CN115983877A - Patent value evaluation method based on depth map and semantic learning - Google Patents
Patent value evaluation method based on depth map and semantic learning Download PDFInfo
- Publication number
- CN115983877A CN115983877A CN202310027211.3A CN202310027211A CN115983877A CN 115983877 A CN115983877 A CN 115983877A CN 202310027211 A CN202310027211 A CN 202310027211A CN 115983877 A CN115983877 A CN 115983877A
- Authority
- CN
- China
- Prior art keywords
- value
- index
- indexes
- sea
- evaluation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of patent evaluation, and provides a patent value evaluation method based on a depth map and semantic learning. In the index screening process, the patent assignment and the construction of a patent value evaluation index system are combined, and an objective fair and strong-operability evaluation method is provided for feature selection. Secondly, the novelty of the patent is calculated through text semantic learning, and the patent value is measured from the semantic perspective. And further utilizing deep graph learning to maximize the information integration node feature representation between the local representation and the global representation, and evaluating the patent value by combining an XGboost algorithm. The method breaks through the defects of the traditional method in the problem of patent value evaluation, and simultaneously introduces the novelty of a patent text to measure the value of the patent. The experimental result shows that the method has higher accuracy and reliability. The invention provides a new method for evaluating patent value and simultaneously provides a new solution for the research of patent value.
Description
Technical Field
The invention belongs to the technical field of patent evaluation, and particularly relates to a patent value evaluation method based on a depth map and semantic learning.
Background
The high-value patent is a hot word of high attention in the industry, the cultivation of the high-value patent becomes an era consensus for innovating and driving high-quality development, and the national intellectual property competent department takes the cultivation of the high-value patent and the improvement of the patent quality as one of the key tasks. Therefore, how to evaluate the patent value and identify high-value patents becomes a key problem which needs to be solved urgently at present. However, with the deep advancement and implementation of intellectual property strategies, the number of patents in China has been greatly increased, and the conventional patent value evaluation method gradually fails to meet the requirement of evaluating the value of a large number of patents to be evaluated. Therefore, constructing a patent value evaluation model suitable for a big data background, and quickly and effectively identifying high-value patents from a large number of patents becomes a key problem for improving the development quality of innovation.
The current research related to patent value mainly explores the influence factors of the patent value from a single index , such as "Hall B , market value and patent indications [ J ]. The Rand Journal of Economics , 2005 , 36 (1): 16-38" , "Lerner J , the impedance of patent scope, an empirical analysis [ J ]. The Rand Journal of economics , 1994 , 25 , 319-333." , "HarhoffD , scherer F M , vopel K , family size , the position and the value of the patents [ J ]. Research Policy , 2003 , 32 (8) 'and' LanjouwJ O , schema M.patent quality and research production, measuringinnovationwith multiple indicators [ J ]. Economic Journal , 2004 , 114 (495): 441-465." , or evaluating patent value by multiple indexes , such as "Wan Xiao Li , evaluation index system of vermilion patent value and fuzzy comprehensive evaluation [ J ] scientific research management , 2008 (02): 185-191." , song river hair , murongping , patent quality and its measuring method and measuring index system research [ J ]. Scientific and scientific technical management , 2010 , 31 (04): 21-27. And Guo Lei , cai rainbow , situation analysis of industrial core patents under situation of patent strategy , 2016 , 34 (11): 1663-1671+1757."). . For example, hall and the like firstly put forward and utilize the value of patents which are frequently introduced to react, and Lerner research finds that the technical range related to the patents has obvious influence on the patent value, but the methods are difficult to objectively reflect the economic value of the patents; secondly, many existing researches focus on evaluating the value of patents by means of patent indexes, for example, the patent is quoted, patent litigation and the like, and all the minds and the like establish an index system comprising 17 indexes such as innovation degree, technical content and the like by means of hierarchical analysis and fuzzy comprehensive evaluation, and provide a new idea for evaluating the patent value in a qualitative and quantitative combined mode; research such as Guo Lei finds that there is a significant forward relationship between the right width, the technical range and the self-priming behavior and the patent value, but it can be found that all indexes in the research are characteristic information of patents, indexes and index weights related in the model are different, and the academic community does not agree with the selection of the indexes. Meanwhile, the text information of the patent is an important factor reflecting the novelty of the patent, and the semantic novelty is not considered in the prior research. Therefore, researchers are required to provide a patent value evaluation method which can effectively fuse multiple indexes and measure patent values from the semantic perspective.
Disclosure of Invention
The present invention addresses the deficiencies of the prior art, and (4) providing a patent value evaluation method by combining the characteristics of the patent. The patent features are screened firstly, and then the semantic novelty of the patent is evaluated by combining deep semantic learning. Meanwhile, in order to effectively fuse external indexes and semantic information, the expression of the nodes is maximally learned based on mutual information, local information of the nodes and global information of the network are reserved, and finally, the value of the patent is estimated by combining an XGboost algorithm. The invention provides a big data oriented patent economic value evaluation method by using semantic learning and deep map learning for the first time.
The technical scheme of the invention is as follows: a comprehensive evaluation value comprehensive evaluation model which effectively integrates multiple indexes and semantic novelty is established through an existing patent data set, and the comprehensive evaluation model is applied to the patent data set to be evaluated to predict the patent value. The method comprises the following steps:
step 1, acquiring the reference relation between the attribute characteristics of the patent and the patent, and constructing a patent reference network;
step 2, determining a sea election index for evaluating the patent value and a criterion layer to which the index belongs by taking the transferred patent as a standard of the high economic value patent;
the method for constructing the criterion layer of the sea election indexes of the sea election index patent value assessment of the patent value assessment comprises the following steps: technical indexes, cited indexes, IPC indexes, internationalization indexes, time indexes, right indexes and patentee indexes; the construction of the sea election index is shown in table 1;
TABLE 1 Standard layer and sea selection index system table
Step 3, screening the sea selection indexes for patent value evaluation based on a K-S method and constructing an index system for patent value evaluation;
step 3.1, standardizing the sea election index data of patent value evaluation;
the data standardization processing is to adopt a maximum value-minimum value standardization method to process the sample data of the sea election index of the patent value evaluation and eliminate the influence of dimension;
step 3.2, calculating a single index D value;
calculating the maximum value of the accumulated frequency difference value of the assigned patents and the unassigned patents corresponding to the sea election indexes of the patent value evaluation in the existing patent data set to obtain the K-S test statistic D value of the sea election indexes of the patent value evaluation;
step 3.3, calculating the index correlation coefficient in the same criterion layer;
calculating a correlation coefficient between any two indexes in the same criterion layer, determining an index pair reflecting repeated information in the candidate indexes for patent value evaluation, and deleting the index with a small D value from the index pair with the correlation coefficient more than 0.7 to complete the first screening of the candidate indexes for patent value evaluation; forming an index system by the remaining K sea selection indexes for patent value evaluation;
step 3.4, calculating the economic value score of the patent;
weighting the sea-choosing indexes of the residual patent value evaluation according to the K-S test statistic D value, and ensuring that the indexes with larger D values have larger weights; calculating the economic value score of the patent in a linear weighting mode; calculating the sea election index weight of patent value evaluation by using the formula (1):
calculating a patent economic value score by using a formula (2):
wherein, wjA candidate index weight for the jth patent value assessment; djThe value of K-S test statistic D of j index; k is the number of highly selected indices needed to give weighted patent value assessments: k =1,2, \ 8230; k is the number of sea election indexes for evaluating the residual patent value after the first screening; z is the score of the economic value of the patent; x is the number ofjThe normalized value of the sea election index for the j patent value evaluation of the patent to be evaluated;
step 3.5, calculating a K-S test statistic D value of the index system;
calculating a K-S test statistic D value of the patent economic value score obtained by an index system by analogy with the calculation of a sea-choosing index D value of single value evaluation;
step 3.6, after calculating an index system D value formed by the sea election indexes of the remaining K patent value evaluations after the first screening, deleting the sea election index of the patent value evaluation in sequence, calculating the maximum value of the D values in the sea election index combinations of the remaining K-1 patent value evaluations, comparing the change of the D values before and after deleting the sea election index of the patent value evaluation, and deleting the sea election index of the patent value evaluation when the sea election index of the patent value evaluation is deleted and the D value of the remaining index combination is larger than that before deleting;
step 3.7, the step 3.6 is circulated until after any one of the candidate indexes of the patent value evaluation is deleted, the D values of the combination of the remaining indexes are all smaller than the D value before the candidate index of the patent value evaluation is deleted, at this moment, the deletion of the candidate indexes of the patent value evaluation is stopped, and the secondary screening of the candidate indexes of the patent value evaluation is completed; the remaining sea election indexes of the patent value assessment are the sea election index combination of the optimal patent value assessment;
step 4, calculating the semantic novelty of the patent, which comprises the following steps;
step 4.1, establishing a corpus set T = { T ] according to the invention name and abstract of the patent1,t2,…,tiWhere t isiThe method is characterized in that the method is a text information set of a patent i, namely a text consisting of an invention name and a patent specification abstract; the unique column vector of the paragraph vector matrix V represents the text paragraph of each patent, and the unique column vector of the word vector matrix W represents each word in the text paragraph of the patent;
step 4.2, predicting text paragraph t according to the unique column vector in the paragraph vector matrix and the word vector matrix, namely the average value of the text paragraph and the wordiObtaining the text paragraph representation and the word representation according to the probability of the occurrence of the next word; according to a training word sequence w1,w2,…,w|T|And paragraph viThe following objectives are maximized under a fixed length window win:
where M is the number of all training words, viIs a text paragraph representation vector containing the context word of the current window; the prediction task is performed by hierarchical softmax:
wherein N iswIs the total number of words in the training word sequence, pr is the output logarithmic probability, and the calculation formula is:
Pr=Ua(wt-|win|,...,wt-1,wt+1,…,wt+|win|,vi;W,V)+b (5)
wherein U and b are softmax parameters, and a is represented by wtAnd viAveraging, using the PV-DM model in the underlying space RkRepresenting a text paragraph of each patent by vectorization to obtain a text characterization matrix V of the final patent;
step 4.3, calculating the Euclidean distance between the text paragraph characterization vector of the patent and the text paragraph characterization vector of the patent cited by the text paragraph characterization vector:
step 4.4, summarizing Euclidean distances between all patent citation pairs | R | in the patent citation network, ranking, and calculating semantic novelty S of the patenti:
Step 5, generating a node feature matrix based on the sea selection index combination of the optimal patent value evaluation obtained in the step 3 and the semantic novelty calculated in the step 4Wherein n is1= | V |, establish patent cited adjacency matrixSaving reference information between nodes, using an encoder @> Acquiring a final node feature representation, comprising the steps of:
step 5.1, inputting a node feature matrix X, and acquiring local representation of nodes in the positive sample through neighborhood information of an epsilon integration target node of a graph convolution network; the information integration process comprises the following steps:
wherein the content of the first and second substances,is->Degree matrix of (H)lIs a feature representation learned for each layer; w is a group oflIs the learning parameter of the l-th layer in the convolutional neural network; for input layer l =0, H0= X, σ is a non-linear activation function;
step 5.2, using the functionNodes in the convolutional neural network are modified to obtain negative samples, the same information integration method as in step 5.1 is used to generate node local representations &forthe negative samples>
Step 5.3, passing the transfer functionPassing a local representation of a node in positive samples hiComputing a network global representation:
wherein N represents the number of positive samples;
step 5.5, minimizing the final loss function LnUpdating the final representation h of each patent node in the generated positive samplei:
Wherein N isnIs the number of negative samples;is a negative sample representation; s is the network global representation; (ii) a E(.)[.]The expression function [.]The expected value of (d); />Represents the logarithmic value of equation (10);
step 6, predicting the patent value; finally, the patent nodes are input into a machine learning XGboost model to predict the value of the patent, and a grading prediction result is obtainedFor a certain patent sample i, inputting the final expression h of the patent nodeiObtaining a prediction result, wherein the calculation formula is as follows:
wherein f iskThe K decision tree in the XGboost model, where K is the number of trees in the model, fk(hi) Indicating the predicted value of patent sample i on the kth tree.
The invention has the beneficial effects that: the invention provides a patent value evaluation method based on a depth map and semantic learning. In the index screening process, the patent assignment and the construction of a patent value evaluation index system are combined, and an objective fair and strong-operability evaluation method is provided for feature selection. Secondly, the novelty of the patent is calculated through text semantic learning, and the patent value is measured from the semantic perspective. And further utilizing depth map learning to maximize the information integration node feature representation between the local representation and the global representation, and evaluating the patent value. The method breaks through the defects of the traditional method in the problem of patent value evaluation, and simultaneously introduces the novelty of a patent text to measure the value of the patent. The experimental result shows that the method has higher accuracy and reliability. The invention provides a new method for evaluating patent value and simultaneously provides a new solution for the research of patent value.
Drawings
FIG. 1 is a flow chart of a patent value evaluation method based on depth map and semantic learning according to the present invention.
FIG. 2 is a flowchart of index screening.
Detailed Description
The following further describes the specific embodiments of the present invention with reference to the drawings and technical solutions.
In this embodiment, 2209 biopharmaceutical field patents with the publication time of more than 5 years are taken as examples, and the index and criterion layer with the publication time of more than 5 years are used for constructing a patent value evaluation model and verifying the validity of the model. 1473 patent samples are selected for constructing a value evaluation model, 736 patent samples are selected for patent value evaluation and verification of effectiveness of the evaluation model, and the implementation steps of the technical scheme of the invention are as follows:
1. and constructing a patent citation network according to the real patent publication and citation information.
2. And selecting a sea election index and constructing a criterion layer according to the characteristics of different patent indexes in the publication time.
3. And (4) carrying out standardization processing on the index data of the patent sample by a maximum-minimum standardization method, and eliminating the influence of dimensions.
4. And calculating the value D of the statistic D of the K-S test of the single index.
The distinguishing capability of the index on the patent transfer state is measured through the size of the sea election index D value, and the larger the index D value is, the larger the difference degree of the transferred patent and the non-transferred patent on the index is, namely, the more the state whether the patent is transferred or not can be identified through the index. The following describes the calculation procedure of the single index D value, taking the index "number of pages in the specification" as an example. For convenience of understanding, it is assumed that the standardized value of "number of specification pages" is 1,0.5,0.
(4.1) each index value of the 'specification page number' corresponds to one or more patents, the patents with the same index value form a patent group, and the patent groups are arranged in a descending order according to the value of the index value of the 'specification page number'. Are listed in table 2, line 2, and table 2, line 1, the number of the patent group.
(4.2) the number of assigned patents and the number of unassigned patents in each patent group are calculated and listed in line 3 and line 4 of Table 2, respectively.
And (4.3) calculating the number of the assigned patents and the number of the unassigned patents in each accumulated patent group.
The patent group with the highest index value is used as the first accumulated patent group, and then the patent group with the lower index value is accumulated each time, namely the first two patent groups form the second accumulated patent group, and the first three patent groups form the third accumulated patent group. The number of patents assigned and the number of patents not assigned to each accumulated patent group are calculated and listed in the 5 th row and the 6 th row of table 2, respectively.
And (4.4) calculating the accumulated patent frequency and the accumulated patent frequency in each accumulated patent group.
The cumulative frequency of assigned patents is obtained by dividing the number of assigned patents accumulated in row 5 of table 2 by the total number of assigned patents accumulated in the last column of row 5 of table 2, and is listed in row 7 of table 2. Similarly, the cumulative frequency of the unassigned patents is obtained by dividing the cumulative number of unassigned patents by the total number of unassigned patents, and is listed in line 8 of table 2.
(4.5) calculating the difference d between the cumulative frequency of patents assigned and the cumulative frequency of patents not assigned in each cumulative patent group, d = | cumulative frequency of patents assigned — cumulative frequency of patents not assigned | each of which is listed in line 9 of table 2.
And (4.6) determining the value of the K-S test statistic D of the single index.
The K-S test statistic D value is the maximum value of the difference D between the cumulative frequency of assigned patents and the cumulative frequency of assigned patents, i.e., D = max (D), and the obtained D value is listed in row 10 of table 2.
TABLE 2 calculation of the D value of the K-S test statistic
5. Deleting indexes reflecting repeated information, and performing first screening of indexes
And calculating a correlation coefficient between any two indexes in the same criterion layer, and deleting the index with a small D value in the index pair with the correlation coefficient more than 0.7, so that information redundancy of an index system is avoided, and the index with strong capacity of distinguishing and transferring by mistakenly deleting is also avoided. The calculation formula of the correlation coefficient between the index q and the index j is as follows:
wherein r isqjA correlation coefficient representing the qth index and the jth index; x is the number ofiqIs the q index value of the i patent;represents the q index average; x is the number ofijIs the j index value of the ith patent; />Is the average of the j-th index.
Through correlation analysis, 9 indexes such as 'number of cited patents in the country' and 'number of cited foreign patents' are deleted altogether, and the remaining 20 indexes are deleted in an index system with a patent publication time of more than 5 years.
6. Empowering indexes based on D values
And (3) giving weight to the index according to the idea that the larger the value of the transfer distinguishing capability K-S test statistic D of the index is, the larger the index weight is. The empowerment formula is:
wherein, wjIs the weight of the jth index; djThe value D of K-S test statistic of the jth index represents the transfer distinguishing capability of the index; k is the number of indices to be assigned, k =1,2, \ 8230;, 20.
7. Patent calculation value score
Calculating the economic value score of the patent by a linear weighting mode, wherein the weighting formula is as follows:
wherein Z is a patent value score; w is ajIs the weight of the jth index; k requires the number of entitled indicators, k =1,2, \ 8230;, 20; x is the number ofjIs the normalized value of the j index of the patent to be evaluated.
8. And calculating the D value of the patent value score, and carrying out secondary screening on the index system.
(8.1) calculating D of the rating index system consisting of the remaining 20 indexes after the first screening20。
According to the calculation method of the D value of the single index, D of the patent value scores of 20 index composition systems is calculated20The value is obtained. Wherein D20The calculation of (2) is similar to the calculation of the D value of a single index, and when the data is brought in, the standardized value of the single index needs to be replaced by the patent value score.
After obtaining 20 indexes D20After the value is obtained, one index is sequentially removed, and the residue is calculated19 indexes are combined into a systemValue, 20 index combinations, and 20 removed indexes are selected>In (1) maximum value->
(8.3) screening out an index system with strong patent assignment distinguishing capability D value.
When D is present20In the meantime, it is explained that the index system consisting of 19 indexes left after one index is removed from 20 indexes becomes stronger in the ability to distinguish the assigned patent from the non-assigned patent. Thus, a 19-index rating system is retained.
(8.4) repeating the step (2) and the step (3), and continuing to delete the index until the index is deletedIn the meantime, the screening of the index is stopped.
And after one index in the k indexes is arbitrarily removed, the distinguishing capability of an index system consisting of the remaining k-1 indexes for the patent transfer is weakened, and at the moment, the index system of the k indexes is reserved, and the index screening is terminated.
After the second index screening, 9 indexes such as IPC (International patent medicine) subclass number, figure number and the like are deleted in an index system with the patent publication time of more than 5 years, and the rest 11 indexes are deleted, so that the index system formed by the rest indexes is the index system with strong patent transfer distinguishing capability.
9. Calculating the semantic novelty of the patent.
(9.1) establishing a corpus set T = { T) according to the invention name and abstract of the patent1,t2,…,tiWhere t isiIs the text information set of patent i. The unique column vector of matrix V represents each paragraph of text and the unique column vector of matrix W represents each word in the sentence. The following objectives are maximized under a fixed length window win:
where M is the number of all training words, viIs a document representation vector containing the context words of the current window. The probability of the next word occurrence in the document is predicted using hierarchical softmax:
the log probability of each paper output was calculated:
Pr=Ua(wt-|win|,...,wt-1,wt+1,…,wt+|win|,vi;W,V)+b
wherein U and b are softmax parameters, and a is represented by wiAnd djAveraging, using the PV-DM model in the underlying space RkAnd obtaining a text characterization matrix V of the patent through vectorization.
(9.2) calculating the distance between the vector of the patent and the vector of the patent it refers to:
(9.3) summarizing and ranking the distances between all citation pairs, calculating the semantic novelty score S of the patenti:
10. Generating a node feature matrix based on semantic novelty of screening indexes and calculationWherein n is1= V |, establishing a matrix>Saving reference information between nodes using an encoder>Acquiring the final node characteristic representation, comprising the following steps:
(10.1) inputting a feature matrix X, and integrating neighborhood information of a target node through a graph convolution network epsilon to obtain the node representation in the positive sample:
(10.2) use functionModifying a node in the network to obtain a negative sample, generating a representation ^ for the negative sample in the same way as in step (10.1)>
(10.3) passing the transfer functionPassing a local node representation, computing a network global representation:
where N represents the number of positive samples.
(10.5) calculating the final loss function:
wherein N isnIs the number of negative samples.
(10.6) minimizing the loss function, generating a representation h of each patent nodei。
11. And (5) predicting the patent value. And (4) inputting the patent node representation into a value prediction model XGboost to obtain a grading prediction result. For a certain sample i, its feature representation h is inputiObtaining a prediction result, wherein the calculation formula is as follows:
wherein f iskIs the k-th decision tree.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any insubstantial changes and substitutions made by those skilled in the art based on the present invention are included in the scope of the present invention claimed in the claims.
Claims (1)
1. A patent value evaluation method based on depth map and semantic learning is characterized by comprising the following steps:
step 1, acquiring the reference relation between the attribute characteristics of the patent and the patent, and constructing a patent reference network;
step 2, determining a sea election index for evaluating the patent value and a criterion layer to which the index belongs by taking the transferred patent as a standard of the high economic value patent;
the method for constructing the criterion layer of the sea election indexes of the sea election index patent value assessment of the patent value assessment comprises the following steps: technical indexes, cited indexes, IPC indexes, internationalization indexes, time indexes, right indexes and patentee indexes; the construction of the sea election index is shown in table 1;
TABLE 1 Standard layer and sea selection index system table
Step 3, screening the sea selection indexes for patent value evaluation based on a K-S method and constructing an index system for patent value evaluation;
step 3.1, standardizing the sea election index data of patent value evaluation;
the data standardization processing is to adopt a maximum value-minimum value standardization method to process the sample data of the sea election index of the patent value evaluation and eliminate the influence of dimension;
step 3.2, calculating a single index D value;
calculating the maximum value of the accumulated frequency difference value of the assigned patents and the non-assigned patents corresponding to the sea election indexes of the patent value evaluation in the existing patent data set to obtain the K-S test statistic D value of the sea election indexes of the patent value evaluation;
step 3.3, calculating index correlation coefficients in the same criterion layer;
calculating a correlation coefficient between any two indexes in the same criterion layer, determining an index pair reflecting repeated information in the sea election indexes of the patent value evaluation, deleting the index with a small D value from the index pair with the correlation coefficient larger than 0.7, and finishing the first screening of the sea election indexes of the patent value evaluation; forming an index system by the remaining K marine selection indexes for patent value evaluation;
step 3.4, calculating the economic value score of the patent;
weighting the sea-choosing indexes of the residual patent value evaluation according to the K-S test statistic D value, and ensuring that the indexes with larger D values have larger weights; calculating the economic value score of the patent in a linear weighting mode; calculating the sea election index weight of patent value evaluation by using the formula (1):
calculating the patent economic value score by using the formula (2):
wherein, wjSelecting the index weight for the jth patent value evaluation; djThe value of K-S test statistic D of j index; k is the number of highly selected indices needed to give weighted patent value assessments: k =1,2, \ 8230; k is the number of sea election indexes for evaluating the residual patent value after the first screening; z is the score of the economic value of the patent; x is the number ofjThe normalized value of the sea election index for the j patent value evaluation of the patent to be evaluated;
step 3.5, calculating a K-S test statistic D value of the index system;
calculating a K-S test statistic D value of the patent economic price value score obtained by an index system by analogy with the calculation of the sea selection index D value of single value evaluation;
step 3.6, after calculating an index system D value formed by the sea election indexes of the remaining K patent value evaluations after the first screening, deleting the sea election index of the patent value evaluation in sequence, calculating the maximum value of the D values in the sea election index combinations of the remaining K-1 patent value evaluations, comparing the change of the D values before and after deleting the sea election index of the patent value evaluation, and deleting the sea election index of the patent value evaluation when the sea election index of the patent value evaluation is deleted and the D value of the remaining index combination is larger than that before deleting;
step 3.7, the step 3.6 is circulated until after any one of the candidate indexes of the patent value evaluation is deleted, the D values of the combination of the remaining indexes are all smaller than the D value before the candidate index of the patent value evaluation is deleted, at this moment, the deletion of the candidate indexes of the patent value evaluation is stopped, and the secondary screening of the candidate indexes of the patent value evaluation is completed; the remaining sea election indexes of the patent value assessment are the sea election index combination of the optimal patent value assessment;
step 4, calculating the semantic novelty of the patent, which comprises the following steps;
step 4.1, establishing a corpus set T = { T ] according to the invention name and abstract of the patent1,t2,…,tiIn which tiThe method is a text information set of the patent i, namely a text consisting of the invention name and the abstract of the patent specification; the unique column vector of the paragraph vector matrix V represents the text paragraph of each patent, and the unique column vector of the word vector matrix W represents each word in the text paragraph of the patent;
step 4.2, predicting text paragraph t according to the unique column vector in the paragraph vector matrix and the word vector matrix, namely the average value of the text paragraph and the wordiObtaining text paragraph representation and word representation according to the probability of the occurrence of the next word; according to a training word sequence w1,w2,…,w|T|And paragraph viThe following objectives are maximized under a fixed length window win:
where M is the number of all training words, viIs a text paragraph representation vector containing the context word of the current window; the prediction task is performed by hierarchical softmax:
wherein N iswIs the total number of words in the training word sequence, pr is the output log probability, and the calculation formula is:
Pr=Ua(wt-|win|,...,wt-1,wt+1,…,wt+|win|,vi;W,V)+b (5)
where U and b are softmax parameters and a is a parameter represented by wtAnd viAveraged using the PV-DM model in the underlying space RkRepresenting a text paragraph of each patent by vectorization to obtain a text characterization matrix V of the final patent;
step 4.3, calculating the Euclidean distance between the text paragraph characterization vector of the patent and the text paragraph characterization vector of the patent cited by the text paragraph characterization vector:
step 4.4, summarizing Euclidean distances between all patent citation pairs | R | in the patent citation network, ranking, and calculating semantic novelty S of the patenti:
Step 5, generating a node feature matrix based on the sea selection index combination of the optimal patent value evaluation obtained in the step 3 and the semantic novelty calculated in the step 4Wherein n is1= | V |, sets up patent cited adjacency matrixSaving reference information between nodes, using an encoder @> Obtaining a final node feature representation, comprising the steps of:
step 5.1, inputting a node feature matrix X, and acquiring local representation of nodes in the positive sample through neighborhood information of an epsilon integration target node of a graph convolution network; the information integration process comprises the following steps:
wherein the content of the first and second substances, is->Degree matrix of (H)lIs a feature representation learned for each layer; w is a group oflIs the learning parameter of the l-th layer in the convolutional neural network; for input layer l =0, H0= X, σ is a non-linear activation function;
step 5.2, using the functionNodes in the convolutional neural network are modified to obtain negative samples, the same information integration method as in step 5.1 is used to generate node local representations &forthe negative samples>
Step 5.3, transferring the node local expression h in the positive sample through the transfer function RiComputing a network global representation:
wherein N represents the number of positive samples;
step 5.5, minimizing the final loss function LnUpdating the final representation h of each patent node in the generated positive samplei:
Wherein N isnIs the number of negative samples;is a negative sample representation; s is the network global representation; (ii) a E(.)[.]The expression function [.]The expected value of (a); />Represents the logarithmic value of equation (10);
step 6, predicting the patent value; finally, the patent nodes are input into a machine learning XGboost model to predict the value of the patent, and a grading prediction result is obtainedFor a certain patent sample i, inputting the final expression h of the patent nodeiObtaining a prediction result, wherein the calculation formula is as follows:
wherein f iskThe K decision tree in the XGboost model, where K is the number of trees in the model, fk(hi) Indicating the predicted value of patent sample i on the kth tree.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310027211.3A CN115983877A (en) | 2023-01-09 | 2023-01-09 | Patent value evaluation method based on depth map and semantic learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310027211.3A CN115983877A (en) | 2023-01-09 | 2023-01-09 | Patent value evaluation method based on depth map and semantic learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115983877A true CN115983877A (en) | 2023-04-18 |
Family
ID=85963010
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310027211.3A Pending CN115983877A (en) | 2023-01-09 | 2023-01-09 | Patent value evaluation method based on depth map and semantic learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115983877A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116776868A (en) * | 2023-08-25 | 2023-09-19 | 北京知呱呱科技有限公司 | Evaluation method of model generation text and computer equipment |
-
2023
- 2023-01-09 CN CN202310027211.3A patent/CN115983877A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116776868A (en) * | 2023-08-25 | 2023-09-19 | 北京知呱呱科技有限公司 | Evaluation method of model generation text and computer equipment |
CN116776868B (en) * | 2023-08-25 | 2023-11-03 | 北京知呱呱科技有限公司 | Evaluation method of model generation text and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI689871B (en) | Gradient lifting decision tree (GBDT) model feature interpretation method and device | |
CN112948541B (en) | Financial news text emotional tendency analysis method based on graph convolution network | |
CN111080117A (en) | Method and device for constructing equipment risk label, electronic equipment and storage medium | |
CN115983877A (en) | Patent value evaluation method based on depth map and semantic learning | |
CN114169869A (en) | Attention mechanism-based post recommendation method and device | |
CN112527769B (en) | Automatic quality assurance framework for software change log generation method | |
Tiruneh et al. | Feature selection for construction organizational competencies impacting performance | |
Wang et al. | Evaluation of the survival of Yangtze finless porpoise under probabilistic hesitant fuzzy environment | |
CN111291189B (en) | Text processing method and device and computer readable storage medium | |
CN111105041B (en) | Machine learning method and device for intelligent data collision | |
CN113516189B (en) | Website malicious user prediction method based on two-stage random forest algorithm | |
CN115345248A (en) | Deep learning-oriented data depolarization method and device | |
Okagbue et al. | Predicting access mode of multidisciplinary and library and information sciences journals using machine learning | |
Lv et al. | An empirical study of factors influencing entrepreneurship using fuzzy logic: based on provincial panel data | |
Dubois et al. | Measuring the expertise of workers for crowdsourcing applications | |
CN113112166A (en) | Equipment state variable selection method and equipment based on gray fuzzy hierarchical analysis | |
Syafiandini et al. | Classification of Indonesian Government Budget Appropriations or Outlays for Research and Development (GBAORD) using decision tree and naive bayes | |
CN115470332B (en) | Intelligent question-answering system for content matching based on matching degree | |
CN117573814B (en) | Public opinion situation assessment method, device and system and storage medium | |
Omondiagbe et al. | Evaluating simple and complex models’ performance when predicting accepted answers on stack overflow | |
CN108376261B (en) | Tobacco classification method based on density and online semi-supervised learning | |
Anastasopoulos et al. | Computational text analysis for public management research | |
CN116956027A (en) | Employee portrait updating method, device, equipment and storage medium | |
Rahmawati et al. | Classification and Regression Trees (CART) Algorithm for Employee Selection | |
CN117556118A (en) | Visual recommendation system and method based on scientific research big data prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |