CN111813951A - Key point identification method based on technical map - Google Patents

Key point identification method based on technical map Download PDF

Info

Publication number
CN111813951A
CN111813951A CN202010559077.8A CN202010559077A CN111813951A CN 111813951 A CN111813951 A CN 111813951A CN 202010559077 A CN202010559077 A CN 202010559077A CN 111813951 A CN111813951 A CN 111813951A
Authority
CN
China
Prior art keywords
technical
papers
centrality
key
indexes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010559077.8A
Other languages
Chinese (zh)
Inventor
华斌
宋平
陆启宇
张琪祁
赵三珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Shanghai Electric Power Co Ltd
Original Assignee
State Grid Shanghai Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Shanghai Electric Power Co Ltd filed Critical State Grid Shanghai Electric Power Co Ltd
Priority to CN202010559077.8A priority Critical patent/CN111813951A/en
Publication of CN111813951A publication Critical patent/CN111813951A/en
Priority to PCT/CN2020/136036 priority patent/WO2021253758A1/en
Priority to AU2020327352A priority patent/AU2020327352B2/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a key point identification method based on a technical atlas, which comprises the following steps: constructing a technical map; performing centrality calculation on the node data in the technical map to obtain key nodes; simplifying the technical indexes of multiple dimensions of the node data by adopting a principal component analysis method; and analyzing the relation between the key nodes and the technical indexes to obtain the key nodes under different dimensions. Compared with the prior art, the method comprehensively considers the network centrality index and the literature measurement of scientific and technological resources, overcomes the defects of single and practical performance deviation of key node indexes in the identification technology map, quantitatively calculates the related indexes of the technology map based on the related theory of the complex network technology, is beneficial to more accurately identifying the key nodes, finds the trend of technical research or a technical trend clue, and provides decision support for scientific and technological innovation.

Description

Key point identification method based on technical map
Technical Field
The invention relates to a data processing method, in particular to a key point identification method based on a technical map.
Background
In the technical map network, the key nodes in the network, namely the key technology and the hot spot technology, are identified, and the method has a great auxiliary effect on the development of the scientific layout work. The traditional discussion of key nodes in the network often exists in the centralized problem and node importance evaluation of a complex network, and the statistical properties of the network are measured through an empirical method. The key node is identified by singly applying a certain measure index or method, so that the one-sidedness is strong, each measure index or method can only reflect the status of the node in the network from a certain side face, and the condition is not in line with the actual situation. In the era of rapid development of the internet, simple measure index combination cannot meet the practical requirements, and higher requirements are provided for the accuracy of identifying key points.
Particularly, the application of the existing network is wider, the application of the network has more practical significance, the measurement degree index from the theoretical angle is not actually fit, and the accuracy of identifying the key node is reduced.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a key point identification method based on a technical graph, and solves the problems of single and practical key point index in the identification technical graph.
The purpose of the invention can be realized by the following technical scheme:
a key point identification method based on a technical atlas comprises the following steps:
constructing a technical map;
performing centrality calculation on the node data in the technical map to obtain key nodes;
simplifying the technical indexes of multiple dimensions of the node data by adopting a principal component analysis method;
and analyzing the relation between the key nodes and the technical indexes to obtain the key nodes under different dimensions.
The technical map is constructed by adopting an entity, relation and attribute extraction method according to scientific and technological achievements of a plurality of websites and databases and fusing knowledge.
The website and the database comprise at least one of a peer-to-peer knowledge network, a national research network, a self-built resource library, research and development organization data, policy and regulation data, industry dynamic data, a patent database and an industry standard database.
The centrality comprises a centrality degree, an approach centrality degree and an intermediate centrality degree.
The dimensions of the technical indexes comprise project horizontal dimensions, talent horizontal dimensions and scientific research result horizontal dimensions.
The technical indexes of the project horizontal dimension comprise total number of projects, fund project categories and scientific research expenditure investment.
The technical indexes of the talent horizontal dimension comprise average age of talents, average scholarly history of talents and number of talents.
In the horizontal dimension of the scientific achievements, the scientific achievements include papers, patents and other achievements.
The technical indexes related to the papers comprise total number of papers, total frequency of quoted papers, number of papers of core periodicals, total frequency of quoted core periodicals, number of fund papers, total frequency of quoted funds, proportion of papers of core periodicals, frequency of quoted total papers, frequency of quoted core periodicals, frequency of quoted fund and H index, the technical indexes related to patents comprise total number of patents and number of invented patents, and the technical indexes related to other achievements comprise achievement awards, achievement identification results, standard number, major compilations or minor major compilations.
And analyzing the relation between the key nodes and the technical indexes by adopting a linear regression method.
Compared with the prior art, the method comprehensively considers the network centrality index and the literature measurement of scientific and technological resources, overcomes the defects of single and practical performance deviation of key node indexes in the identification technology map, quantitatively calculates the related indexes of the technology map based on the related theory of the complex network technology, is beneficial to more accurately identifying the key nodes, finds the trend of technical research or a technical trend clue, and provides decision support for scientific and technological innovation.
Drawings
FIG. 1 is a flowchart of a method for identifying key points based on a technical atlas according to the present embodiment;
FIG. 2 is a technical map constructed in the present example;
fig. 3 is a graph showing the cumulative contribution ratio of each evaluation index in the present embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
Examples
As shown in fig. 1, a method for identifying key points based on a technical atlas includes the following steps:
1) construction of technical maps
Metadata is acquired from the same-party knowledge network, the national research network, the self-built resource base, the external expert and research and development organization data, the internal project and scientific and technological achievement data, policy and regulation data, industry dynamic data, patent data and industry standard data are added, entity, relation and attribute extraction is carried out, entity disambiguation and coreference resolution are carried out on the extracted information, an ontology is extracted, and a technical map is constructed, as shown in fig. 2.
2) In consideration of the statistical indexes of the complex network, the key nodes are positioned based on the sizes of indexes such as degree centrality, approach centrality, betweenness centrality and the like, the nodes with high betweenness centrality and high-frequency characteristics are key technologies in the field and represent the research hotspot subject of the period;
the degree centrality is the sum of the direct connections of one node to other nodes. Because the connection of the technical map is directional, the technical map can be divided into a point-in centrality and a point-out centrality. And comprehensively considering the point-in centrality and the point-out centrality, wherein the calculation formula of the centrality of the node is as follows:
Figure BDA0002545628780000031
where u is a node, n is the number of nodes in the graph, XvuIndicating whether the nodes v and u are directly connected. Centrality is the most direct measure characterizing the centrality of a node in network analysis, which reflects the cohesion of a node. The higher the centrality of a node, the more important the node is in the network;
the recenterness is the reciprocal of the sum of the shortest path distances from one node to all other nodes. Which reflects the proximity between a node and other nodes in the network. The standardized calculation formula of the proximity centrality of the node is as follows:
Figure BDA0002545628780000032
where u is a node, n is the number of nodes in the graph, and d (u, v) is the shortest path distance between another node v and u. Because the connection of the technical atlas is directional, the technical atlas can be divided into an approach centrality and an exit centrality. The approach centrality reflects the integration force of the node and the approach centrality reflects the radiation force of the node;
the betweenness centrality is the number of shortest paths through a node. I.e. the number of times a node acts as a bridge for the shortest path between any other two nodes. The calculation formula of the node betweenness centrality is as follows:
Figure BDA0002545628780000033
Figure BDA0002545628780000034
where u is a node, p is the total number of shortest paths between nodes s and t, and p (u) is the number of shortest paths between nodes s and t through node u. The higher the number of times a node acts as an "intermediary", the greater the degree of its intermediary, which acts as a "traffic hub" in the network.
3) The method is based on the literature measurement of scientific resources and starts from two aspects of scientific research investment and scientific research achievements;
the scientific research investment is divided into scientific research projects and talent echelons, the scientific research projects comprise total number of projects, fund projects and scientific research fund investment, and the talent echelons comprise average age of talents, average academic history of talents and talent number;
scientific research achievements comprise papers, patents, standards, monographs and achievements, wherein the factors to be considered by the papers are total number of the papers, total frequency of quoted papers, number of core journal papers, total frequency of quoted core journals, number of fund papers, total frequency of fund quoted papers, occupation ratio of core journal papers, total frequency of quoted papers, frequency of quoted core journal papers, frequency of quoted fund papers and H index, the patents comprise total number of patents and number of invented patents, the achievements comprise achievement awards and achievement identifications, and also comprise standard number, main edition or sub-main edition and the like;
4) and converting the multidimensional evaluation indexes defined in 2) and 3) into mutually independent comprehensive evaluation indexes through principal component analysis, eliminating the correlation among the evaluation indexes and simplifying the critical index number of the evaluation nodes.
The invention constructs a technical map for the co-occurrence relation of 200 technologies in scientific and technical data, and evaluates the criticality of the nodes from the dimensions of a network topology structure, a project level, a talent level and scientific research results. Respectively calculating 27 evaluation indexes corresponding to each technology to form a 200 × 27 matrix, and performing principal component analysis on the matrix to obtain a characteristic root, a contribution rate and an accumulated contribution rate, wherein the accumulated contribution rate is shown in fig. 3:
as can be seen from the figure, the cumulative contribution rate of the first 5 principal components reaches 90.79%. Therefore, only the first 5 principal components are selected to sufficiently represent the information contained in the 27 evaluation indexes. The evaluation matrix can be reduced to 200 x 5 by calculating the product of the original index weight matrix corresponding to the first 5 principal components and the evaluation index matrix.
5) By using a linear regression expression, the contribution rates of the former 5 principal components are used as the weights of the principal components, and a key comprehensive value of the node can be obtained. Based on the result of 4), obtaining a comprehensive function for evaluating the criticality of the node:
Z=0.3284*y1+0.1531*y2+0.2157*y3+0.1196*y4+0.0911*y5
through function calculation, the obtained numerical values are sequenced, key nodes can be obtained, and the key nodes are marked with striking colors in the network, so that the key nodes are convenient to identify. In addition, the method can also be adopted for a network formed by subjects such as research fields, authors, research institutions and the like to identify key node nodes in the network.

Claims (10)

1. A key point identification method based on a technical atlas is characterized by comprising the following steps:
constructing a technical map;
performing centrality calculation on the node data in the technical map to obtain key nodes;
simplifying the technical indexes of multiple dimensions of the node data by adopting a principal component analysis method;
and analyzing the relation between the key nodes and the technical indexes to obtain the key nodes under different dimensions.
2. The method for identifying key points based on technical atlases as claimed in claim 1, wherein the technical atlases are constructed by extracting methods of entities, relations and attributes according to scientific and technological achievements of a plurality of websites and databases through knowledge fusion.
3. The method as claimed in claim 2, wherein the website and database includes at least one of a peer-to-peer network, a national research network, a self-built resource base, research and development institution data, policy and regulation data, industry dynamic data, a patent database, and an industry standard database.
4. The method for identifying key points based on technical atlases as claimed in claim 1, wherein the centrality comprises a centrality, a centrality approach and a centrality betweenness.
5. The method for identifying key points based on technical atlases as claimed in claim 1, wherein the dimensions of the technical index include project horizontal dimension, talent horizontal dimension, and scientific research result horizontal dimension.
6. The method as claimed in claim 5, wherein the technical indicators of the project horizontal dimension include total number of projects, fund project category and scientific research expenditure investment.
7. The method as claimed in claim 5, wherein the technical indicators of the horizontal dimension of talents include average age, average scholarship and number of talents.
8. The method as claimed in claim 5, wherein the scientific achievements in the horizontal dimension include papers, patents, and other achievements.
9. The method according to claim 8, wherein the technical indicators related to the papers include total number of papers, total frequency of quoted papers, total number of core journal papers, total frequency of quoted core journals, total number of fund papers, total frequency of fund quoted papers, percentage of core journal papers, percentage of total fund quoted papers, percentage of total papers, percentage of core journal papers, percentage of total fund quoted frequencies, and H-index, the technical indicators related to the patents include total number of patents and number of patents, and the technical indicators related to other achievements include achievement, award identification result, standard number, main edition or sub-main edition.
10. The method of claim 1, wherein the relationship between the key nodes and the technical index is analyzed by linear regression.
CN202010559077.8A 2020-06-18 2020-06-18 Key point identification method based on technical map Pending CN111813951A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202010559077.8A CN111813951A (en) 2020-06-18 2020-06-18 Key point identification method based on technical map
PCT/CN2020/136036 WO2021253758A1 (en) 2020-06-18 2020-12-14 Key node identification method based on technology graph
AU2020327352A AU2020327352B2 (en) 2020-06-18 2020-12-14 Key node identification method based on technology graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010559077.8A CN111813951A (en) 2020-06-18 2020-06-18 Key point identification method based on technical map

Publications (1)

Publication Number Publication Date
CN111813951A true CN111813951A (en) 2020-10-23

Family

ID=72845160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010559077.8A Pending CN111813951A (en) 2020-06-18 2020-06-18 Key point identification method based on technical map

Country Status (3)

Country Link
CN (1) CN111813951A (en)
AU (1) AU2020327352B2 (en)
WO (1) WO2021253758A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021253758A1 (en) * 2020-06-18 2021-12-23 国网上海市电力公司 Key node identification method based on technology graph
WO2023207013A1 (en) * 2022-04-26 2023-11-02 广州广电运通金融电子股份有限公司 Graph embedding-based relational graph key personnel analysis method and system

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114417837B (en) * 2022-01-19 2024-02-13 合肥工业大学 Scientific and technological big data popularity and frontier measurement method based on subject evolution trend
CN114567562B (en) * 2022-03-01 2024-02-06 重庆邮电大学 Method for identifying key nodes of coupling network of power grid and communication network
CN116595192B (en) * 2023-05-18 2023-11-21 中国科学技术信息研究所 Technological front information acquisition method and device, electronic equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295692A (en) * 2016-08-05 2017-01-04 北京航空航天大学 Product initial failure root primordium recognition methods based on dimensionality reduction Yu support vector machine
CN109446342A (en) * 2018-10-30 2019-03-08 沈阳师范大学 A kind of education of middle and primary schools knowledge mapping analysis method and system based on He Ximan index
CN110490331A (en) * 2019-08-23 2019-11-22 北京明略软件***有限公司 The processing method and processing device of knowledge mapping interior joint
WO2020048058A1 (en) * 2018-09-03 2020-03-12 平安科技(深圳)有限公司 Fund knowledge inference method and system, computer device, and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2008338259A1 (en) * 2007-12-17 2009-06-25 Leximancer Pty Ltd Methods for determining a path through concept nodes
CN110032665B (en) * 2019-03-25 2023-11-17 创新先进技术有限公司 Method and device for determining graph node vector in relational network graph
CN111813951A (en) * 2020-06-18 2020-10-23 国网上海市电力公司 Key point identification method based on technical map

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295692A (en) * 2016-08-05 2017-01-04 北京航空航天大学 Product initial failure root primordium recognition methods based on dimensionality reduction Yu support vector machine
WO2020048058A1 (en) * 2018-09-03 2020-03-12 平安科技(深圳)有限公司 Fund knowledge inference method and system, computer device, and storage medium
CN109446342A (en) * 2018-10-30 2019-03-08 沈阳师范大学 A kind of education of middle and primary schools knowledge mapping analysis method and system based on He Ximan index
CN110490331A (en) * 2019-08-23 2019-11-22 北京明略软件***有限公司 The processing method and processing device of knowledge mapping interior joint

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021253758A1 (en) * 2020-06-18 2021-12-23 国网上海市电力公司 Key node identification method based on technology graph
WO2023207013A1 (en) * 2022-04-26 2023-11-02 广州广电运通金融电子股份有限公司 Graph embedding-based relational graph key personnel analysis method and system

Also Published As

Publication number Publication date
AU2020327352B2 (en) 2023-01-05
AU2020327352A1 (en) 2022-01-20
WO2021253758A1 (en) 2021-12-23

Similar Documents

Publication Publication Date Title
CN111813951A (en) Key point identification method based on technical map
JP4920023B2 (en) Inter-object competition index calculation method and system
US20190272329A1 (en) Statistical process control and analytics for translation supply chain operational management
US20060106755A1 (en) Tracking usage of data elements in electronic business communications
Ji et al. Complexity analysis approach for prefabricated construction products using uncertain data clustering
CN106056287A (en) Equipment and method for carrying out data quality evaluation on data set based on context
CN105868956A (en) Data processing method and device
Reda et al. Towards a data quality assessment in big data
Yanhui et al. A comparative study of first and all-author bibliographic coupling analysis based on Scientometrics
CN111143394A (en) Knowledge data processing method, knowledge data processing device, knowledge data processing medium and electronic equipment
Qureshi et al. OpenRank–a novel approach to rank universities using objective and publicly verifiable data sources
CN107798137B (en) A kind of multi-source heterogeneous data fusion architecture system based on additive models
CN113610626A (en) Bank credit risk identification knowledge graph construction method and device, computer equipment and computer readable storage medium
Shi et al. [Retracted] Research on Fast Recommendation Algorithm of Library Personalized Information Based on Density Clustering
CN115934963B (en) Commercial draft big data analysis method and application map for enterprise finance acquisition
Chen et al. [Retracted] Credibility Analysis of Accounting Cloud Service Based on Complex Network
CN115827994A (en) Data processing method, device, equipment and storage medium
Liu et al. Application of master data classification model in enterprises
US6823294B1 (en) Method and system for measuring circuit design capability
Li et al. Research on optimization of process parameters of traditional Chinese medicine based on data mining technology
Soheili et al. An evaluation of information behaviour studies through the Scholarly Capital Model
Wang et al. A data quality improvement method based on the greedy algorithm
Qu et al. Research on identification of key processes in machining process based on PageRank algorithm
Sikdar et al. On the effectiveness of the scientific peer-review system: a case study of the Journal of High Energy Physics
KR102276448B1 (en) An invention pattern analysis system using patent classification codes and method of analyzing invention patterns using the patent classification code

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination