CN109977419B - Knowledge graph construction system - Google Patents
Knowledge graph construction system Download PDFInfo
- Publication number
- CN109977419B CN109977419B CN201910280117.2A CN201910280117A CN109977419B CN 109977419 B CN109977419 B CN 109977419B CN 201910280117 A CN201910280117 A CN 201910280117A CN 109977419 B CN109977419 B CN 109977419B
- Authority
- CN
- China
- Prior art keywords
- module
- information
- knowledge
- data
- communication connection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A knowledge graph construction system comprises an information input module, a conversion module, an extraction module, a word segmentation module, a filtering module, a triple identification module, a central processing unit, a knowledge graph generation and storage module and a database module; the information input module is in communication connection with the conversion module; the extraction module is in communication connection with the conversion module and the word segmentation module; the filtering module is in communication connection with the word segmentation module and is in communication connection with the central processing unit; the database module is in communication connection with the central processing unit; the triple identification module is in communication connection with the central processing unit and the database module; the knowledge map generation and storage module is in communication connection with the central processing unit. The invention facilitates the input of new updated knowledge to combine the current knowledge to generate a new knowledge map, thereby facilitating the propagation and communication of knowledge.
Description
Technical Field
The invention relates to the technical field of knowledge graph construction, in particular to a knowledge graph construction system.
Background
The knowledge graph is also called a scientific knowledge graph; the knowledge map is called knowledge domain visualization or knowledge domain mapping map in the book intelligence world, is a series of different graphs for displaying the relationship between the knowledge development process and the structure, describes knowledge resources and carriers thereof by using visualization technology, and excavates, analyzes, constructs, draws and displays knowledge and the mutual relation between the knowledge resources and the carriers. The construction of the knowledge graph is the core of the application of the artificial intelligence technology in the specific industry field at present. The prior knowledge graph is mainly based on the prior subject textbook and literature knowledge and is obtained by secondary processing and editing; with the development of science and technology, the updating and extending speed of knowledge in each field is extremely high, wherein only a small amount of knowledge can be inquired and browsed through various encyclopedia websites, and the knowledge is unstructured and semi-structured data; most updated knowledge needs to be added into books and documents, and communication in the form of the books and the documents is inconvenient, so that information lag is easily caused; the knowledge updated in each field is displayed in the form of the knowledge graph, so that people can browse related information and exchange knowledge conveniently, but the construction of the existing knowledge graph needs a large amount of labor and time investment, and the efficiency of constructing the knowledge graph is low and the cost is high.
Disclosure of Invention
Objects of the invention
In order to solve the technical problems in the background art, the invention provides a knowledge map construction system, which is convenient for inputting newly updated knowledge to combine the current knowledge to generate a new knowledge map, thereby facilitating the propagation and communication of the knowledge.
(II) technical scheme
In order to solve the problems, the invention provides a knowledge graph construction system which comprises an information input module, a conversion module, an extraction module, a word segmentation module, a filtering module, a triple identification module, a central processing unit, a knowledge graph generation and storage module and a database module;
the information input module is in communication connection with the conversion module, is used for inputting the information A and sends the information A to the conversion module; the conversion module is used for converting the information A into structured data B;
the extraction module is in communication connection with the conversion module and the word segmentation module; the extraction module is used for extracting the structured data B and sending the extracted structured data B to the word segmentation module; the word segmentation module is used for segmenting words of the data structure B and obtaining a plurality of text content fragments C;
the filtering module is in communication connection with the word segmentation module and is in communication connection with the central processing unit; the filtering module is used for filtering the obtained text content fragments C and obtaining entity information D of a plurality of key entities; the filtering module is used for sending the entity information D of the key entities to the central processing unit;
the database module is in communication connection with the central processing unit and is used for storing all structured data in the field of information A;
the triple identification module is in communication connection with the central processing unit and the database module; the triple identification module is used for melting the entity information D of the key entities and the data in the database module according to the entity relationship respectively to generate a new data structure E;
the knowledge graph generating and storing module is in communication connection with the central processing unit and is used for acquiring a new datamation structure E, generating a corresponding knowledge graph F and storing the knowledge graph F.
Preferably, the information A input by the information input module comprises structured data, unstructured data and semi-structured data.
Preferably, the system also comprises a data examination module; the data examination module is used for automatically acquiring unstructured data and semi-structured data in the field of information A and converting the acquired unstructured data and semi-structured data into structured data; the data examination module is in communication connection with the database module and sends the acquired structured data to the database module.
Preferably, the device also comprises a display module; the display module is in communication connection with the central processing unit and is used for displaying the generated knowledge graph F.
Preferably, the triple identification module comprises a first identification matching unit, a second identification matching unit, a third identification matching unit and a determination unit;
the first identification matching unit is used for matching at least one keyword in the knowledge text content fragment with a domain topic in all structured data of an information A domain to determine the domain topic of the knowledge text content fragment;
the second identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the corresponding model included in all the structural data in the information A field according to a preset rule, and determining the model matched with the entity information D of the key entities and the knowledge element examples of the entity information D of the key entities;
the third identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the attribute of the corresponding model according to a preset rule and determining the attribute of the knowledge element example in the entity information D of the key entities;
and the determining unit is used for determining the association relationship among the knowledge element instances by combining the attributes of the knowledge element instances in the entity information D of the key entities.
Preferably, the use method of the knowledge graph construction system comprises the following specific steps:
s1, inputting information A into an information input module;
s2, converting the information A into structured data B by a conversion module;
s3, extracting the structured data B by the extraction module and sending the extracted structured data B to the word segmentation module;
s4, the word segmentation module is used for segmenting the data structure B and obtaining a plurality of text content fragments C;
s5, filtering the obtained text content fragments C by a filtering module to obtain entity information D of a plurality of key entities; the filtering module sends entity information D of a plurality of key entities to the central processing unit;
s6, the triple identification module melts the entity information D of the key entities and the data in the database module according to the entity relationship respectively to generate a new data structure E;
s7, the knowledge graph generating and storing module generates a corresponding knowledge graph F from the obtained new data structure E and stores the knowledge graph F;
and S8, displaying the generated knowledge graph F by a display module.
The technical scheme of the invention has the following beneficial technical effects: the method comprises the steps of inputting newly updated knowledge including unstructured data and semi-structured data in a certain field through an information input module, and converting all the input knowledge into structured data; extracting and segmenting structured data; fusing entity information of a plurality of key entities after word segmentation and current existing data according to an entity relationship, and finally generating a knowledge graph corresponding to the latest knowledge; the invention is convenient to input the newly updated knowledge to combine the current knowledge to generate a new knowledge map, thereby facilitating the propagation and communication of the knowledge.
Drawings
FIG. 1 is a schematic block diagram of a knowledge graph building system according to the present invention.
FIG. 2 is a flow chart of a method of using the knowledge-graph building system of the present invention.
Reference numerals: 1. an information input module; 2. a conversion module; 3. an extraction module; 4. a word segmentation module; 5. a filtration module; 6. a triplet identification module; 7. a central processing unit; 8. a knowledge graph generation and storage module; 9. a database module; 10. and a data examination module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It is to be understood that these descriptions are only illustrative and are not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
FIG. 1 is a schematic block diagram of a knowledge graph building system according to the present invention.
FIG. 2 is a flow chart of a method of using the knowledge-graph building system of the present invention.
As shown in fig. 1, the knowledge graph construction system provided by the invention comprises an information input module 1, a conversion module 2, an extraction module 3, a word segmentation module 4, a filtering module 5, a triple identification module 6, a central processing unit 7, a knowledge graph generation and storage module 8 and a database module 9;
the information input module 1 is in communication connection with the conversion module 2, the information input module 1 is used for inputting the information A, and the information input module 1 sends the information A to the conversion module 2; the conversion module 2 is used for converting the information A into structured data B;
the extraction module 3 is in communication connection with the conversion module 2, and the extraction module 3 is in communication connection with the word segmentation module 4; the extraction module 3 is used for extracting the structured data B and sending the extracted structured data B to the word segmentation module 4; the word segmentation module 4 is used for segmenting words of the data structure B and obtaining a plurality of text content segments C;
the filtering module 5 is in communication connection with the word segmentation module 4, and the filtering module 5 is in communication connection with the central processing unit 7; the filtering module 5 is configured to filter the obtained multiple text content fragments C and obtain entity information D of multiple key entities; the filtering module 5 is used for sending the entity information D of a plurality of key entities to the central processor 7;
the database module 9 is in communication connection with the central processing unit 7, and the database module 9 is used for storing all structured data in the field of information A;
the triple identification module 6 is in communication connection with the central processing unit 7, and the triple identification module 6 is in communication connection with the database module 9; the triple identification module 6 is used for respectively melting the entity information D of the plurality of key entities and the data in the database module 9 according to an entity relationship to generate a new datamation structure E;
the knowledge map generation and storage module 8 is in communication connection with the central processing unit 7, and the knowledge map generation and storage module 8 is used for acquiring a new datamation structure E, generating a corresponding knowledge map F and storing the knowledge map F.
In an alternative embodiment, the information a input by the information input module 1 comprises structured data, unstructured data and semi-structured data.
It should be noted that, when the information a is structured data, the information a can be directly subjected to subsequent operations without conversion.
In an optional embodiment, the system further comprises a data review module 10; the data examination module 10 is configured to automatically obtain unstructured data and semi-structured data in the information a field, and convert the obtained unstructured data and semi-structured data into structured data; the data examination module 10 is in communication connection with the database module 9, and the data examination module 10 sends the acquired structured data to the database module 9.
In an optional embodiment, the system further comprises a display module; the display module is in communication connection with the central processing unit 7 and is used for displaying the generated knowledge graph F.
In an alternative embodiment, the triple identifying module 6 includes a first identifying matching unit, a second identifying matching unit, a third identifying matching unit and a determining unit;
the first identification matching unit is used for matching at least one keyword in the knowledge text content fragment with a domain topic in all structured data of an information A domain to determine the domain topic of the knowledge text content fragment;
the second identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the corresponding model included in all the structural data in the information A field according to a preset rule, and determining the model matched with the entity information D of the key entities and the knowledge element examples of the entity information D of the key entities;
the third identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the attribute of the corresponding model according to a preset rule and determining the attribute of the knowledge element example in the entity information D of the key entities;
and the determining unit is used for determining the association relationship among the knowledge element instances by combining the attributes of the knowledge element instances in the entity information D of the key entities.
As shown in fig. 2, the method for using the knowledge graph construction system provided by the invention comprises the following specific steps:
s1, inputting information A into an information input module 1;
s2, the conversion module 2 converts the information A into structured data B;
s3, the extraction module 3 extracts the structured data B and sends the extracted structured data B to the word segmentation module 4;
s4, the word segmentation module 4 is used for segmenting the data structure B to obtain a plurality of text content segments C;
s5, filtering the obtained text content fragments C by a filtering module 5 to obtain entity information D of a plurality of key entities; the filtering module 5 sends entity information D of a plurality of key entities to the central processor 7;
s6, the triple identification module 6 melts the entity information D of the key entities and the data in the database module 9 according to the entity relationship respectively to generate a new datamation structure E;
s7, the knowledge graph generating and storing module 8 generates a corresponding knowledge graph F from the obtained new datamation structure E and stores the knowledge graph F;
and S8, displaying the generated knowledge graph F by a display module.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modifications, equivalents, improvements and the like which are made without departing from the spirit and scope of the present invention shall be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundary of the appended claims, or the equivalents of such scope and boundary.
Claims (4)
1. A knowledge graph construction system is characterized by comprising an information input module (1), a conversion module (2), an extraction module (3), a word segmentation module (4), a filtering module (5), a triple identification module (6), a central processing unit (7), a knowledge graph generation and storage module (8) and a database module (9);
the information input module (1) is in communication connection with the conversion module (2), the information input module (1) is used for inputting the information A, and the information input module (1) sends the information A to the conversion module (2); the conversion module (2) is used for converting the information A into structured data B;
the extraction module (3) is in communication connection with the conversion module (2), and the extraction module (3) is in communication connection with the word segmentation module (4); the extraction module (3) is used for extracting the structured data B and sending the extracted structured data B to the word segmentation module (4); the word segmentation module (4) is used for segmenting words of the data structure B and obtaining a plurality of text content fragments C;
the filtering module (5) is in communication connection with the word segmentation module (4), and the filtering module (5) is in communication connection with the central processing unit (7); the filtering module (5) is used for filtering the obtained text content fragments C and obtaining entity information D of a plurality of key entities; the filtering module (5) is used for sending the entity information D of the key entities to the central processing unit (7);
the database module (9) is in communication connection with the central processing unit (7), and the database module (9) is used for storing all structured data in the field of information A;
the triple identification module (6) is in communication connection with the central processing unit (7), and the triple identification module (6) is in communication connection with the database module (9); the triple identification module (6) is used for melting the entity information D of the key entities and the data in the database module (9) according to the entity relationship respectively to generate a new datamation structure E; the triple identification module (6) comprises a first identification matching unit, a second identification matching unit, a third identification matching unit and a determination unit;
the first identification matching unit is used for matching at least one keyword in the text content segment with a domain topic in all structured data of an information A domain to determine the domain topic of the text content segment;
the second identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the corresponding model included in all the structural data in the information A field according to a preset rule, and determining the model matched with the entity information D of the key entities and the knowledge element examples of the entity information D of the key entities;
the third identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the attribute of the corresponding model according to a preset rule and determining the attribute of the knowledge element example in the entity information D of the key entities;
the determining unit is used for determining the incidence relation among the knowledge element instances by combining the attributes of the knowledge element instances in the entity information D of the key entities;
the knowledge map generating and storing module (8) is in communication connection with the central processing unit (7), and the knowledge map generating and storing module (8) is used for acquiring a new datamation structure E, generating a corresponding knowledge map F and storing the knowledge map F;
the using method of the knowledge graph construction system comprises the following specific steps:
s1, inputting information A into an information input module (1);
s2, converting the information A into structured data B by a conversion module (2);
s3, the extraction module (3) extracts the structured data B and sends the extracted structured data B to the word segmentation module (4);
s4, the word segmentation module (4) is used for segmenting the data structure B and obtaining a plurality of text content fragments C;
s5, filtering the obtained text content fragments C by a filtering module (5) to obtain entity information D of a plurality of key entities; the filtering module (5) sends entity information D of a plurality of key entities to a central processing unit (7);
s6, the triple identification module (6) melts the entity information D of the key entities and the data in the database module (9) according to the entity relationship respectively to generate a new datamation structure E;
s7, a knowledge graph generating and storing module (8) generates a corresponding knowledge graph F from the obtained new datamation structure E, and stores the knowledge graph F;
and S8, displaying the generated knowledge graph F by a display module.
2. A knowledge graph building system according to claim 1, wherein the information a inputted by the information input module (1) comprises structured data, unstructured data and semi-structured data.
3. A knowledge-graph building system according to claim 1, further comprising a data review module (10); the data examination module (10) is used for automatically acquiring unstructured data and semi-structured data in the field of information A and converting the acquired unstructured data and semi-structured data into structured data; the data examination module (10) is in communication connection with the database module (9), and the data examination module (10) sends the acquired structured data to the database module (9).
4. The knowledge-graph building system of claim 1, further comprising a display module; the display module is in communication connection with the central processing unit (7) and is used for displaying the generated knowledge graph F.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280117.2A CN109977419B (en) | 2019-04-09 | 2019-04-09 | Knowledge graph construction system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280117.2A CN109977419B (en) | 2019-04-09 | 2019-04-09 | Knowledge graph construction system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109977419A CN109977419A (en) | 2019-07-05 |
CN109977419B true CN109977419B (en) | 2023-04-07 |
Family
ID=67083639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910280117.2A Active CN109977419B (en) | 2019-04-09 | 2019-04-09 | Knowledge graph construction system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109977419B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112559704A (en) * | 2020-12-08 | 2021-03-26 | 北京航天云路有限公司 | Knowledge graph generation tool configured by user-defined |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108694177A (en) * | 2017-04-06 | 2018-10-23 | 北大方正集团有限公司 | Knowledge mapping construction method and system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106777274B (en) * | 2016-06-16 | 2018-05-29 | 北京理工大学 | A kind of Chinese tour field knowledge mapping construction method and system |
CN106168965B (en) * | 2016-07-01 | 2020-06-30 | 竹间智能科技(上海)有限公司 | Knowledge graph construction system |
CN108595494B (en) * | 2018-03-15 | 2022-05-20 | 腾讯科技(深圳)有限公司 | Method and device for acquiring reply information |
CN108595708A (en) * | 2018-05-10 | 2018-09-28 | 北京航空航天大学 | A kind of exception information file classification method of knowledge based collection of illustrative plates |
-
2019
- 2019-04-09 CN CN201910280117.2A patent/CN109977419B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108694177A (en) * | 2017-04-06 | 2018-10-23 | 北大方正集团有限公司 | Knowledge mapping construction method and system |
Non-Patent Citations (1)
Title |
---|
基于多种数据源的中文知识图谱构建方法研究;胡芳槐;《中国博士学位论文全文数据库》;20150515;正文第1-130页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109977419A (en) | 2019-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102426609B (en) | Index generation method and index generation device based on MapReduce programming architecture | |
CN105468605A (en) | Entity information map generation method and device | |
CN111522927B (en) | Entity query method and device based on knowledge graph | |
WO2019153685A1 (en) | Text processing method, apparatus, computer device and storage medium | |
CN111708938B (en) | Method, apparatus, electronic device, and storage medium for information processing | |
US10223471B2 (en) | Web pages processing | |
CN112560468B (en) | Meteorological early warning text processing method, related device and computer program product | |
CN111259160A (en) | Knowledge graph construction method, device, equipment and storage medium | |
CN108847957A (en) | It was found that the method and system with presentation network application access information | |
US20150278248A1 (en) | Personal Information Management Service System | |
US20140280352A1 (en) | Processing semi-structured data | |
US20150379112A1 (en) | Creating an on-line job function ontology | |
CN114595686A (en) | Knowledge extraction method, and training method and device of knowledge extraction model | |
CN107391650B (en) | A kind of structuring method for splitting of document, apparatus and system | |
CN116245177A (en) | Geographic environment knowledge graph automatic construction method and system and readable storage medium | |
CN109977419B (en) | Knowledge graph construction system | |
CN114064923A (en) | Data processing method and device, electronic equipment and storage medium | |
JP2022091686A (en) | Data annotation method, device, electronic apparatus and storage medium | |
EP3564833B1 (en) | Method and device for identifying main picture in web page | |
CN108846134A (en) | A kind of O&M scheme recommender system and method based on web crawlers | |
CN116431828A (en) | Construction method of power grid center data asset knowledge graph database constructed based on neural network technology | |
CN116467433A (en) | Knowledge graph visualization method, device, equipment and medium for multi-source data | |
US9661086B2 (en) | Incorporation of content from an external followed user within a social networking system | |
CN114547477A (en) | Data processing method and device, electronic equipment and storage medium | |
CN114281884A (en) | Method for extracting subject knowledge submodel of knowledge graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220909 Address after: 361000 units 1702 and 1703, No. 59, Chengyi North Street, phase III, software park, Xiamen, Fujian Applicant after: XIAMEN USEEAR INFORMATION TECHNOLOGY Co.,Ltd. Address before: Unit 1701, unit 1704, No. 59, Chengyi North Street, phase III, software park, Xiamen City, Fujian Province, 361000 Applicant before: FUJIAN QIDIAN SPACE-TIME DIGITAL TECHNOLOGY Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |