CN109977419B - Knowledge graph construction system - Google Patents

Knowledge graph construction system Download PDF

Info

Publication number
CN109977419B
CN109977419B CN201910280117.2A CN201910280117A CN109977419B CN 109977419 B CN109977419 B CN 109977419B CN 201910280117 A CN201910280117 A CN 201910280117A CN 109977419 B CN109977419 B CN 109977419B
Authority
CN
China
Prior art keywords
module
information
knowledge
data
communication connection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910280117.2A
Other languages
Chinese (zh)
Other versions
CN109977419A (en
Inventor
张晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Useear Information Technology Co ltd
Original Assignee
Xiamen Useear Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Useear Information Technology Co ltd filed Critical Xiamen Useear Information Technology Co ltd
Priority to CN201910280117.2A priority Critical patent/CN109977419B/en
Publication of CN109977419A publication Critical patent/CN109977419A/en
Application granted granted Critical
Publication of CN109977419B publication Critical patent/CN109977419B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A knowledge graph construction system comprises an information input module, a conversion module, an extraction module, a word segmentation module, a filtering module, a triple identification module, a central processing unit, a knowledge graph generation and storage module and a database module; the information input module is in communication connection with the conversion module; the extraction module is in communication connection with the conversion module and the word segmentation module; the filtering module is in communication connection with the word segmentation module and is in communication connection with the central processing unit; the database module is in communication connection with the central processing unit; the triple identification module is in communication connection with the central processing unit and the database module; the knowledge map generation and storage module is in communication connection with the central processing unit. The invention facilitates the input of new updated knowledge to combine the current knowledge to generate a new knowledge map, thereby facilitating the propagation and communication of knowledge.

Description

Knowledge graph construction system
Technical Field
The invention relates to the technical field of knowledge graph construction, in particular to a knowledge graph construction system.
Background
The knowledge graph is also called a scientific knowledge graph; the knowledge map is called knowledge domain visualization or knowledge domain mapping map in the book intelligence world, is a series of different graphs for displaying the relationship between the knowledge development process and the structure, describes knowledge resources and carriers thereof by using visualization technology, and excavates, analyzes, constructs, draws and displays knowledge and the mutual relation between the knowledge resources and the carriers. The construction of the knowledge graph is the core of the application of the artificial intelligence technology in the specific industry field at present. The prior knowledge graph is mainly based on the prior subject textbook and literature knowledge and is obtained by secondary processing and editing; with the development of science and technology, the updating and extending speed of knowledge in each field is extremely high, wherein only a small amount of knowledge can be inquired and browsed through various encyclopedia websites, and the knowledge is unstructured and semi-structured data; most updated knowledge needs to be added into books and documents, and communication in the form of the books and the documents is inconvenient, so that information lag is easily caused; the knowledge updated in each field is displayed in the form of the knowledge graph, so that people can browse related information and exchange knowledge conveniently, but the construction of the existing knowledge graph needs a large amount of labor and time investment, and the efficiency of constructing the knowledge graph is low and the cost is high.
Disclosure of Invention
Objects of the invention
In order to solve the technical problems in the background art, the invention provides a knowledge map construction system, which is convenient for inputting newly updated knowledge to combine the current knowledge to generate a new knowledge map, thereby facilitating the propagation and communication of the knowledge.
(II) technical scheme
In order to solve the problems, the invention provides a knowledge graph construction system which comprises an information input module, a conversion module, an extraction module, a word segmentation module, a filtering module, a triple identification module, a central processing unit, a knowledge graph generation and storage module and a database module;
the information input module is in communication connection with the conversion module, is used for inputting the information A and sends the information A to the conversion module; the conversion module is used for converting the information A into structured data B;
the extraction module is in communication connection with the conversion module and the word segmentation module; the extraction module is used for extracting the structured data B and sending the extracted structured data B to the word segmentation module; the word segmentation module is used for segmenting words of the data structure B and obtaining a plurality of text content fragments C;
the filtering module is in communication connection with the word segmentation module and is in communication connection with the central processing unit; the filtering module is used for filtering the obtained text content fragments C and obtaining entity information D of a plurality of key entities; the filtering module is used for sending the entity information D of the key entities to the central processing unit;
the database module is in communication connection with the central processing unit and is used for storing all structured data in the field of information A;
the triple identification module is in communication connection with the central processing unit and the database module; the triple identification module is used for melting the entity information D of the key entities and the data in the database module according to the entity relationship respectively to generate a new data structure E;
the knowledge graph generating and storing module is in communication connection with the central processing unit and is used for acquiring a new datamation structure E, generating a corresponding knowledge graph F and storing the knowledge graph F.
Preferably, the information A input by the information input module comprises structured data, unstructured data and semi-structured data.
Preferably, the system also comprises a data examination module; the data examination module is used for automatically acquiring unstructured data and semi-structured data in the field of information A and converting the acquired unstructured data and semi-structured data into structured data; the data examination module is in communication connection with the database module and sends the acquired structured data to the database module.
Preferably, the device also comprises a display module; the display module is in communication connection with the central processing unit and is used for displaying the generated knowledge graph F.
Preferably, the triple identification module comprises a first identification matching unit, a second identification matching unit, a third identification matching unit and a determination unit;
the first identification matching unit is used for matching at least one keyword in the knowledge text content fragment with a domain topic in all structured data of an information A domain to determine the domain topic of the knowledge text content fragment;
the second identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the corresponding model included in all the structural data in the information A field according to a preset rule, and determining the model matched with the entity information D of the key entities and the knowledge element examples of the entity information D of the key entities;
the third identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the attribute of the corresponding model according to a preset rule and determining the attribute of the knowledge element example in the entity information D of the key entities;
and the determining unit is used for determining the association relationship among the knowledge element instances by combining the attributes of the knowledge element instances in the entity information D of the key entities.
Preferably, the use method of the knowledge graph construction system comprises the following specific steps:
s1, inputting information A into an information input module;
s2, converting the information A into structured data B by a conversion module;
s3, extracting the structured data B by the extraction module and sending the extracted structured data B to the word segmentation module;
s4, the word segmentation module is used for segmenting the data structure B and obtaining a plurality of text content fragments C;
s5, filtering the obtained text content fragments C by a filtering module to obtain entity information D of a plurality of key entities; the filtering module sends entity information D of a plurality of key entities to the central processing unit;
s6, the triple identification module melts the entity information D of the key entities and the data in the database module according to the entity relationship respectively to generate a new data structure E;
s7, the knowledge graph generating and storing module generates a corresponding knowledge graph F from the obtained new data structure E and stores the knowledge graph F;
and S8, displaying the generated knowledge graph F by a display module.
The technical scheme of the invention has the following beneficial technical effects: the method comprises the steps of inputting newly updated knowledge including unstructured data and semi-structured data in a certain field through an information input module, and converting all the input knowledge into structured data; extracting and segmenting structured data; fusing entity information of a plurality of key entities after word segmentation and current existing data according to an entity relationship, and finally generating a knowledge graph corresponding to the latest knowledge; the invention is convenient to input the newly updated knowledge to combine the current knowledge to generate a new knowledge map, thereby facilitating the propagation and communication of the knowledge.
Drawings
FIG. 1 is a schematic block diagram of a knowledge graph building system according to the present invention.
FIG. 2 is a flow chart of a method of using the knowledge-graph building system of the present invention.
Reference numerals: 1. an information input module; 2. a conversion module; 3. an extraction module; 4. a word segmentation module; 5. a filtration module; 6. a triplet identification module; 7. a central processing unit; 8. a knowledge graph generation and storage module; 9. a database module; 10. and a data examination module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It is to be understood that these descriptions are only illustrative and are not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
FIG. 1 is a schematic block diagram of a knowledge graph building system according to the present invention.
FIG. 2 is a flow chart of a method of using the knowledge-graph building system of the present invention.
As shown in fig. 1, the knowledge graph construction system provided by the invention comprises an information input module 1, a conversion module 2, an extraction module 3, a word segmentation module 4, a filtering module 5, a triple identification module 6, a central processing unit 7, a knowledge graph generation and storage module 8 and a database module 9;
the information input module 1 is in communication connection with the conversion module 2, the information input module 1 is used for inputting the information A, and the information input module 1 sends the information A to the conversion module 2; the conversion module 2 is used for converting the information A into structured data B;
the extraction module 3 is in communication connection with the conversion module 2, and the extraction module 3 is in communication connection with the word segmentation module 4; the extraction module 3 is used for extracting the structured data B and sending the extracted structured data B to the word segmentation module 4; the word segmentation module 4 is used for segmenting words of the data structure B and obtaining a plurality of text content segments C;
the filtering module 5 is in communication connection with the word segmentation module 4, and the filtering module 5 is in communication connection with the central processing unit 7; the filtering module 5 is configured to filter the obtained multiple text content fragments C and obtain entity information D of multiple key entities; the filtering module 5 is used for sending the entity information D of a plurality of key entities to the central processor 7;
the database module 9 is in communication connection with the central processing unit 7, and the database module 9 is used for storing all structured data in the field of information A;
the triple identification module 6 is in communication connection with the central processing unit 7, and the triple identification module 6 is in communication connection with the database module 9; the triple identification module 6 is used for respectively melting the entity information D of the plurality of key entities and the data in the database module 9 according to an entity relationship to generate a new datamation structure E;
the knowledge map generation and storage module 8 is in communication connection with the central processing unit 7, and the knowledge map generation and storage module 8 is used for acquiring a new datamation structure E, generating a corresponding knowledge map F and storing the knowledge map F.
In an alternative embodiment, the information a input by the information input module 1 comprises structured data, unstructured data and semi-structured data.
It should be noted that, when the information a is structured data, the information a can be directly subjected to subsequent operations without conversion.
In an optional embodiment, the system further comprises a data review module 10; the data examination module 10 is configured to automatically obtain unstructured data and semi-structured data in the information a field, and convert the obtained unstructured data and semi-structured data into structured data; the data examination module 10 is in communication connection with the database module 9, and the data examination module 10 sends the acquired structured data to the database module 9.
In an optional embodiment, the system further comprises a display module; the display module is in communication connection with the central processing unit 7 and is used for displaying the generated knowledge graph F.
In an alternative embodiment, the triple identifying module 6 includes a first identifying matching unit, a second identifying matching unit, a third identifying matching unit and a determining unit;
the first identification matching unit is used for matching at least one keyword in the knowledge text content fragment with a domain topic in all structured data of an information A domain to determine the domain topic of the knowledge text content fragment;
the second identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the corresponding model included in all the structural data in the information A field according to a preset rule, and determining the model matched with the entity information D of the key entities and the knowledge element examples of the entity information D of the key entities;
the third identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the attribute of the corresponding model according to a preset rule and determining the attribute of the knowledge element example in the entity information D of the key entities;
and the determining unit is used for determining the association relationship among the knowledge element instances by combining the attributes of the knowledge element instances in the entity information D of the key entities.
As shown in fig. 2, the method for using the knowledge graph construction system provided by the invention comprises the following specific steps:
s1, inputting information A into an information input module 1;
s2, the conversion module 2 converts the information A into structured data B;
s3, the extraction module 3 extracts the structured data B and sends the extracted structured data B to the word segmentation module 4;
s4, the word segmentation module 4 is used for segmenting the data structure B to obtain a plurality of text content segments C;
s5, filtering the obtained text content fragments C by a filtering module 5 to obtain entity information D of a plurality of key entities; the filtering module 5 sends entity information D of a plurality of key entities to the central processor 7;
s6, the triple identification module 6 melts the entity information D of the key entities and the data in the database module 9 according to the entity relationship respectively to generate a new datamation structure E;
s7, the knowledge graph generating and storing module 8 generates a corresponding knowledge graph F from the obtained new datamation structure E and stores the knowledge graph F;
and S8, displaying the generated knowledge graph F by a display module.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modifications, equivalents, improvements and the like which are made without departing from the spirit and scope of the present invention shall be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundary of the appended claims, or the equivalents of such scope and boundary.

Claims (4)

1. A knowledge graph construction system is characterized by comprising an information input module (1), a conversion module (2), an extraction module (3), a word segmentation module (4), a filtering module (5), a triple identification module (6), a central processing unit (7), a knowledge graph generation and storage module (8) and a database module (9);
the information input module (1) is in communication connection with the conversion module (2), the information input module (1) is used for inputting the information A, and the information input module (1) sends the information A to the conversion module (2); the conversion module (2) is used for converting the information A into structured data B;
the extraction module (3) is in communication connection with the conversion module (2), and the extraction module (3) is in communication connection with the word segmentation module (4); the extraction module (3) is used for extracting the structured data B and sending the extracted structured data B to the word segmentation module (4); the word segmentation module (4) is used for segmenting words of the data structure B and obtaining a plurality of text content fragments C;
the filtering module (5) is in communication connection with the word segmentation module (4), and the filtering module (5) is in communication connection with the central processing unit (7); the filtering module (5) is used for filtering the obtained text content fragments C and obtaining entity information D of a plurality of key entities; the filtering module (5) is used for sending the entity information D of the key entities to the central processing unit (7);
the database module (9) is in communication connection with the central processing unit (7), and the database module (9) is used for storing all structured data in the field of information A;
the triple identification module (6) is in communication connection with the central processing unit (7), and the triple identification module (6) is in communication connection with the database module (9); the triple identification module (6) is used for melting the entity information D of the key entities and the data in the database module (9) according to the entity relationship respectively to generate a new datamation structure E; the triple identification module (6) comprises a first identification matching unit, a second identification matching unit, a third identification matching unit and a determination unit;
the first identification matching unit is used for matching at least one keyword in the text content segment with a domain topic in all structured data of an information A domain to determine the domain topic of the text content segment;
the second identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the corresponding model included in all the structural data in the information A field according to a preset rule, and determining the model matched with the entity information D of the key entities and the knowledge element examples of the entity information D of the key entities;
the third identification matching unit is used for matching at least one keyword in the entity information D of the key entities with the attribute of the corresponding model according to a preset rule and determining the attribute of the knowledge element example in the entity information D of the key entities;
the determining unit is used for determining the incidence relation among the knowledge element instances by combining the attributes of the knowledge element instances in the entity information D of the key entities;
the knowledge map generating and storing module (8) is in communication connection with the central processing unit (7), and the knowledge map generating and storing module (8) is used for acquiring a new datamation structure E, generating a corresponding knowledge map F and storing the knowledge map F;
the using method of the knowledge graph construction system comprises the following specific steps:
s1, inputting information A into an information input module (1);
s2, converting the information A into structured data B by a conversion module (2);
s3, the extraction module (3) extracts the structured data B and sends the extracted structured data B to the word segmentation module (4);
s4, the word segmentation module (4) is used for segmenting the data structure B and obtaining a plurality of text content fragments C;
s5, filtering the obtained text content fragments C by a filtering module (5) to obtain entity information D of a plurality of key entities; the filtering module (5) sends entity information D of a plurality of key entities to a central processing unit (7);
s6, the triple identification module (6) melts the entity information D of the key entities and the data in the database module (9) according to the entity relationship respectively to generate a new datamation structure E;
s7, a knowledge graph generating and storing module (8) generates a corresponding knowledge graph F from the obtained new datamation structure E, and stores the knowledge graph F;
and S8, displaying the generated knowledge graph F by a display module.
2. A knowledge graph building system according to claim 1, wherein the information a inputted by the information input module (1) comprises structured data, unstructured data and semi-structured data.
3. A knowledge-graph building system according to claim 1, further comprising a data review module (10); the data examination module (10) is used for automatically acquiring unstructured data and semi-structured data in the field of information A and converting the acquired unstructured data and semi-structured data into structured data; the data examination module (10) is in communication connection with the database module (9), and the data examination module (10) sends the acquired structured data to the database module (9).
4. The knowledge-graph building system of claim 1, further comprising a display module; the display module is in communication connection with the central processing unit (7) and is used for displaying the generated knowledge graph F.
CN201910280117.2A 2019-04-09 2019-04-09 Knowledge graph construction system Active CN109977419B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910280117.2A CN109977419B (en) 2019-04-09 2019-04-09 Knowledge graph construction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910280117.2A CN109977419B (en) 2019-04-09 2019-04-09 Knowledge graph construction system

Publications (2)

Publication Number Publication Date
CN109977419A CN109977419A (en) 2019-07-05
CN109977419B true CN109977419B (en) 2023-04-07

Family

ID=67083639

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910280117.2A Active CN109977419B (en) 2019-04-09 2019-04-09 Knowledge graph construction system

Country Status (1)

Country Link
CN (1) CN109977419B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559704A (en) * 2020-12-08 2021-03-26 北京航天云路有限公司 Knowledge graph generation tool configured by user-defined

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694177A (en) * 2017-04-06 2018-10-23 北大方正集团有限公司 Knowledge mapping construction method and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106777274B (en) * 2016-06-16 2018-05-29 北京理工大学 A kind of Chinese tour field knowledge mapping construction method and system
CN106168965B (en) * 2016-07-01 2020-06-30 竹间智能科技(上海)有限公司 Knowledge graph construction system
CN108595494B (en) * 2018-03-15 2022-05-20 腾讯科技(深圳)有限公司 Method and device for acquiring reply information
CN108595708A (en) * 2018-05-10 2018-09-28 北京航空航天大学 A kind of exception information file classification method of knowledge based collection of illustrative plates

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694177A (en) * 2017-04-06 2018-10-23 北大方正集团有限公司 Knowledge mapping construction method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于多种数据源的中文知识图谱构建方法研究;胡芳槐;《中国博士学位论文全文数据库》;20150515;正文第1-130页 *

Also Published As

Publication number Publication date
CN109977419A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN102426609B (en) Index generation method and index generation device based on MapReduce programming architecture
CN105468605A (en) Entity information map generation method and device
CN111522927B (en) Entity query method and device based on knowledge graph
WO2019153685A1 (en) Text processing method, apparatus, computer device and storage medium
CN111708938B (en) Method, apparatus, electronic device, and storage medium for information processing
US10223471B2 (en) Web pages processing
CN112560468B (en) Meteorological early warning text processing method, related device and computer program product
CN111259160A (en) Knowledge graph construction method, device, equipment and storage medium
CN108847957A (en) It was found that the method and system with presentation network application access information
US20150278248A1 (en) Personal Information Management Service System
US20140280352A1 (en) Processing semi-structured data
US20150379112A1 (en) Creating an on-line job function ontology
CN114595686A (en) Knowledge extraction method, and training method and device of knowledge extraction model
CN107391650B (en) A kind of structuring method for splitting of document, apparatus and system
CN116245177A (en) Geographic environment knowledge graph automatic construction method and system and readable storage medium
CN109977419B (en) Knowledge graph construction system
CN114064923A (en) Data processing method and device, electronic equipment and storage medium
JP2022091686A (en) Data annotation method, device, electronic apparatus and storage medium
EP3564833B1 (en) Method and device for identifying main picture in web page
CN108846134A (en) A kind of O&M scheme recommender system and method based on web crawlers
CN116431828A (en) Construction method of power grid center data asset knowledge graph database constructed based on neural network technology
CN116467433A (en) Knowledge graph visualization method, device, equipment and medium for multi-source data
US9661086B2 (en) Incorporation of content from an external followed user within a social networking system
CN114547477A (en) Data processing method and device, electronic equipment and storage medium
CN114281884A (en) Method for extracting subject knowledge submodel of knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220909

Address after: 361000 units 1702 and 1703, No. 59, Chengyi North Street, phase III, software park, Xiamen, Fujian

Applicant after: XIAMEN USEEAR INFORMATION TECHNOLOGY Co.,Ltd.

Address before: Unit 1701, unit 1704, No. 59, Chengyi North Street, phase III, software park, Xiamen City, Fujian Province, 361000

Applicant before: FUJIAN QIDIAN SPACE-TIME DIGITAL TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant