CN116701650A - Knowledge graph construction method and device and readable storage medium - Google Patents
Knowledge graph construction method and device and readable storage medium Download PDFInfo
- Publication number
- CN116701650A CN116701650A CN202310671560.9A CN202310671560A CN116701650A CN 116701650 A CN116701650 A CN 116701650A CN 202310671560 A CN202310671560 A CN 202310671560A CN 116701650 A CN116701650 A CN 116701650A
- Authority
- CN
- China
- Prior art keywords
- knowledge
- entities
- entity
- knowledge graph
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010276 construction Methods 0.000 title claims abstract description 61
- 239000000463 material Substances 0.000 claims abstract description 75
- 238000000034 method Methods 0.000 claims abstract description 40
- 238000005457 optimization Methods 0.000 claims abstract description 39
- 238000002372 labelling Methods 0.000 claims description 31
- 230000009471 action Effects 0.000 claims description 24
- 230000015654 memory Effects 0.000 claims description 17
- 238000000605 extraction Methods 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 9
- 238000013145 classification model Methods 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 6
- 230000000750 progressive effect Effects 0.000 claims description 5
- 239000013598 vector Substances 0.000 description 12
- 238000003745 diagnosis Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 9
- 238000011161 development Methods 0.000 description 6
- 230000004927 fusion Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 230000002688 persistence Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013550 semantic technology Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides a knowledge graph construction method, a knowledge graph construction device and a readable storage medium, wherein the knowledge graph construction method comprises the following steps: acquiring knowledge materials related to mobile network optimization; constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triplet information for representing the relations among the entities; and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph. The method, the device and the readable storage medium can solve the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially cannot be directly applied to the field of mobile network optimization.
Description
Technical Field
The present application relates to the field of mobile network optimization, and in particular, to a method and apparatus for constructing a knowledge graph, and a readable storage medium.
Background
In the field of mobile network optimization, one of core technologies for intelligent diagnosis of network problems occurring in a use process of a mobile network user is to construct a knowledge graph of problem diagnosis. However, the existing knowledge graph construction method cannot be directly applied to the construction of a knowledge graph in a specific field, especially in the field where mobile network optimization is heavy and the diagnosis of the cause of network problems occurring to mobile users is strongly dependent on expert experience, so that it is highly desirable to provide a knowledge graph construction method for the field of mobile network optimization.
Disclosure of Invention
The technical problem to be solved by the application is to provide a knowledge graph construction method, a device and a readable storage medium aiming at the defects of the prior art, so as to solve the problem that the prior knowledge graph construction method cannot be directly applied to the specific field, especially cannot be directly applied to the field of mobile network optimization.
In a first aspect, the present application provides a method for constructing a knowledge graph, where the method includes:
acquiring knowledge materials related to mobile network optimization;
constructing ontology models of various entity types based on the knowledge materials;
extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triplet information for representing the relations among the entities;
and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph.
Further, the building of the ontology model of the plurality of entity types based on the knowledge materials specifically includes:
and constructing an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions based on the knowledge materials.
Further, the user experience is used to infer the intermediate cause, which includes two subclasses of the apparent cause and the primary cause, the intermediate cause is used to infer the root cause, which is used to assign the first-line action.
Further, the extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model to obtain the triplet information for representing the relationships between the entities specifically includes:
based on the ontology model, adopting a sequence labeling model to conduct named entity recognition so as to extract entities in each sentence in the knowledge material; the method comprises the steps of,
based on the extracted entities, the relationship classification model R-BERT is utilized to judge the category of the relationship among the entities, and the triplet information used for representing the relationship among the entities is obtained.
Further, the identifying of named entities based on the ontology model by using a sequence labeling model to extract entities in each sentence in the knowledge material specifically includes:
inputting each sentence in the knowledge material into a sequence labeling model, and respectively labeling a subject and an object in each sentence as entities into a head entity and a tail entity through the sequence labeling model;
and identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
Further, after extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model to obtain the triplet information for representing the relationships between the entities, the method further includes:
and saving the extracted entities and relations to a Neo4J graph database.
Further, after the entity in the triplet information is taken as a vertex, the relation is taken as an edge, and the knowledge graph is constructed, the method further comprises:
and building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and the retrieval of the knowledge graph.
In a second aspect, the present application provides a knowledge graph construction apparatus, including:
the knowledge material acquisition module is used for acquiring knowledge materials related to mobile network optimization;
the ontology model construction module is connected with the knowledge material acquisition module and used for constructing ontology models of various entity types based on the knowledge materials;
the entity relation extracting module is connected with the ontology model constructing module and is used for extracting the entities in each sentence in the knowledge material and the relation among the entities based on the ontology model to obtain triple information for representing the relation among the entities;
and the knowledge graph construction module is connected with the entity relation extraction module and is used for constructing a knowledge graph by taking the entity in the triplet information as a vertex and the relation as an edge.
In a third aspect, the present application provides a knowledge graph construction apparatus, comprising a memory and a processor, the memory storing a computer program, the processor being configured to run the computer program to implement the knowledge graph construction method according to the first aspect.
In a fourth aspect, the present application provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the knowledge graph construction method according to the first aspect.
The application provides a knowledge graph construction method, a knowledge graph construction device and a readable storage medium. Firstly, acquiring knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph. The application can realize the construction of the complete knowledge graph aiming at the mobile network optimization field, and solves the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially the mobile network optimization field.
Drawings
FIG. 1 is a flow chart of a knowledge graph construction method in embodiment 1 of the present application;
FIG. 2 is a schematic diagram of an embodiment of an ontology model;
FIG. 3 is a schematic diagram of a knowledge graph construction framework according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a knowledge graph construction device according to embodiment 2 of the present application;
fig. 5 is a schematic structural diagram of a knowledge graph construction device according to embodiment 3 of the present application.
Detailed Description
In order to make the technical scheme of the present application better understood by those skilled in the art, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.
It is to be understood that the specific embodiments and figures described herein are merely illustrative of the application, and are not limiting of the application.
It is to be understood that the various embodiments of the application and the features of the embodiments may be combined with each other without conflict.
It is to be understood that only the portions relevant to the present application are shown in the drawings for convenience of description, and the portions irrelevant to the present application are not shown in the drawings.
It should be understood that each unit and module in the embodiments of the present application may correspond to only one physical structure, may be formed by a plurality of physical structures, or may be integrated into one physical structure.
It will be appreciated that, without conflict, the functions and steps noted in the flowcharts and block diagrams of the present application may occur out of the order noted in the figures.
It is to be understood that the flowcharts and block diagrams of the present application illustrate the architecture, functionality, and operation of possible implementations of systems, apparatuses, devices, methods according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a unit, module, segment, code, or the like, which comprises executable instructions for implementing the specified functions. Moreover, each block or combination of blocks in the block diagrams and flowchart illustrations can be implemented by hardware-based systems that perform the specified functions, or by combinations of hardware and computer instructions.
It should be understood that the units and modules related in the embodiments of the present application may be implemented by software, or may be implemented by hardware, for example, the units and modules may be located in a processor.
Summary of the application
At present, the field of mobile network optimization mainly depends on experience judgment of network optimization experts, and tasks such as root cause diagnosis, operation issuing and the like of problems are realized through a manual mode. The traditional network diagnosis mode has the problems of long period, high cost, low efficiency and the like. With the development of artificial intelligence technology and the successful application of technologies such as deep learning in the fields of image recognition, voice processing, chess games and the like, the intelligent diagnosis of communication network problems based on novel technologies such as deep learning, artificial intelligence and the like becomes a new research hotspot.
In the field of mobile network optimization, one of core technologies for intelligent diagnosis of network problems occurring in a use process of a mobile network user is to construct a knowledge graph of problem diagnosis. The knowledge graph main technology comprises knowledge collection, knowledge extraction, knowledge fusion and knowledge reasoning. The knowledge collection integrates the existing knowledge materials into a format and a type which can be identified and processed, the knowledge extraction technology identifies key information from various types of data, the core content hidden in the data is mined, and the entity, the relationship and the attribute are constructed. Classical knowledge extraction methods include rule and template based extraction methods, data analysis statistics based extraction methods, machine learning based extraction methods, and the like. Knowledge fusion mainly solves the problems of inconsistent concept and relation, ambiguity and the like, and solves the problems of definition conflict, repeated content, unknown reference, disordered layers and the like faced by merging all local knowledge bases into one integral knowledge base. In the knowledge fusion process, a plurality of technologies such as clustering, similarity analysis, probability statistical analysis and the like are adopted, and finally a compact, clear and complete global knowledge base is constructed. The goal of knowledge reasoning is to obtain new knowledge or related conclusions through a series of methods, common methods including description logic based reasoning, graph structure based reasoning, statistical rule based reasoning, probabilistic logic based reasoning, etc.
However, the existing knowledge graph construction method cannot be directly applied to the construction of a knowledge graph in a specific field, especially in the field where mobile network optimization is heavy and the diagnosis of the cause of network problems occurring to mobile users is strongly dependent on expert experience, so that it is highly desirable to provide a knowledge graph construction method for the field of mobile network optimization.
Aiming at the technical problems, the application provides a knowledge graph construction method, a knowledge graph construction device and a readable storage medium, wherein the knowledge graph construction method, the knowledge graph construction device and the readable storage medium firstly acquire knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph, thereby realizing the construction of the complete knowledge graph aiming at the field of mobile network optimization.
Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.
Example 1:
the embodiment provides a method for constructing a knowledge graph, as shown in fig. 1, including:
step S101: and acquiring knowledge materials related to mobile network optimization.
In this embodiment, knowledge materials related to mobile network optimization include unstructured data such as encyclopedia pages published on the network and crawled by python crawlers, historically solved optimization case text files, optimization instruction manual text files, domestic and foreign standard text files and enterprise standard text files, and the like, so as to form a basic corpus text file for constructing a knowledge map.
Step S102: and constructing an ontology model of a plurality of entity types based on the knowledge materials.
Specifically, an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions is constructed based on the knowledge materials.
As shown in fig. 2, a structural schematic diagram of an ontology model is shown, wherein the user experience is used to infer the intermediate cause, the intermediate cause includes two sub-categories (sub-coskof) of the apparent cause and a first-level cause, the intermediate cause is used to infer the root cause, and the root cause is used to assign the first-line action. It should be noted that, in addition to the primary reason, the intermediate reason may further include a secondary reason, a tertiary reason, and the like.
In the present embodiment, the explanation of the respective entity types is as follows:
(a) User experience: visual experience of problems of mobile network users in the experience process, such as voice interruption, business blocking and the like;
(b) The intermediate reasons are as follows: the possible reasons for the user experience cannot be solved by the first-line action, and the intermediate reasons include an appearance reason, a primary reason, a secondary reason (a step of deducing from the appearance reason in several steps), and the like;
(c) The appearance reasons are as follows: preliminary problem classification obtained through problems of user experience cannot be solved through first-line actions, such as downlink coverage problems, downlink quality difference problems and the like;
(d) First (two, three … …) stage reasons: the reason judgment of a plurality of steps is carried out from the apparent reason, but the problem can not be solved through a line action, such as base station cell fault, neighbor cell miss-distribution and the like;
(e) The root cause is: the problem of base station disconnection, antenna feed direction angle and the like can be solved by distributing a line action;
(f) A line action: the root cause can be solved through the action, and then the problems, such as newly built base station, downward dip angle of antenna, and the like, are solved.
Step S103: and extracting the entities and the relations among the entities in each sentence in the knowledge material based on the ontology model to obtain the triplet information for representing the relations among the entities.
In this embodiment, the triplet information is constituted by an "entity-relationship-entity" triplet.
Optionally, the extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model, to obtain the triplet information for representing the relationships between the entities specifically includes:
based on the ontology model, adopting a sequence labeling model to conduct named entity recognition so as to extract entities in each sentence in the knowledge material; the method comprises the steps of,
based on the extracted entities, the relationship classification model R-BERT is utilized to judge the category of the relationship among the entities, and the triplet information used for representing the relationship among the entities is obtained.
In this embodiment, a Pipeline mode is adopted, named entity recognition is performed by using a sequence labeling model, entities in sentences are predicted, then the category of entity relationships is judged by using an R-BERT relationship classification model, and triple information is extracted. The sequence labeling model can be a linear model, a hidden Markov model, a maximum entropy Markov model, a conditional random field, and the like.
Optionally, the identifying named entities by using a sequence labeling model based on the ontology model to extract entities in each sentence in the knowledge material specifically includes:
inputting each sentence in the knowledge material into a sequence labeling model, and respectively labeling a subject and an object in each sentence as entities into a head entity and a tail entity through the sequence labeling model;
and identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
In this embodiment, the sequence labeling model uses the subject and the object in the sentence as entities, and labels the subject and the object as the head entity and the tail entity respectively, the named entity recognition method based on the sequence labeling uses the models of CNN, RNN, BERT and the like to code and characterize the sequence of the text token, then uses a full connection layer to classify each token of the sequence, and finally uses CRF to perform final label judgment and determination. Assuming that the entity class of the dataset is k, for example k is 6, the class includes user experience, intermediate cause, appearance cause, primary cause, root cause, and first-line action, the procedure for named entity identification is as follows:
a) Given a text= "W = 1 ,W 2 ...,W n "token sequence is: "[ W ] 1 ,W 2 ...,W n ]", each token is a short sentence, and n is the length of the text sequence;
b) W obtains embedded representation X E R of sequence through Embedding Layer (Embedding Layer) n*d D represents a vector dimension;
c) Modeling the token sequence by X through a text encoding Layer (Encoder Layer) to obtain a hidden Layer representation H E R of the sequence n*h H represents hidden layer vector dimension;
d) Predicting the entity label of each token through a fully connected classification layer (Classification Layer) to obtain a classification result Logits E R n*h Wherein each row of Logit i ∈Logits∈R k Representing W in text i For each ofA predictive score for the entity tag;
f) Logis calculated by Softmax, and each W i The entity class with the highest corresponding probability is taken as the W i I.e. k-sorting the n token sections.
In this embodiment, the relationship classification, that is, determining the relationship between the entities, for example, the relationship between the entity of the intermediate cause type and the entity of the root cause type is inferred, the relationship between the entity of the root cause type and the entity of the first line action type is assigned, and the specific relationship classification step includes: special marks are inserted before and after the location of the target entity, and then the text is entered into the R-BERT. The positions of the two target entities are found in the output empeddings of the R-BERT model. Their ebeddings and sense codes are used as inputs to the multi-layer neural network classification. The head and tail entities are added with $ and #, respectively. Then, two mask vectors are constructed, only the position of the entity is 1, and the mask of the non-entity part is dropped. The two mask vectors of the output layer are respectively subjected to inner product with sequence_output of the bert, the two vectors are spliced with the pool_output of the bert, and the two vectors are input into a fully-connected network to be subjected to softmax, and the category is output.
Optionally, after extracting the entities in each sentence and the relationships between the entities in the knowledge material based on the ontology model to obtain the triplet information for characterizing the relationships between the entities, the method further includes:
and saving the extracted entities and relations to a Neo4J graph database.
In this embodiment, the extracted entities and relationships are saved to the Neo4J graph database, that is, the triplet information is saved to the Neo4J graph database. Neo4J is a high performance NOSQL graph database that stores structured data on the network rather than in tables, with the advantages of high performance, lightweight, etc.
Step S104: and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph.
In this embodiment, a knowledge graph of the mobile network optimization domain can be constructed according to the relationship between the entities in the triplet information.
Optionally, after the entity in the triplet information is taken as a vertex, the relationship is taken as an edge, and the knowledge graph is constructed, the method further comprises:
and building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and the retrieval of the knowledge graph.
In this embodiment, in order to realize the display and retrieval of the knowledge graph, a knowledge graph visualization and query function interface is built based on the VUE framework, the VUE is a front-end progressive framework, a design of bottom-up incremental development is adopted, an MVVM data binding and combinable component system is provided, a simple and flexible API is provided, responsive data binding and combinable view components can be realized through the simple API, and front-end development efficiency can be improved by adopting the VUE framework.
In a specific embodiment, the purpose of the knowledge graph construction is to construct a knowledge graph composed of "entity-relationship-entity" triples from existing text materials, and the knowledge graph construction method may include the following steps:
(1) And collecting knowledge materials related to mobile network optimization, wherein the knowledge materials comprise unstructured data such as encyclopedia pages disclosed on a network and crawled by a python crawler, historically solved optimization case text files, optimization instruction manual text files, domestic and foreign standard text files and enterprise standard text files and the like, so as to form a basic corpus text file for constructing a knowledge map.
(2) As shown in FIG. 2, an ontology model is constructed which comprises six entity types of user experience, an appearance reason, a first-order reason, an intermediate reason, a root cause and a first-line action, wherein the user experience and the intermediate reason are inferred, the intermediate reason comprises two subclasses (subclaussof) of the appearance reason and the first-order reason, the intermediate reason infers the root cause, and the root cause is assigned to the first-line action. The explanation of the individual entity types is as follows:
(a) User experience: visual experience of problems of mobile network users in the experience process, such as voice interruption, business blocking and the like;
(b) The intermediate reasons are as follows: the possible reasons for the user experience cannot be solved by the first-line action, and the intermediate reasons include an appearance reason, a primary reason, a secondary reason (a step of deducing from the appearance reason in several steps), and the like;
(c) The appearance reasons are as follows: preliminary problem classification obtained through problems of user experience cannot be solved through first-line actions, such as downlink coverage problems, downlink quality difference problems and the like;
(d) First (two, three … …) stage reasons: the reason judgment of a plurality of steps is carried out from the apparent reason, but the problem can not be solved through a line action, such as base station cell fault, neighbor cell miss-distribution and the like;
(e) The root cause is: the problem of base station disconnection, antenna feed direction angle and the like can be solved by distributing a line action;
(f) A line action: the root cause can be solved through the action, and then the problems, such as newly built base station, downward dip angle of antenna, and the like, are solved.
(3) And adopting a Pipeline mode, firstly using a sequence labeling model to carry out named entity identification, predicting the entity in the sentence, then using an R-BERT relationship classification model to judge the category of the entity relationship, and extracting the triplet relationship.
Specifically, named entity recognition is firstly carried out, a sequence labeling model takes a subject and an object in a sentence as entities, the entities are respectively labeled as a head entity and a tail entity, the named entity recognition method based on sequence labeling utilizes CNN, RNN, BERT and other models to code and characterize a text token sequence, then utilizes a full connection layer to classify each token of the sequence, and finally utilizes CRF to carry out final label judgment and determination. Assuming that the entity class of the data set is k, the named entity identification process is as follows:
a) Given a text= "W = 1 ,W 2 ...,W n "token sequence is: "[ W ] 1 ,W 2 ...,W n ]", n is the text sequence length;
b) W obtains embedded representation X E R of sequence through Embedding Layer (Embedding Layer) n*d D represents a vector dimension;
c) Modeling the token sequence by X through a text encoding Layer (Encoder Layer) to obtain a hidden Layer representation H E R of the sequence n*h H represents hiddenLayer vector dimensions;
d) Predicting the entity label of each token through a fully connected classification layer (Classification Layer) to obtain a classification result Logits E R n*h Wherein each row of Logit i ∈Logits∈R k Representing W in text i Predictive scores for each entity tag;
f) Logis calculated by Softmax, and each W i The entity class with the highest corresponding probability is taken as the W i I.e. k-sorting the n token sections.
Next, the relationship classification is performed by inserting special marks before and after the location of the target entity and then inputting the text into the R-BERT. The positions of the two target entities are found in the output empeddings of the R-BERT model. Their ebeddings and sense codes are used as inputs to the multi-layer neural network classification. The head and tail entities are added with $ and #, respectively. Then, two mask vectors are constructed, only the position of the entity is 1, and the mask of the non-entity part is dropped. The two mask vectors of the output layer are respectively subjected to inner product with sequence_output of the bert, the two vectors are spliced with the pool_output of the bert, and the two vectors are input into a fully-connected network to be subjected to softmax, and the category is output.
(4) And storing the extracted entities and relations into a Neo4J graph database, wherein the entities are used as vertexes of the graph database, the relations are used as edges, and knowledge graph visualization and query function interfaces are built based on the VUE framework, so that graph display and retrieval are realized.
Specifically, the knowledge graph construction method can be constructed based on a knowledge graph construction framework, fig. 3 shows the knowledge graph construction framework, and the knowledge graph construction framework comprises seven modules including field knowledge collection, basic platform tools, knowledge storage, knowledge modeling and representation, fusion knowledge base, and vertical capability opening, and covers the technical framework of the whole knowledge graph construction flow and capability opening flow, and the description of each module is as follows:
1) Domain knowledge collection
The knowledge materials in the field of mobile network optimization are mainly concentrated on text materials of various provincial and municipal branches, and the problem to be solved in the field knowledge collection is to analyze the text materials in a concentrated manner and extract the knowledge in the text materials.
2) Basic tool
The basic tool provides support of tool capability for each stage of the whole flow, and comprises a knowledge modeling tool, a knowledge extraction tool, a knowledge reasoning tool, a knowledge fusion tool and the like.
3) Knowledge storage
The knowledge storage provides a database tool of a persistence and access interface of the data structure after the knowledge graph construction is completed, wherein the database tool comprises a primary triplet database, a graph structure database, a relational database and the like, and the requirements of persistence of knowledge graph data and vertical application data are respectively met.
4) Knowledge modeling and representation
The knowledge modeling process constructs the definition of key concepts and relations of knowledge in the field of mobile network optimization, the knowledge extracted from the text is represented as a large number of triples in a serialization manner through semantic technologies RDF, OWL and the like, and the triples are imported into a primary triplet database and a graph structure database for storage.
5) Fusion knowledge base
The fusion knowledge base provides knowledge graph access interfaces in various fields.
6) Vertical capacity opening
The vertical capability development provides an access interface for third party business capability development, and can realize the capability of calling query, reasoning, prediction and the like based on the knowledge graph by combining the requirements of the vertical industry on the basis of fusing the knowledge base.
It should be noted that, the existing knowledge graph construction usually aims at completing the establishment of the graph, lacks a full-flow framework of a complete system for graph construction and lacks a capability opening for application of an upper third party, so the application provides a full-flow technical framework from knowledge collection, graph construction to vertical capability development of the knowledge graph, is suitable for the establishment of the knowledge graph in the field of mobile network optimization, and can realize the capability opening for application of the upper third party.
The method for constructing the knowledge graph provided by the embodiment of the application comprises the steps of firstly acquiring knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph. The application can realize the construction of the complete knowledge graph aiming at the mobile network optimization field, and solves the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially the mobile network optimization field.
Example 2:
as shown in fig. 4, the present embodiment provides a knowledge graph construction apparatus for executing the knowledge graph construction method, where the apparatus includes:
the knowledge material acquisition module 11 is used for acquiring knowledge materials related to mobile network optimization;
the ontology model construction module 12 is connected with the knowledge material acquisition module 11 and is used for constructing ontology models of various entity types based on the knowledge materials;
the entity relation extracting module 13 is connected with the ontology model constructing module 12 and is used for extracting entities in each sentence in the knowledge material and relations among the entities based on the ontology model to obtain triplet information for representing the relations among the entities;
the knowledge graph construction module 14 is connected with the entity relation extraction module 13, and is used for constructing a knowledge graph by taking the entity in the triplet information as a vertex and the relation as an edge.
Optionally, the ontology model building module 12 is specifically configured to:
and constructing an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions based on the knowledge materials.
Optionally, the user experience is used to infer the intermediate cause, the intermediate cause includes two subclasses of the appearance cause and a primary cause, the intermediate cause is used to infer the root cause, and the root cause is used to assign the first-line action.
Optionally, the entity relationship extraction module 13 specifically includes:
the entity extraction unit is used for carrying out named entity identification by adopting a sequence labeling model based on the ontology model so as to extract the entities in each sentence in the knowledge material;
the triplet acquisition unit is used for judging the category of the relation between the entities by utilizing the relation classification model R-BERT based on the extracted entities, and obtaining triplet information for representing the relation between the entities.
Optionally, the entity extraction unit specifically includes:
the labeling unit is used for inputting each sentence in the knowledge material into the sequence labeling model, and labeling the subject and the object in each sentence as entities through the sequence labeling model as a head entity and a tail entity respectively;
and the identification unit is used for identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
Optionally, the apparatus further comprises:
and the storage module is used for storing the extracted entities and relations to the Neo4J graph database.
Optionally, the apparatus further comprises:
and the display and search module is used for building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and search of the knowledge graph.
Example 3:
referring to fig. 5, the present embodiment provides a knowledge graph construction apparatus including a memory 21 and a processor 22, the memory 21 storing a computer program, the processor 22 being configured to run the computer program to perform the knowledge graph construction method in embodiment 1.
The memory 21 is connected to the processor 22, the memory 21 may be a flash memory, a read-only memory, or other memories, and the processor 22 may be a central processing unit or a single chip microcomputer.
Example 4:
the present embodiment provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, implements the knowledge graph construction method in embodiment 1 described above.
Computer-readable storage media include volatile or nonvolatile, removable or non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, computer program modules or other data. Computer-readable storage media includes, but is not limited to, RAM (Random Access Memory ), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read Only Memory, charged erasable programmable Read-Only Memory), flash Memory or other Memory technology, CD-ROM (Compact Disc Read-Only Memory), digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.
In summary, the method, the device and the readable storage medium for constructing the knowledge graph provided by the embodiment of the application firstly acquire knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph. The application can realize the construction of the complete knowledge graph aiming at the mobile network optimization field, and solves the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially the mobile network optimization field.
It is to be understood that the above embodiments are merely illustrative of the application of the principles of the present application, but not in limitation thereof. Various modifications and improvements may be made by those skilled in the art without departing from the spirit and substance of the application, and are also considered to be within the scope of the application.
Claims (10)
1. The method for constructing the knowledge graph is characterized by comprising the following steps of:
acquiring knowledge materials related to mobile network optimization;
constructing ontology models of various entity types based on the knowledge materials;
extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triplet information for representing the relations among the entities;
and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph.
2. The method according to claim 1, wherein the building of the ontology model of a plurality of entity types based on the knowledge material specifically comprises:
and constructing an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions based on the knowledge materials.
3. The method of claim 2, wherein the user experience is used to infer the intermediate cause, wherein the intermediate cause comprises two sub-categories of the apparent cause and a primary cause, wherein the intermediate cause is used to infer the root cause, and wherein the root cause is used to assign the line action.
4. The method according to claim 1, wherein the extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model, to obtain the triplet information for characterizing the relationships between the entities, specifically includes:
based on the ontology model, adopting a sequence labeling model to conduct named entity recognition so as to extract entities in each sentence in the knowledge material; the method comprises the steps of,
based on the extracted entities, the relationship classification model R-BERT is utilized to judge the category of the relationship among the entities, and the triplet information used for representing the relationship among the entities is obtained.
5. The method of claim 4, wherein the identifying named entities using a sequence labeling model based on the ontology model to extract entities in each sentence in knowledge material, specifically comprises:
inputting each sentence in the knowledge material into a sequence labeling model, and respectively labeling a subject and an object in each sentence as entities into a head entity and a tail entity through the sequence labeling model;
and identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
6. The method according to claim 1, wherein after extracting entities and relationships between the entities in each sentence in the knowledge material based on the ontology model, obtaining triplet information for characterizing the relationships between the entities, the method further comprises:
and saving the extracted entities and relations to a Neo4J graph database.
7. The method of claim 1, wherein after constructing a knowledge graph with the entities in the triplet information as vertices and the relationships as edges, the method further comprises:
and building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and the retrieval of the knowledge graph.
8. The knowledge graph construction device is characterized by comprising:
the knowledge material acquisition module is used for acquiring knowledge materials related to mobile network optimization;
the ontology model construction module is connected with the knowledge material acquisition module and used for constructing ontology models of various entity types based on the knowledge materials;
the entity relation extracting module is connected with the ontology model constructing module and is used for extracting the entities in each sentence in the knowledge material and the relation among the entities based on the ontology model to obtain triple information for representing the relation among the entities;
and the knowledge graph construction module is connected with the entity relation extraction module and is used for constructing a knowledge graph by taking the entity in the triplet information as a vertex and the relation as an edge.
9. A knowledge graph construction apparatus comprising a memory and a processor, the memory having a computer program stored therein, the processor being arranged to run the computer program to implement a knowledge graph construction method according to any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the knowledge-graph construction method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310671560.9A CN116701650A (en) | 2023-06-07 | 2023-06-07 | Knowledge graph construction method and device and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310671560.9A CN116701650A (en) | 2023-06-07 | 2023-06-07 | Knowledge graph construction method and device and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116701650A true CN116701650A (en) | 2023-09-05 |
Family
ID=87833436
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310671560.9A Pending CN116701650A (en) | 2023-06-07 | 2023-06-07 | Knowledge graph construction method and device and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116701650A (en) |
-
2023
- 2023-06-07 CN CN202310671560.9A patent/CN116701650A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111914156B (en) | Cross-modal retrieval method and system for self-adaptive label perception graph convolution network | |
CN115564393B (en) | Position recommendation method based on recruitment demand similarity | |
CN110990590A (en) | Dynamic financial knowledge map construction method based on reinforcement learning and transfer learning | |
CN113254659A (en) | File studying and judging method and system based on knowledge graph technology | |
US11620453B2 (en) | System and method for artificial intelligence driven document analysis, including searching, indexing, comparing or associating datasets based on learned representations | |
CN104573130A (en) | Entity resolution method based on group calculation and entity resolution device based on group calculation | |
CN111191051B (en) | Method and system for constructing emergency knowledge map based on Chinese word segmentation technology | |
CN116975256B (en) | Method and system for processing multisource information in construction process of underground factory building of pumped storage power station | |
US20220327492A1 (en) | Ontology-based technology platform for mapping skills, job titles and expertise topics | |
CN116484024A (en) | Multi-level knowledge base construction method based on knowledge graph | |
CN115330268A (en) | Comprehensive emergency command method and system for dealing with mine disaster | |
CN113220901A (en) | Writing concept auxiliary system and network system based on enhanced intelligence | |
CN115238197A (en) | Expert thinking model-based field business auxiliary analysis method | |
CN112632406B (en) | Query method, query device, electronic equipment and storage medium | |
CN117151222A (en) | Domain knowledge guided emergency case entity attribute and relation extraction method thereof, electronic equipment and storage medium | |
CN115292274B (en) | Data warehouse topic model construction method and system | |
CN116226404A (en) | Knowledge graph construction method and knowledge graph system for intestinal-brain axis | |
CN116701650A (en) | Knowledge graph construction method and device and readable storage medium | |
CN115965085A (en) | Ship static attribute reasoning method and system based on knowledge graph technology | |
CN114694098A (en) | Power grid infrastructure construction risk control method based on image recognition and knowledge graph | |
CN114548325A (en) | Zero sample relation extraction method and system based on dual contrast learning | |
CN114969279A (en) | Table text question-answering method based on hierarchical graph neural network | |
CN113849639A (en) | Method and system for constructing theme model categories of urban data warehouse | |
CN113240443A (en) | Entity attribute pair extraction method and system for power customer service question answering | |
CN116702784B (en) | Entity linking method, entity linking device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |