CN116701650A - Knowledge graph construction method and device and readable storage medium - Google Patents

Knowledge graph construction method and device and readable storage medium Download PDF

Info

Publication number
CN116701650A
CN116701650A CN202310671560.9A CN202310671560A CN116701650A CN 116701650 A CN116701650 A CN 116701650A CN 202310671560 A CN202310671560 A CN 202310671560A CN 116701650 A CN116701650 A CN 116701650A
Authority
CN
China
Prior art keywords
knowledge
entities
entity
knowledge graph
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310671560.9A
Other languages
Chinese (zh)
Inventor
王昭宁
朱佳佳
程新洲
乔金剑
吕非彼
刘亮
只璐
狄子翔
肖天
成晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202310671560.9A priority Critical patent/CN116701650A/en
Publication of CN116701650A publication Critical patent/CN116701650A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a knowledge graph construction method, a knowledge graph construction device and a readable storage medium, wherein the knowledge graph construction method comprises the following steps: acquiring knowledge materials related to mobile network optimization; constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triplet information for representing the relations among the entities; and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph. The method, the device and the readable storage medium can solve the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially cannot be directly applied to the field of mobile network optimization.

Description

Knowledge graph construction method and device and readable storage medium
Technical Field
The present application relates to the field of mobile network optimization, and in particular, to a method and apparatus for constructing a knowledge graph, and a readable storage medium.
Background
In the field of mobile network optimization, one of core technologies for intelligent diagnosis of network problems occurring in a use process of a mobile network user is to construct a knowledge graph of problem diagnosis. However, the existing knowledge graph construction method cannot be directly applied to the construction of a knowledge graph in a specific field, especially in the field where mobile network optimization is heavy and the diagnosis of the cause of network problems occurring to mobile users is strongly dependent on expert experience, so that it is highly desirable to provide a knowledge graph construction method for the field of mobile network optimization.
Disclosure of Invention
The technical problem to be solved by the application is to provide a knowledge graph construction method, a device and a readable storage medium aiming at the defects of the prior art, so as to solve the problem that the prior knowledge graph construction method cannot be directly applied to the specific field, especially cannot be directly applied to the field of mobile network optimization.
In a first aspect, the present application provides a method for constructing a knowledge graph, where the method includes:
acquiring knowledge materials related to mobile network optimization;
constructing ontology models of various entity types based on the knowledge materials;
extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triplet information for representing the relations among the entities;
and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph.
Further, the building of the ontology model of the plurality of entity types based on the knowledge materials specifically includes:
and constructing an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions based on the knowledge materials.
Further, the user experience is used to infer the intermediate cause, which includes two subclasses of the apparent cause and the primary cause, the intermediate cause is used to infer the root cause, which is used to assign the first-line action.
Further, the extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model to obtain the triplet information for representing the relationships between the entities specifically includes:
based on the ontology model, adopting a sequence labeling model to conduct named entity recognition so as to extract entities in each sentence in the knowledge material; the method comprises the steps of,
based on the extracted entities, the relationship classification model R-BERT is utilized to judge the category of the relationship among the entities, and the triplet information used for representing the relationship among the entities is obtained.
Further, the identifying of named entities based on the ontology model by using a sequence labeling model to extract entities in each sentence in the knowledge material specifically includes:
inputting each sentence in the knowledge material into a sequence labeling model, and respectively labeling a subject and an object in each sentence as entities into a head entity and a tail entity through the sequence labeling model;
and identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
Further, after extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model to obtain the triplet information for representing the relationships between the entities, the method further includes:
and saving the extracted entities and relations to a Neo4J graph database.
Further, after the entity in the triplet information is taken as a vertex, the relation is taken as an edge, and the knowledge graph is constructed, the method further comprises:
and building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and the retrieval of the knowledge graph.
In a second aspect, the present application provides a knowledge graph construction apparatus, including:
the knowledge material acquisition module is used for acquiring knowledge materials related to mobile network optimization;
the ontology model construction module is connected with the knowledge material acquisition module and used for constructing ontology models of various entity types based on the knowledge materials;
the entity relation extracting module is connected with the ontology model constructing module and is used for extracting the entities in each sentence in the knowledge material and the relation among the entities based on the ontology model to obtain triple information for representing the relation among the entities;
and the knowledge graph construction module is connected with the entity relation extraction module and is used for constructing a knowledge graph by taking the entity in the triplet information as a vertex and the relation as an edge.
In a third aspect, the present application provides a knowledge graph construction apparatus, comprising a memory and a processor, the memory storing a computer program, the processor being configured to run the computer program to implement the knowledge graph construction method according to the first aspect.
In a fourth aspect, the present application provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the knowledge graph construction method according to the first aspect.
The application provides a knowledge graph construction method, a knowledge graph construction device and a readable storage medium. Firstly, acquiring knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph. The application can realize the construction of the complete knowledge graph aiming at the mobile network optimization field, and solves the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially the mobile network optimization field.
Drawings
FIG. 1 is a flow chart of a knowledge graph construction method in embodiment 1 of the present application;
FIG. 2 is a schematic diagram of an embodiment of an ontology model;
FIG. 3 is a schematic diagram of a knowledge graph construction framework according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a knowledge graph construction device according to embodiment 2 of the present application;
fig. 5 is a schematic structural diagram of a knowledge graph construction device according to embodiment 3 of the present application.
Detailed Description
In order to make the technical scheme of the present application better understood by those skilled in the art, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.
It is to be understood that the specific embodiments and figures described herein are merely illustrative of the application, and are not limiting of the application.
It is to be understood that the various embodiments of the application and the features of the embodiments may be combined with each other without conflict.
It is to be understood that only the portions relevant to the present application are shown in the drawings for convenience of description, and the portions irrelevant to the present application are not shown in the drawings.
It should be understood that each unit and module in the embodiments of the present application may correspond to only one physical structure, may be formed by a plurality of physical structures, or may be integrated into one physical structure.
It will be appreciated that, without conflict, the functions and steps noted in the flowcharts and block diagrams of the present application may occur out of the order noted in the figures.
It is to be understood that the flowcharts and block diagrams of the present application illustrate the architecture, functionality, and operation of possible implementations of systems, apparatuses, devices, methods according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a unit, module, segment, code, or the like, which comprises executable instructions for implementing the specified functions. Moreover, each block or combination of blocks in the block diagrams and flowchart illustrations can be implemented by hardware-based systems that perform the specified functions, or by combinations of hardware and computer instructions.
It should be understood that the units and modules related in the embodiments of the present application may be implemented by software, or may be implemented by hardware, for example, the units and modules may be located in a processor.
Summary of the application
At present, the field of mobile network optimization mainly depends on experience judgment of network optimization experts, and tasks such as root cause diagnosis, operation issuing and the like of problems are realized through a manual mode. The traditional network diagnosis mode has the problems of long period, high cost, low efficiency and the like. With the development of artificial intelligence technology and the successful application of technologies such as deep learning in the fields of image recognition, voice processing, chess games and the like, the intelligent diagnosis of communication network problems based on novel technologies such as deep learning, artificial intelligence and the like becomes a new research hotspot.
In the field of mobile network optimization, one of core technologies for intelligent diagnosis of network problems occurring in a use process of a mobile network user is to construct a knowledge graph of problem diagnosis. The knowledge graph main technology comprises knowledge collection, knowledge extraction, knowledge fusion and knowledge reasoning. The knowledge collection integrates the existing knowledge materials into a format and a type which can be identified and processed, the knowledge extraction technology identifies key information from various types of data, the core content hidden in the data is mined, and the entity, the relationship and the attribute are constructed. Classical knowledge extraction methods include rule and template based extraction methods, data analysis statistics based extraction methods, machine learning based extraction methods, and the like. Knowledge fusion mainly solves the problems of inconsistent concept and relation, ambiguity and the like, and solves the problems of definition conflict, repeated content, unknown reference, disordered layers and the like faced by merging all local knowledge bases into one integral knowledge base. In the knowledge fusion process, a plurality of technologies such as clustering, similarity analysis, probability statistical analysis and the like are adopted, and finally a compact, clear and complete global knowledge base is constructed. The goal of knowledge reasoning is to obtain new knowledge or related conclusions through a series of methods, common methods including description logic based reasoning, graph structure based reasoning, statistical rule based reasoning, probabilistic logic based reasoning, etc.
However, the existing knowledge graph construction method cannot be directly applied to the construction of a knowledge graph in a specific field, especially in the field where mobile network optimization is heavy and the diagnosis of the cause of network problems occurring to mobile users is strongly dependent on expert experience, so that it is highly desirable to provide a knowledge graph construction method for the field of mobile network optimization.
Aiming at the technical problems, the application provides a knowledge graph construction method, a knowledge graph construction device and a readable storage medium, wherein the knowledge graph construction method, the knowledge graph construction device and the readable storage medium firstly acquire knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph, thereby realizing the construction of the complete knowledge graph aiming at the field of mobile network optimization.
Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.
Example 1:
the embodiment provides a method for constructing a knowledge graph, as shown in fig. 1, including:
step S101: and acquiring knowledge materials related to mobile network optimization.
In this embodiment, knowledge materials related to mobile network optimization include unstructured data such as encyclopedia pages published on the network and crawled by python crawlers, historically solved optimization case text files, optimization instruction manual text files, domestic and foreign standard text files and enterprise standard text files, and the like, so as to form a basic corpus text file for constructing a knowledge map.
Step S102: and constructing an ontology model of a plurality of entity types based on the knowledge materials.
Specifically, an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions is constructed based on the knowledge materials.
As shown in fig. 2, a structural schematic diagram of an ontology model is shown, wherein the user experience is used to infer the intermediate cause, the intermediate cause includes two sub-categories (sub-coskof) of the apparent cause and a first-level cause, the intermediate cause is used to infer the root cause, and the root cause is used to assign the first-line action. It should be noted that, in addition to the primary reason, the intermediate reason may further include a secondary reason, a tertiary reason, and the like.
In the present embodiment, the explanation of the respective entity types is as follows:
(a) User experience: visual experience of problems of mobile network users in the experience process, such as voice interruption, business blocking and the like;
(b) The intermediate reasons are as follows: the possible reasons for the user experience cannot be solved by the first-line action, and the intermediate reasons include an appearance reason, a primary reason, a secondary reason (a step of deducing from the appearance reason in several steps), and the like;
(c) The appearance reasons are as follows: preliminary problem classification obtained through problems of user experience cannot be solved through first-line actions, such as downlink coverage problems, downlink quality difference problems and the like;
(d) First (two, three … …) stage reasons: the reason judgment of a plurality of steps is carried out from the apparent reason, but the problem can not be solved through a line action, such as base station cell fault, neighbor cell miss-distribution and the like;
(e) The root cause is: the problem of base station disconnection, antenna feed direction angle and the like can be solved by distributing a line action;
(f) A line action: the root cause can be solved through the action, and then the problems, such as newly built base station, downward dip angle of antenna, and the like, are solved.
Step S103: and extracting the entities and the relations among the entities in each sentence in the knowledge material based on the ontology model to obtain the triplet information for representing the relations among the entities.
In this embodiment, the triplet information is constituted by an "entity-relationship-entity" triplet.
Optionally, the extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model, to obtain the triplet information for representing the relationships between the entities specifically includes:
based on the ontology model, adopting a sequence labeling model to conduct named entity recognition so as to extract entities in each sentence in the knowledge material; the method comprises the steps of,
based on the extracted entities, the relationship classification model R-BERT is utilized to judge the category of the relationship among the entities, and the triplet information used for representing the relationship among the entities is obtained.
In this embodiment, a Pipeline mode is adopted, named entity recognition is performed by using a sequence labeling model, entities in sentences are predicted, then the category of entity relationships is judged by using an R-BERT relationship classification model, and triple information is extracted. The sequence labeling model can be a linear model, a hidden Markov model, a maximum entropy Markov model, a conditional random field, and the like.
Optionally, the identifying named entities by using a sequence labeling model based on the ontology model to extract entities in each sentence in the knowledge material specifically includes:
inputting each sentence in the knowledge material into a sequence labeling model, and respectively labeling a subject and an object in each sentence as entities into a head entity and a tail entity through the sequence labeling model;
and identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
In this embodiment, the sequence labeling model uses the subject and the object in the sentence as entities, and labels the subject and the object as the head entity and the tail entity respectively, the named entity recognition method based on the sequence labeling uses the models of CNN, RNN, BERT and the like to code and characterize the sequence of the text token, then uses a full connection layer to classify each token of the sequence, and finally uses CRF to perform final label judgment and determination. Assuming that the entity class of the dataset is k, for example k is 6, the class includes user experience, intermediate cause, appearance cause, primary cause, root cause, and first-line action, the procedure for named entity identification is as follows:
a) Given a text= "W = 1 ,W 2 ...,W n "token sequence is: "[ W ] 1 ,W 2 ...,W n ]", each token is a short sentence, and n is the length of the text sequence;
b) W obtains embedded representation X E R of sequence through Embedding Layer (Embedding Layer) n*d D represents a vector dimension;
c) Modeling the token sequence by X through a text encoding Layer (Encoder Layer) to obtain a hidden Layer representation H E R of the sequence n*h H represents hidden layer vector dimension;
d) Predicting the entity label of each token through a fully connected classification layer (Classification Layer) to obtain a classification result Logits E R n*h Wherein each row of Logit i ∈Logits∈R k Representing W in text i For each ofA predictive score for the entity tag;
f) Logis calculated by Softmax, and each W i The entity class with the highest corresponding probability is taken as the W i I.e. k-sorting the n token sections.
In this embodiment, the relationship classification, that is, determining the relationship between the entities, for example, the relationship between the entity of the intermediate cause type and the entity of the root cause type is inferred, the relationship between the entity of the root cause type and the entity of the first line action type is assigned, and the specific relationship classification step includes: special marks are inserted before and after the location of the target entity, and then the text is entered into the R-BERT. The positions of the two target entities are found in the output empeddings of the R-BERT model. Their ebeddings and sense codes are used as inputs to the multi-layer neural network classification. The head and tail entities are added with $ and #, respectively. Then, two mask vectors are constructed, only the position of the entity is 1, and the mask of the non-entity part is dropped. The two mask vectors of the output layer are respectively subjected to inner product with sequence_output of the bert, the two vectors are spliced with the pool_output of the bert, and the two vectors are input into a fully-connected network to be subjected to softmax, and the category is output.
Optionally, after extracting the entities in each sentence and the relationships between the entities in the knowledge material based on the ontology model to obtain the triplet information for characterizing the relationships between the entities, the method further includes:
and saving the extracted entities and relations to a Neo4J graph database.
In this embodiment, the extracted entities and relationships are saved to the Neo4J graph database, that is, the triplet information is saved to the Neo4J graph database. Neo4J is a high performance NOSQL graph database that stores structured data on the network rather than in tables, with the advantages of high performance, lightweight, etc.
Step S104: and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph.
In this embodiment, a knowledge graph of the mobile network optimization domain can be constructed according to the relationship between the entities in the triplet information.
Optionally, after the entity in the triplet information is taken as a vertex, the relationship is taken as an edge, and the knowledge graph is constructed, the method further comprises:
and building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and the retrieval of the knowledge graph.
In this embodiment, in order to realize the display and retrieval of the knowledge graph, a knowledge graph visualization and query function interface is built based on the VUE framework, the VUE is a front-end progressive framework, a design of bottom-up incremental development is adopted, an MVVM data binding and combinable component system is provided, a simple and flexible API is provided, responsive data binding and combinable view components can be realized through the simple API, and front-end development efficiency can be improved by adopting the VUE framework.
In a specific embodiment, the purpose of the knowledge graph construction is to construct a knowledge graph composed of "entity-relationship-entity" triples from existing text materials, and the knowledge graph construction method may include the following steps:
(1) And collecting knowledge materials related to mobile network optimization, wherein the knowledge materials comprise unstructured data such as encyclopedia pages disclosed on a network and crawled by a python crawler, historically solved optimization case text files, optimization instruction manual text files, domestic and foreign standard text files and enterprise standard text files and the like, so as to form a basic corpus text file for constructing a knowledge map.
(2) As shown in FIG. 2, an ontology model is constructed which comprises six entity types of user experience, an appearance reason, a first-order reason, an intermediate reason, a root cause and a first-line action, wherein the user experience and the intermediate reason are inferred, the intermediate reason comprises two subclasses (subclaussof) of the appearance reason and the first-order reason, the intermediate reason infers the root cause, and the root cause is assigned to the first-line action. The explanation of the individual entity types is as follows:
(a) User experience: visual experience of problems of mobile network users in the experience process, such as voice interruption, business blocking and the like;
(b) The intermediate reasons are as follows: the possible reasons for the user experience cannot be solved by the first-line action, and the intermediate reasons include an appearance reason, a primary reason, a secondary reason (a step of deducing from the appearance reason in several steps), and the like;
(c) The appearance reasons are as follows: preliminary problem classification obtained through problems of user experience cannot be solved through first-line actions, such as downlink coverage problems, downlink quality difference problems and the like;
(d) First (two, three … …) stage reasons: the reason judgment of a plurality of steps is carried out from the apparent reason, but the problem can not be solved through a line action, such as base station cell fault, neighbor cell miss-distribution and the like;
(e) The root cause is: the problem of base station disconnection, antenna feed direction angle and the like can be solved by distributing a line action;
(f) A line action: the root cause can be solved through the action, and then the problems, such as newly built base station, downward dip angle of antenna, and the like, are solved.
(3) And adopting a Pipeline mode, firstly using a sequence labeling model to carry out named entity identification, predicting the entity in the sentence, then using an R-BERT relationship classification model to judge the category of the entity relationship, and extracting the triplet relationship.
Specifically, named entity recognition is firstly carried out, a sequence labeling model takes a subject and an object in a sentence as entities, the entities are respectively labeled as a head entity and a tail entity, the named entity recognition method based on sequence labeling utilizes CNN, RNN, BERT and other models to code and characterize a text token sequence, then utilizes a full connection layer to classify each token of the sequence, and finally utilizes CRF to carry out final label judgment and determination. Assuming that the entity class of the data set is k, the named entity identification process is as follows:
a) Given a text= "W = 1 ,W 2 ...,W n "token sequence is: "[ W ] 1 ,W 2 ...,W n ]", n is the text sequence length;
b) W obtains embedded representation X E R of sequence through Embedding Layer (Embedding Layer) n*d D represents a vector dimension;
c) Modeling the token sequence by X through a text encoding Layer (Encoder Layer) to obtain a hidden Layer representation H E R of the sequence n*h H represents hiddenLayer vector dimensions;
d) Predicting the entity label of each token through a fully connected classification layer (Classification Layer) to obtain a classification result Logits E R n*h Wherein each row of Logit i ∈Logits∈R k Representing W in text i Predictive scores for each entity tag;
f) Logis calculated by Softmax, and each W i The entity class with the highest corresponding probability is taken as the W i I.e. k-sorting the n token sections.
Next, the relationship classification is performed by inserting special marks before and after the location of the target entity and then inputting the text into the R-BERT. The positions of the two target entities are found in the output empeddings of the R-BERT model. Their ebeddings and sense codes are used as inputs to the multi-layer neural network classification. The head and tail entities are added with $ and #, respectively. Then, two mask vectors are constructed, only the position of the entity is 1, and the mask of the non-entity part is dropped. The two mask vectors of the output layer are respectively subjected to inner product with sequence_output of the bert, the two vectors are spliced with the pool_output of the bert, and the two vectors are input into a fully-connected network to be subjected to softmax, and the category is output.
(4) And storing the extracted entities and relations into a Neo4J graph database, wherein the entities are used as vertexes of the graph database, the relations are used as edges, and knowledge graph visualization and query function interfaces are built based on the VUE framework, so that graph display and retrieval are realized.
Specifically, the knowledge graph construction method can be constructed based on a knowledge graph construction framework, fig. 3 shows the knowledge graph construction framework, and the knowledge graph construction framework comprises seven modules including field knowledge collection, basic platform tools, knowledge storage, knowledge modeling and representation, fusion knowledge base, and vertical capability opening, and covers the technical framework of the whole knowledge graph construction flow and capability opening flow, and the description of each module is as follows:
1) Domain knowledge collection
The knowledge materials in the field of mobile network optimization are mainly concentrated on text materials of various provincial and municipal branches, and the problem to be solved in the field knowledge collection is to analyze the text materials in a concentrated manner and extract the knowledge in the text materials.
2) Basic tool
The basic tool provides support of tool capability for each stage of the whole flow, and comprises a knowledge modeling tool, a knowledge extraction tool, a knowledge reasoning tool, a knowledge fusion tool and the like.
3) Knowledge storage
The knowledge storage provides a database tool of a persistence and access interface of the data structure after the knowledge graph construction is completed, wherein the database tool comprises a primary triplet database, a graph structure database, a relational database and the like, and the requirements of persistence of knowledge graph data and vertical application data are respectively met.
4) Knowledge modeling and representation
The knowledge modeling process constructs the definition of key concepts and relations of knowledge in the field of mobile network optimization, the knowledge extracted from the text is represented as a large number of triples in a serialization manner through semantic technologies RDF, OWL and the like, and the triples are imported into a primary triplet database and a graph structure database for storage.
5) Fusion knowledge base
The fusion knowledge base provides knowledge graph access interfaces in various fields.
6) Vertical capacity opening
The vertical capability development provides an access interface for third party business capability development, and can realize the capability of calling query, reasoning, prediction and the like based on the knowledge graph by combining the requirements of the vertical industry on the basis of fusing the knowledge base.
It should be noted that, the existing knowledge graph construction usually aims at completing the establishment of the graph, lacks a full-flow framework of a complete system for graph construction and lacks a capability opening for application of an upper third party, so the application provides a full-flow technical framework from knowledge collection, graph construction to vertical capability development of the knowledge graph, is suitable for the establishment of the knowledge graph in the field of mobile network optimization, and can realize the capability opening for application of the upper third party.
The method for constructing the knowledge graph provided by the embodiment of the application comprises the steps of firstly acquiring knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph. The application can realize the construction of the complete knowledge graph aiming at the mobile network optimization field, and solves the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially the mobile network optimization field.
Example 2:
as shown in fig. 4, the present embodiment provides a knowledge graph construction apparatus for executing the knowledge graph construction method, where the apparatus includes:
the knowledge material acquisition module 11 is used for acquiring knowledge materials related to mobile network optimization;
the ontology model construction module 12 is connected with the knowledge material acquisition module 11 and is used for constructing ontology models of various entity types based on the knowledge materials;
the entity relation extracting module 13 is connected with the ontology model constructing module 12 and is used for extracting entities in each sentence in the knowledge material and relations among the entities based on the ontology model to obtain triplet information for representing the relations among the entities;
the knowledge graph construction module 14 is connected with the entity relation extraction module 13, and is used for constructing a knowledge graph by taking the entity in the triplet information as a vertex and the relation as an edge.
Optionally, the ontology model building module 12 is specifically configured to:
and constructing an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions based on the knowledge materials.
Optionally, the user experience is used to infer the intermediate cause, the intermediate cause includes two subclasses of the appearance cause and a primary cause, the intermediate cause is used to infer the root cause, and the root cause is used to assign the first-line action.
Optionally, the entity relationship extraction module 13 specifically includes:
the entity extraction unit is used for carrying out named entity identification by adopting a sequence labeling model based on the ontology model so as to extract the entities in each sentence in the knowledge material;
the triplet acquisition unit is used for judging the category of the relation between the entities by utilizing the relation classification model R-BERT based on the extracted entities, and obtaining triplet information for representing the relation between the entities.
Optionally, the entity extraction unit specifically includes:
the labeling unit is used for inputting each sentence in the knowledge material into the sequence labeling model, and labeling the subject and the object in each sentence as entities through the sequence labeling model as a head entity and a tail entity respectively;
and the identification unit is used for identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
Optionally, the apparatus further comprises:
and the storage module is used for storing the extracted entities and relations to the Neo4J graph database.
Optionally, the apparatus further comprises:
and the display and search module is used for building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and search of the knowledge graph.
Example 3:
referring to fig. 5, the present embodiment provides a knowledge graph construction apparatus including a memory 21 and a processor 22, the memory 21 storing a computer program, the processor 22 being configured to run the computer program to perform the knowledge graph construction method in embodiment 1.
The memory 21 is connected to the processor 22, the memory 21 may be a flash memory, a read-only memory, or other memories, and the processor 22 may be a central processing unit or a single chip microcomputer.
Example 4:
the present embodiment provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, implements the knowledge graph construction method in embodiment 1 described above.
Computer-readable storage media include volatile or nonvolatile, removable or non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, computer program modules or other data. Computer-readable storage media includes, but is not limited to, RAM (Random Access Memory ), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read Only Memory, charged erasable programmable Read-Only Memory), flash Memory or other Memory technology, CD-ROM (Compact Disc Read-Only Memory), digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.
In summary, the method, the device and the readable storage medium for constructing the knowledge graph provided by the embodiment of the application firstly acquire knowledge materials related to mobile network optimization; then constructing ontology models of various entity types based on the knowledge materials; extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triple information for representing the relations among the entities; and finally, taking the entity in the triplet information as a vertex, taking the relation as an edge, and constructing a knowledge graph. The application can realize the construction of the complete knowledge graph aiming at the mobile network optimization field, and solves the problem that the existing knowledge graph construction method cannot be directly applied to the specific field, especially the mobile network optimization field.
It is to be understood that the above embodiments are merely illustrative of the application of the principles of the present application, but not in limitation thereof. Various modifications and improvements may be made by those skilled in the art without departing from the spirit and substance of the application, and are also considered to be within the scope of the application.

Claims (10)

1. The method for constructing the knowledge graph is characterized by comprising the following steps of:
acquiring knowledge materials related to mobile network optimization;
constructing ontology models of various entity types based on the knowledge materials;
extracting entities and relations among the entities in each sentence in the knowledge material based on the ontology model to obtain triplet information for representing the relations among the entities;
and taking the entity in the triplet information as a vertex, and taking the relation as an edge to construct a knowledge graph.
2. The method according to claim 1, wherein the building of the ontology model of a plurality of entity types based on the knowledge material specifically comprises:
and constructing an ontology model comprising six entity types including user experience, intermediate reasons, appearance reasons, primary reasons, root reasons and first-line actions based on the knowledge materials.
3. The method of claim 2, wherein the user experience is used to infer the intermediate cause, wherein the intermediate cause comprises two sub-categories of the apparent cause and a primary cause, wherein the intermediate cause is used to infer the root cause, and wherein the root cause is used to assign the line action.
4. The method according to claim 1, wherein the extracting the entities and the relationships between the entities in each sentence in the knowledge material based on the ontology model, to obtain the triplet information for characterizing the relationships between the entities, specifically includes:
based on the ontology model, adopting a sequence labeling model to conduct named entity recognition so as to extract entities in each sentence in the knowledge material; the method comprises the steps of,
based on the extracted entities, the relationship classification model R-BERT is utilized to judge the category of the relationship among the entities, and the triplet information used for representing the relationship among the entities is obtained.
5. The method of claim 4, wherein the identifying named entities using a sequence labeling model based on the ontology model to extract entities in each sentence in knowledge material, specifically comprises:
inputting each sentence in the knowledge material into a sequence labeling model, and respectively labeling a subject and an object in each sentence as entities into a head entity and a tail entity through the sequence labeling model;
and identifying the head entity and the tail entity by adopting a named entity identification method based on sequence labeling based on the ontology model so as to extract the entities in each sentence in the knowledge material.
6. The method according to claim 1, wherein after extracting entities and relationships between the entities in each sentence in the knowledge material based on the ontology model, obtaining triplet information for characterizing the relationships between the entities, the method further comprises:
and saving the extracted entities and relations to a Neo4J graph database.
7. The method of claim 1, wherein after constructing a knowledge graph with the entities in the triplet information as vertices and the relationships as edges, the method further comprises:
and building a visual display interface and a query function interface of the knowledge graph based on the progressive frame VUE frame so as to realize the display and the retrieval of the knowledge graph.
8. The knowledge graph construction device is characterized by comprising:
the knowledge material acquisition module is used for acquiring knowledge materials related to mobile network optimization;
the ontology model construction module is connected with the knowledge material acquisition module and used for constructing ontology models of various entity types based on the knowledge materials;
the entity relation extracting module is connected with the ontology model constructing module and is used for extracting the entities in each sentence in the knowledge material and the relation among the entities based on the ontology model to obtain triple information for representing the relation among the entities;
and the knowledge graph construction module is connected with the entity relation extraction module and is used for constructing a knowledge graph by taking the entity in the triplet information as a vertex and the relation as an edge.
9. A knowledge graph construction apparatus comprising a memory and a processor, the memory having a computer program stored therein, the processor being arranged to run the computer program to implement a knowledge graph construction method according to any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the knowledge-graph construction method according to any one of claims 1-7.
CN202310671560.9A 2023-06-07 2023-06-07 Knowledge graph construction method and device and readable storage medium Pending CN116701650A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310671560.9A CN116701650A (en) 2023-06-07 2023-06-07 Knowledge graph construction method and device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310671560.9A CN116701650A (en) 2023-06-07 2023-06-07 Knowledge graph construction method and device and readable storage medium

Publications (1)

Publication Number Publication Date
CN116701650A true CN116701650A (en) 2023-09-05

Family

ID=87833436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310671560.9A Pending CN116701650A (en) 2023-06-07 2023-06-07 Knowledge graph construction method and device and readable storage medium

Country Status (1)

Country Link
CN (1) CN116701650A (en)

Similar Documents

Publication Publication Date Title
CN111914156B (en) Cross-modal retrieval method and system for self-adaptive label perception graph convolution network
CN115564393B (en) Position recommendation method based on recruitment demand similarity
CN110990590A (en) Dynamic financial knowledge map construction method based on reinforcement learning and transfer learning
CN113254659A (en) File studying and judging method and system based on knowledge graph technology
US11620453B2 (en) System and method for artificial intelligence driven document analysis, including searching, indexing, comparing or associating datasets based on learned representations
CN104573130A (en) Entity resolution method based on group calculation and entity resolution device based on group calculation
CN111191051B (en) Method and system for constructing emergency knowledge map based on Chinese word segmentation technology
CN116975256B (en) Method and system for processing multisource information in construction process of underground factory building of pumped storage power station
US20220327492A1 (en) Ontology-based technology platform for mapping skills, job titles and expertise topics
CN116484024A (en) Multi-level knowledge base construction method based on knowledge graph
CN115330268A (en) Comprehensive emergency command method and system for dealing with mine disaster
CN113220901A (en) Writing concept auxiliary system and network system based on enhanced intelligence
CN115238197A (en) Expert thinking model-based field business auxiliary analysis method
CN112632406B (en) Query method, query device, electronic equipment and storage medium
CN117151222A (en) Domain knowledge guided emergency case entity attribute and relation extraction method thereof, electronic equipment and storage medium
CN115292274B (en) Data warehouse topic model construction method and system
CN116226404A (en) Knowledge graph construction method and knowledge graph system for intestinal-brain axis
CN116701650A (en) Knowledge graph construction method and device and readable storage medium
CN115965085A (en) Ship static attribute reasoning method and system based on knowledge graph technology
CN114694098A (en) Power grid infrastructure construction risk control method based on image recognition and knowledge graph
CN114548325A (en) Zero sample relation extraction method and system based on dual contrast learning
CN114969279A (en) Table text question-answering method based on hierarchical graph neural network
CN113849639A (en) Method and system for constructing theme model categories of urban data warehouse
CN113240443A (en) Entity attribute pair extraction method and system for power customer service question answering
CN116702784B (en) Entity linking method, entity linking device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination