CN113742446A - Knowledge graph question-answering method and system based on path sorting - Google Patents

Knowledge graph question-answering method and system based on path sorting Download PDF

Info

Publication number
CN113742446A
CN113742446A CN202110809041.5A CN202110809041A CN113742446A CN 113742446 A CN113742446 A CN 113742446A CN 202110809041 A CN202110809041 A CN 202110809041A CN 113742446 A CN113742446 A CN 113742446A
Authority
CN
China
Prior art keywords
entity
question
query
path
knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110809041.5A
Other languages
Chinese (zh)
Inventor
李开
邹复好
甘早斌
杨建飞
向文
卢萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202110809041.5A priority Critical patent/CN113742446A/en
Publication of CN113742446A publication Critical patent/CN113742446A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a knowledge graph question-answering method and a knowledge graph question-answering system based on path sequencing, wherein the method comprises the following steps: acquiring entity mentions in a question of a user; linking the entity mentions to entity nodes of the knowledge graph to obtain linked entities; inquiring and acquiring at least one inquiry path and a candidate answer corresponding to each inquiry path in the knowledge graph according to the link entity; acquiring the most appropriate query path according to the similarity between each query path and the question of the user; and acquiring a final answer result of the question of the user based on the candidate answer corresponding to the most proper query path. According to the method, the corresponding link entity is found in the knowledge graph according to the entity mention in the question of the user, then the corresponding candidate answer and the corresponding query path are queried in the knowledge graph according to the link entity, the most appropriate query path is selected from a plurality of query paths, and the corresponding answer result is obtained.

Description

Knowledge graph question-answering method and system based on path sorting
Technical Field
The invention relates to the field of knowledge graph and natural language processing, in particular to a knowledge graph question-answering method and system based on path sequencing.
Background
The question answering is a research field with a long history in computer science, is also a research hotspot in the current natural language processing field, and relates to a plurality of related technologies, including linguistics, deep learning, machine learning and the like. Google has proposed the concept of knowledge-graph and has gained wide acceptance by everyone. In brief, a knowledge graph is an entity or concept that represents the real world using nodes, and the nodes are connected by directed edges, and the directed edges represent various relationships in the real world. The knowledge graph has strong semantic processing capacity, can efficiently manage mass data, and greatly promotes the development of intelligent question answering.
Deep learning is a popular machine learning method in recent years, and has been applied to various industries, and the method for using deep learning for knowledge map question answering has a good effect and gradually becomes a mainstream method for knowledge map question answering. Deep learning for knowledge-graph question answering has mainly two directions: firstly, the accuracy of the traditional semantic analysis, template matching, information extraction and other methods is improved by taking deep learning as a tool. And the other direction is based on deep learning, the question of the user and the knowledge graph are both expressed by adopting vectors, namely, the knowledge graph is embedded, and finally, an end-to-end question-answering system is realized. For example, a knowledge graph embedding mode is adopted to solve a simple problem, and the expression of a head entity, a tail entity and a predicate of the problem is recovered in an embedding space of the knowledge graph, so that a good effect is achieved.
The knowledge-graph question-answer mode based on answer sorting considers the intelligent question-answer question as an information retrieval task, and the requirement on training data is greatly reduced. The general idea is to score and sort the question features and one or more features related to the answers of the question, and output the answer corresponding to the first name with the score exceeding the threshold. The traditional method based on template matching has poor universality, when the content of the knowledge graph changes, the template needs to be expanded, otherwise, the corresponding problem is difficult to cover. The semantic analysis based method requires much manual intervention, and even requires a designer to have certain linguistic knowledge.
Disclosure of Invention
The invention provides a knowledge graph question-answering method and a knowledge graph question-answering system based on path sequencing, aiming at the technical problems in the prior art.
According to a first aspect of the invention, a knowledge-graph question-answering method based on path ordering is provided, which comprises the following steps: identifying a named entity in a question of a user and acquiring entity mention; linking the entity mention to an entity node of a knowledge graph based on the matching degree and the entity popularity of the entity and the question in the user question to obtain a linked entity of the entity mention on the knowledge graph; inquiring in a knowledge graph based on a preset inquiry template according to the link entity to obtain at least one inquiry path and a candidate answer corresponding to each inquiry path; acquiring the most appropriate query path according to the similarity of each query path and the question of the user based on the fusion of the multiple features; and acquiring a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
On the basis of the technical scheme, the invention can be improved as follows.
Optionally, the identifying a named entity in the user question and obtaining an entity reference includes: training the entity recognition model to obtain the trained entity recognition model; identifying a named entity in the question of the user based on the trained entity identification model to obtain an entity reference; wherein, training the entity recognition model to obtain the trained entity recognition model comprises: collecting a plurality of user question sentences, labeling the named entities in each user question sentence, training the entity recognition model based on the plurality of user question sentences and the corresponding named entity labels, and obtaining the trained entity recognition model.
Optionally, the linking the entity mention to the entity node of the knowledge graph based on the matching degree and the entity popularity of the entity and the question in the user question to obtain the linked entity of the entity mention on the knowledge graph includes: when only one entity in the knowledge graph is mentioned with the entity eiIf the two entities are consistent, the entities in the knowledge graph are entity mentions eiLinking entities e in a knowledge graphj(ii) a When synonyms of entities and entity mentions in the knowledge-graph eiWhen the two entities are consistent, the entities in the knowledge graph are used as the link entities ej(ii) a When there are multiple entities in the knowledge-graph and entity mentions eiWhen the questions are consistent with each other, selecting the entity with high matching degree and high popularity with the question of the user as the link entity ejThe popularity of the entity is the frequency of the entity appearing in the knowledge graph and the number of other entities having a one-degree relationship with the entity, and the matching degree of the entity and the problem is the matching degree of the entity type and the problem and the matching degree of the entity two-degree subgraph and the problem; when there is no mention of an entity in the knowledge-graph eiIdentity of entities, and no synonyms of entities with entity mentions eiWhen the entities are consistent, the entities are selected to mention eiEditing the entity with the smallest distance as the link entity ej
Optionally, the querying, according to the link entity, in the knowledge graph based on the preset query template to obtain at least one query path and a candidate answer corresponding to each query path includes: recalling in knowledge graph through preset query template to link entity ejFor a central subgraph, recording each node or relationship in each subgraph as a candidate answer, and recording the slave link entity ejA query path to each candidate answer.
Optionally, the preset query template includes a first-degree query module and a second-degree query template, and the first-degree query template queriesLinking entity e in knowledge graphjDirectly connected nodes, the second-degree query template queries the knowledge graph for the linked entity ejIndirectly connected nodes.
Optionally, the obtaining a most suitable query path according to the similarity between each query path and the question of the user based on the fusion of the multiple features includes: and calculating the similarity between each query path and the question of the user based on the literal matching characteristics of the query path and the question, the semantic matching characteristics of the query path and the question, the matching characteristics of the answer type and the question and the characteristics of the query path, and selecting the most appropriate query path from the multiple query paths.
Optionally, the literal matching features of the query path and the question include a Jaccard distance and an edit distance between two character strings, where the two character strings refer to a character string corresponding to the query path and a character string corresponding to the question; the semantic matching characteristics of the query path and the question are that a BERT language model is adopted to respectively extract semantic vectors of the query path and the question, and a cosine function is adopted to calculate the similarity between the two semantic vectors; the matching characteristics of the answer type and the question are the consistency degree of the answer type and the intention of the question; the query path itself is characterized by the likelihood size of the query path itself, referred to as the answer result.
Optionally, the obtaining a final answer result of the user question based on the candidate answer corresponding to the most suitable query path includes: and carrying out normalization operation on a plurality of candidate answers corresponding to the most proper query path, reserving all answer results with different meanings, and reserving only one answer result with the same meaning.
Optionally, whether the two answer results are the same or not is determined by the following method: calculating the similarity between the two answer results, wherein the similarity comprises the literal quantity similarity and the semantic similarity, and the literal quantity similarity comprises the Jaccard distance or the editing distance between the two answer results; if the similarity between the two answer results is greater than the threshold similarity, the two answer results have the same meaning; otherwise, the two answer results in different meanings.
According to a second aspect of the present invention, there is provided a path-ranking-based knowledge-graph question-answering system, comprising: the identification module is used for identifying the named entity in the question of the user and acquiring the entity mention; the first acquisition module is used for linking the entity mention to an entity node of a knowledge graph based on the matching degree and the entity popularity of the entity and the question in the question of the user, and acquiring a linked entity of the entity mention on the knowledge graph; the query module is used for querying in the knowledge graph based on a preset query template according to the link entity to obtain at least one query path and candidate answers corresponding to each query path; the second acquisition module is used for acquiring the most appropriate query path according to the similarity of each query path and the question of the user based on the fusion of the multiple features; and the third obtaining module is used for obtaining a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
According to a third aspect of the present invention, there is provided an electronic device comprising a memory, a processor for implementing the steps of the path-ordering based knowledge-graph question-answering method when executing a computer management class program stored in the memory.
According to a fourth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer management-like program, which when executed by a processor, performs the steps of the path-based ranking-based knowledge-graph question-answering method.
The invention provides a knowledge graph question-answering method and a knowledge graph question-answering system based on path sequencing.
Drawings
FIG. 1 is a flow chart of a knowledge-graph question-answering method based on path ranking according to the present invention;
FIG. 2 is a schematic diagram of a query path;
FIG. 3 is a schematic overall flow chart of a knowledge-graph question-answer based on path ranking according to the present invention;
FIG. 4 is a schematic structural diagram of a knowledge-graph question-answering system based on path ranking according to the present invention;
FIG. 5 is a schematic diagram of a hardware structure of a possible electronic device provided in the present invention;
fig. 6 is a schematic diagram of a hardware structure of a possible computer-readable storage medium according to the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Fig. 1 is a flowchart of a knowledge-graph question-answering method based on path ranking, as shown in fig. 1, the method includes: 101. identifying a named entity in a question of a user and acquiring entity mention; 102. linking the entity mention to an entity node of a knowledge graph based on the matching degree and the entity popularity of the entity and the question in the user question to obtain a linked entity of the entity mention on the knowledge graph; 103. inquiring in a knowledge graph based on a preset inquiry template according to the link entity to obtain at least one inquiry path and a candidate answer corresponding to each inquiry path; 104. acquiring the most appropriate query path according to the similarity of each query path and the question of the user based on the fusion of the multiple features; 105. and acquiring a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
It can be understood that, based on the above defects or improvement requirements of the prior art, the invention provides a general knowledge graph question-answering method based on path sequencing, so that the technical problems of poor universality and more manual intervention existing in the conventional knowledge graph question-answering method are solved, and a specific application scenario is to find a correct and appropriate answer in a complex knowledge graph according to the problems of a user.
Specifically, according to an input user question, a named entity is identified from the input user question, entity mentions are further obtained, the entity mentions are linked to entity nodes in the knowledge graph, and the entity nodes in the knowledge graph are called as linked entities. According to the link entity, based on the query template, query in the knowledge graph can be carried out, and a plurality of query paths and candidate answers corresponding to each query path can be obtained. And selecting a most suitable query path from the plurality of query paths, and obtaining a final answer result according to the plurality of candidate answers of the most suitable query path.
According to the method, the corresponding link entity is found in the knowledge graph according to the entity mention in the question of the user, then the corresponding candidate answer and the corresponding query path are queried in the knowledge graph according to the link entity, the most appropriate query path is selected from a plurality of query paths, and the corresponding answer result is obtained.
In a possible embodiment, identifying a named entity in a user question, and obtaining an entity reference includes: training the entity recognition model to obtain the trained entity recognition model; identifying a named entity in the question of the user based on the trained entity identification model to obtain an entity reference; wherein, training the entity recognition model to obtain the trained entity recognition model comprises: collecting a plurality of user question sentences, labeling the named entities in each user question sentence, training the entity recognition model based on the plurality of user question sentences and the corresponding named entity labels, and obtaining the trained entity recognition model.
It is understood that BERT-CRF (Bidirectional Encoder responses from transform) is used&Conditional Random Field) entity recognition model carries out named entity recognition on an input user question to obtain entity mentions e in the questioni
The training process of the entity recognition model comprises the steps of collecting user question sentences, labeling named entities in the user question sentences, and inputting the user question sentences and labels into the entity recognition model for training.
In a possible embodiment mode, the entity mentions are linked to the entity nodes of the knowledge graph based on the matching degree of the entity and the question in the question of the user and the popularity of the entity, and the linked entity of the entity mentions on the knowledge graph is obtained, which includes various conditions:
(1) when only one entity in the knowledge graph is mentioned with the entity eiIf the two entities are consistent, the entities in the knowledge graph are entity mentions eiLinking entities e in a knowledge graphj
(2) When synonyms of entities and entity mentions in the knowledge-graph eiWhen the two entities are consistent, the entities in the knowledge graph are used as the link entities ej
(3) When there are multiple entities in the knowledge-graph and entity mentions eiWhen the questions are consistent with each other, selecting the entity with high matching degree and high popularity with the question of the user as the link entity ej. The entity popularity is the frequency of the entity appearing in the knowledge graph and the number of other entities having a one-degree relationship with the entity, wherein the one-degree relationship means that two nodes are directly connected. The matching degree of the entity and the problem is the matching degree of the entity type and the problem and the matching degree of the entity two-degree subgraph and the problem.
(4) When there is no mention of an entity in the knowledge-graph eiIdentity of entities, and no synonyms of entities with entity mentions eiWhen the entities are consistent, the entities are selected to mention eiEditing the entity with the smallest distance as the link entity ejAnd the editing distance between the two entities refers to the distance between the character strings corresponding to the two entities.
In a possible embodiment, querying in the knowledge graph based on a preset query template according to the link entity to obtain at least one query path and a candidate answer corresponding to each query path includes: recalling in knowledge graph through preset query template to link entity ejFor a central subgraph, recording each node or relationship in each subgraph as a candidate answer, and recording the slave link entity ejQuery path to each candidate answer。
It can be understood that, in the above embodiment, the link entity is found in the knowledge graph, and in this step, the candidate query path and the corresponding candidate answer are obtained by querying in the knowledge graph based on the preset query template according to the link entity.
In a possible embodiment mode, the preset query template comprises a first-degree query module and a second-degree query template, and the first-degree query template queries the link entity e in the knowledge graphjDirectly connected nodes, two-degree query template query knowledge graph and link entity ejIndirectly connected nodes.
It can be understood that the query templates in the embodiment of the present invention include a first-degree query template and a second-degree query template, which can be seen in fig. 2, where the first-degree query template refers to two nodes directly connected, such as the a and b nodes with sequence numbers 1 and 2 in fig. 2, where the a and b nodes with sequence number 1, and for the a node, r is1For the exit path, the node a and node b with sequence number 2, for node a, r1Is an entrance path.
The two-degree query template refers to an entity which is indirectly connected with a link entity by taking the link entity as a center and querying, for example, a node a is taken as a center in sequence numbers 3, 4 and 5 in fig. 2, and a node c is a node of which the node a has a two-degree relationship.
In a possible embodiment, obtaining the most suitable query path according to the similarity between each query path and the question of the user based on the fusion of multiple features includes: and calculating the similarity between each query path and the question of the user based on the literal matching characteristics of the query path and the question, the semantic matching characteristics of the query path and the question, the matching characteristics of the answer type and the question and the characteristics of the query path, and selecting the most appropriate query path from the multiple query paths.
It can be understood that, in the above steps, a plurality of candidate query paths and corresponding candidate answers are obtained according to the query template. In the step, all candidate query paths and the question of the user are compared and ranked based on the similarity of multiple feature fusions, the most appropriate query path is obtained, and the corresponding answer result is returned.
The plurality of features referred to herein include literal matching features of the query path and the question, semantic matching features of the query path and the question, matching features of the answer type and the question, and features of the query path itself.
The character face matching characteristics of the query path and the question comprise a Jaccard Distance and an editing Distance between two character strings, wherein the two character strings are the character string corresponding to the query path and the character string corresponding to the question, and the editing Distance (Edit Distance), also called Levenshtein Distance, is the minimum number of editing operations required for converting one character string into the other character string. Permitted editing operations include replacing one character with another, inserting one character, and deleting one character. Generally, the smaller the edit distance, the greater the similarity of the two strings.
The semantic matching characteristics of the query path and the question are that the query path and the semantic vector of the question are respectively extracted by adopting a BERT language model, and the similarity between the two semantic vectors is calculated by adopting a cosine function.
The matching characteristics of the answer type and the question are the degree of coincidence between the answer type and the intention of the question, such as the intention including people, places, time, quantity and the like. For the question "who invented the car? ", the intent of the question is to query the person, so the type of answer entity should also be the person.
The query path is characterized by a possibility that the query path is called an answer result, for example, a first-degree path is more likely to be an answer than a second-degree path, and an outgoing path is more likely to be an answer than an incoming path.
And comprehensively calculating the similarity between each query path and the question of the user based on the literal matching characteristics of each query path and the question, the semantic matching characteristics of the query path and the question, the matching characteristics of the answer type and the question and the characteristics of the query path, and selecting the query path with the highest similarity as the most appropriate query path corresponding to the question of the user.
In a possible embodiment, obtaining a final answer result of a question from a user based on a candidate answer corresponding to a most suitable query path includes: and carrying out normalization operation on a plurality of candidate answers corresponding to the most proper query path, reserving all answer results with different meanings, and reserving only one answer result with the same meaning.
It is understood that the most suitable candidate answers to the query path may include multiple answers, that is, there may be multiple answers to the same path, and normalization is required, and the results with different meanings are all retained, for example, if the human profession is scientist, biologist, inventor, or inventor, they are all retained. Only one result is retained for a plurality of results with the same meaning, such as Beijing and Beijing (one of the four direct prefectures of China), and only one result is retained for a plurality of results with different meanings.
Wherein, whether the two answer results are the same meaning is judged by the following method: calculating the similarity between the two answer results, wherein the similarity comprises the word size similarity and the semantic similarity, and the word size similarity comprises the Jaccard distance or the editing distance between the two answer results; if the similarity between the two answer results is greater than the threshold similarity, the two answer results have the same meaning; otherwise, the two answer results in different meanings.
Referring to fig. 3, a specific example of the method for learning-graph question-answering based on path ranking according to the present invention is shown, where a user question "who is written by a stone? "the entity in the question is identified by the entity identification module to refer to" stone notes ", and the corresponding graph entities may be < stone notes (uk movie in 1972) >, < dream of red house (qing dynasty long story) >, < stone notes (2015 chinese shooting tv drama) >, < stone notes (damming party singing song) >. These entities are referred to herein as candidate entities.
Then, in the entity link module, the candidate entities are scored based on the characteristics of the synonym dictionary, the matching degree of the entities and the problems, the popularity of the entities and the like, the writing in the question is more inclined to the novel, and the popularity of the entity < dream of red building (Qing generation long story novel) > in the knowledge graph is much higher than that of other entities, so that the selected link entity is < dream of red building (Qing generation long story novel) >.
Next, based on the compiled query template, querying information related to < dream of red mansions (qing dynasty long story) > in the knowledge graph, and recording a query path and a corresponding answer result, which are called a candidate query path and a candidate answer, specifically using a template as shown in fig. 2, wherein a node a is a link entity, as can be seen from fig. 2, the template mainly comprises a first-degree query template and a second-degree query template, wherein the first-degree query template comprises an incoming direction and an outgoing direction, and the second-degree query template comprises a template in which the ignoring direction is the incoming direction, so that excessive relation of recalled subgraphs is prevented, and the effect of sorting subsequent paths is reduced.
Then, in a path sorting module, multi-feature fusion similarity comparison sorting is carried out on the candidate paths and the questions, wherein the similarity comparison sorting comprises literal matching features of the paths and the questions, semantic matching features of the paths and the questions, matching features of answer types and the questions and features of the candidate paths. The question "who the stone remembers was written? "from a number of features, the question is intended to be a person, so the answer type should be a person," author "and" who written? The semantic similarity score is higher, so that the score is highest in a path matching module (author, sponish), and the end point is returned as a final answer result.
Finally, at the result normalization module, since the same path corresponds to multiple results, but the results are not semantically identical and thus will all be preserved, "who is the stone written? The final answer to this question is "osprey, Caocheng".
Fig. 4 is a structure diagram of a knowledge-graph question-answering system based on path ranking according to an embodiment of the present invention, and as shown in fig. 4, a knowledge-graph question-answering system based on path ranking includes an identification module 401, a first obtaining module 402, a query module 403, a second obtaining module 404, and a third obtaining module 405, where:
the identification module 401 is configured to identify a named entity in a user question and obtain an entity reference; a first obtaining module 402, configured to link the entity mention to an entity node of a knowledge graph based on a matching degree of the entity and a question in a question of a user and an entity popularity, and obtain a linked entity of the entity mention on the knowledge graph; the query module 403 is configured to query, according to the link entity, in the knowledge graph based on a preset query template to obtain at least one query path and candidate answers corresponding to each query path; a second obtaining module 404, configured to obtain a most appropriate query path according to similarity between each query path and a question of the user based on multiple feature fusion; a third obtaining module 405, configured to obtain a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
It can be understood that the knowledge-graph question-answering system based on the path sorting provided by the invention corresponds to the knowledge-graph question-answering method based on the path sorting provided by the foregoing embodiments, and the relevant technical features of the knowledge-graph question-answering system based on the path sorting may refer to the relevant technical features of the knowledge-graph question-answering method based on the path sorting, and are not described herein again.
Referring to fig. 5, fig. 5 is a schematic view of an embodiment of an electronic device according to an embodiment of the invention. As shown in fig. 5, an embodiment of the present invention provides an electronic device 500, which includes a memory 510, a processor 520, and a computer program 511 stored in the memory 510 and executable on the processor 520, wherein the processor 520 executes the computer program 511 to implement the following steps: identifying a named entity in a question of a user and acquiring entity mention; linking the entity mention to an entity node of a knowledge graph based on the matching degree and the entity popularity of the entity and the question in the user question to obtain a linked entity of the entity mention on the knowledge graph; inquiring in a knowledge graph based on a preset inquiry template according to the link entity to obtain at least one inquiry path and a candidate answer corresponding to each inquiry path; acquiring the most appropriate query path according to the similarity of each query path and the question of the user based on the fusion of the multiple features; and acquiring a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
Referring to fig. 6, fig. 6 is a schematic diagram of an embodiment of a computer-readable storage medium according to the present invention. As shown in fig. 6, the present embodiment provides a computer-readable storage medium 600 having a computer program 611 stored thereon, the computer program 611, when executed by a processor, implementing the steps of: identifying a named entity in a question of a user and acquiring entity mention; linking the entity mention to an entity node of a knowledge graph based on the matching degree and the entity popularity of the entity and the question in the user question to obtain a linked entity of the entity mention on the knowledge graph; inquiring in a knowledge graph based on a preset inquiry template according to the link entity to obtain at least one inquiry path and a candidate answer corresponding to each inquiry path; acquiring the most appropriate query path according to the similarity of each query path and the question of the user based on the fusion of the multiple features; and acquiring a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
The knowledge graph question-answering method and system based on path sequencing provided by the embodiment of the invention have the following beneficial effects:
(1) the method comprises the steps of finding a corresponding link entity in a knowledge graph according to entity mentions in a user question, inquiring a corresponding candidate answer and a corresponding inquiry path in the knowledge graph according to the link entity, selecting the most appropriate inquiry path from a plurality of inquiry paths, and obtaining a corresponding answer result.
(2) And the synonym dictionary is used, the matching degree of the problem and the entity type and the popularity characteristic of the entity complete entity linking on the named entity, and the accuracy of entity linking is improved.
(3) And comparing and sequencing the query path and the question of the user based on the similarity of fusion of various features to obtain the most appropriate query path, wherein the features comprise literal matching, numerical statistic matching and semantic matching, and the semantic matching is particularly important for the question and answer of the general knowledge graph and is beneficial to improving the universality of the question and answer.
It should be noted that, in the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to relevant descriptions of other embodiments for parts that are not described in detail in a certain embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A knowledge graph question-answering method based on path sequencing is characterized by comprising the following steps:
identifying a named entity in a question of a user and acquiring entity mention;
linking the entity mention to an entity node of a knowledge graph based on the matching degree and the entity popularity of the entity and the question in the user question to obtain a linked entity of the entity mention on the knowledge graph;
inquiring in a knowledge graph based on a preset inquiry template according to the link entity to obtain at least one inquiry path and a candidate answer corresponding to each inquiry path;
acquiring the most appropriate query path according to the similarity of each query path and the question of the user based on the fusion of the multiple features;
and acquiring a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
2. The knowledge-graph question-answering method according to claim 1, wherein the identifying named entities in the user question and obtaining entity mentions comprises:
training the entity recognition model to obtain the trained entity recognition model;
identifying a named entity in the question of the user based on the trained entity identification model to obtain an entity reference;
wherein, training the entity recognition model to obtain the trained entity recognition model comprises:
collecting a plurality of user question sentences, labeling the named entities in each user question sentence, training the entity recognition model based on the plurality of user question sentences and the corresponding named entity labels, and obtaining the trained entity recognition model.
3. The method for questioning and answering a knowledge graph according to claim 1 or 2, wherein the step of linking the entity mention to the entity node of the knowledge graph based on the matching degree of the entity and the question in the user question and the popularity of the entity obtains the linked entity of the entity mention on the knowledge graph comprises the following steps:
when only one entity in the knowledge graph is mentioned with the entity eiIf the two entities are consistent, the entities in the knowledge graph are entity mentions eiLinking entities e in a knowledge graphj
When synonyms of entities and entity mentions in the knowledge-graph eiWhen the two entities are consistent, the entities in the knowledge graph are used as the link entities ej
When there are multiple entities in the knowledge-graph and entity mentions eiWhen the questions are consistent with each other, selecting the entity with high matching degree and high popularity with the question of the user as the link entity ejThe popularity of the entity is the frequency of the entity appearing in the knowledge graph and the number of other entities having a one-degree relationship with the entity, and the matching degree of the entity and the problem is the matching degree of the entity type and the problem and the matching degree of the entity two-degree subgraph and the problem;
when there is no mention of an entity in the knowledge-graph eiIdentity of entities, and no synonyms of entities with entity mentions eiWhen the entities are consistent, the entities are selected to mention eiEditing the entity with the smallest distance as the link entity ej
4. The knowledge-graph question-answering method according to claim 1, wherein the querying in the knowledge graph based on a preset query template according to the link entity to obtain at least one query path and a candidate answer corresponding to each query path comprises:
recalling in knowledge graph through preset query template to link entity ejFor a central subgraph, recording each node or relationship in each subgraph as a candidate answer, and recording the slave link entity ejA query path to each candidate answer.
5. The knowledge-graph question-answering method according to claim 4, wherein the preset query template comprises a first-degree query module and a second-degree query template, and the first-degree query template queries a knowledge graph for a linked entity ejDirectly connected nodes, the second-degree query template queries the knowledge graph for the linked entity ejIndirectly connected nodes.
6. The knowledge-graph question-answering method according to claim 1, wherein the obtaining of the most appropriate query path according to the similarity of each query path and the user question based on the fusion of a plurality of features comprises:
and calculating the similarity between each query path and the question of the user based on the literal matching characteristics of the query path and the question, the semantic matching characteristics of the query path and the question, the matching characteristics of the answer type and the question and the characteristics of the query path, and selecting the most appropriate query path from the multiple query paths.
7. The knowledge-graph question-answering method according to claim 6,
the literal matching characteristics of the query path and the question comprise a Jaccard distance and an editing distance between two character strings, wherein the two character strings are the character string corresponding to the query path and the character string corresponding to the question;
the semantic matching characteristics of the query path and the question are that a BERT language model is adopted to respectively extract semantic vectors of the query path and the question, and a cosine function is adopted to calculate the similarity between the two semantic vectors;
the matching characteristics of the answer type and the question are the consistency degree of the answer type and the intention of the question;
the query path itself is characterized by the likelihood size of the query path itself, referred to as the answer result.
8. The knowledge-graph question-answering method according to claim 1, wherein the obtaining of the final answer result of the user question based on the candidate answer corresponding to the most suitable query path includes:
and carrying out normalization operation on a plurality of candidate answers corresponding to the most proper query path, reserving all answer results with different meanings, and reserving only one answer result with the same meaning.
9. The knowledge-graph question answering method according to claim 8, wherein it is determined whether two answer results are the same meaning by:
calculating the similarity between the two answer results, wherein the similarity comprises the literal quantity similarity and the semantic similarity, and the literal quantity similarity comprises the Jaccard distance or the editing distance between the two answer results;
if the similarity between the two answer results is greater than the threshold similarity, the two answer results have the same meaning; otherwise, the two answer results in different meanings.
10. A knowledge-graph answer system based on path ranking, comprising:
the identification module is used for identifying the named entity in the question of the user and acquiring the entity mention;
the first acquisition module is used for linking the entity mention to an entity node of a knowledge graph based on the matching degree and the entity popularity of the entity and the question in the question of the user, and acquiring a linked entity of the entity mention on the knowledge graph;
the query module is used for querying in the knowledge graph based on a preset query template according to the link entity to obtain at least one query path and candidate answers corresponding to each query path;
the second acquisition module is used for acquiring the most appropriate query path according to the similarity of each query path and the question of the user based on the fusion of the multiple features;
and the third obtaining module is used for obtaining a final answer result of the question of the user based on the candidate answer corresponding to the most appropriate query path.
CN202110809041.5A 2021-07-16 2021-07-16 Knowledge graph question-answering method and system based on path sorting Pending CN113742446A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110809041.5A CN113742446A (en) 2021-07-16 2021-07-16 Knowledge graph question-answering method and system based on path sorting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110809041.5A CN113742446A (en) 2021-07-16 2021-07-16 Knowledge graph question-answering method and system based on path sorting

Publications (1)

Publication Number Publication Date
CN113742446A true CN113742446A (en) 2021-12-03

Family

ID=78728728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110809041.5A Pending CN113742446A (en) 2021-07-16 2021-07-16 Knowledge graph question-answering method and system based on path sorting

Country Status (1)

Country Link
CN (1) CN113742446A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114168756A (en) * 2022-01-29 2022-03-11 浙江口碑网络技术有限公司 Query understanding method and apparatus for search intention, storage medium, and electronic device
CN114564599A (en) * 2022-04-28 2022-05-31 中科雨辰科技有限公司 Retrieval system based on query string template
CN115982338A (en) * 2023-02-24 2023-04-18 中国测绘科学研究院 Query path ordering-based domain knowledge graph question-answering method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837550A (en) * 2019-11-11 2020-02-25 中山大学 Knowledge graph-based question and answer method and device, electronic equipment and storage medium
CN112328766A (en) * 2020-11-10 2021-02-05 四川长虹电器股份有限公司 Knowledge graph question-answering method and device based on path search
CN112650840A (en) * 2020-12-04 2021-04-13 天津泰凡科技有限公司 Intelligent medical question-answering processing method and system based on knowledge graph reasoning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837550A (en) * 2019-11-11 2020-02-25 中山大学 Knowledge graph-based question and answer method and device, electronic equipment and storage medium
CN112328766A (en) * 2020-11-10 2021-02-05 四川长虹电器股份有限公司 Knowledge graph question-answering method and device based on path search
CN112650840A (en) * 2020-12-04 2021-04-13 天津泰凡科技有限公司 Intelligent medical question-answering processing method and system based on knowledge graph reasoning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114168756A (en) * 2022-01-29 2022-03-11 浙江口碑网络技术有限公司 Query understanding method and apparatus for search intention, storage medium, and electronic device
CN114168756B (en) * 2022-01-29 2022-05-13 浙江口碑网络技术有限公司 Query understanding method and device for search intention, storage medium and electronic device
CN114564599A (en) * 2022-04-28 2022-05-31 中科雨辰科技有限公司 Retrieval system based on query string template
CN115982338A (en) * 2023-02-24 2023-04-18 中国测绘科学研究院 Query path ordering-based domain knowledge graph question-answering method and system

Similar Documents

Publication Publication Date Title
CN110399457B (en) Intelligent question answering method and system
CN111259653B (en) Knowledge graph question-answering method, system and terminal based on entity relationship disambiguation
CN107436864B (en) Chinese question-answer semantic similarity calculation method based on Word2Vec
JP6309644B2 (en) Method, system, and storage medium for realizing smart question answer
CN112035730B (en) Semantic retrieval method and device and electronic equipment
CN110276071B (en) Text matching method and device, computer equipment and storage medium
CN111368048B (en) Information acquisition method, information acquisition device, electronic equipment and computer readable storage medium
CN110188147B (en) Knowledge graph-based document entity relationship discovery method and system
CN115292469B (en) Question-answering method combining paragraph search and machine reading understanding
CN103218436B (en) A kind of Similar Problems search method and device that merges class of subscriber label
CN113742446A (en) Knowledge graph question-answering method and system based on path sorting
CN111078837A (en) Intelligent question and answer information processing method, electronic equipment and computer readable storage medium
CN112328800A (en) System and method for automatically generating programming specification question answers
CN110795544B (en) Content searching method, device, equipment and storage medium
CN112364132A (en) Similarity calculation model and system based on dependency syntax and method for building system
CN113157885A (en) Efficient intelligent question-answering system for knowledge in artificial intelligence field
CN112632250A (en) Question and answer method and system under multi-document scene
CN112131881A (en) Information extraction method and device, electronic equipment and storage medium
CN111241248A (en) Synonymy question generation model training method and system and synonymy question generation method
CN117539990A (en) Problem processing method and device, electronic equipment and storage medium
Hassani et al. LVTIA: A new method for keyphrase extraction from scientific video lectures
CN117725183A (en) Reordering method and device for improving retrieval performance of AI large language model
CN105631032B (en) Question and answer Knowledge Base, the apparatus and system recommended based on abstract semantics
CN110750632B (en) Improved Chinese ALICE intelligent question-answering method and system
CN113886535B (en) Knowledge graph-based question and answer method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination