CN112035627B - Automatic question and answer method, device, equipment and storage medium - Google Patents

Automatic question and answer method, device, equipment and storage medium Download PDF

Info

Publication number
CN112035627B
CN112035627B CN202010731550.6A CN202010731550A CN112035627B CN 112035627 B CN112035627 B CN 112035627B CN 202010731550 A CN202010731550 A CN 202010731550A CN 112035627 B CN112035627 B CN 112035627B
Authority
CN
China
Prior art keywords
question
knowledge
natural language
expression
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010731550.6A
Other languages
Chinese (zh)
Other versions
CN112035627A (en
Inventor
傅向华
杨静莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Technology University
Original Assignee
Shenzhen Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Technology University filed Critical Shenzhen Technology University
Priority to CN202010731550.6A priority Critical patent/CN112035627B/en
Publication of CN112035627A publication Critical patent/CN112035627A/en
Application granted granted Critical
Publication of CN112035627B publication Critical patent/CN112035627B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention is applicable to the technical field of computers and provides an automatic question-answering method, an automatic question-answering device, automatic question-answering equipment and a storage medium, wherein the automatic question-answering method comprises the following steps: acquiring a natural language question, searching knowledge information of the natural language question in a preset question-answering library based on a knowledge graph, inputting the natural language question and the knowledge information into a pre-trained automatic question-answering model, and encoding, fusing and decoding the natural language question and the knowledge information through the automatic question-answering model to obtain an answer of the natural language question output by the automatic question-answering model, so that the accuracy of automatic question-answering is effectively improved.

Description

Automatic question and answer method, device, equipment and storage medium
Technical Field
The invention belongs to the fields of natural language processing, machine learning and artificial intelligence in the technical field of computers, and particularly relates to an automatic question-answering method, device, equipment and storage medium.
Background
The automatic question-answering technology relates to natural language processing, machine learning, artificial intelligence and the like, and can automatically analyze questions presented by a user in a natural language mode and return corresponding answers to the user.
For automatic question and answer technology, researchers have proposed various types of automatic question and answer models, for example, based on sequence-to-sequence (Sequence to Sequence network, seq2 seq). However, most of the current automatic question-answering models face the problem of lack of knowledge, so that the automatic question-answering models cannot accurately understand questions, and the obtained answers are low in accuracy.
Disclosure of Invention
The invention aims to provide an automatic question-answering method, an automatic question-answering device, automatic question-answering equipment and a storage medium, and aims to solve the problem that the accuracy of answers in automatic question-answering is low because the prior art cannot provide an effective automatic question-answering method.
In one aspect, the present invention provides an automatic question-answering method, the method comprising the steps of:
acquiring a natural language question;
searching knowledge information of the natural language question in a preset question-answering library based on the knowledge graph;
and inputting the natural language question and the knowledge information into a pre-trained automatic question-answer model to obtain an answer of the natural language question output by the automatic question-answer model.
In another aspect, the present invention provides an automatic question answering apparatus, the apparatus comprising:
the question acquisition unit is used for acquiring natural language questions;
the knowledge searching unit is used for searching knowledge information of the natural language question in a preset question-answer library based on the knowledge graph; and
and the answer generating unit is used for inputting the natural language question and the knowledge information into a pre-trained automatic question-answer model to obtain the answer of the natural language question output by the automatic question-answer model.
In another aspect, the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the steps of the automatic question-answering method are implemented when the processor executes the computer program.
In another aspect, the present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the automatic question-answering method described above.
According to the invention, natural language questions are acquired, knowledge information of the natural language questions is searched in the preset knowledge-graph-based question-answering library, the natural language questions and the knowledge information are input into the automatic question-answering model trained in advance, and answers of the natural language questions output by the automatic question-answering model are obtained, so that fusion of the natural language questions and the knowledge information in the automatic question-answering is realized by searching the knowledge information of the natural language questions in the knowledge-graph-based question-answering library and inputting the natural language questions and the knowledge information into the field question-answering model, the knowledge shortage problem in the automatic question-answering is effectively solved, the knowledge information acquisition efficiency is improved, and the accuracy and the efficiency of the automatic question-answering are further improved.
Drawings
FIG. 1 is a flowchart of an automatic question-answering method according to an embodiment of the present invention;
fig. 2 is a flowchart of an implementation of an automatic question-answering method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an automatic answering device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The following describes in detail the implementation of the present invention in connection with specific embodiments:
embodiment one:
fig. 1 shows a flow of implementing an automatic question-answering method according to a first embodiment of the present invention, and for convenience of explanation, only the parts related to the embodiment of the present invention are shown, which are described in detail below:
in step S101, a natural language question is acquired.
The embodiment of the invention is suitable for electronic equipment such as computers, servers, tablet computers, smart phones and the like.
In the embodiment of the invention, the natural language question input by the user, or the natural language question sent by the terminal equipment, or the natural language question acquired in advance can be acquired so as to solve the acquired natural language question. Wherein, the natural language refers to languages such as Chinese, english or Japanese, and the natural language question refers to a question expressed by natural language. For example, the natural language question is "what is the medication for children to catch cold? ". Wherein, the number of the natural language questions is more than or equal to 1, and when the number of the natural language questions is more than one, the automatic answering can be carried out on the natural language questions respectively.
In step S102, knowledge information of a natural language question is searched in a preset knowledge-graph-based question-answer library.
In the embodiment of the invention, the knowledge graph comprises a piece of knowledge, each piece of knowledge can be expressed as a main Predicate-Object (SPO) triplet, wherein, the main subjects and the objects represent the entities, and the predicates represent the relationship between the two entities, so that each piece of knowledge can be further understood as the following form: entity-entity relationship-entity. The knowledge-graph-based question-answer library comprises a plurality of preset entity types, a plurality of preset entity relation types and a plurality of question types, so that the knowledge-graph-based question-answer library comprises a plurality of pieces of knowledge. According to the natural language question sentences in the question-answering library based on the knowledge graph, corresponding knowledge can be searched in the knowledge graph, knowledge information of the natural language question sentences can be obtained through the knowledge, and the knowledge information obtaining efficiency is effectively improved.
By way of example, taking a knowledge-based medical question and answer library as an example, the types of entities in the knowledge-based medical question and answer library may include "diagnostic examination items" (e.g., bronchography, arthroscopy), "medical subjects" (e.g., internal medicine, cosmetic orthopedics), "diseases," "medicines," etc.; the types of entity relationships in the knowledge-graph-based pharmaceutical question and answer library may include "belong to" (e.g., XX symptoms belong to XX family), "disease commonly used drugs" (e.g., XX disease commonly used XX drugs), "disease eating-prohibited foods" (e.g., XX disease eating-prohibited XX foods); the question types in the knowledge-graph-based medical question-answering library may include "query for disease symptoms" (e.g., question types of "which of the symptoms of XX disease" are "query for disease symptoms"), "query for disease etiology" (e.g., question types of "why XX disease will occur" are "query for disease etiology").
In the embodiment of the invention, knowledge information related to natural language questions can be searched in a question-answer library based on the knowledge graph through database query sentences. The number of knowledge information of the natural language question is one or more, and each knowledge information can be a sentence or a paragraph. For example, a natural language question "what medicine is used for treating cold and fever in young children" is referred to as XX medicine when the symptom of cold in young children is XX, and knowledge information "what medicine is used for fever in young children" are referred to as XX medicine when the knowledge information related to "what medicine is used for fever in young children" are searched simultaneously in a knowledge map-based question-answering library.
In step S103, the natural language question and knowledge information are input into an automatic question-answer model trained in advance, and the natural language question and knowledge information are encoded, fused and decoded through the automatic question-answer model, so as to obtain the answer of the natural language question output by the automatic question-answer model.
In the embodiment of the invention, after the knowledge information of the natural language question is obtained, the knowledge information of the natural language question can be used as the external extension knowledge of the natural language question so as to supplement knowledge to the natural language question and solve the problem of knowledge shortage in the natural question-answering process.
In the embodiment of the invention, natural language questions and knowledge information of the natural language questions can be input into the automatic question-answering model. In the automatic question-answering model, the natural language question and the knowledge information are respectively encoded to obtain a question representation of the natural language question and a knowledge representation of the knowledge information, the question representation and the knowledge representation are fused to obtain a question representation fused with the knowledge representation, and the question representation fused with the knowledge representation is decoded to obtain an answer of the natural language question. The question of the natural language question is expressed as a word vector sequence of the natural language question, knowledge of knowledge information is expressed as a word vector sequence of the knowledge information, and the automatic question-answering model is a deep neural network model.
In the embodiment of the invention, the knowledge information of the natural language question is searched in the question-answering library based on the knowledge graph, the natural language question and the knowledge information are encoded, fused and decoded through the automatic question-answering model trained in advance, and the answers of the natural language question outputted by the automatic question-answering model are obtained, so that the knowledge information is obtained based on the question-answering library of the knowledge graph, the knowledge information obtaining efficiency is improved, the answers are obtained by processing the natural language question and the knowledge information through the automatic question-answering model, the problem of lack of knowledge faced by automatic question-answering is solved, and the accuracy and the efficiency of automatic question-answering are further improved.
Embodiment two:
fig. 2 shows a flow of implementing the automatic question-answering method according to the second embodiment of the present invention, and for convenience of explanation, only the parts related to the embodiment of the present invention are shown, and the details are as follows:
in step S201, a natural language question is acquired.
In the embodiment of the present invention, step S201 may refer to the detailed description of step S101, and will not be described again.
In step S202, knowledge information of a natural language question is searched in a preset knowledge-graph-based question-answer library.
In the embodiment of the invention, the knowledge graph comprises a piece of knowledge, and each piece of knowledge can be expressed as a main predicate triple and can be further understood as the following form: entity-entity relationship-entity. The knowledge-graph-based question-answer library comprises a plurality of preset entity types, a plurality of preset entity relation types and a plurality of question types, so that the knowledge-graph-based question-answer library comprises a plurality of pieces of knowledge.
In the embodiment of the invention, knowledge information related to natural language questions can be searched in a question-answer library based on the knowledge graph through database query sentences. The number of knowledge information of the natural language question is one or more, and each knowledge information can be a sentence or a paragraph.
In the embodiment of the invention, in the knowledge graph-based question-and-answer library, in the process of searching knowledge information related to natural language questions through database query sentences, preset category words in the knowledge graph question-and-answer library can be acquired first, the acquired category words are matched with the natural language questions, and question categories to which the natural language questions belong are determined, wherein the question categories to which the natural language questions belong can be one or more, for example, the question categories of whether a cold is likely to belong to question types of inquiring disease symptoms or question types of inquiring what needs to be checked by the natural language questions are determined. And extracting keywords and subject matters from the natural language question according to preset keywords and subject matters in a question-answer library based on the knowledge graph, for example, extracting keywords and subject matters by adopting an AC automaton (Aho-Corasick automation) algorithm, wherein the keywords and the subject matters are important words in sentences, but the subject matters are more important than the keywords, and the natural language question generally comprises few subject matters and a plurality of keywords, for example, in a natural language question of 'what medicine is used for treating cough and fever of a young child', the 'cough' and 'fever' are the subject matters in the natural language question, and 'young child', 'medicine' and 'treatment' are the keywords in the natural language question. After the question type, the subject word and the key word of the natural language question are obtained, a database query sentence can be formed by the question type, the subject word and the key word of the natural language question, and knowledge information of the natural language question is searched in a question-answer base based on a knowledge graph, so that the knowledge information is obtained by carrying out semantic analysis on the natural language question and then querying according to the question type, the subject word and the key word obtained by semantic analysis, the complexity of knowledge information query is reduced, and the knowledge information obtaining efficiency is improved.
In a possible implementation manner, the question-answer library based on the knowledge graph is a graph database, so that the storage effect and the searching efficiency of the knowledge graph are improved in a graph database mode. For example, the knowledge-graph-based question-answer library is a graph database neo4j, and the database lookup statement may be a structured cypher query statement. For another example, the knowledge-based question-answer library may employ an existing knowledge-based medical question-answer library.
In step S203, the natural language question is encoded by an encoder in the automatic question-answering model, and a question representation is obtained.
In the embodiment of the invention, the natural language question is vectorized firstly, the vectorized natural language question is input into the encoder in the automatic question-answering model, and the vectorized natural language question is encoded to obtain the question expression of the natural language question. The natural language question is expressed as a word vector sequence formed by word vectors of words in the natural language question, and the encoder is a trained neural network model.
In one possible implementation, the encoder includes a preset number of network layers, and encodes the vectorized natural language question sequentially through each network layer in the encoder to obtain a question representation of the natural language question, where each network layer includes a multi-head attention layer and a feedforward neural network layer, and when an output of a previous network layer is taken as an input of a current network layer, the input is processed sequentially through the multi-head attention layer and the feedforward neural network layer in the current network layer, so as to improve an encoding effect of the natural language question.
In a possible implementation manner, in the process of processing the vectorized natural language questions from the sequence through each network, inputting the output of the N-th network layer in the encoder into the multi-head attention layer of the N+1th network layer in the encoder to obtain the output of the multi-head attention layer, performing residual learning and batch standardization processing on the output of the multi-head attention layer, and inputting the output of the multi-head attention layer subjected to residual learning and batch standardization processing into the feedforward neural network of the N+1th network layer to obtain the output of the N+1th network layer. And taking the output of the (N+1) th network layer as the input of the (N+2) th network layer, thus realizing the coding process of each network layer. Wherein N is greater than or equal to 1, and the input of the first network layer is a vectorized natural language question. Therefore, the coding effect of the natural language questions is improved by combining a multi-head attention layer, residual learning, batch standardization processing and a feedforward neural network.
In one possible embodiment, taking a first network layer as an example, a process of encoding a vectorized natural language question through the first network layer includes the following steps:
(1) And processing the vectorized natural language question through a preset weight parameter in the multi-head attention layer of the first network layer. The processing formula may be:
Q i =FW i Q 、K i =FW i K 、V i =FW i V wherein W is i Q 、W i K 、W i V For the weight parameter of the multi-head attention layer, i is more than or equal to 1 and less than or equal to h, h is a preset parameter, F is a vectorized natural language question, and W i Q 、W i K 、W i V Can be trained and obtained during the training of an automatic question-answering model.
(2) In the multi-head attention layer of the first network layer, h Q values are calculated through a preset self-attention formula pair i 、K i 、V i Calculating to obtain H weighted feature matrixes H i According to H weighted feature matrices H i And obtaining a final characteristic matrix H.
Feature matrix H i The calculation formula of (c) can be:
H i =Attention(Q i ,K i ,V i ),d k is a preset constant parameter.
The calculation formula of the feature matrix H may be:
H=Concat(H 1 ,H 2 ,…,H h )W 0 wherein Concat () is a function for combining a plurality of vectors or a plurality of arrays, and is used for combining a plurality of feature matrices H i Merging, W 0 Is a preset weight parameter, W 0 Can be automatically askedTraining the answer model.
(3) And carrying out residual error learning and batch standardization processing on the characteristic matrix H.
The formula for residual learning of the feature matrix H may be:
h '=f+h, where H' is the residual learned feature matrix.
The formula for performing batch normalization processing on the feature matrix H' after residual error learning can be:
wherein, gamma and beta are preset weight parameters, which can be obtained by training during the training of the automatic question-answer model, < - > and->N is the number of nodes of the multi-head attention layer, epsilon sum is a preset constant parameter, and L is a characteristic matrix after batch standardization processing.
(4) And processing the characteristic matrix after batch normalization processing in a feedforward network layer of the first network layer to obtain the output of the feedforward network layer.
In the feed-forward network layer of the first network layer, a processing formula for processing the feature matrix after batch normalization processing may be:
FFN(L)=max(0,LW 1 +b 1 )W 2 +b 2 wherein W is 1 、W 2 B is a preset weight parameter 1 、b 2 Is a preset bias parameter, W 1 、W 2 、b 1 And b 2 Can be trained during automatic question-answering model training, max (0, LW 1 +b 1 ) Represented from 0 and LW 1 +b 1 And takes the maximum value.
(5) And carrying out residual error learning and batch standardization processing on the output of the feedforward network layer in the first network layer to obtain the output of the first network layer. The residual learning and batch normalization are not described in detail.
Therefore, the steps described in the above (1), (2), (3), (4) and (5) complete the processing of the vectorized natural language question by the first network layer, and the subsequent processing procedures of the second network layer, the third network layer and the like can refer to the description of the first network layer, which is not repeated. The output of the first network layer is the input of the second network layer, and the output of the last network layer is determined to be the question expression of the natural language question by analogy.
In step S204, the knowledge information is encoded by an encoder to obtain a knowledge representation.
In the embodiment of the invention, the knowledge information of the natural language question is encoded by the encoder to obtain the knowledge representation of the knowledge information, and the process of encoding the natural language question by the encoder to obtain the question representation can be referred to, so that the description is omitted.
In step S205, a weight parameter for fusing the question expression and the knowledge expression is determined, and the question expression and the knowledge expression are fused according to the weight parameter, so as to obtain the question expression fused with the knowledge expression.
In the embodiment of the invention, after the question expression of the natural language question and the knowledge expression of the knowledge information of the natural language question are obtained, the fusion parameters used for fusing the question expression and the knowledge expression can be determined, and the question expression and the knowledge expression are fused according to the fusion parameters to obtain the question expression fused with the knowledge expression, so that the question expression used for decoding subsequently contains sufficient knowledge information, and the answer accuracy is improved.
In one possible implementation, in the process of determining the fusion parameters for fusing the question expression and the knowledge expression, the matching degree between the question expression and the knowledge expression can be determined first, and the fusion parameters for fusing the question expression and the knowledge expression are determined according to the matching degree, so that the accuracy of the fusion parameters for fusing the question expression and the knowledge expression is improved.
In one possible embodiment, the calculation formula of the matching degree between the question expression and the knowledge expression is:
wherein r is q Expressed as question, r k For knowledge representation, f (r q ,r k ) For the matching degree of question representation and knowledge representation, < ->For the preset weight parameter, +.>For a preset bias parameter ∈>And->Can be obtained by training an automatic question-answering model.
Further, a calculation formula for determining the fusion parameter according to the matching degree is as follows:
where α is the fusion parameter, relu () is the activation function, < >>For the preset weight parameter, +.>For a preset bias parameter ∈>And->Can be obtained by training an automatic question-answering model. Alpha epsilon R n N is the number of knowledge information and also the number of knowledge representations.
In a possible implementation manner, in the process of fusing the question expression and the knowledge expression according to the fusion parameter, the fusion parameter can play a role of soft switching, the knowledge expression is extracted by taking the fusion parameter as the extraction probability, and the question expression is extracted by taking "1 minus the fusion parameter" as the extraction probability, so that the flexibility of fusing the question expression and the knowledge expression is improved, and the complexity of fusing the question expression and the knowledge expression is reduced. The calculation formula of the fusion process can be:
r' k =α*r k ,r' q =(1-α)*r q ,R q =r' k +r' q . Wherein R is q To fuse the question expression of knowledge information, r' k For the information extracted from the knowledge representation by taking the fusion parameter alpha as the extraction probability, r' q The information obtained by extracting the fusion parameter alpha from the question expression by taking 1 as the extraction probability.
Further, when knowledge information of a natural language question is not found from a question-answer library based on a knowledge graph, the value of the fusion parameter is set to 0, and at this time, the extraction probability of the knowledge representation is 0, and the extraction probability of the question representation is 1. Before the knowledge representation and the question representation are fused, whether effective information exists in the natural language question can be detected, if no effective information exists in the natural language question (for example, no effective information exists in the natural language question for what reason today is uncomfortable), the value of the fusion parameter is set to be 1, at the moment, the extraction probability of the knowledge representation is 1, and the extraction probability of the question representation is 0, so that the flexibility of fusion of the knowledge representation and the question representation is effectively improved.
In step S206, the question representation of the fused knowledge representation is decoded by a decoder in the automatic question-answer model to obtain an answer.
In the embodiment of the invention, after the question expression fused with the knowledge expression is obtained, the question expression fused with the knowledge expression can be decoded through a decoder in the automatic question-answering model to obtain the answer of the natural language question, so that the automatic question-answering of the natural language question is realized.
In the embodiment of the invention, the decoder comprises a plurality of network layers, each network layer sequentially comprises a multi-head attention layer and a feedforward neural network layer, the output of each multi-head attention layer is subjected to residual error learning and batch standardization processing, the output of each feedforward neural network layer is also subjected to residual error learning and batch standardization processing, and the multi-head attention layer, the feedforward neural network layer, the residual error learning and batch standardization processing are described in detail with reference to the coding layer and are not repeated. The question expression integrated with the knowledge expression is processed by a plurality of network layers of the decoder to obtain the predicted word output by the last network layer. Inputting each predicted word into a linear network layer and a classification layer which are connected with the last network layer in the encoder and a preset probability distribution corresponding to a vocabulary, obtaining the probability of each predicted word, selecting the vocabulary in all the predicted words according to the probability of each predicted word, and combining the selected vocabulary to obtain the answer of the natural language question.
In a possible implementation manner, when vocabulary selection is performed among all the predicted words according to the probabilities of the predicted words, the predicted words with the highest probabilities can be sequentially selected until the selected predicted words are the preset terminator. And combining all the selected vocabularies according to the selected sequence to obtain the answers of the natural language questions, thereby improving the accuracy of automatic questions and answers.
In one possible implementation, the linear network layer and the classification layer may be expressed as the following formulas:
P kg =softmax (vy+b), where P kg In order to predict the probability corresponding to the vocabulary y, V is a preset weight parameter, b is a preset bias parameter, and both V and b can be obtained through training of an automatic question-answering model.
In one possible implementation, the encoder and decoder in the automatic question-and-answer model may employ a transducer encoder and a transducer decoder, respectively, to enhance the encoding and decoding effects of the automatic question-and-answer model.
In one possible implementation, a supervised training approach may be used to enhance the training effect of the automatic question-answering model when training the automatic question-answering model.
In the embodiment of the invention, the knowledge information of the natural language question is searched in the question-answering library based on the knowledge graph, the natural language question and the knowledge information are encoded, fused and decoded through the automatic question-answering model trained in advance, and the answers of the natural language question outputted by the automatic question-answering model are obtained, so that the knowledge information is obtained based on the question-answering library of the knowledge graph, the knowledge information obtaining efficiency is improved, the answers are obtained by processing the natural language question and the knowledge information through the automatic question-answering model, the problem of lack of knowledge faced by automatic question-answering is solved, and the accuracy and the efficiency of automatic question-answering are further improved.
Embodiment III:
fig. 3 shows the structure of an automatic question and answer device according to a third embodiment of the present invention, and for convenience of explanation, only the parts related to the embodiment of the present invention are shown, including:
a question acquisition unit 31 for acquiring a natural language question;
a knowledge searching unit 32, configured to search knowledge information of a natural language question in a preset knowledge-graph-based question-answer library; and
the answer generation unit 33 is configured to input the natural language question and knowledge information into an automatic question-answer model trained in advance, and encode, fuse and decode the natural language question and knowledge information through the automatic question-answer model, so as to obtain an answer of the natural language question output by the automatic question-answer model.
In a possible embodiment, the knowledge finding unit 32 is specifically configured to:
determining the question category to which the natural language question belongs according to the category words preset in the question-answer library based on the knowledge graph; extracting keywords and subject matters from the natural language question according to preset keywords and subject matters in a question-answer library based on the knowledge graph; searching knowledge information of the natural language question in a question-answering library based on the knowledge graph according to the question category to which the natural language question belongs, keywords in the natural language question and subject words in the natural language question.
In a possible embodiment, the answer generation unit 33 is specifically configured to:
encoding the natural language question by an encoder in the automatic question-answering model to obtain a question expression; determining fusion parameters for fusing the question expression and the knowledge expression; fusing the question expression and the knowledge expression according to the fusion parameters to obtain a question expression of the fused knowledge expression; and decoding the question expression fused with the knowledge expression through a decoder in the automatic question-answer model to obtain an answer.
In one possible embodiment, the encoder includes a preset number of network layers; the answer generation unit 33 is specifically configured to:
vectorizing natural language questions; inputting the vectorized natural language question into an encoder; and coding the vectorized natural language question sequentially through each network layer to obtain a question representation, wherein each network layer comprises a multi-head attention layer and a feedforward neural network layer.
In a possible embodiment, the answer generation unit 33 is specifically configured to:
inputting the output of each N-th network layer in the encoder into a multi-head attention layer in each N+1-th network layer in the encoder, wherein N is more than or equal to 1; residual learning and batch standardization processing are carried out on the output of the multi-head attention layer in the (N+1) th network layer, so that the input of the feedforward neural network layer in the (N+1) th network layer is obtained; and carrying out residual error learning and batch standardization processing on the output of the feedforward neural network layer in the (N+1) th network layer to obtain the output of the (N+1) th network layer.
In a possible embodiment, the answer generation unit 33 is specifically configured to:
determining a degree of matching between the question representation and the knowledge representation; and determining fusion parameters according to the matching degree.
In a possible embodiment, the answer generation unit 33 is specifically configured to:
inputting the question expression fused with the knowledge expression into a decoder according to the decoder and a preset vocabulary, and obtaining probability distribution of the vocabulary; and sequentially selecting predicted words in the vocabulary according to probability distribution of the vocabulary, and obtaining answers according to the selected predicted words.
In the embodiment of the present invention, specific implementation and achieved technical effects of the automatic question answering device may refer to specific descriptions of corresponding method embodiments, and will not be repeated.
In the embodiment of the invention, each unit of the automatic question answering device can be realized by corresponding hardware or software units, each unit can be an independent software unit and an independent hardware unit, and can also be integrated into one software unit and one hardware unit, and the invention is not limited herein.
Embodiment four:
fig. 4 shows the structure of an electronic device according to the fourth embodiment of the present invention, and for convenience of explanation, only the portions related to the embodiments of the present invention are shown.
The electronic device 4 of the embodiment of the invention comprises a processor 40, a memory 41 and a computer program 42 stored in the memory 41 and executable on the processor 40. The processor 40, when executing the computer program 42, implements the steps of the various method embodiments described above, such as steps S101 to S103 shown in fig. 1 or steps S201 to S206 shown in fig. 2. Alternatively, the processor 40, when executing the computer program 42, performs the functions of the units in the above-described device embodiments, for example the functions of the units 31 to 33 shown in fig. 3.
In the embodiment of the invention, natural language questions are acquired, knowledge information of the natural language questions is searched in the preset knowledge-based question-answering library, the natural language questions and the knowledge information are input into the automatic question-answering model trained in advance, and the natural language questions and the knowledge information are encoded, fused and decoded through the automatic question-answering model to obtain answers of the natural language questions output by the automatic question-answering model, so that the knowledge shortage problem of automatic question-answering is effectively solved, and the accuracy of automatic question-answering is improved.
The electronic device of the embodiment of the invention can be a computer, a server, a tablet personal computer and the like. The steps of the electronic device terminal 4 when the processor 40 executes the computer program 42 to implement the automatic question answering method can refer to the description of the foregoing method embodiments, and will not be repeated here.
Fifth embodiment:
in an embodiment of the present invention, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps in the above-described method embodiments, for example, steps S101 to S103 shown in fig. 1 or steps S201 to S206 shown in fig. 2. Alternatively, the computer program, when executed by a processor, implements the functions of the units in the above-described embodiments of the apparatus, such as the functions of the units 31 to 33 shown in fig. 3.
In the embodiment of the invention, natural language questions are acquired, knowledge information of the natural language questions is searched in the preset knowledge-based question-answering library, the natural language questions and the knowledge information are input into the automatic question-answering model trained in advance, and the natural language questions and the knowledge information are encoded, fused and decoded through the automatic question-answering model to obtain answers of the natural language questions output by the automatic question-answering model, so that the knowledge shortage problem of automatic question-answering is effectively solved, and the accuracy of automatic question-answering is improved.
The computer readable storage medium of embodiments of the present invention may include any entity or device capable of carrying computer program code, recording medium, such as ROM/RAM, magnetic disk, optical disk, flash memory, and so on.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (9)

1. An automatic question-answering method, characterized in that it comprises the steps of:
acquiring a natural language question;
searching knowledge information of the natural language question in a preset question-answering library based on the knowledge graph;
inputting the natural language question and the knowledge information into a pre-trained automatic question-answering model, and encoding the natural language question through an encoder in the automatic question-answering model to obtain a question representation;
encoding the knowledge information through the encoder to obtain knowledge representation;
determining fusion parameters for fusing the question representation with the knowledge representation;
fusing the question expression and the knowledge expression according to the fusion parameters to obtain a question expression fused with the knowledge expression;
decoding the question expression fused with the knowledge expression through a decoder in the automatic question-answering model to obtain an answer of the natural language question outputted by the automatic question-answering model;
the step of fusing the question expression and the knowledge expression according to the fusion parameters comprises the following steps:
the fusion parameters are used as extraction probabilities to extract information from the knowledge representation, and information obtained by extracting the fusion parameters from the knowledge representation is obtained;
taking 1 minus the fusion parameter as extraction probability to extract information from the question expression, so as to obtain information extracted from the question expression by taking 1 minus the fusion parameter as extraction probability;
and adding the information extracted from the knowledge representation by taking the fusion parameter as the extraction probability and the information extracted from the question representation by taking 1 minus the fusion parameter as the extraction probability to obtain the question representation fused with the knowledge representation.
2. The method according to claim 1, wherein searching knowledge information of the natural language question in a preset knowledge-based question-answering library comprises:
determining the question category to which the natural language question belongs according to the category words preset in the question-answering library based on the knowledge graph;
extracting keywords and subject matters from the natural language question according to preset keywords and subject matters in the question-answer library based on the knowledge graph;
searching knowledge information of the natural language question in the question-answering library based on the knowledge graph according to the question category to which the natural language question belongs, keywords in the natural language question and subject matters in the natural language question.
3. The method of claim 1, wherein the encoder includes a predetermined number of network layers, and wherein the step of encoding the natural language question by the encoder in the automatic question-answering model to obtain a question representation includes:
vectorizing the natural language question;
inputting the vectorized natural language question into the encoder;
and coding the quantized natural language question sequentially through each network layer to obtain the question representation, wherein each network layer comprises a multi-head attention layer and a feedforward neural network layer.
4. A method according to claim 3, wherein said processing of said vectorized natural language question sequentially through each of said network layers comprises:
inputting the output of the N-th network layer in the encoder into a multi-head attention layer in the (n+1) -th network layer in the encoder, wherein N is more than or equal to 1;
residual error learning and batch standardization processing are carried out on the output of the multi-head attention layer in the N+1th network layer, so that the input of the feedforward neural network layer in the N+1th network layer is obtained;
inputting the input of the feedforward neural network layer in the (N+1) th network layer into the feedforward neural network in the (N+1) th network layer;
and carrying out residual error learning and batch standardization processing on the output of the feedforward neural network layer in the (N+1) th network layer to obtain the output of the (N+1) th network layer.
5. The method of claim 1, wherein the determining fusion parameters for fusion of the question representation with the knowledge representation comprises:
determining a degree of matching between the question representation and the knowledge representation;
and determining the fusion parameters according to the matching degree.
6. The method according to claim 1, wherein decoding, by a decoder in the automatic question-answering model, the question representation fused with the knowledge representation to obtain an answer to the natural language question outputted by the automatic question-answering model, includes:
inputting the question expression fused with the knowledge expression into the decoder according to the probability distribution of a preset vocabulary, so as to obtain the probability of a plurality of predicted words;
selecting vocabulary according to the probabilities of the plurality of predicted words;
and obtaining the answer according to the selected vocabulary.
7. An automatic question answering apparatus, the apparatus comprising:
the question acquisition unit is used for acquiring natural language questions;
the knowledge searching unit is used for searching knowledge information of the natural language question in a preset question-answer library based on the knowledge graph; and
the answer generation unit is configured to input the natural language question and the knowledge information into a pre-trained automatic question and answer model, encode, fuse and decode the natural language question and the knowledge information through the automatic question and answer model, and obtain an answer to the natural language question output by the automatic question and answer model, where the inputting the natural language question and the knowledge information into the pre-trained automatic question and answer model encodes, fuses and decodes the natural language question and the knowledge information through the automatic question and answer model, and obtain an answer to the natural language question output by the automatic question and answer model, and the answer generation unit includes:
inputting the natural language question and the knowledge information into a pre-trained automatic question-answering model, and encoding the natural language question through an encoder in the automatic question-answering model to obtain a question representation; encoding the knowledge information through the encoder to obtain knowledge representation; determining fusion parameters for fusing the question representation with the knowledge representation; fusing the question expression and the knowledge expression according to the fusion parameters to obtain a question expression fused with the knowledge expression; decoding the question expression fused with the knowledge expression through a decoder in the automatic question-answering model to obtain an answer of the natural language question outputted by the automatic question-answering model;
the step of fusing the question expression and the knowledge expression according to the fusion parameters to obtain the question expression fusing the knowledge expression comprises the following steps:
the fusion parameters are used as extraction probabilities to extract information from the knowledge representation, and information obtained by extracting the fusion parameters from the knowledge representation is obtained; taking 1 minus the fusion parameter as extraction probability to extract information from the question expression, so as to obtain information extracted from the question expression by taking 1 minus the fusion parameter as extraction probability; and adding the information extracted from the knowledge representation by taking the fusion parameter as the extraction probability and the information extracted from the question representation by taking 1 minus the fusion parameter as the extraction probability to obtain the question representation fused with the knowledge representation.
8. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 6.
CN202010731550.6A 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium Active CN112035627B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010731550.6A CN112035627B (en) 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010731550.6A CN112035627B (en) 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112035627A CN112035627A (en) 2020-12-04
CN112035627B true CN112035627B (en) 2023-11-17

Family

ID=73583231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010731550.6A Active CN112035627B (en) 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112035627B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559707A (en) * 2020-12-16 2021-03-26 四川智仟科技有限公司 Knowledge-driven customer service question and answer method
CN112650768A (en) * 2020-12-22 2021-04-13 网易(杭州)网络有限公司 Dialog information generation method and device and electronic equipment

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663129A (en) * 2012-04-25 2012-09-12 中国科学院计算技术研究所 Medical field deep question and answer method and medical retrieval system
US9336259B1 (en) * 2013-08-08 2016-05-10 Ca, Inc. Method and apparatus for historical analysis analytics
CN106710596A (en) * 2016-12-15 2017-05-24 腾讯科技(上海)有限公司 Answer statement determination method and device
CN107729493A (en) * 2017-09-29 2018-02-23 北京创鑫旅程网络技术有限公司 Travel the construction method of knowledge mapping, device and travelling answering method, device
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN108073600A (en) * 2016-11-11 2018-05-25 阿里巴巴集团控股有限公司 A kind of intelligent answer exchange method, device and electronic equipment
CN108647233A (en) * 2018-04-02 2018-10-12 北京大学深圳研究生院 A kind of answer sort method for question answering system
CN108984778A (en) * 2018-07-25 2018-12-11 南京瓦尔基里网络科技有限公司 A kind of intelligent interaction automatically request-answering system and self-teaching method
CN109271524A (en) * 2018-08-02 2019-01-25 中国科学院计算技术研究所 Entity link method in knowledge base question answering system
CN109918489A (en) * 2019-02-28 2019-06-21 上海乐言信息科技有限公司 A kind of knowledge question answering method and system of more strategy fusions
CN110188182A (en) * 2019-05-31 2019-08-30 中国科学院深圳先进技术研究院 Model training method, dialogue generation method, device, equipment and medium
CN111125333A (en) * 2019-06-06 2020-05-08 北京理工大学 Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN111274373A (en) * 2020-01-16 2020-06-12 山东大学 Electronic medical record question-answering method and system based on knowledge graph

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8666928B2 (en) * 2005-08-01 2014-03-04 Evi Technologies Limited Knowledge repository
US20070276723A1 (en) * 2006-04-28 2007-11-29 Gideon Samid BiPSA: an inferential methodology and a computational tool
US9110882B2 (en) * 2010-05-14 2015-08-18 Amazon Technologies, Inc. Extracting structured knowledge from unstructured text

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663129A (en) * 2012-04-25 2012-09-12 中国科学院计算技术研究所 Medical field deep question and answer method and medical retrieval system
US9336259B1 (en) * 2013-08-08 2016-05-10 Ca, Inc. Method and apparatus for historical analysis analytics
CN108073600A (en) * 2016-11-11 2018-05-25 阿里巴巴集团控股有限公司 A kind of intelligent answer exchange method, device and electronic equipment
CN106710596A (en) * 2016-12-15 2017-05-24 腾讯科技(上海)有限公司 Answer statement determination method and device
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN107729493A (en) * 2017-09-29 2018-02-23 北京创鑫旅程网络技术有限公司 Travel the construction method of knowledge mapping, device and travelling answering method, device
CN108647233A (en) * 2018-04-02 2018-10-12 北京大学深圳研究生院 A kind of answer sort method for question answering system
CN108984778A (en) * 2018-07-25 2018-12-11 南京瓦尔基里网络科技有限公司 A kind of intelligent interaction automatically request-answering system and self-teaching method
CN109271524A (en) * 2018-08-02 2019-01-25 中国科学院计算技术研究所 Entity link method in knowledge base question answering system
CN109918489A (en) * 2019-02-28 2019-06-21 上海乐言信息科技有限公司 A kind of knowledge question answering method and system of more strategy fusions
CN110188182A (en) * 2019-05-31 2019-08-30 中国科学院深圳先进技术研究院 Model training method, dialogue generation method, device, equipment and medium
CN111125333A (en) * 2019-06-06 2020-05-08 北京理工大学 Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN111274373A (en) * 2020-01-16 2020-06-12 山东大学 Electronic medical record question-answering method and system based on knowledge graph

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Attention is all you need;Ashish Vaswani;《Proceedings of the 31st International Conference on Neural Information Processing Systems》;6000-6010 *
Knowledge Diffusion for Neural Dialogue Generation;Shuman Liu;《Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 》;第1卷;1489–1498 *
陈美茜.融合知识图谱的多编码器神经网络中文问题产生方法研究.《中国优秀硕士学位论文全文数据库 信息科技》.2020,I138-261. *

Also Published As

Publication number Publication date
CN112035627A (en) 2020-12-04

Similar Documents

Publication Publication Date Title
Liu et al. Probabilistic reasoning via deep learning: Neural association models
US9858263B2 (en) Semantic parsing using deep neural networks for predicting canonical forms
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
WO2022227203A1 (en) Triage method, apparatus and device based on dialogue representation, and storage medium
CN113707307A (en) Disease analysis method and device, electronic equipment and storage medium
CN112035627B (en) Automatic question and answer method, device, equipment and storage medium
CN111858940A (en) Multi-head attention-based legal case similarity calculation method and system
An et al. Extracting causal relations from the literature with word vector mapping
CN112784532A (en) Multi-head attention memory network for short text sentiment classification
CN112380867A (en) Text processing method, text processing device, knowledge base construction method, knowledge base construction device and storage medium
CN112632250A (en) Question and answer method and system under multi-document scene
CN112925918A (en) Question-answer matching system based on disease field knowledge graph
Kim et al. A convolutional neural network in legal question answering
Li et al. Approach of intelligence question-answering system based on physical fitness knowledge graph
CN117056475A (en) Knowledge graph-based intelligent manufacturing question-answering method, device and storage medium
CN116992002A (en) Intelligent care scheme response method and system
Yang et al. Cmu livemedqa at trec 2017 liveqa: A consumer health question answering system
CN115964475A (en) Dialogue abstract generation method for medical inquiry
CN113468311B (en) Knowledge graph-based complex question and answer method, device and storage medium
CN112131363B (en) Automatic question and answer method, device, equipment and storage medium
CN111767388B (en) Candidate pool generation method
CN114003684A (en) Medical information relation prediction method and system based on open world assumption
Zhou et al. A depth evidence score fusion algorithm for chinese medical intelligence question answering system
Zhao et al. A Dynamic Optimization-Based Ensemble Learning Method for Traditional Chinese Medicine Named Entity Recognition
Liu et al. CPK-Adapter: Infusing Medical Knowledge into K-Adapter with Continuous Prompt

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant