WO2022088671A1 - Procédé et appareil de réponse automatique à des questions, dispositif et support de mémoire - Google Patents

Procédé et appareil de réponse automatique à des questions, dispositif et support de mémoire Download PDF

Info

Publication number
WO2022088671A1
WO2022088671A1 PCT/CN2021/097419 CN2021097419W WO2022088671A1 WO 2022088671 A1 WO2022088671 A1 WO 2022088671A1 CN 2021097419 W CN2021097419 W CN 2021097419W WO 2022088671 A1 WO2022088671 A1 WO 2022088671A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
attribute
name
predicted
question
Prior art date
Application number
PCT/CN2021/097419
Other languages
English (en)
Chinese (zh)
Inventor
侯丽
刘翔
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022088671A1 publication Critical patent/WO2022088671A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to an automatic question answering method, apparatus, computer equipment, and computer-readable storage medium.
  • Knowledge graph technology is an important part of artificial intelligence technology, which describes the relationship between concepts, entities and their keys in the objective world in a structured way.
  • Knowledge graph technology provides a better ability to organize, manage and understand the massive information on the Internet, expressing the information on the Internet into a form that is closer to human cognition of the world. Therefore, establishing a knowledge base with semantic processing capability and open interconnection capability can generate application value in intelligent information services such as intelligent search, intelligent question answering, and personalized recommendation.
  • the methods used in the current mainstream knowledge base-based automatic question answering can be divided into two categories: semantic parsing-based (SP-based) methods and information retrieval-based (IR-based) methods.
  • SP-based semantic parsing-based
  • IR-based information retrieval-based
  • the inventors realized that the method based on semantic analysis first converts the question in the form of natural language into a certain type of logical expression form.
  • Traditional semantic analysis needs to be supervised by logical forms marked with part-of-speech information, and is limited by only a few narrow domains of logical predicates.
  • the method of information retrieval first obtains a series of candidate answers from the knowledge base through a relatively rough method, and then extracts features from the questions and candidate answers, uses them to sort the candidate answers, and selects the one with the highest score as the final answer.
  • information retrieval methods lack the understanding of deep semantics, resulting in low accuracy of automatic question answering.
  • the main purpose of the present application is to provide an automatic question answering method, device, computer equipment and computer readable storage medium, which aims to solve the problem that traditional semantic analysis requires a marked logical form as supervision data, and relies on a small number of logical predicates, and Information retrieval lacks the understanding of deep semantics, which leads to technical problems with low accuracy of automatic question answering for semantic analysis and information retrieval.
  • the present application provides an automatic question answering method, which includes the following steps:
  • the entity alias of each word in the question to be predicted is obtained, and the entity alias is used as a candidate entity, wherein the entity alias and the candidate entity are multiple; based on the preset entity recognition model, Determine the entity name corresponding to the to-be-predicted question according to the to-be-predicted question and a plurality of the candidate entities; tuple, wherein the triplet includes the entity name, attribute name and attribute value, and the triplet is multiple; based on a preset attribute mapping model, determine the attribute name according to each attribute name and the problem to be predicted.
  • the target attribute name corresponding to the question to be predicted is described, and the attribute value corresponding to the target attribute name is used as the question and answer of the question to be predicted.
  • the present application also provides an automatic question answering device, the automatic question answering device comprising:
  • the obtaining module is used to obtain the entity alias of each word in the question to be predicted according to the preset alias dictionary, and use the entity alias as a candidate entity, wherein the entity alias and the candidate entity are multiple;
  • the determination module is used to determine the entity name corresponding to the to-be-predicted problem according to the to-be-predicted problem and a plurality of the candidate entities based on the preset entity recognition model;
  • the second determination module is used to determine the entity name corresponding to the to-be-predicted problem according to the setting a map database, determining the triplet corresponding to the entity name in the preset map database, wherein the triplet includes an attribute name and an attribute value, and the triplet is multiple;
  • the third determining module using Based on the preset attribute mapping model, the target attribute name corresponding to the to-be-predicted question is determined according to each of the attribute names and the to-be-predicted question, and the attribute value corresponding to the target attribute name is used as the
  • the present application also provides a computer device, the computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program is executed by the When the processor executes, it implements the following steps:
  • the entity alias of each word in the question to be predicted is obtained, and the entity alias is used as a candidate entity, wherein the entity alias and the candidate entity are multiple; based on the preset entity recognition model, Determine the entity name corresponding to the to-be-predicted question according to the to-be-predicted question and a plurality of the candidate entities; tuple, wherein the triplet includes the entity name, attribute name and attribute value, and the triplet is multiple; based on a preset attribute mapping model, determine the attribute name according to each attribute name and the problem to be predicted.
  • the target attribute name corresponding to the question to be predicted is described, and the attribute value corresponding to the target attribute name is used as the question and answer of the question to be predicted.
  • the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, wherein when the computer program is executed by the processor, the following steps are implemented:
  • the entity alias of each word in the question to be predicted is obtained, and the entity alias is used as a candidate entity, wherein the entity alias and the candidate entity are multiple; based on the preset entity recognition model, Determine the entity name corresponding to the to-be-predicted question according to the to-be-predicted question and a plurality of the candidate entities; tuple, wherein the triplet includes the entity name, attribute name and attribute value, and the triplet is multiple; based on a preset attribute mapping model, determine the attribute name according to each attribute name and the problem to be predicted.
  • the target attribute name corresponding to the question to be predicted is described, and the attribute value corresponding to the target attribute name is used as the question and answer of the question to be predicted.
  • the present application provides an automatic question answering method, device, computer equipment and computer-readable storage medium, by obtaining entity aliases in each word of a question to be predicted according to a preset alias dictionary, and using the entity alias as a candidate entity; problem and multiple candidate entities, determine the entity name corresponding to the problem to be predicted; according to the entity name and the preset map database, determine the triplet corresponding to the entity name in the preset map database; determine the to-be-predicted according to each attribute name and the problem to be predicted
  • the target attribute name corresponding to the question, and the attribute value corresponding to the target attribute name is used as the question and answer of the question to be predicted, so as to realize the entity recognition of the question and the semantic encoding of the attribute mapping of the question, and improve the representation ability and generalization ability of the reading text. Thereby improving the accuracy of automatic question answering.
  • FIG. 1 is a schematic flowchart of an automatic question answering method provided by an embodiment of the present application.
  • Fig. 2 is the sub-step flowchart schematic diagram of the automatic question answering method in Fig. 1;
  • Fig. 3 is the sub-step flow schematic diagram of the automatic question answering method in Fig. 1;
  • FIG. 4 is a schematic flow chart of the steps of training a preset entity recognition model
  • FIG. 5 is a schematic flowchart of steps for training a preset attribute mapping model
  • FIG. 6 is a schematic block diagram of an automatic question answering device provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural block diagram of a computer device according to an embodiment of the present application.
  • Embodiments of the present application provide an automatic question answering method, apparatus, computer device, and computer-readable storage medium.
  • the automatic question answering method can be applied to a computer device, and the computer device can be an electronic device such as a notebook computer and a desktop computer.
  • FIG. 1 is a schematic flowchart of an automatic question answering method provided by an embodiment of the present application.
  • the automatic question answering method includes steps S101 to S104.
  • Step S101 obtaining entity aliases of each word in the question to be predicted according to a preset alias dictionary, and using the entity aliases as candidate entities, wherein there are multiple entity aliases and candidate entities.
  • the question to be predicted is acquired, and the entity alias of each word in the question to be predicted is acquired according to the alias list of the preset alias dictionary, wherein the alias list includes multiple entity aliases.
  • the entity aliases in the alias list are compared with each word in the question to be predicted, if the word in the question to be predicted is the same as Any entity alias in the alias list is the same, then all entity aliases in the alias list corresponding to the entity name are determined, and all the entity aliases are the entity aliases of the word in the problem to be predicted.
  • the obtained entity alias is used as a candidate entity, wherein the number of entity aliases and candidate entities is multiple.
  • Step S102 Determine the entity name corresponding to the problem to be predicted according to the problem to be predicted and a plurality of candidate entities based on a preset entity recognition model.
  • the preset entity recognition model is obtained by training the first preset pre-trained language model through the data to be trained in advance. Based on the preset entity recognition model, the problem to be predicted is determined according to the problem to be predicted and a plurality of candidate entities. The corresponding entity name. Among them, the entity name is the common name of the name in the question, the candidate entity is the entity alias, and the entity alias is the special name or the former name, etc.
  • the question to be predicted is “Who is the author of A Dream of Red Mansions?”, where "Dream of Red Mansions” is the entity name; or, the question to be predicted is "Who is the author of The Story of the Stone?" where "The Story of the Stone” is the entity alias, and " The "Dream of Red Mansions” corresponding to "The Story of the Stone” is the entity name.
  • the problem to be predicted is input into the preset entity recognition model, the name corresponding to the problem to be predicted is identified based on the preset entity recognition model, and the name corresponding to the problem to be predicted is determined based on the name and multiple candidate entities.
  • entity name For example, the name is compared with each candidate entity, and if the name is the same as any one of the multiple candidate entities, the name is used as the entity name of the problem to be predicted.
  • step S102 includes: sub-step S1021 to sub-step S1023 .
  • Sub-step S1021 respectively replace the corresponding words in the question to be predicted according to the plurality of candidate entities to generate a plurality of text records.
  • multiple candidate entities corresponding to each word in the question to be predicted are obtained, and based on the candidate entities, the words corresponding to the question to be predicted are respectively replaced to generate multiple text records.
  • determine whether any word in the question to be predicted corresponds to multiple candidate entities; if multiple candidate entities are candidate entities of the same word, then determine that the multiple candidate entities correspond to the word positions in the question to be predicted, and replace the word with each candidate entity at the position of the word in the problem to be predicted, and generate a plurality of corresponding text records.
  • the to-be-predicted question is "Who is the author of The Story of the Stone?" and the Dream of Red Mansions is a candidate entity for the word "Story of the Stone” in the question to be predicted "Who is the author of the Story of the Stone?" Who is the author of Shishiji?”, replace the word “Stoneji” with the candidate entity "Dream of Red Mansions" at the position of Shishiji in the question to be predicted "Who is the author of Shishiji?”.
  • multiple candidate entities are not candidates for the same word, determine that each candidate entity corresponds to the word position in the question to be predicted, and replace the word with the corresponding word at the position of each word in the question to be predicted.
  • the candidate entities of generate corresponding text records, that is, the number of text records is the same as the number of candidate entities.
  • Sub-step S1022 Input the plurality of text records into a preset entity recognition model respectively, and predict the predicted value of the candidate entity in each of the text records.
  • each text record is input into a preset entity recognition model, and the predicted value of the candidate entity in each text record is predicted by the preset entity recognition model.
  • each text record is input into a preset entity recognition model
  • the preset entity recognition model includes a dictionary file, and each text record is split through the dictionary text to obtain text sequences corresponding to multiple text records.
  • vectorized representation is performed on the text sequence to obtain corresponding text vector information.
  • the preset entity recognition model includes a multi-head attention mechanism model, and the text vector information is input into the multi-head attention mechanism model, and the multi-head attention mechanism model obtains a vector corresponding to each word in the text vector information fused with context information.
  • the preset entity recognition model includes a linear conversion layer, and perform linear conversion on the text semantic vector information corresponding to each text record through the linear conversion layer, and obtain each text record.
  • the predicted value of the candidate entity is a linear conversion layer
  • Sub-step S1023 Determine the candidate entity in the target text record as the entity name according to the predicted value of the candidate entity in each of the text records, and use the entity name as the entity name corresponding to the problem to be predicted.
  • the predicted values of the candidate entities in each text record are obtained, the predicted values of the candidate entities in each text record are compared, and the text record with the highest predicted value of the candidate alias is determined.
  • the target text record with the highest predicted value of the candidate alias output by the preset entity recognition model, and use the candidate entity in the target text record with the highest predicted value of the candidate alias as the entity name in the problem to be predicted.
  • Step S103 according to the entity name and the preset map database, determine the triplet corresponding to the entity name in the preset map database; wherein, the triplet includes the entity name, the attribute name and the attribute value, and the The triples described above are multiple groups.
  • a preset graph database is queried based on the entity name, and the preset graph database includes multiple groups of triples, wherein each group of triples is structured in a way stored in the graph data.
  • the preset graph database is queried by the entity name, and multiple sets of triples corresponding to the entity name in the preset graph database are obtained, and the triples include the entity name, the attribute name and the attribute value.
  • the preset map database is searched based on the Xiaoao Jianghu, and the triplet corresponding to the Xiaoao Jianghu ""Xiaoao Jianghu"
  • Step S104 Determine the target attribute name corresponding to the to-be-predicted problem according to each of the attribute names and the to-be-predicted problem based on a preset attribute mapping model, and use the attribute value corresponding to the target attribute name as the to-be-predicted problem 's Q&A.
  • each attribute text pair is obtained by combining the to-be-predicted question with the attribute values in each set of triples. Input each attribute text pair into a preset attribute mapping model, the preset attribute mapping model predicts the predicted score of each attribute text pair, determines the attribute text pair with the highest predicted score, and then determines the attribute text pair with the highest predicted score.
  • the target attribute text pair the target attribute text pair output by the preset attribute mapping model is obtained.
  • the target attribute text pair determine the attribute name in the target data text pair as the target attribute name, determine the triplet corresponding to the target attribute name in the graph database based on the target attribute name, and determine the target attribute name based on the target attribute name.
  • the attribute value in the corresponding triple is used as the question and answer corresponding to the question to be predicted.
  • step S104 includes: sub-step S1041 to sub-step S1043 .
  • Sub-step S1041 each of the attribute names is combined with the to-be-predicted question to generate a plurality of attribute text pairs.
  • multiple groups of triples corresponding to entity names are obtained, and the attribute names in each group of triples are obtained.
  • the acquired attribute names are combined with the questions to be predicted, respectively, to obtain attribute text pairs corresponding to the combination of each attribute name and the question to be predicted, wherein the number of attribute text pairs and attribute values is the same.
  • Sub-step S1042 Input each of the attribute-text pairs into a preset attribute mapping model to obtain the predicted score of the attribute name in each of the attribute-text pairs.
  • each obtained attribute text pair is input into a preset attribute mapping model
  • the preset attribute mapping model includes a dictionary file
  • each attribute text pair is split through the dictionary file to obtain each attribute text pair.
  • the word sequence of each question and attribute name is filled with the word sequence of each question and the word sequence of each attribute name to obtain a uniform fixed-length word sequence.
  • the question and the segmentation position of each attribute value are marked with special symbols, and the attribute text sequence is marked.
  • the attribute text sequence is vectorized to obtain the corresponding text vector information.
  • the preset attribute mapping model includes a multi-head attention network model, the text vector information is input into the multi-head attention network model, and the multi-head attention network model obtains the vector representation corresponding to each word in the input text vector fused with context information , so as to obtain the text semantic vector information output by the multi-head attention network model. Mark the segmentation position of the question and each attribute name in the text semantic vector information based on the special symbol, and obtain the semantic vector corresponding to each attribute text in the text semantic vector information.
  • the preset entity recognition model includes a linear transformation layer. Through the linear transformation layer The semantic vector of each attribute text pair is linearly transformed to obtain the predicted score of the attribute name in each attribute text pair.
  • Sub-step S1043 according to the predicted score of the attribute name in each of the attribute text pairs, obtain the target attribute text pair with the highest predicted score output by the preset attribute mapping model.
  • the predicted score of the attribute name in each attribute text pair is obtained, the predicted score of the attribute name in each attribute text pair is compared, and the attribute with the highest predicted score of the attribute name in the multiple attribute text pairs is determined. Text pair, the attribute text pair with the highest predicted score is taken as the target attribute text pair, and the target attribute text pair with the highest predicted score output by the preset attribute mapping model is obtained.
  • Sub-step S1044 Use the attribute name in the target attribute text pair as the target attribute name corresponding to the question to be predicted.
  • the target attribute text pair with the highest output prediction score from the preset attribute mapping model when the target attribute text pair with the highest output prediction score from the preset attribute mapping model is obtained, the target attribute text pair includes the question to be predicted and the attribute name, and the attribute name in the target attribute text pair is used as the target attribute. name.
  • the method before determining the triplet corresponding to the entity name according to the entity name and the preset graph database, the method includes: acquiring any triplet in the preset knowledge base, and obtaining the triplet based on the preset alias.
  • the dictionary obtains the alias list of entity names in the triplet; according to the alias list, it is determined whether the triplet exists in the preset graph knowledge base; if it is determined that the triplet exists, the preset
  • the graph knowledge base is used as a preset graph database; if it is determined that the triplet does not exist, a node is created in the preset knowledgebase and the triplet is imported at the node to generate a preset graph database.
  • a preset knowledge base is obtained, where the preset knowledge base includes multiple groups of triples. Get any triple in the preset knowledge base, the triple includes entity name, attribute name and attribute value.
  • a preset alias dictionary is queried based on the entity name, where the alias dictionary includes multiple entity aliases corresponding to the entity name, wherein the entity name is also an entity alias.
  • the preset graph knowledge base is searched to determine whether there is a node with the entity alias in the preset graph knowledge base, and if there is a node with the entity alias, determine whether the node with the entity alias has the three
  • the attribute name node in the tuple if there is an attribute name node in the triplet, the preset graph knowledge base is used as the preset graph database. If there is no node of the entity alias, create a node of the entity alias, and import the triplet corresponding to the node of the entity alias in the node of the entity alias, and generate the triplet corresponding to the node of the entity alias.
  • Pre-built graph database Pre-built graph database.
  • the method before acquiring the entity alias in the question to be predicted according to the preset alias dictionary, and using the entity alias as the candidate entity, the method further includes: acquiring each text in the preset knowledge base, and identifying each The entity name in the text is extracted; the entity alias of the entity name is extracted based on the preset attribute rules, and the preset alias dictionary is generated.
  • each text in the knowledge base is acquired, and the entity name in each text is identified.
  • the entity name is the common name of the name
  • the entity alias is the former name of the name.
  • the identification method includes labeling, obtaining all the names of the same semantic name, extracting the entity name in all the names according to the preset attribute rules, and the entity alias of the entity name, generating the alias list of the entity name, and the alias of each entity name.
  • Lists form a dictionary of aliases.
  • the preset attribute rules may be extracted in a probabilistic manner, for example, the probabilities of all names of the same semantic name are obtained, the name with the highest probability is used as the entity name, and the other names are used as entity aliases.
  • the candidate entities of the entity names in the problem to be predicted are obtained through a preset alias dictionary, and each candidate entity and the problem to be predicted are input into the preset entity recognition model to obtain the entity name of the problem to be predicted.
  • Set the map database and entity name obtain each attribute name in the triplet corresponding to the entity name, input each attribute name and the problem to be predicted into the preset attribute mapping model, get the target attribute name corresponding to the predicted problem, so as to obtain the target attribute name corresponding to the predicted problem.
  • Predict the question and answer corresponding to the question thereby improving the accuracy of the machine reading multiple documents.
  • FIG. 4 is a schematic flowchart of training a preset entity recognition model.
  • the training preset entity recognition model includes steps S201 to S204.
  • Step S201 Acquire the data to be trained, and determine the target entity name and candidate entity name of the problem in the to-be-trained data, wherein the target entity name is different from the candidate entity name, and the candidate entity names are multiple.
  • data to be trained is obtained, and the data to be trained includes a question to be trained, a question and answer of the question to be trained, and a triplet corresponding to the question to be trained.
  • the data to be trained is the question to be trained "Who is the author of Meihualuo?”, the triplet "Plumblossom
  • the candidate entity name in the problem to be trained the candidate entity name may be the candidate entity name of the same word, or the candidate entity name of each word. For example, by dividing the problem to be trained into each word, the candidate entity name of each word is obtained through the alias dictionary.
  • Step S202 Obtain the first character of the target entity name, replace the first character with the target entity name in the question, and generate positive example data of the data to be trained, wherein the label of the positive example data The value is 1.
  • the first character of the target entity name is obtained, and the first character is a preset character.
  • the character is [MASK].
  • the question is "Who is the author of Meihualuo?”, where "Plumblossom" is the target entity name of the question, and the position of the "Meihualuo" in "Who is the author of Meihualuo?" is determined.
  • replace the "Plum Blossoms" with [MASK] to generate the corresponding positive data, where the question in the positive data is "Who is the author of [MASK]?" and label the positive data, the The label value of the positive data is 1.
  • Step S203 Obtain the second character of the candidate entity name, replace the second character with each candidate entity name in the question, and generate multiple negative example data of the data to be trained, wherein each negative The label value of the example data is 0.
  • a second character of the candidate entity name is obtained, and the second character is a preset character.
  • the second character is [MASK].
  • the question is "Who is the author of Plum Blossom?", in which "Plum Blossom” and "Hua Luo" are candidate entity names for the question.
  • Step S204 according to the positive example data and the label value of the positive example data, as well as the multiple negative example data and the label value of each negative example data, train the first pre-training language model, and generate a corresponding preset Entity Recognition Model.
  • the positive example data and multiple negative example data are input into the first preset pre-training language model, wherein, the first preset pre-training language model (Bidirectional Encoder Representations from Transformers BERT), including dictionary files vocab.txt, through the dictionary file vocab.txt, the questions in the positive data and the negative data are divided according to words, and the word sequence of the questions in each positive and negative data is obtained, and the word sequence of the question is obtained.
  • the preset padding rules or truncation rules a sequence of characters and words of uniform length is generated.
  • the divided questions are spliced to obtain the corresponding text sequence, wherein the text sequence includes the type symbol and the position symbol of each question.
  • the [CLS] character is used as the classification symbol of the text sequence, and the [SEP] Split symbols as the position of each question.
  • the obtained text sequence is vectorized to obtain the text vector information corresponding to the text sequence.
  • each word in the input text sequence is represented by a pre-trained word feature vector to obtain text vector information, where the text vector information includes semantic representation information, location representation information, and segment representation information of each word in the text sequence. Addition information.
  • the first preset pre-trained language model includes a multi-head attention network model, and the acquired text vector information is input into the multi-head attention network model, and the multi-head attention network model acquires each word in the input text vector fused with context information
  • the corresponding vector representation is used to obtain the text semantic vector information output by the multi-head attention network model.
  • the acquired text vector information is input into a multi-head attention network model
  • the multi-head attention network model includes a first linear mapping layer
  • the text vector information is mapped to different semantic spaces through the first linear mapping layer.
  • the semantic vector captures semantic information of different dimensions. And perform self-attention operations on semantic vectors in different semantic spaces, and output text semantic vectors in different semantic spaces.
  • the text semantic vectors in different semantic spaces are spliced, and the spliced vector information is mapped back to the original semantic space through the first linear mapping layer to obtain the output text semantic vector information.
  • the acquired text vector information is input into a multi-head attention network model
  • the multi-head attention network model includes a first linear mapping layer
  • the text vector information is mapped to the semantics of different semantic spaces through the first linear mapping layer.
  • Vector capturing semantic information of different dimensions.
  • Concat is a vector splicing operation
  • W is the linear term that maps different semantic spaces back to the initial semantic space
  • C is the second text semantic vector output by the multi-head self-attention network model.
  • the spliced vector information is mapped back to the original semantic space through the first linear mapping layer, and the output text semantic vector information is obtained.
  • the semantic vector of the entity name and each entity alias is obtained from the text semantic vector information.
  • the second linear mapping layer based on the first preset pre-trained language model performs linear transformation on the entity name and the semantic vector of each entity alias to obtain the probability score value of the entity name and the probability score value of each entity alias. After normalizing the obtained probability score value of the entity name and the probability score value of each entity alias as softmax, calculate the cross entropy loss of the label value (1 or 0), and use the cross entropy loss as loss function. When multiple loss functions are obtained, corresponding model parameters are obtained through a back-propagation mechanism, and the model parameters of the first and first preset pre-trained language models are updated through the model parameters to generate a corresponding preset entity recognition model.
  • a pre-trained language model is trained to obtain a preset entity recognition model, which realizes the entity recognition of the problem by the preset entity recognition model, thereby semantically encoding the entity name, improving the representation ability and Generalization ability, thereby improving the accuracy of the preset entity recognition model.
  • FIG. 5 is a schematic flowchart of a preset attribute mapping model.
  • the preset attribute mapping model includes steps S301 to S304.
  • Step S301 Acquire data to be trained, determine target attribute names of the questions in the data to be trained, and acquire candidate attribute names associated with the target attribute names, wherein the candidate attribute names are multiple.
  • data to be trained is obtained, and the data to be trained includes a question to be trained, a question and answer of the question to be trained, and a triplet corresponding to the question to be trained.
  • the data to be trained is the question to be trained "Who is the author of Meihualuo?", the triplet "Plumblossom
  • the target attribute name can be manually annotated. Obtain the associated candidate attribute name through the target attribute name.
  • the method of obtaining the candidate attribute name includes querying a preset map database through the target attribute name, and the preset map database includes multiple groups of triples, and each group of triples includes an entity name, an attribute name and an attribute value. Obtain the attribute names in each triplet of the same node of the target attribute, and use the obtained attribute name as the candidate attribute name of the target attribute name.
  • Step S302 Generate positive example data of the to-be-trained data for the question including the target attribute name, wherein the label value of the positive example data is 1.
  • positive example data of the data to be trained is generated, wherein the positive example data includes the question to be trained, the answer to the question to be trained and the corresponding Triad. And label the positive example data, the label value of the positive example data is 1.
  • Step S303 replacing each candidate attribute name with the target attribute name in the question, to generate multiple negative example data of the to-be-trained data, wherein the label value of each negative example data is 0.
  • An example is, when determining multiple candidate attribute names of the problem to be trained in the to-be-trained data, determine the position of the target attribute name in the problem, and replace the target attribute name with each candidate attribute name.
  • the question is "What language is the TV series in Xiaoao Jianghu?”, where "language” is the target attribute name of the question, and "dialect", "starring", “director”, etc. are the candidate attribute names of the target attribute name.
  • Step S304 according to the positive example data and the label value of the positive example data, as well as the multiple negative example data and the label value of each negative example data, train the second pre-training language model, and generate a corresponding preset Attribute Mapping Model.
  • the positive example data and multiple negative example data are input into the second preset pre-training language model, wherein the second preset pre-training language model (Bidirectional Encoder Representations from Transformers BERT), including dictionary files vocab.txt, through the dictionary file vocab.txt, the questions in the positive data and the negative data are divided according to words, and the word sequence of the questions in each positive and negative data is obtained, and the word sequence of the question is obtained.
  • the preset padding rules or truncation rules a sequence of characters and words of uniform length is generated.
  • the divided questions are spliced to obtain the corresponding text sequence, wherein the text sequence includes the type symbol and the position symbol of each question.
  • the [CLS] character is used as the classification symbol of the text sequence, and the [SEP] Split symbols as the position of each question.
  • the obtained text sequence is vectorized to obtain the text vector information corresponding to the text sequence.
  • each word in the input text sequence is represented by a pre-trained word feature vector to obtain text vector information, where the text vector information includes semantic representation information, location representation information, and segment representation information of each word in the text sequence. Addition information.
  • the second preset pre-trained language model includes a multi-head attention network model, and the acquired text vector information is input into the multi-head attention network model, and the multi-head attention network model acquires each word in the input text vector fused with context information
  • the corresponding vector representation is used to obtain the text semantic vector information output by the multi-head attention network model.
  • the acquired text vector information is input into a multi-head attention network model
  • the multi-head attention network model includes a first linear mapping layer
  • the text vector information is mapped to different semantic spaces through the first linear mapping layer.
  • the semantic vector captures semantic information of different dimensions. And perform self-attention operations on semantic vectors in different semantic spaces, and output text semantic vectors in different semantic spaces.
  • the text semantic vectors in different semantic spaces are spliced, and the spliced vector information is mapped back to the original semantic space through the first linear mapping layer to obtain the output text semantic vector information.
  • the acquired text vector information is input into a multi-head attention network model
  • the multi-head attention network model includes a first linear mapping layer
  • the text vector information is mapped to the semantics of different semantic spaces through the first linear mapping layer.
  • Vector capturing semantic information of different dimensions.
  • Concat is a vector splicing operation
  • W is the linear term that maps different semantic spaces back to the initial semantic space
  • C is the second text semantic vector output by the multi-head self-attention network model.
  • the spliced vector information is mapped back to the original semantic space through the first linear mapping layer, and the output text semantic vector information is obtained.
  • the semantic vector of the entity name and each entity alias is obtained from the text semantic vector information.
  • the second linear mapping layer based on the first preset pre-trained language model performs linear transformation on the entity name and the semantic vector of each entity alias to obtain the probability score value of the attribute name and the probability score value of each other attribute name. After normalizing the obtained probability score value of the attribute name and the probability score value of each other attribute name as softmax, calculate the cross entropy loss of the label value (1 or 0), and the cross entropy loss as a loss function.
  • corresponding model parameters are obtained through a back-propagation mechanism, and the model parameters of the first and first preset pre-trained language models are updated through the model parameters to generate a corresponding preset attribute mapping model.
  • a pre-trained language model is trained to obtain a preset attribute mapping model, which realizes the attribute mapping of the preset attribute mapping model to the question, thereby semantically encoding the attribute name, and improving the representation ability and performance of the preset attribute mapping model.
  • Generalization ability thereby improving the accuracy of the preset attribute mapping model.
  • FIG. 6 is a schematic block diagram of an automatic question answering apparatus provided by an embodiment of the present application.
  • the automatic question answering device 400 includes: an acquisition module 401 , a first determination module 402 , a second determination module 403 , and a third determination module 404 .
  • the obtaining module 401 is configured to obtain the entity alias of each word in the question to be predicted according to the preset alias dictionary, and use the entity alias as a candidate entity, wherein the entity alias and the candidate entity are multiple;
  • a first determination module 402 configured to determine the entity name corresponding to the to-be-predicted question according to the to-be-predicted question and a plurality of the candidate entities based on a preset entity recognition model;
  • the second determining module 403 is configured to determine, according to the entity name and the preset map database, the triplet corresponding to the entity name in the preset map database; wherein the triplet includes the entity name, attribute name and attribute value, the triplet is multiple groups;
  • the third determining module 404 is configured to determine the target attribute name corresponding to the to-be-predicted problem according to each of the attribute names and the to-be-predicted problem based on the preset attribute mapping model, and use the attribute value corresponding to the target attribute name as the target attribute name. The question and answer of the question to be predicted.
  • the first determining module 402 is further used for:
  • the corresponding words in the to-be-predicted question are respectively replaced to generate a plurality of text records
  • the candidate entity in the target text record is determined as the entity name, and the entity name is used as the entity name corresponding to the problem to be predicted.
  • the third determining module 404 is also specifically used for:
  • Each of the attribute names is combined with the to-be-predicted question to generate a plurality of attribute text pairs;
  • the attribute name in the target attribute text pair is used as the target attribute name corresponding to the question to be predicted.
  • the automatic question and answer device is also used for:
  • the first pre-trained language model is trained according to the positive example data and the label values of the positive example data, as well as the plurality of negative example data and the label values of the respective negative example data, and a corresponding preset entity recognition model is generated .
  • the automatic question answering device is also used for:
  • Generating the positive example data of the data to be trained will include the question of the target attribute name, wherein the label value of the positive example data is 1;
  • Each candidate attribute name is respectively replaced with the target attribute name in the problem, and a plurality of negative example data of the to-be-trained data is generated, wherein the label value of each negative example data is 0;
  • a second pre-trained language model is trained according to the positive example data and the label values of the positive example data, as well as the multiple negative example data and the label values of the respective negative example data, and a corresponding preset attribute mapping model is generated .
  • the automatic question answering device is also used for:
  • the preset graph knowledge base is used as a preset graph database
  • a node is created in the preset knowledge base and the triplet is imported at the node to generate a preset graph database.
  • the automatic question answering device is also used for:
  • the entity alias of the entity name is extracted based on the preset attribute rule, and a preset alias dictionary is generated.
  • the apparatuses provided by the above embodiments may be implemented in the form of a computer program, and the computer program may be executed on the computer device as shown in FIG. 7 .
  • FIG. 7 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application.
  • the computer device may be a terminal.
  • the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a non-volatile storage medium and an internal memory.
  • the nonvolatile storage medium can store operating systems and computer programs.
  • the computer program includes program instructions that, when executed, can cause the processor to execute any automatic question-answering method.
  • the processor is used to provide computing and control capabilities to support the operation of the entire computer equipment.
  • the internal memory provides an environment for running the computer program in the non-volatile storage medium.
  • the processor can execute any automatic question-answering method.
  • the network interface is used for network communication, such as sending assigned tasks.
  • the network interface is used for network communication, such as sending assigned tasks.
  • FIG. 7 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • the processor may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (Application Specific Integrated circuits) Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor or the like.
  • the processor is configured to run a computer program stored in the memory to implement the following steps:
  • the preset alias dictionary obtain the entity alias of each word in the question to be predicted, and use the entity alias as a candidate entity, wherein the entity alias and the candidate entity are multiple;
  • the triplet corresponding to the entity name in the preset graph database is determined; wherein, the triplet includes the entity name, attribute name and attribute value, and the triplet group is multi-group;
  • the target attribute name corresponding to the to-be-predicted question is determined according to each of the attribute names and the to-be-predicted question, and the attribute value corresponding to the target attribute name is used as the question and answer of the to-be-predicted question.
  • the processor determines the entity name corresponding to the problem to be predicted according to the problem to be predicted and a plurality of candidate entities based on a preset entity recognition model, the processor is used to realize:
  • the corresponding words in the to-be-predicted question are respectively replaced to generate a plurality of text records
  • the candidate entity in the target text record is determined as the entity name, and the entity name is used as the entity name corresponding to the problem to be predicted.
  • the processor determines the target attribute name corresponding to the problem according to each of the attribute names and the to-be-predicted problem based on a preset attribute mapping model, the processor is used to realize:
  • Each of the attribute names is combined with the to-be-predicted question to generate a plurality of attribute text pairs;
  • the attribute name in the target attribute text pair is used as the target attribute name corresponding to the question to be predicted.
  • the processor automatic question answering method further includes when implementing, for implementing:
  • the first pre-trained language model is trained according to the positive example data and the label values of the positive example data, as well as the plurality of negative example data and the label values of the respective negative example data, and a corresponding preset entity recognition model is generated .
  • the processor automatic question answering method further includes when implementing, for implementing:
  • Generating the positive example data of the data to be trained will include the question of the target attribute name, wherein the label value of the positive example data is 1;
  • Each candidate attribute name is respectively replaced with the target attribute name in the problem, and a plurality of negative example data of the to-be-trained data is generated, wherein the label value of each negative example data is 0;
  • a second pre-trained language model is trained according to the positive example data and the label values of the positive example data, as well as the multiple negative example data and the label values of the respective negative example data, and a corresponding preset attribute mapping model is generated .
  • the processor determines, according to the entity name and the preset graph database, when the triplet corresponding to the entity name is previously implemented, to implement:
  • the preset graph knowledge base is used as a preset graph database
  • a node is created in the preset knowledge base and the triplet is imported at the node to generate a preset graph database.
  • the processor obtains entity aliases in the problem to be predicted according to a preset alias dictionary, and uses the entity aliases as candidate entities when implemented before, for implementing:
  • the entity alias of the entity name is extracted based on the preset attribute rule, and a preset alias dictionary is generated.
  • Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, the computer program includes program instructions, and the method implemented when the program instructions are executed may refer to this document Various embodiments of automated question answering methods are claimed.
  • the computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiments, such as a hard disk or a memory of the computer device.
  • the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk equipped on the computer device, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) ) card, Flash Card, etc.
  • the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function, and the like; The data created by the use of the node, etc.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Procédé et appareil de réponse automatique à des questions, dispositif et support de mémoire lisible par ordinateur. Le procédé consiste : selon un dictionnaire d'alias pré-configuré, à acquérir des entités candidates de chaque mot dans une question à prédire ; sur la base d'un modèle d'identification d'entité pré-configuré, en fonction de la question à prédire et de la pluralité d'entités candidates, à déterminer un nom d'entité correspondant à la question à prédire ; en fonction du nom d'entité et d'une base de données de graphe pré-configurée, à déterminer des triplets correspondant au nom d'entité ; sur la base d'un modèle de mappage d'attribut pré-configuré, en fonction de noms d'attribut et de la question à prédire, à déterminer un nom d'attribut cible correspondant à la question à prédire, et à utiliser la valeur d'attribut correspondant au nom d'attribut cible en tant que réponse à la question à prédire. Un codage sémantique est mis en œuvre à la fois dans l'identification d'entité pour une question à l'aide d'un modèle d'identification d'entité pré-configuré et d'un mappage d'attribut pour la question à l'aide d'un modèle de mappage d'attribut, de telle sorte que la capacité de représentation et la capacité de généralisation du texte lu par machine sont améliorées, et ainsi la précision du modèle d'identification d'entité pré-configuré et du modèle de mappage d'attribut pré-configuré est améliorée.
PCT/CN2021/097419 2020-10-29 2021-05-31 Procédé et appareil de réponse automatique à des questions, dispositif et support de mémoire WO2022088671A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011187360.9 2020-10-29
CN202011187360.9A CN112328759A (zh) 2020-10-29 2020-10-29 自动问答方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022088671A1 true WO2022088671A1 (fr) 2022-05-05

Family

ID=74296400

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097419 WO2022088671A1 (fr) 2020-10-29 2021-05-31 Procédé et appareil de réponse automatique à des questions, dispositif et support de mémoire

Country Status (2)

Country Link
CN (1) CN112328759A (fr)
WO (1) WO2022088671A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723073A (zh) * 2022-06-07 2022-07-08 阿里健康科技(杭州)有限公司 语言模型预训练、产品搜索方法、装置以及计算机设备
CN116991875A (zh) * 2023-09-26 2023-11-03 海信集团控股股份有限公司 一种基于大模型的sql语句生成、别名映射方法及设备
CN117149985A (zh) * 2023-10-31 2023-12-01 海信集团控股股份有限公司 一种基于大模型的问答方法、装置、设备及介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328759A (zh) * 2020-10-29 2021-02-05 平安科技(深圳)有限公司 自动问答方法、装置、设备及存储介质
CN113761940B (zh) * 2021-09-09 2023-08-11 杭州隆埠科技有限公司 新闻主体判断方法、设备及计算机可读介质
CN114817510B (zh) * 2022-06-23 2022-10-14 清华大学 问答方法、问答数据集生成方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107748757A (zh) * 2017-09-21 2018-03-02 北京航空航天大学 一种基于知识图谱的问答方法
US20190065576A1 (en) * 2017-08-23 2019-02-28 Rsvp Technologies Inc. Single-entity-single-relation question answering systems, and methods
CN110502621A (zh) * 2019-07-03 2019-11-26 平安科技(深圳)有限公司 问答方法、问答装置、计算机设备及存储介质
CN110765257A (zh) * 2019-12-30 2020-02-07 杭州识度科技有限公司 一种知识图谱驱动型的法律智能咨询***
CN110837550A (zh) * 2019-11-11 2020-02-25 中山大学 基于知识图谱的问答方法、装置、电子设备及存储介质
CN112328759A (zh) * 2020-10-29 2021-02-05 平安科技(深圳)有限公司 自动问答方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190065576A1 (en) * 2017-08-23 2019-02-28 Rsvp Technologies Inc. Single-entity-single-relation question answering systems, and methods
CN107748757A (zh) * 2017-09-21 2018-03-02 北京航空航天大学 一种基于知识图谱的问答方法
CN110502621A (zh) * 2019-07-03 2019-11-26 平安科技(深圳)有限公司 问答方法、问答装置、计算机设备及存储介质
CN110837550A (zh) * 2019-11-11 2020-02-25 中山大学 基于知识图谱的问答方法、装置、电子设备及存储介质
CN110765257A (zh) * 2019-12-30 2020-02-07 杭州识度科技有限公司 一种知识图谱驱动型的法律智能咨询***
CN112328759A (zh) * 2020-10-29 2021-02-05 平安科技(深圳)有限公司 自动问答方法、装置、设备及存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723073A (zh) * 2022-06-07 2022-07-08 阿里健康科技(杭州)有限公司 语言模型预训练、产品搜索方法、装置以及计算机设备
CN114723073B (zh) * 2022-06-07 2023-09-05 阿里健康科技(杭州)有限公司 语言模型预训练、产品搜索方法、装置以及计算机设备
CN116991875A (zh) * 2023-09-26 2023-11-03 海信集团控股股份有限公司 一种基于大模型的sql语句生成、别名映射方法及设备
CN116991875B (zh) * 2023-09-26 2024-03-08 海信集团控股股份有限公司 一种基于大模型的sql语句生成、别名映射方法及设备
CN117149985A (zh) * 2023-10-31 2023-12-01 海信集团控股股份有限公司 一种基于大模型的问答方法、装置、设备及介质
CN117149985B (zh) * 2023-10-31 2024-03-19 海信集团控股股份有限公司 一种基于大模型的问答方法、装置、设备及介质

Also Published As

Publication number Publication date
CN112328759A (zh) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2022088672A1 (fr) Procédé et appareil de compréhension de lecture de machine basés sur bert, dispositif, et support de stockage
WO2022088671A1 (fr) Procédé et appareil de réponse automatique à des questions, dispositif et support de mémoire
US11327978B2 (en) Content authoring
WO2021135910A1 (fr) Procédé d'extraction d'informations basé sur la compréhension de lecture de machine et dispositif associé
CN108959246B (zh) 基于改进的注意力机制的答案选择方法、装置和电子设备
US10650102B2 (en) Method and apparatus for generating parallel text in same language
WO2022105122A1 (fr) Procédé et appareil de génération de réponse basés sur l'intelligence artificielle, ainsi que dispositif informatique et support
US10740678B2 (en) Concept hierarchies
US9373075B2 (en) Applying a genetic algorithm to compositional semantics sentiment analysis to improve performance and accelerate domain adaptation
WO2020147238A1 (fr) Procédé de détermination de mot-clé, procédé, appareil et dispositif de notation automatique, et support
US20170330084A1 (en) Clarification of Submitted Questions in a Question and Answer System
US11017301B2 (en) Obtaining and using a distributed representation of concepts as vectors
US20170161619A1 (en) Concept-Based Navigation
CN110929125B (zh) 搜索召回方法、装置、设备及其存储介质
US10108661B2 (en) Using synthetic events to identify complex relation lookups
CN112287069B (zh) 基于语音语义的信息检索方法、装置及计算机设备
CN115668168A (zh) 用于处理数据记录的方法和***
US20220300543A1 (en) Method of retrieving query, electronic device and medium
US20160110364A1 (en) Realtime Ingestion via Multi-Corpus Knowledge Base with Weighting
US11669679B2 (en) Text sequence generating method and apparatus, device and medium
WO2022141872A1 (fr) Procédé et appareil de génération de résumé de document, dispositif informatique et support de stockage
CN116821373A (zh) 基于图谱的prompt推荐方法、装置、设备及介质
CN115600605A (zh) 一种中文实体关系联合抽取方法、***、设备及存储介质
WO2022073341A1 (fr) Procédé et appareil de mise en correspondance d'entités de maladie fondés sur la sémantique vocale, et dispositif informatique
CN114842982B (zh) 一种面向医疗信息***的知识表达方法、装置及***

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21884414

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21884414

Country of ref document: EP

Kind code of ref document: A1