CN114265922A

CN114265922A - Automatic question answering and model training method and device based on cross-language

Info

Publication number: CN114265922A
Application number: CN202111395477.0A
Authority: CN
Inventors: 段智超; 李秀星; 李振宇; 王建勇
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2021-11-23
Filing date: 2021-11-23
Publication date: 2022-04-01

Abstract

The invention provides a cross-language-based automatic question answering and model training method and equipment, wherein the method comprises the following steps: acquiring a question and a text, wherein the language of the text is different from the language of the question; inputting the questions and the texts into an automatic question-answering model to obtain position information of answers of the questions in the texts output by the automatic question-answering model; the automatic question-answering model is obtained by training a preset model comprising a knowledge-enhanced pre-training model based on question samples and text samples comprising answers of the question samples, the knowledge-enhanced pre-training model is obtained by pre-training an initial pre-training model based on a plurality of multi-tuples constructed by using a knowledge graph, and cross-language knowledge can be learned to establish corresponding relations among different languages, so that the reasoning ability and the understanding ability of the knowledge-enhanced pre-training model to different languages are enhanced, and the improvement of the performance of the automatic question-answering model to cross-language question-answering is realized.

Description

Automatic question answering and model training method and device based on cross-language

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a cross-language-based automatic question answering and model training method and device.

Background

At present, an extraction type question-answering task is widely focused as one of natural language processing tasks. The extraction type question-answering task is to give a question and a text segment, and to find out the text segment capable of answering the question from the text segment.

With the rapid development of information technology, new question-answering scenarios are emerging continuously, which also present new challenges to the existing question-answering methods, such as a question-answering task based on cross-language, i.e. questions and answers exist in different languages, for example, for question 1 expressed in chinese: "how much the crude oil is rising today? "text 1, which requires english-based expression: "A barrel of light sweet reel, on the New York Mercabile Exchange, $1.55 to match at $ 63.38", extract the answer "$ 1.55", for question 2 in English: "How much what did you build his birth? ", text 2, which is required to be expressed based on chinese: "New York Commodity exchange barrel of light sweet crude oil today rises $1.55, receives $ 63.38", and extracts the answer, $1.55 ". For the question-answering task of cross-language, at present, no question-answering scheme with good performance exists, so that answers of some questions cannot be extracted. How to improve the performance of cross-language question answering becomes one of the problems to be solved urgently by the question answering task.

Disclosure of Invention

The embodiment of the invention provides a cross-language-based automatic question answering and model training method and equipment, which are used for solving the defect that answers of some questions cannot be extracted due to the fact that a better-performance question answering scheme does not exist for a cross-language question answering task in the prior art, improving the performance of an automatic question answering model and further accurately extracting answers of the questions.

In a first aspect, an embodiment of the present invention provides an automatic question answering method based on cross-language, including:

obtaining a question and a text, wherein the language of the text is different from the language of the question;

inputting the question and the text into an automatic question-answering model to obtain position information of answers of the question in the text output by the automatic question-answering model;

the automatic question-answering model is obtained by training a preset model comprising a knowledge-enhanced pre-training model based on a question sample and a text sample comprising answers of the question sample, and the knowledge-enhanced pre-training model is obtained by pre-training an initial pre-training model based on a plurality of multi-tuples constructed by using a knowledge graph;

the multiple tuples comprise a first type tuple, a second type tuple and a third type tuple; each of the first type of tuple comprises a head entity, a relation and a tail entity expressed by a same language in a plurality of languages; each tuple in the second type of tuples comprises a head entity expressed in one language of the plurality of languages, a relationship of a language expression of the same kind as the head entity or the tail entity, and a tail entity expressed in a language different from the head entity; each tuple in the third type of tuples includes a head entity, a relationship, and a tail entity expressed in one of the plurality of languages, and a head entity, a relationship, and a tail entity expressed in another language.

Optionally, the pre-training process of the knowledge-enhanced pre-training model includes:

for each tuple, inputting the tuple to the initial pre-training model, training the initial pre-training model to output a head entity of each language expression in the tuple, and inputting the tuple to the initial pre-training model, training the initial pre-training model to output a tail entity of each language expression in the tuple.

Optionally, the inputting the tuples into the initial pre-training model, training the initial pre-training model to output head entities of each language expression in the tuples, and inputting the tuples into the initial pre-training model, training the initial pre-training model to output tail entities of each language expression in the tuples includes:

inputting the multi-element group into the initial pre-training model, masking head entities expressed by each language in the multi-element group, and training the initial pre-training model to output the head entities expressed by each language in the multi-element group;

inputting the multi-element group into the initial pre-training model, masking the tail entity expressed by each language in the multi-element group, and training the initial pre-training model to output the tail entity expressed by each language in the multi-element group.

Optionally, the first loss function used for the pre-training of the knowledge-enhanced pre-training model is obtained based on a maximum likelihood estimation.

Optionally, before the inputting the tuple to the initial pre-training model and training the initial pre-training model to output a head entity of each linguistic expression in the tuple, and inputting the tuple to the initial pre-training model and training the initial pre-training model to output a tail entity of each linguistic expression in the tuple, the method further includes:

acquiring the knowledge graph, wherein the knowledge graph comprises a plurality of preset triples, and the preset triples comprise a head entity, a relation and a tail entity;

acquiring a head entity and a tail entity in each preset triple to form an entity set, and acquiring multiple language expressions corresponding to the entity from a preset language database aiming at each entity in the entity set;

acquiring the relation in each preset triple to form a relation set, and acquiring multiple language expressions corresponding to the relation from the preset language database aiming at each relation in the relation set;

and respectively constructing the first type of multi-tuple, the second type of multi-tuple and the third type of multi-tuple based on the plurality of preset triples, the entity set and the relationship set.

Optionally, constructing the first type of tuple includes:

performing the following operations on each preset triplet:

sequentially selecting one of the multiple languages as a current language;

and acquiring a head entity and a tail entity of the current language expression from the entity set aiming at the head entity and the tail entity in the preset triples based on the current language, and acquiring a relation of the current language expression from the relation set aiming at the relation in the preset triples so as to obtain the triples in the first type of multi-tuple.

Optionally, constructing the second type of tuple includes:

performing the following operations on each preset triplet:

sequentially selecting one of the multiple languages as a current language;

determining at least one language different from the current language type in the multiple languages based on the current language, and respectively combining each language different from the current language type with the current language to form at least one first cross-language combination; for each first cross-language combination, aiming at a head entity and a tail entity in the preset triples, acquiring a head entity expressed by the current language in the first cross-language combination and a tail entity expressed by a language different from the current language in the first cross-language combination from the entity set, and aiming at the relationship in the preset triples, acquiring the relationship expressed by the current language or the language different from the current language in the first cross-language combination from the relationship set to obtain the tuples in the second type of tuples.

Optionally, constructing the third type of tuple includes:

performing the following operations on each preset triplet:

sequentially selecting one of the multiple languages as a current language;

determining at least one language different from the current language type in the multiple languages based on the current language, and respectively combining each language different from the current language type with the current language to form at least one second cross-language combination; for each second cross-language combination, acquiring a head entity and a tail entity of a current language expression in the second cross-language combination from the entity set aiming at the head entity and the tail entity in the preset triple, acquiring a relation of the current language expression in the second cross-language combination from the relation set aiming at the relation in the preset triple, acquiring a head entity and a tail entity of a language expression different from the current language type in the second cross-language combination from the entity set aiming at the head entity and the tail entity in the preset triple, and acquiring a relation of a language expression different from the current language type in the second cross-language combination from the relation set aiming at the relation in the preset triple so as to obtain the multi-tuple in the third type multi-tuple.

In a second aspect, an embodiment of the present invention provides a cross-language-based automatic question-answering model training method, including:

acquiring a plurality of tuples constructed based on the knowledge graph; the multiple tuples comprise a first type tuple, a second type tuple and a third type tuple; each of the first type of tuple comprises a head entity, a relation and a tail entity expressed by a same language in a plurality of languages; each tuple in the second type of tuples comprises a head entity expressed in one language of the plurality of languages, a relationship of a language expression of the same kind as the head entity or the tail entity, and a tail entity expressed in a language different from the head entity; each tuple in the third type of tuples comprises a head entity, a relationship and a tail entity expressed in one language of the plurality of languages and a head entity, a relationship and a tail entity expressed in another language;

sequentially inputting each multi-tuple to an initial pre-training model for pre-training to obtain a knowledge-enhanced pre-training model;

inputting a question sample and a text sample containing answers of the question sample into a preset model containing the knowledge-enhanced pre-training model for training to obtain an automatic question and answer model, wherein the automatic question and answer model is used for extracting position information of the answers of the questions from the text.

In a third aspect, embodiments of the present invention also provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the program, the steps of the methods provided in the first and second aspects are implemented.

In a fourth aspect, embodiments of the present invention also provide a non-transitory computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the method as provided in the first and second aspects above.

The invention provides a cross-language-based automatic question-answering method, which is characterized in that questions and texts are obtained, the languages of the texts are different from the languages of the questions, the questions and the texts are input into an automatic question-answering model, and position information of answers to the questions in the texts output by the automatic question-answering model is obtained, wherein the automatic question model is obtained by training a preset model containing a knowledge-enhanced pre-training model based on question samples and text samples, the knowledge-enhanced pre-training model is obtained by continuously pre-training the initial pre-training model based on a plurality of tuples constructed by using a knowledge graph, the plurality of tuples comprise a first-class tuple, a second-class tuple and a third-class tuple, each tuple in the first-class tuple comprises a head entity, a relation and a tail entity expressed by the same language in a plurality of languages, each multi-element group in the second multi-element group comprises a head entity expressed by one language in a plurality of languages, a relation expressed by the same kind of language as the head entity or the tail entity and a tail entity expressed by a different kind of language from the head entity, each multi-element group in the third multi-element group comprises a head entity expressed by one language in a plurality of languages, a relation and a tail entity expressed by another language, and the head entity, the relation and the tail entity expressed by another language are beneficial to extracting key information of the entities and the relations and filtering out irrelevant information, and can enable the common knowledge of a plurality of languages to be injected into the knowledge enhanced pre-training model, and cross-language knowledge can be learned to establish corresponding relations between different languages so as to connect different languages, thereby enhancing the reasoning ability and the understanding ability of the knowledge enhanced pre-training model, and the performance of the automatic question-answering model on the cross-language question-answering is improved, so that the answer of the question is accurately extracted from the text.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flow chart of an automatic question answering method based on cross-language according to an embodiment of the present invention;

FIG. 2 is a second flowchart of the cross-language based automatic question answering method according to the embodiment of the present invention;

FIG. 3 is a third flowchart of an automatic question answering method based on cross-language according to an embodiment of the present invention;

FIG. 4 is a fourth flowchart illustrating an automatic question answering method based on cross-language according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the languages contained in the MLQA provided by an embodiment of the present invention;

FIG. 6 is a schematic diagram of an example of a question-answer provided by an embodiment of the present invention;

FIG. 7 is a flowchart illustrating a cross-language based method for training an automatic question-answering model according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The shortage of information for connecting different languages becomes a bottleneck for restricting the performance of the cross-language question-answering task, and greatly limits the reasoning capability of the automatic question-answering model. Therefore, rich semantic knowledge contained in each language and potential association among different languages can be deeply mined, and the cross-language reasoning capability of the model is further improved. The cross-language based automatic question answering method provided by the embodiment of the invention is described in more detail below.

Fig. 1 is a schematic flowchart of an automatic question answering method based on cross-language according to an embodiment of the present invention.

The cross-language-based automatic question answering method provided by this embodiment may be executed by an intelligent terminal such as a computer or a combination of hardware and/or software therein, or may be executed by a server or a combination of hardware and/or software therein, as shown in fig. 1, the method may include:

at step 110, a question and a text are obtained, the language of the text being different from the language of the question.

Step 120, inputting the question and the text into an automatic question-answering model to obtain position information of answers of the question in the text output by the automatic question-answering model;

the multiple tuples comprise a first type tuple, a second type tuple and a third type tuple; each tuple in the first type of tuple comprises a head entity, a relation and a tail entity expressed by the same language in a plurality of languages; each tuple in the second type tuple comprises a head entity expressed by one language in a plurality of languages, a relation expressed by the same kind of languages as the head entity or the tail entity and a tail entity expressed by different kinds of languages from the head entity; each tuple in the third type of tuple includes a head entity, a relationship, and a tail entity expressed in one of the plurality of languages, and a head entity, a relationship, and a tail entity expressed in another of the plurality of languages.

Many of which are natural languages. Natural language is a language that naturally evolves with culture and is a major tool for human communication and thinking. For example, Chinese, English, German, Spanish, Hindi, Arabic, and Vietnamese, etc., are natural languages.

A knowledge graph is a representation of knowledge and can be represented in a form of a triple, and a connection between entities is established. An entity is the most basic element in a knowledge graph and can be a person, an object, a tissue, and the like. A triplet may include a head entity, a relationship, and a tail entity, denoted as (h, r, t). Where the relationship r is the relationship between the head entity h and the tail entity t. For example, the phrase "Kaiweirand is a basketball player" where "Kaiweirand" is the head entity, "basketball player" is the tail entity, and "is the relationship between" Kaiweirand "and" basketball player ".

In practical application, a plurality of tuples used for pre-training can be constructed based on the knowledge graph. The tuples in the first type of tuples are represented as (h)_i，r_i，t_i) The multi-element group comprises head entities h expressed by a language i in a plurality of languages_iRelation r_iAnd tail entity t_i. The tuples in the second type of tuples are represented as (h)_j，r_i，t_i) Or (h)_i，r_i，t_j) Language i and language j are different ones of a plurality of languages, tuples (h)_j，r_i，t_i) Head entity h comprising an expression in language j_jAnd relation r expressed by language i_iAnd tail entity t_iMultiple element group (h)_i，r_i，t_j) Head entity h comprising an expression of a language i_iAnd relation r_iAnd tail entity t expressed in language j_j. The tuples in the third type of tuples are represented as (h)_i,r_i，t_i,h_j,r_j,t_j) The language i and the language j are different languages in a plurality of languages, and the multi-element group comprises head entities h expressed by the language i in one of the languages_iRelation r_iAnd tail entity t_iAnd a header entity h expressed in another language j_jRelation r_jAnd tail entity t_j。

The plurality of languages may include at least two languages. The above multiple tuples are illustrated by taking two languages, Chinese and English, as an example.

Illustratively, (Keven Durant, is a, Basketball Player) is a tuple in the first type of tuple, where the head entity, the relationship, and the tail entity are all expressed in Chinese. (Kaiw Durand, is a basketball player) is also a tuple in the first category of tuples, where the head entity, the relationship and the tail entity are all expressed in English. Wherein, the 'Kaiw Duranty' and the 'Keven Duranty' correspond to the same entity, the 'Basketball Player' and the 'Basketball Player' correspond to the same entity, the 'Yes' and the 'is a' correspond to the same relation, and only the language expression is different.

(Kaiwulant, is, Basketball Player) is a tuple in the second type of tuple where the head entity and relationship are expressed in Chinese and the tail entity is expressed in English.

(Keven Durant, is a, Basketball Player, Kaiw Dulant, Basketball players) is a third type of tuple, which includes the head, relationship and tail entities expressed in English, and also includes the head, relationship and tail entities expressed in Chinese.

The initial pre-training model is already pre-trained initially by a pre-training method in the related art, and in this step, the initial pre-training model is further pre-trained by using the plurality of tuples.

For a sentence, the head entity, the relation and the tail entity in the multi-tuple are critical information, after the multi-tuple is used for pre-training, the critical information is extracted conveniently, irrelevant information is filtered, the knowledge of the same language (namely a single language) cannot be learned, and the cross-language knowledge can be learned to establish the corresponding relation between different languages, so that the different languages are connected. Taking the multi-tuple (Keven Durant, is a, Basketball Player, Kaiw Dulander, Basketball Player) in the third multi-tuple as an example, the corresponding relationship between English "Basketball Player" and Chinese "Basketball Player" can be learned, and the language translation is not needed. Therefore, the injection of the general knowledge of various languages is realized, and the knowledge-enhanced pre-training model is obtained, so that the reasoning ability and the understanding ability of different languages of the knowledge-enhanced pre-training model are enhanced.

The position information of the answer may include a start position and an end position of the answer.

In this embodiment, the position information of the answer to the question in the text output by the automatic question-answering model is obtained by acquiring the question and the text, where the language of the text is different from the language of the question, and inputting the question and the text into the automatic question-answering model, where the automatic question model is obtained by training a preset model including a knowledge-enhanced pre-training model based on a question sample and a text sample, where the knowledge-enhanced pre-training model is obtained by continuously pre-training an initial pre-training model based on multiple tuples constructed by using a knowledge map, where the multiple tuples include a first-class tuple, a second-class tuple, and a third-class tuple, each tuple in the first-class tuple includes a head entity, a relationship, and a tail entity expressed in the same language in multiple languages, and each tuple in the second-class tuple includes a head entity, a tail entity, and a tail entity expressed in one of the multiple languages, The relation expressed by the same kind of language as the head entity or the tail entity and the tail entity expressed by different kinds of languages from the head entity, each tuple in the third kind of tuple comprises the head entity, the relation and the tail entity expressed by one language in a plurality of languages and the head entity, the relation and the tail entity expressed by another language, which is not only beneficial to extracting the key information of the entities and the relation and filtering out irrelevant information, but also can lead the common knowledge of a plurality of languages to be injected into the pre-training model with enhanced knowledge, and can learn the cross-language knowledge to establish the corresponding relation between different languages so as to connect different languages, thereby leading the reasoning ability and the understanding ability of the pre-training model with enhanced knowledge to be enhanced, and based on the method, the automatic question-answer model can dig out the potential relation between different languages, the cross-language reasoning capability of the automatic question-answering model is improved, so that the performance of the automatic question-answering model on cross-language question-answering is improved, and answers of questions are accurately extracted from texts.

Based on the above embodiments, the training process of the automatic question-answering model may include: inputting the question sample and the text sample containing the answer of the question sample into a preset model containing a knowledge-enhanced pre-training model for training so as to obtain an automatic question-answering model.

Specifically, the question sample expressed in one of the languages and the text sample including the answer to the question sample may be input to a preset model including a knowledge-enhanced pre-training model for training, so as to obtain an automatic question-answering model. Illustratively, question samples and text samples in the widely used english language may be employed. As shown in fig. 2, in this step, the preset model of the knowledge-enhanced pre-training model is mainly fine-tuned, so that the automatic question-answering task can be adapted to, an automatic question-answering model is obtained, and subsequently, the position information of the answer to the question in the text can be predicted by using the automatic question-answering model.

Based on any of the above embodiments, inputting the question sample and the text sample containing the answer to the question sample into a preset model containing a knowledge-enhanced pre-training model for training to obtain an automatic question-answering model, which may specifically include: inputting the question sample and the text sample containing the answer of the question sample into a preset model containing a knowledge-enhanced pre-training model to obtain position information of the answer of the question sample, extracting the answer of the question sample from the text sample containing the answer of the question sample based on the position information of the answer of the question sample, calculating a second loss function based on the extracted answer of the question sample and a pre-marked answer aiming at the question sample, and training the preset model by utilizing the second loss function to obtain an automatic question model.

Based on the above embodiments, the pre-training process of the knowledge-enhanced pre-training model may include: and inputting the multi-element group into the initial pre-training model, and training the initial pre-training model to output the head entity of each language expression in the multi-element group. In this embodiment, during pre-training, the initial pre-training model is trained to output entities expressed by each language in the multi-tuple, so as to implement injection of multi-language knowledge, that is, knowledge stored in the pre-training model is enriched by using a link prediction technique, and further, knowledge stored in the automatic question-answering model is enriched.

Based on the above embodiment, inputting the multi-element set to the initial pre-training model, training the initial pre-training model to output the head entity of each language expression in the multi-element set, and inputting the multi-element set to the initial pre-training model, training the initial pre-training model to output the tail entity of each language expression in the multi-element set, and the specific implementation manner may include: inputting the multi-element group into an initial pre-training model, masking head entities expressed by each language in the multi-element group, and training the initial pre-training model to output the head entities expressed by each language in the multi-element group; and inputting the multi-element group into an initial pre-training model, masking the tail entity expressed by each language in the multi-element group, and training the initial pre-training model to output the tail entity expressed by each language in the multi-element group.

In practice, one of the important reasons that a pre-trained model can be successful in a wide range of applications is to optimize the entire model using unsupervised training objectives. More specifically, methods using mask language modeling build a pre-trained model based on large unlabeled text.

By way of example in fig. 2, for a tuple in the first type of tuple (keen Durant, is a, Basketball Player), a tuple in the second type of tuple (kelvindolant, is, Basketball Player) and a tuple in the third type of tuple (keen Durant, is a, Basketball Player, is, Basketball Player), the english-expressing entity "Basketball Player" and the chinese-expressing "Basketball Player" are replaced with a [ MASK ] mark to MASK "Basketball Player" and "Basketball Player", and then the goal of our pre-training is to recover these missing (i.e., masked) entities to get the corresponding outputs of the first type of tuple, the second type of tuple and the third type of tuple, as shown in fig. 2, the english-expressing "Basketball Player" and the chinese-expressing "Basketball" Player "corresponding to each tuple are output.

In this embodiment, a masking language modeling method is adopted to mask a head entity or a tail entity of a tuple and directly output the head entity or the tail entity, that is, a generating method is used to perform completion (i.e., completion of the entity) on a missing entity, thereby enhancing the reasoning ability of the model. And outputting the masked entity under the explicit constraint of the entity expressed by each unmasked language, so as to obtain a better model parameter.

Based on any of the above embodiments, the first loss function employed for the pre-training of the knowledge-enhanced pre-trained model is derived based on a maximum likelihood estimation. Specifically, when the pre-training is performed, the training is stopped when the first loss function is smaller than a preset threshold. The preset threshold may be set according to actual conditions, and is not specifically limited herein.

In practical applications, in mask language modeling, a model is trained by minimizing a loss function to predict missing tags (tokens) based on given information, and generally, a sentence is divided into a plurality of tokens. For example, the sentence "Kaiweilander is a basketball player" is divided into three tokens, "Kaiweilander", "Yes", and "basketball player".

The loss function based on the maximum likelihood estimation is usually adopted in the mask language modeling, as shown in formula (1):

x is a sequence of tokens derived from a sentence, comprising x₁，x₂，……，x_k……，x_nWhere n denotes the number of tokens, x_kRepresents the kth token; m is an indicator comprising₁，m₂，……，m_k……，m_n，m_kIndicates that the kth token in the marker sequence is covered by [ MASK]Replacement; x omicron m represents that x is multiplied by elements in m one by one; θ is a parameter of the model; p (x)_kI x omicron m; theta) represents the pair x_kA predicted probability of (d); mlm denotes mask language modeling (L)_mlm(θ) represents the penalty function employed for the mask language modeling.

The specific implementation of the maximum likelihood estimation may be implemented by referring to the related art, which is not described herein.

Based on this, in the present embodiment, the formula of the first loss function adopted may be as follows:

where θ is a parameter of the model, w represents the number of masked entities in the tuple, e_yDenotes the y-th masked entity, p (e)_yI e', r; θ) represents the masked entity e_yE' represents an unmasked entity, r represents a relationship, ec represents a completion entity (entity completion), L_ec(θ) represents the first loss function employed in completing the entity.

In this embodiment, pre-training is performed based on the first loss function obtained by the maximum likelihood estimation, which is beneficial to obtaining a better model parameter.

Based on any of the above embodiments, the automatic question-answering model may further include a full connection layer, and the method may further include inputting the question and the text into the automatic question-answering model to obtain location information of an answer to the question in the text output by the automatic question-answering model, and specifically may include: inputting the question and the text into a pre-training model with enhanced knowledge to obtain a characteristic vector, and inputting the characteristic vector into a full-connection layer to obtain position information of an answer of the question in the text. Thus, through the full connection layer, the dimension reduction is realized, and the position information of the answer of the question is obtained.

In practical applications, the knowledge-enhanced pre-training model may include a transformer model. the transformer model is a model based on an attention mechanism, and the training speed is high.

As shown in fig. 3, when inputting the question and the text into the automatic question-answering model, the method may specifically include: the question and text are concatenated together, a first flag [ CLS ] is added before the question, and a second flag [ SEP ] is added between the question and the text to help the model distinguish the two parts, wherein [ CLS ] is generally placed at the head of the first sentence and [ SEP ] is used to separate the two sentences. The method comprises the steps of inputting a question and a text into a label Embedding (Token Embedding) layer, a Position Embedding (Position Embedding) layer and a Segment Embedding (Segment Embedding) layer respectively for Embedding, inputting the question and the text into a knowledge-enhanced pre-training model to obtain a feature vector, and inputting the feature vector into a full-connection layer to obtain Position information of an answer in the text.

Where marker embedding is used to convert each token into a vector. Position embedding is used to set position information for each token, and different positions are illustrated in fig. 3 by 1, 2, 3, … …, n, n +1, n +2, n +3, n +4, n + 5. The segment embedding is used for distinguishing tokens of different sentences, the token corresponding to the question is indicated by symbol A in FIG. 3, and the token corresponding to the text is indicated by symbol B.

Based on any of the above embodiments, before inputting the tuple to the initial pre-training model and training the initial pre-training model to output the head entity of each linguistic expression in the tuple, and inputting the tuple to the initial pre-training model and training the initial pre-training model to output the tail entity of each linguistic expression in the tuple, as shown in fig. 4, the method may further include:

step 410, acquiring a knowledge graph, wherein the knowledge graph comprises a plurality of preset triples, and the preset triples comprise a head entity, a relation and a tail entity.

For example, a plurality of preset triples may be obtained from a preset knowledge-map database. For example, the pre-set knowledge-map database may be wiki data. For example, 20,000 triples may be extracted from the wiki data.

Step 420, obtaining a head entity and a tail entity in each preset triple to form an entity set, and obtaining multiple language expressions corresponding to the entity from a preset language database for each entity in the entity set.

In practical application, a head entity and a tail entity in each preset triple are used to form an entity set E. Multiple language expressions may be stored for an entity in a preset language database. Based on this, a plurality of language expressions can be obtained for each entity in the entity set by accessing the preset language database.

And 430, acquiring the relation in each preset triple to form a relation set, and acquiring multiple language expressions corresponding to the relation from a preset language database aiming at each relation in the relation set.

Similarly, a relationship set R is formed by using the relationships in each preset triplet. A plurality of language expressions may be stored for the relationship in a preset language database. Based on this, multiple linguistic expressions can be obtained for each relationship in the set of relationships by accessing a preset linguistic database.

Step 440, respectively constructing a first type of tuple, a second type of tuple, and a third type of tuple based on the plurality of preset triples, the entity set, and the relationship set.

In this embodiment, the preset triples in the knowledge graph are used to construct an entity set formed by entities expressed in multiple languages, and a relationship set formed by relationships expressed in multiple languages, so that the automatic construction of the various types of tuples can be realized based on the multiple preset triples, the entity set, and the relationship set.

Based on the above embodiment, the first-type tuple is constructed, and a specific implementation manner thereof may include: performing the following operations on each preset triplet:

firstly, one of a plurality of languages is selected as a current language in sequence;

then, based on the current language, the head entity and the tail entity expressed by the current language are obtained from the entity set aiming at the head entity and the tail entity in the preset triples, and the relation expressed by the current language is obtained from the relation set aiming at the relation in the preset triples, so as to obtain the triples in the first-class multi-tuple.

For example, the first language may be selected as the current language, the first entity and the last entity of the chinese expression may be obtained from the entity set for the first entity and the last entity in the predetermined triplet, the first entity and the last entity of the chinese expression may be obtained from the relationship set for the relationship in the predetermined triplet, the first entity and the last entity of the chinese expression may be obtained from the relationship set for the first tuple, the english may be selected as the current language, the first entity and the last entity of the english expression may be obtained from the entity set for the first entity and the last entity of the predetermined triplet, the relationship of the english expression may be obtained from the relationship set for the relationship in the predetermined triplet, the first tuple may be obtained, and similarly, the next language may be selected as the current language, and obtaining the corresponding tuples in the first type of tuples.

In this embodiment, various possible tuples in the first-class tuple can be comprehensively obtained.

Based on the above embodiment, the second type of multi-tuple is constructed, and the specific implementation manner thereof may include: performing the following operations on each preset triplet:

then, based on the current language, determining at least one language different from the current language type in the multiple languages, and respectively combining each language different from the current language type with the current language to form at least one first cross-language combination; for each first cross-language combination, aiming at a head entity and a tail entity in a preset triple, acquiring a head entity expressed by the current language in the first cross-language combination and a tail entity expressed by a language different from the current language in the first cross-language combination from an entity set, and aiming at a relation in the preset triple, acquiring a relation expressed by the current language or the language different from the current language in the first cross-language combination from a relation set to obtain a multi-tuple in a second multi-tuple.

For example, a plurality of languages including chinese, english, german, spanish, hindi, arabic, and vietnamese may be selected as the current language, then english, german, spanish, hindi, arabic, and vietnamese are respectively determined as languages different from the chinese category, each language different from the chinese category is combined with chinese to form 6 first cross-language combinations, then, for each first cross-language combination, for a head entity and a tail entity in a preset triplet, a head entity expressed in chinese in the first cross-language combination and a tail entity expressed in a language different from chinese category in the first cross-language combination are obtained from the entity set, and for a relationship in the preset triplet, a relationship expressed in chinese or a language different from chinese category in the first cross-language combination is obtained from the relationship set, to obtain tuples of the second type of tuples. Then, English is selected as the current language, and so on.

In this embodiment, various possible tuples in the second type of tuples can be obtained comprehensively.

In addition, other ways can also be adopted to construct the second type of tuples, and based on the constructed first type of tuples, the language expression of the tail entity of the tuples in the first type of tuples is changed, or the relationship of the tuples in the first type of tuples and the language expression of the tail entity are changed. For example, the english-expressed tail entity "Basketball play" in the first type of multi-tuple (Keven Durant, is a, Basketball Player) is changed to "Basketball Player" expressed in chinese. In this way, tuples of the second type can also be constructed quickly.

Based on the above embodiments, a third type of multi-tuple is constructed, and a specific implementation manner thereof may include: performing the following operations on each preset triplet:

then, based on the current language, determining at least one language different from the current language type in the multiple languages, and respectively combining each language different from the current language type with the current language to form at least one second cross-language combination; for each second cross-language combination, aiming at a head entity and a tail entity in a preset triple, acquiring a head entity and a tail entity of the current language expression in the second cross-language combination from the entity set, aiming at a relation in the preset triple, acquiring a relation of the current language expression in the second cross-language combination from the relation set, aiming at the head entity and the tail entity in the preset triple, acquiring a head entity and a tail entity of the language expression different from the current language type in the second cross-language combination from the entity set, and aiming at the relation in the preset triple, acquiring a relation of the language expression different from the current language type in the second cross-language combination from the relation set, so as to obtain a multi-tuple in a third multi-tuple.

For example, a plurality of languages formed in chinese, english, german, spanish, hindi, arabic, and vietnamese may be first selected as a current language, then english, german, spanish, hindi, arabic, and vietnamese are respectively determined as languages different from the chinese kind, each language different from the chinese kind is combined with chinese to form 6 second cross-language combinations, then, for each second cross-language combination, a head entity and a tail entity expressed in chinese in the second cross-language combination are obtained from the entity set for the head entity and the tail entity in the preset triplet, and a head entity and a tail entity expressed in chinese in the second cross-language combination are obtained from the relationship set for the relationship in the preset triplet, and a head entity and a tail entity expressed in language different from chinese kind in the second cross-language combination are obtained from the entity set for the head entity and the tail entity in the preset triplet, and acquiring the relation expressed by the languages different from the Chinese types in the second cross-language combination from the relation set aiming at the relation in the preset triples so as to obtain the tuples in the third type tuples. Then, English is selected as the current language, and so on.

In the embodiment, various possible tuples in the third type of tuples can be comprehensively obtained.

In addition, a third type of multi-tuple can be constructed in other manners, two multi-tuples expressed by two different languages in the first type of multi-tuple are connected based on the constructed first type of multi-tuple to form a multi-tuple in the third type of multi-tuple, wherein the head entities, the relations and the tail entities of the two connected multi-tuples are the same. When two tuples expressed by two different languages in the first type of tuples are connected, firstly, the two tuples are connected according to the sequence that the tuple expressed by one language is before the tuple expressed by the other language to form one tuple in the third type of tuples, and then the sequence of the two tuples is exchanged, and the two tuples after the exchange sequence are connected to form one tuple in the third type of tuples. For example, a tuple (Keven Durant, is a, Basketball Player) in the first group of tuples and a tuple (kaiw dolant, which is a Basketball Player) in the first group of tuples are connected to form a tuple (Keven Durant, is a, Basketball Player) in the third group of tuples, and a tuple (Keven Durant, is a, Basketball Player) in the first group of tuples are connected to form a tuple (Keven Durant, is, Basketball Player, Keven Durant, is a, Basketball Player) in the third group of tuples. In this way, tuples of the third type can also be constructed quickly.

The following illustrates a scheme provided by an embodiment of the present invention in a specific application scenario.

In the scene of a cross-language question-and-answer task (G-XLT), the reference method for comparing with the automatic question-and-answer method provided by the implementation of the invention adopts the existing cross-language-based automatic question-and-answer model which can be an automatic question-and-answer model obtained based on an XLM-RoBERTA Base model. The XLM-RoBERTA Base model is an XLM-R-Base model for short and is a cross-language pre-training model. XLM is an abbreviation for Cross-Language Model (XLM).

The performance of the automatic Question-Answering model is evaluated on a real data set multi-language Question-Answering (MLQA). MLQA is a highly parallel multi-language extraction type question-answer data set. MLQA has 7 languages of question and answer data samples, including english, arabic, german, spanish, hindi, vietnamese and simplified chinese, because for english that should be extensive, MLQA stores english expressions of all tokens corresponding to triplets of the knowledge graph, while other 6 languages express that may lack some token expressions, fig. 5 illustrates coverage rates corresponding to 6 languages other than english, and the coverage rate corresponding to each language refers to the intersection of all tokens expressed by the language and all tokens corresponding to triplets of the knowledge graph, and the ratio of all tokens expressed by the language. The data set has 12,738 English-language-extracted question-answer instances and over 5,000 target-language (i.e., other languages than English) extracted question-answer instances. The question and answer examples are in multi-path parallel, namely questions and texts with the same semantics are expressed in multiple languages, for example, question A and text AA expressed in English, and question A and text AA expressed in Chinese and with the same language also exist. Specifically, 9,019 instances in the data set are 4-way parallel, 2,930 instances are 3-way parallel, and 789 instances are 2-way parallel. MLQA is constructed from wikipedia articles, which have similar meaning in a parallel language environment. The marking work is finished in a crowdsourcing mode, and the translation work is finished by professional translators. There are a total of 46,000 question and answer instances. MLQA provides data support for question-and-answer task related experiments in cross-language environments, and has given set partitioning for training/testing. During the evaluation, perfect Match (Exact Match, EM) and F1 score (F1 score) can be used as measures of the results.

The experimental results on the real data set MLQA are shown in table 1, and the evaluation results of the automatic question-answering model in the method provided by the embodiment of the present invention and the automatic question-answering model in the reference method are compared, which shows the value of F1 score and EM value obtained by evaluation when english is set to cross languages with other 6 languages.

TABLE 1F 1 score values and EM value comparison

Compared with a reference method, the automatic question answering method provided by the embodiment of the invention has the advantages that the F1 score value is averagely improved by 13.18%, the EM value is averagely improved by 12.00%, and the performance is relatively obviously improved.

Fig. 6 illustrates a specific Question-Answer example, which shows a Question-Answer result when a Question expressed in chinese and a text expressed in english cross-languages, where Question in the figure represents a Question, context represents a text, XLM-R-Base Prediction represents an Answer output by a reference method, Ours represents an Answer output by the method according to the embodiment of the present invention, and group return Answer represents an actual Answer. As can be seen from the figure, for the first question, the reference method outputs no answer, whereas the method provided by the embodiment of the present invention outputs an answer and matches with the actual answer. For the second question, the reference method does not output any answer, but the method provided by the embodiment of the invention outputs an answer and matches with the actual answer. For the third question, the reference method outputs an answer and matches with the actual answer, and the method provided by the embodiment of the invention also outputs an answer and matches with the actual answer. Therefore, on the whole, compared with the reference method, the method provided by the embodiment of the invention can accurately extract answers to the questions, and the model performance is higher.

The following describes an automatic question-answering model training method based on cross-language provided by the embodiment of the invention, and the automatic question-answering model training method based on cross-language described below and the automatic question-answering method based on cross-language described above can be referred to correspondingly.

Fig. 7 is a schematic flowchart of an automatic question-answering model training method based on cross-language according to an embodiment of the present invention.

The cross-language-based automatic question-answering model training method provided by this embodiment may be executed by an intelligent terminal such as a computer or a combination of hardware and/or software therein, or may be executed by a server or a combination of hardware and/or software therein, as shown in fig. 7, the method may include:

step 710, acquiring a plurality of tuples constructed based on the knowledge graph; the multiple tuples comprise a first type tuple, a second type tuple and a third type tuple; each tuple in the first type of tuple comprises a head entity, a relation and a tail entity expressed by the same language in a plurality of languages; each tuple in the second type tuple comprises a head entity expressed by one language in a plurality of languages, a relation expressed by the same kind of languages as the head entity or the tail entity and a tail entity expressed by different kinds of languages from the head entity; each tuple in the third type of tuple includes a head entity, a relationship, and a tail entity expressed in one of the plurality of languages, and a head entity, a relationship, and a tail entity expressed in another of the plurality of languages.

And 720, sequentially inputting each multi-element group into an initial pre-training model for pre-training to obtain a knowledge-enhanced pre-training model.

Step 730, inputting the question sample and the text sample containing the answer of the question sample into a preset model containing a knowledge-enhanced pre-training model for training to obtain an automatic question and answer model, wherein the automatic question and answer model is used for extracting the position information of the answer of the question from the text.

Fig. 8 illustrates a physical structure diagram of an electronic device, and as shown in fig. 8, the electronic device may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. The processor 810 may call the logic instructions in the memory 830 to perform the methods provided by any of the above embodiments.

In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Embodiments of the present invention also provide a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer being capable of executing the method provided by any of the above embodiments.

Embodiments of the present invention also provide a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the method provided in any one of the above embodiments when executed by a processor.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An automatic question-answering method based on cross-language is characterized by comprising the following steps:

2. The cross-language based automatic question answering method according to claim 1, wherein the pre-training process of the knowledge enhanced pre-training model comprises:

3. The method according to claim 2, wherein the inputting the tuples into the initial pre-training model, training the initial pre-training model to output head entities of each language expression in the tuples, and inputting the tuples into the initial pre-training model, training the initial pre-training model to output tail entities of each language expression in the tuples comprises:

4. The cross-language based automatic question-answering method according to claim 3, wherein the first loss function employed for the pre-training of the knowledge-enhanced pre-trained model is obtained based on maximum likelihood estimation.

5. The method according to claim 2, wherein before inputting the tuple to the initial pre-training model for each tuple, training the initial pre-training model to output a head entity of each linguistic expression in the tuple, and inputting the tuple to the initial pre-training model, training the initial pre-training model to output a tail entity of each linguistic expression in the tuple, the method further comprises:

6. The cross-language based automatic question answering method according to claim 5, wherein constructing the first type of tuples comprises:

performing the following operations on each preset triplet:

sequentially selecting one of the multiple languages as a current language;

7. The cross-language based automatic question answering method according to claim 5, wherein constructing the second type of tuples comprises:

performing the following operations on each preset triplet:

sequentially selecting one of the multiple languages as a current language;

8. The cross-language based automatic question answering method according to claim 5, wherein constructing the third type of tuples comprises:

performing the following operations on each preset triplet:

sequentially selecting one of the multiple languages as a current language;

9. A cross-language-based automatic question-answering model training method is characterized by comprising the following steps:

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 9 are implemented when the processor executes the program.