CN114691815A

CN114691815A - Model training method and device, electronic equipment and storage medium

Info

Publication number: CN114691815A
Application number: CN202011566127.1A
Authority: CN
Inventors: 秦昌博; 谢韬; 高倩; 邵长东
Original assignee: Ecovacs Commercial Robotics Co Ltd
Current assignee: Ecovacs Commercial Robotics Co Ltd
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2022-07-01

Abstract

The embodiment of the invention provides a model training method, a model training device, electronic equipment and a storage medium, wherein the method comprises the following steps: and acquiring and filtering redundant sentences in the first question-answer set to obtain a second question-answer set. Any question-answer pair in the second question-answer set is composed of a target question statement and a target answer statement that are semantically related, i.e., is a positive sample. The target question sentences and the non-target answer sentences in the second question-and-answer set may also constitute a third question-and-answer set, i.e., a negative example. Finally, the language model is trained according to the second question and answer set and the third question and answer set. In the scheme, the question-answer sentences in the second question-answer set are semantically corresponding to the question-answer sentences in the third question-answer set through redundant sentence filtering, and the question-answer sentences in the third question-answer set are semantically not corresponding to each other, so that the positive and negative samples are divided more accurately. The training is carried out by using the accurately divided positive and negative samples, so that the language model can learn the semantic relation between question sentences and answer sentences, and the training effect of the model is ensured.

Description

Model training method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a model training method and apparatus, an electronic device, and a storage medium.

Background

With the development of artificial intelligence technology, various intelligent robots increasingly enter people's lives, such as service robots, self-moving vending robots, and other commercial robots. For the convenience of users, the intelligent robot generally supports various human-computer interaction modes, such as a human-computer interaction mode based on touch operation, an interaction mode based on voice, and the like.

In practical application, after receiving the dialog content input by the user, the intelligent robot usually determines the semantics of the dialog content first, and then obtains the response content corresponding to the dialog content according to the semantics, thereby realizing human-computer interaction.

Disclosure of Invention

The embodiment of the invention provides a model training method and device, electronic equipment and a storage medium, which are used for ensuring the fluency of man-machine conversation.

The embodiment of the invention provides a model training method, which comprises the following steps:

acquiring a first question-answer set;

filtering out redundant sentences in the first question-answer set to obtain a second question-answer set, wherein any question-answer pair in the second question-answer set comprises a target question sentence and a target answer sentence;

determining a third question-answer set according to the target question sentences and the non-target answer sentences in the second question-answer set;

and training a language model according to the second question-answer set and the third question-answer set.

An embodiment of the present invention provides a model training apparatus, including:

the acquisition module is used for acquiring a first question-answer set;

a filtering module, configured to filter redundant statements in the first question-answer set to obtain a second question-answer set, where any question-answer pair in the second question-answer set includes a target question statement and a target answer statement;

a set determining module, configured to determine a third question-answer set according to the target question statements and the non-target answer statements in the second question-answer set;

and the training module is used for training a language model according to the second question-answer set and the third question-answer set.

An embodiment of the present invention provides an electronic device, including: a processor and a memory; wherein the memory is to store one or more computer instructions that when executed by the processor implement:

acquiring a first question-answer set;

Embodiments of the present invention provide a computer-readable storage medium storing computer instructions that, when executed by one or more processors, cause the one or more processors to perform at least the following:

acquiring a first question-answer set;

The model training method provided by the invention obtains the first question-answer set, and filters out redundant sentences in the first question-answer set to obtain the second question-answer set. Any one of the question-answer pairs in the second question-answer set may be composed of a target question sentence and a target answer sentence, which are semantically related and may be regarded as a positive sample. At this time, the target question sentences and the non-target answer sentences in the second question and answer set may also constitute a third question and answer set. The question-answer pairs in the third question-answer set are semantically uncorrelated and can be considered as negative examples. And finally, realizing the training of the language model according to the second question-answer set and the third question-answer set.

In practical applications, there may be a plurality of question sentences and answer sentences having the same or similar semantics in the first question-answer set, and therefore, there is a tendency that one question sentence is semantically associated with a plurality of answer sentences, or one answer sentence is semantically associated with a plurality of question sentences. The condition that the question sentences and the answer sentences do not correspond to each other semantically can affect the training effect of the language model. In the scheme, the redundant sentences are filtered, so that the question sentences and the answer sentences in the second question-answer set are strictly in one-to-one correspondence in semantics, and the question sentences and the answer sentences in the third question-answer set are not in semantic correspondence, so that the positive and negative samples are more accurately and uniquely divided. The training is carried out by using the accurately divided positive and negative samples, so that the language model can learn the internal relation between the question sentence and the answer sentence, the relation comprises the semantic association and the semantic disassociation, the model training effect is ensured, and the fluency of man-machine conversation is further ensured.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

Fig. 1 is a schematic structural diagram of a dialog system according to an embodiment of the present invention;

FIG. 2 is a flowchart of a model training method according to an embodiment of the present invention;

FIG. 3 is a flow chart of another model training method provided by the embodiments of the present invention;

FIG. 4 is a diagram illustrating a language model and a training process thereof according to an embodiment of the present invention;

fig. 5a is a schematic diagram of a model training method applied in a banking scenario according to an embodiment of the present invention;

fig. 5b is another schematic diagram of the model training method applied in a banking scenario according to the embodiment of the present invention;

fig. 5c is another schematic diagram of the model training method provided in the embodiment of the present invention applied in a bank scenario;

FIG. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device corresponding to the model training apparatus provided in the embodiment shown in fig. 6.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well. "plurality" generally includes at least two unless the context clearly dictates otherwise.

The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.

It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.

Before explaining the model training method provided by the embodiment of the present invention, an exemplary explanation may also be made on the actual use of the language model:

as mentioned in the background, an intelligent robot applied in different scenarios can provide a human-computer interaction function, i.e. a human-computer conversation function. Such a dialogue function may be generally implemented by a dialogue system configured in the intelligent robot, and a specific structure of the dialogue system may be as shown in fig. 1.

The workflow of the dialog system may be: the intelligent robot receives question sentences input by a user and inputs the question sentences into the dialogue system. The language model in the dialogue system converts the question sentence into a sentence vector, and then the retrieval model in the dialogue system respectively calculates the similarity between the sentence vector corresponding to the question sentence and the sentence vector in the knowledge base. The knowledge base comprises sentence vectors which are collected in advance and respectively correspond to different sentences. The sequencing model in the system sequences the sentence vectors in the knowledge base according to the similarity, and determines the sentence vector corresponding to the highest similarity in the knowledge base as the target sentence vector. Finally, the dialog system outputs the text form of the target sentence vector, i.e., the answer sentence.

In practical applications, the above-described dialogue system including the language model may be optionally deployed on an intelligent robot such as a service robot, a self-moving vending robot, or the like. The dialog system can also be deployed in a plug-in for human-machine dialog (alternatively called human-machine dialog interface, human-machine dialog function module) integrated by systems such as online shopping systems, public service systems; the dialogue system can also be deployed on intelligent terminals such as intelligent household appliances and intelligent wearable equipment. Broadly speaking, the dialogue system including the language model can be deployed in any device or system supporting man-machine dialogue.

According to the working process of the dialogue system, the language model is a core component in the system. Whether the language model can convert the question sentence into an accurate sentence vector or not can directly influence the fluency of the human-computer conversation. Therefore, in order to ensure the smoothness of human-computer conversation, the language model can be trained by using the model training method provided by the invention, so that the language model can output accurate sentence vectors.

In the man-machine conversation scene, the training samples used in training the language model can be question and answer pairs collected in advance. Based on the training samples composed of question-answer pairs, the overall idea of the model training method provided by the embodiments of the present invention can be summarized as a two-stage training method combined with comparative learning.

Specifically, a first training sample (i.e., a first question and answer set in the following embodiments) is obtained, and then a second training sample (i.e., a fourth question and answer set in the following embodiments) is generated according to the first training sample. The first training sample is used as a positive sample, and the second training sample is used as a negative sample to perform the first stage training to obtain a language model (i.e., the first language model in the following embodiments).

Then, redundant sentences in the first training sample are filtered out to obtain a third training sample (i.e., a second question-answer set in the following embodiment). Next, a fourth training sample (i.e., a third question and answer set in the following embodiments) is generated based on the third training sample. And taking the third training sample as a positive sample, and taking the fourth training sample as a negative sample to perform second-stage training, namely, continuing to perform secondary training on the language model obtained after the first-stage training, so as to obtain a second language model in each embodiment described below.

The specific implementation of the two-stage training process can be referred to the description in the following embodiments.

Based on the above description, some embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The features of the embodiments and examples described below may be combined with each other without conflict between the embodiments. In addition, the sequence of steps in each method embodiment described below is only an example and is not strictly limited.

Fig. 2 is a flowchart of a model training method according to an embodiment of the present invention, where the model training method according to the embodiment of the present invention may be executed by a training device. It will be appreciated that the training device may be implemented as software, or a combination of software and hardware, and in particular may be a server. As shown in fig. 2, the method comprises the steps of:

101. a first set of questions and answers is obtained.

The user may collect a first set of questions and answers via the internet. The first question-answer set is composed of a plurality of question-answer pairs, and the question sentences and answer sentences in each question-answer pair are semantically associated, and therefore, the question-answer pairs in the first question-answer set can be regarded as positive samples.

In order to ensure the training effect of the language model, negative samples can be generated according to the positive samples. Assume that question-answer pair 1 in the first question-answer set is composed of question sentence Q1 and answer sentence a 1. At this time, the answer sentences other than the answer sentence a1 in the first question and answer set may be recombined into question and answer pairs with the question sentence Q1, respectively, and the recombined question and answer pairs may be taken as negative examples. Wherein question-answer pair 1 may be any question-answer pair in the first question-answer set. This negative example is also the fourth set of questions and answers in the embodiment shown in FIG. 3.

In practical application, the language model can be trained directly by using the positive and negative samples. The combination of the positive and negative samples can enable the language model to learn the semantic relationship between each question sentence and each answer sentence in the first question-answer set, namely, the comparison learning is realized, so that the training effect of the model is ensured.

It should be noted that ideally, the question statements and answer statements in the negative examples are semantically uncorrelated. However, in real conversations, especially chatting scenarios, the semantic association between question sentences and answer sentences is flexible, which easily results in that one question sentence in the first question-answer set can be semantically associated with multiple answer sentences, or one answer sentence in the first question-answer pair can be semantically associated with multiple question sentences. Therefore, in the negative sample obtained in the above manner, the question sentence and the answer sentence are likely to be semantically related, and the negative sample is obviously invalid and may affect the training effect of the language model. In order to avoid the above problem, the following statement filtering step may be further performed.

102. And filtering redundant sentences in the first question-answer set to obtain a second question-answer set, wherein any question-answer pair in the second question-answer set comprises a target question sentence and a target answer sentence.

The redundant statements in the first question-and-answer set may be redundant question statements and/or redundant answer statements. At this time, a first similarity between question sentences in the first question-answer set may be calculated first, and a second similarity between answer sentences in the first question-answer set may be calculated. And then determining a plurality of similar question sentences in the first question-answer set according to the first similarity, optionally, randomly reserving any one of the plurality of similar question sentences and filtering other question sentences. Similarly, the answer sentences may also be filtered according to the second similarity, that is, the first filtering is implemented.

Further, after the above filtering, there is a high possibility that the question sentences or the answer sentences in one question-answer pair are separately filtered. Therefore, after the first filtering, a second filtering may be performed on the remaining sentences in the first question-answer set, that is, question sentences or answer sentences whose identifications cannot be paired are filtered according to the question-answer relationship identification, so that the second filtering is completed to obtain a second question-answer set. The question-answer relationship marks of all the sentences in the first question-answer set are preset, the question sentences and the answer sentences in one question-answer pair have the same question-answer relationship marks, and the same question-answer relationship marks also indicate that the two sentences have the preset question-answer relationship.

The filtering process actually filters sentences with similar or identical semantics according to the similarity, and then filters isolated sentences which cannot form question-answer pairs according to the sentence marks. The second set of filtered questions and answers is also composed of a plurality of question and answer pairs, which can be used as positive samples. Assuming that any question-answer pair in the second question-answer set consists of a target sentence and a target answer sentence, in the second question-answer set, the target question sentence is only semantically associated with the target answer sentence, and there is no semantic association with the non-target answer sentence, that is, the question sentence and the answer sentence are semantically strictly in one-to-one correspondence in the second question-answer set is realized.

103. And determining a third question-answer set according to the target question sentences and the non-target answer sentences in the second question-answer set.

Then, similar to the description in step 101, a third set of questions and answers may be further generated according to the second set of questions and answers. The question-answer pairs included in the third question-answer set may be composed of the target sentences and the non-target answer sentences in the second question-answer set. And through the filtering processing in step 102, any question-answer pair in the third question-answer set contains question sentences and answer sentences which are semantically unrelated, and any question-answer pair in the third question-answer set is a valid negative sample.

104. And training the language model according to the second question-answer set and the third question-answer set.

Finally, the second question-answer set can be used as a positive sample, and the third question-answer set can be used as a negative sample for training the language model. The combined use of the positive and negative samples can enable the language model to learn the semantic relationship between each question sentence and each answer sentence in the second question-answer set, namely, the comparison learning is realized, so that the training effect of the model is ensured.

In this embodiment, a first question-answer set is obtained, and redundant statements in the first question-answer set are filtered out to obtain a second question-answer set. And then the target question sentences and the non-target answer sentences in the second question-answer set form a third question-answer set. And finally, taking the second question-answer set as a positive sample, and taking the third question-answer set as a negative sample to train the language model.

In the method, the problem sentences and the answer sentences in the second question-answer set are semantically in one-to-one correspondence by filtering the redundant sentences, and the problem sentences and the answer sentences in the third question-answer set are not in correspondence, so that the positive and negative samples are more accurately divided. The training is carried out by using the accurately divided positive and negative samples, so that the language model can learn the semantic relation between question sentences and answer sentences, the model training effect is ensured, and the fluency of man-machine conversation can be further ensured.

The above description of the embodiment has mentioned that the overall idea of the model training method may be two-stage training combined with comparative learning, while the embodiment shown in fig. 2 describes the second stage of training process. On the basis of the embodiment shown in fig. 2, fig. 3 is a flowchart of another model training method provided in the embodiment of the present invention, and as shown in fig. 3, the method may include the following steps:

201. and acquiring a first question-answer set, wherein any question-answer set in the first question-answer set comprises target question sentences and target answer sentences.

202. And determining a fourth question-answer set according to the target question sentences and the non-target answer sentences in the first question-answer set.

203. And performing model training according to the first question-answer set and the fourth question-answer set to obtain a first language model.

For the acquired first question-answer set, any question-answer pair is composed of a target question statement and a target answer statement, and the two are semantically related. At this time, it is also possible to reconstruct a plurality of question-answer pairs from the target question sentence in the first question-answer set and the non-answer sentences in the first question-answer set, and to compose a fourth question-answer set from these reconstructed question-answer pairs. The question sentences and answer sentences in question-answer pairs included in the fourth question-answer set are semantically uncorrelated. The first question-answer set may be regarded as a positive sample, and the fourth question-answer set may be regarded as a negative sample, in which case, the positive and negative samples may be used to perform the first stage of model training to obtain the first language model.

The above steps 201 to 203 can be understood in conjunction with the description in step 101.

204. And filtering redundant sentences in the first question-answer set to obtain a second question-answer set, wherein any question-answer pair in the second question-answer set comprises a target question sentence and a target answer sentence.

Then, filtering out of redundant statements may be performed. As described in the embodiment shown in fig. 2, the filtering of the sentence can be implemented by using the first similarity and the second similarity, and the similarity can be calculated by the following method:

inputting the first question-answer set into a first language model obtained after the first stage training, and respectively converting question sentences and answer sentences in the first question-answer set into corresponding sentence vectors by the first language model. Then, a first similarity between sentence vectors corresponding to the question sentences is calculated. Alternatively, the first similarity may be expressed as a distance value or a dot product between sentence vectors, and the smaller the distance value or the smaller the product, the higher the similarity. The second similarity between the sentence vectors corresponding to the answer sentences may also be expressed as distance values or dot products of the sentence vectors.

Alternatively, since the question sentences and answer sentences are semantically different, in practical applications, for the conversion of sentence vectors, usually, the question sentences in the first question-answer set are vector-converted by the first conversion network in the first language model, and the answer sentences in the first question-answer set are vector-converted by the second conversion network in the first language model. Alternatively, the first and second conversion networks may be convolutional neural networks.

After the first step of filtering of the sentences is completed by using the first conversion network and the second conversion network, the second step of filtering can be further performed according to the question-answer relationship identification of the sentences. The specific process of the second filtering step can be referred to the related description in step 102, and is not described herein again.

After the two steps of filtering, at least one target question-answer pair can be left in the first question-answer set, and each target question-answer pair has a preset question-answer relationship, namely, the question sentences and the answer sentences in each target question-answer pair have the same question-answer relationship identification and are semantically related. In the embodiment shown in fig. 2, the second question-answer set is directly formed by at least one target question-answer pair to complete the training of the language model.

However, considering that the question-answer relationship identifiers of the sentences in the target question-answer pair are all manually set in the sentence collection process, there may be deviations, that is, the semantic association relationship between the question sentences and the answer sentences in the target question-answer pair is not accurate, and the preset question-answer relationship between the question sentences and the answer sentences is not necessarily true. Therefore, if the target question-answer pair is directly subjected to model training as a positive sample, the effect of the model training is obviously affected.

In order to avoid the above-mentioned problems, optionally, the preset question-answer relations of the target question-answer pairs may also be verified by means of the first language model: and inputting the target question-answer pair into a first language model, and classifying whether a preset question-answer relation of the target question-answer pair is established or not by a classification network in the first language model. And filtering the target question-answer pair if the confidence coefficient is lower than a preset threshold value according to the confidence coefficient of the classification result output by the classification network and indicates that the preset question-answer relationship of the target question-answer pair is not established. The remaining question-answer pairs after this filtering process are used as the second question-answer set. This process can also be considered as a third step filtering of the statement.

It should be noted that, in order to ensure the effect of model training, the number of question-answer pairs in the second question-answer set needs to be greater than or equal to the preset number, and therefore, in practical applications, the number of question-answer pairs in the first question-answer set may be N times that in the second question-answer set. Wherein N is more than or equal to 5.

205. And determining a third question-answer set according to the target question sentences and the non-target answer sentences in the second question-answer set.

206. And training the first language model according to the second question-answer set and the third question-answer set to obtain a second language model.

And finally, taking the second question-answer set as a positive sample, taking the third answer set as a negative sample, and performing second-stage training on the basis of the first language model obtained in the step 203 to obtain a second language model. The execution process of the above steps 205 to 206 is similar to the corresponding steps of the foregoing embodiment, and reference may be made to the related description in the embodiment shown in fig. 2, which is not repeated herein.

In this embodiment, the first question-answer set is first used as a positive sample, and the fourth sample set is used as a negative sample to perform the comparison training in the first stage, so as to obtain the first language model. The combined use of positive and negative samples may enable the language model to learn the semantic association between each question sentence and each answer sentence in the first set of questions and answers. And then filtering redundant sentences from the first question-answer set according to the semantic and question-answer relationship identification to obtain a second question-answer set, taking the second question-answer set as a positive sample, and taking the third question-answer set as a negative sample to perform second-stage comparison training to obtain a second language model. The combined use of positive and negative examples may also enable the language model to learn semantic associations between each question statement and each answer statement in the second set of questions and answers. The training effect of the model can be ensured by the contrast learning of the two stages.

The structure of the language model in the embodiment shown in fig. 3 and the training process of the first training phase may be as shown in fig. 4. The language model is equally applicable to the embodiment shown in fig. 2. In the first stage of training, the first conversion network, the second conversion network and the classification network included in the first language model may be trained together when step 203 is performed.

Optionally, the first conversion network and the second conversion network may be trained simultaneously according to the first question-answer set and the fourth question-answer set. It should be noted that the network parameters of the first switching network and the second switching network are independent and not shared.

For the training of the two conversion networks, assuming that N question sentences and N answer sentences exist in the first question-answer set and the fourth question-answer set in total, the first conversion network may output respective sentence vectors of the N question sentences, and the second conversion network may output respective sentence vectors of the N answer sentences.

And respectively calculating the similarity between the sentence vectors corresponding to the N question sentences and the sentence vectors corresponding to the N question-answer sentences so as to obtain N × N similarities. At this time, the network parameters of the first and second switching networks may be adjusted according to the N × N similarities. The parameter is adjusted so that the similarity between the question statement Q1 and the answer statement a1 is greater than the similarity between the question statement Q1 and other answer statements. The question sentence Q1 and the answer sentence a1 form a positive sample, and both may be a question-answer pair in the first question-answer set. The question sentence Q1 and the other answer sentences constitute negative examples, which are a question-answer pair in the fourth question-answer set.

For the training of the classification network, because the sentences in the first question-answer set and the fourth question-answer set are preset with question-answer relationship marks, and the question-answer relationship marks can reflect the preset question-answer relationship between the question sentences and the answer sentences, the preset question-answer relationship can be used as the supervision information to train the classification network. And adjusting network parameters of the classification network according to the difference between the preset question-answer relationship and the predicted question-answer relationship output by the classification network.

The training process of the transformation network and the classification network may occur in the first training stage, and as described in the embodiment shown in fig. 3, the training in the second stage may be further implemented by using the transformation network and the classification network obtained after the first training stage. And in the training process of the second stage, the network parameters of the classification network are fixed, and only the first conversion network and the second conversion network are further trained.

Specifically, the second question-answer set and the third question-answer set are input into the first language model, so that the sentence vector of the question sentence is output by the first conversion network in the first language model, and the sentence vector of the answer sentence is output by the second conversion network. And adjusting respective network parameters of the first conversion network and the second conversion network according to the similarity between the sentence vector of the question sentence and the sentence vector of the answer sentence to obtain a second language model. The adjustment principle of the network parameters is the same as the first stage training, and reference can be made to the above description.

By combining the above embodiments, in one manner, after the first question-answer set is obtained, redundant statements may be directly filtered out, so as to obtain a second question-answer set as a positive sample, and then a third question-answer set as a negative sample is generated according to the second question-answer set. After the statement filtering, the question statement and the answer statement in the second question-answer set can be strictly corresponding to each other one by one semantically, and meanwhile, the question statement and the answer statement in the third question-answer set can be semantically unrelated, so that the effectiveness of the negative sample is ensured. Finally, the language model is trained by using positive and negative samples in a comparison learning mode, and the training effect is ensured. The above process is the manner provided by the embodiment shown in fig. 2.

In order to ensure the training efficiency and effect of the model, in another way, after the first question-answer set serving as the positive sample is obtained, a fourth question-answer set serving as the negative sample can be generated according to the first question-answer set. And performing model training in the first stage by using the first question-answer set and the fourth question-answer set in a comparative learning mode to obtain a first language model comprising a first conversion network, a second conversion network and a classification network.

On the basis of the first language model, two conversion networks are used for respectively carrying out sentence vector conversion on question sentences and answer sentences in the first question-answer set, and first-step filtering of redundant sentences is realized according to the sentence vectors. And realizing the second-step filtering of the sentences according to the question-answer relationship marks of the sentences, and realizing the third-step filtering of the sentences according to the classification network in the first language model, thereby obtaining a second question-answer set serving as a positive sample. And generating a third question-answer set serving as a negative sample according to the second question-answer set. And finally, training the second stage of the two conversion networks in the first language model according to the second question-answer set and the third question-answer set, thereby obtaining a second language model. Wherein, during the second stage training process, the classification network is fixed and does not participate in the training.

It should be noted that, the classification network in the language model is used to check whether the preset question-answer relationship of the question-answer pair is established, and the second conversion network is used to perform sentence vector conversion on the answer sentence, so that, in the actual use process of the language model, the first conversion network in the language model is used to perform sentence vector conversion on the question sentence input by the user, and the second conversion network and the classification network are not used, so that the two networks can be removed.

For ease of understanding, a specific implementation of the model training method provided above is illustrated in conjunction with the following application scenarios.

Taking a bank in a public service scenario as an example, the obtained first question-answer set may include the following question-answer pairs:

question sentence Q1: what you like, answer sentence a 1: i like watching a movie

Question sentence Q2: does you like to watch a movie, answer sentence a 2: i like watching a movie

Question sentence Q3: what financial products are recommended, answer sentence a 3: m has high financial profit and recommends purchase

Question sentence Q4: today, how weather, answer sentence a 2: blue sky white cloud

And the question-answer relationship identifiers set by Q1 and a1 in the first question-answer set can be C1, the question-answer relationship identifiers set by Q2 and a2 can be C2, and so on.

And generating a fourth question-answer set based on the first question-answer set, wherein the fourth question-answer set comprises: the question-answer pairs consisting of Q1 and a2 to a4, the question-answer pairs consisting of Q2 and a1, A3 and a4, the question-answer pairs consisting of Q3 and a1, a2 and a4, and the question-answer pairs consisting of Q4 and a1 to A3.

And carrying out model training in the first stage by using the two question-answer sets. The first stage of the training process may be as shown in fig. 5 a. At this time, the similarity between the 4 sentence vectors output by the first conversion network and the second conversion network can be calculated respectively, and the network parameters of the two conversion networks can be adjusted respectively according to the 16 similarity values, so that the first conversion network and the second conversion network in the first language model have a certain sentence vector conversion capability. Meanwhile, the network parameters of the classification network can be adjusted according to the predicted classification result output by the classification network in the first language model, so that the classification network can accurately distinguish whether question sentences and answer sentences in a question-answer pair are semantically related.

Then, a second phase of training can be started: the answer sentences A1 in the first question-answer set can be filtered randomly by using the sentence vectors output by the first and second conversion networks, and then Q1 is filtered according to the question-answer relationship identification of the sentences. Optionally, the classification network in the first language model may also be used to verify the preset question-answer relationship for the remaining 3 question-answer pairs, and the preset question-answer relationship between the classification model outputs Q4 and a4 is filtered if it is not true. The filtering process described above can be as shown in FIG. 5b

After the above filtering, Q2 and a2 and Q3 and A3 in the first question-answer set form a second question-answer set as positive samples. Q2 and A3 and Q3 and A2 form a third set of questions and answers as negative examples, and the first language model is trained in the second phase. In the second training process, the classification network needs to be fixed, and the parameters of the conversion network are respectively adjusted only according to the similarity between two sentence vectors respectively output by the two conversion networks in the first language model, so as to obtain the second language model. This process can be combined with that shown in figure 5 c.

If the traditional training method is adopted to train the language model, in a market scene, when a question sentence input by a user is 'who you are', a response sentence output by a dialogue system comprising the language model for the user is 'who says', and after the embodiment of the invention is used, the response sentence output by the dialogue system is 'a robot under the situation, please teach more than one finger', the response sentence is more logical and has smooth dialogue based on the second language model.

In addition, taking a mall scene as an example, the obtained first question-answer set may include the following question-answer pairs:

question sentence Q1: what is a good-looking movie, answer sentence a 1: i do not like to watch movies

Question sentence Q2: does store M have a discount activity, answer sentence a 2: shop M today full 8 discount

Question sentence Q3: which discounted clothes are, answer sentence a 3: store M today's full 8 discount coupon statement Q4: today, how weather, answer sentence a 2: blue sky white cloud

Similar to the scenario described above, a fourth question-answer set may also be obtained, and the first stage of training is completed by using the two question-answer sets to obtain the first language model.

The sentences A2 can be randomly filtered according to the similarity, the sentences Q2 can be filtered according to the question-answer relation identification, and the sentences Q4 and A4 can be filtered according to the classification result output by the classification network.

Therefore, a second question-answer set is formed by the remaining sentences Q1 and A1, Q3 and A3, a third question-answer set is formed by Q1 and A3, and Q3 and A1, and a second stage of training is carried out to obtain a second language model.

If the traditional training method is adopted to train the language model, in a market scene, when a question sentence input by a user is 'speak a joke', a response sentence output by a dialogue system containing the language model for the user is 'enter and speak', and after the embodiment of the invention is used, the response sentence output by the dialogue system is 'the sun shines your while you are also shining the sun' based on the second language model.

The training process in this scenario is similar to the scenario shown in fig. 5 a-5 c, and therefore the training diagram is not shown.

The model training apparatus of one or more embodiments of the present invention will be described in detail below. Those skilled in the art will appreciate that these model training devices can each be constructed using commercially available hardware components configured through the steps taught in the present scheme.

Fig. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention, and as shown in fig. 6, the apparatus includes:

the obtaining module 11 is configured to obtain a first question-answer set.

A filtering module 12, configured to filter out redundant statements in the first question-answer set to obtain a second question-answer set, where any question-answer pair in the second question-answer set includes a target question statement and a target answer statement.

And a set determining module 13, configured to determine a third question-answer set according to the target question statements and the non-target answer statements in the second question-answer set.

And the training module 14 is configured to train a language model according to the second question-answer set and the third question-answer set.

Optionally, any one of the first question-answer sets includes a target question statement and a target answer statement;

the set determining module 13 is further configured to determine a fourth question-answer set according to the target question statements and the non-target answer statements in the first question-answer set.

The training module 14 is further configured to perform model training according to the first question-answer set and the fourth question-answer set to obtain a first language model; and training the first language model according to the second question-answer set and the third question-answer set to obtain a second language model.

Optionally, the filtering module 12 is specifically configured to:

filtering out redundant question sentences in the first question-answer set according to the first similarity between the question sentences in the first question-answer set; filtering out redundant answer sentences in the first question-answer set according to the second similarity between the answer sentences in the first question-answer set; and according to the question-answer relationship marks of the remaining sentences in the first question-answer set, gathering the remaining sentences for filtering processing to obtain a second question-answer set.

Optionally, the apparatus further comprises:

an input module 21, configured to input the first question-answer set into the first language model, so that the first language model converts question sentences and answer sentences in the first question-answer set into sentence vectors respectively

A similarity determining module 22, configured to determine the first similarity according to the sentence vectors corresponding to the question sentences respectively; and determining the second similarity according to the sentence vectors corresponding to the answer sentences respectively.

Optionally, the first language model is configured to perform vector transformation on the question statement by a first transformation network in the first language model; and vector-converting the answer sentence by a second conversion network in the first language model.

Optionally, the filtering module 12 is specifically configured to: determining a target question-answer pair with a preset question-answer relationship in the remaining sentences according to the question-answer relationship identification; inputting the target question-answer pair into the first language model, and classifying whether a preset question-answer relation of the target question-answer pair is established or not by a classification network in the first language model; and filtering out target question-answer pairs with an undetermined preset question-answer relationship according to the confidence coefficient of the classification result output by the classification network.

Optionally, the training module 14 is specifically configured to: and performing model training according to the first question-answer set and the fourth question-answer set to obtain a first conversion network and a second conversion network in the first language model.

Optionally, the training module 14 is further specifically configured to: inputting the second question-answer set and the third question-answer set into the first language model, outputting sentence vectors corresponding to question sentences through the first conversion network in the first language model, and outputting sentence vectors corresponding to answer sentences through the second conversion network;

and adjusting respective network parameters of the first conversion network and the second conversion network according to the similarity between the sentence vector corresponding to the question sentence and the sentence vector corresponding to the answer sentence to obtain the second language model.

Optionally, the training module 14 is further specifically configured to: and performing model training according to a preset question-answer relationship between question sentences and answer sentences in the first question-answer set to obtain a classification network in the first language model.

The apparatus shown in fig. 6 may perform the model training method provided in the embodiments shown in fig. 1 to fig. 4, and a part not described in detail in this embodiment may refer to the related description of the embodiments shown in fig. 1 to fig. 4, and is not described again here.

Having described the internal functions and structure of the model training apparatus, in one possible design, the structure of the model training apparatus may be implemented as an electronic device, as shown in FIG. 7, which may include: a processor 31 and a memory 32. Wherein the memory 32 is used for storing a program for supporting the electronic device to execute the model training method provided in the foregoing embodiments shown in fig. 1 to 4, and the processor 31 is configured to execute the program stored in the memory 32.

The program comprises one or more computer instructions which, when executed by the processor 31, are capable of performing the steps of:

optionally, the processor 31 is further configured to perform all or part of the steps in the foregoing embodiments shown in fig. 1 to 4.

Acquiring a first question-answer set;

The electronic device may further include a communication interface 33 for communicating with other devices or a communication network.

Additionally, embodiments of the present invention provide a computer-readable storage medium storing computer instructions that, when executed by one or more processors, cause the one or more processors to perform at least the following:

acquiring a first question-answer set;

The above-described apparatus embodiments are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by adding a necessary general hardware platform, and of course, can also be implemented by a combination of hardware and software. With this understanding, the above technical solutions may be embodied in the form of a computer product, which is a substantial part of or contributes to the prior art.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method of model training, comprising:

acquiring a first question-answer set;

2. The method according to claim 1, wherein any one of the first question-answer sets contains a target question sentence and a target answer sentence;

before filtering out the redundant sentences in the first question-answer set to obtain a second question-answer set, the method further includes:

determining a fourth question-answer set according to the target question sentences and the non-target answer sentences in the first question-answer set;

performing model training according to the first question-answer set and the fourth question-answer set to obtain a first language model;

training a language model according to the second question-answer set and the third question-answer set, including:

and training the first language model according to the second question-answer set and the third question-answer set to obtain a second language model.

3. The method of claim 2, wherein filtering out redundant sentences in the first set of questions and answers to obtain a second set of questions and answers comprises:

filtering redundant question sentences in the first question-answer set according to first similarity among the question sentences in the first question-answer set;

filtering out redundant answer sentences in the first question-answer set according to the second similarity between the answer sentences in the first question-answer set;

and according to the question-answer relationship marks of the remaining sentences in the first question-answer set, gathering the remaining sentences for filtering processing to obtain a second question-answer set.

4. The method of claim 3, further comprising:

inputting the first question-answer set into the first language model so that question sentences and answer sentences in the first question-answer set are converted into sentence vectors by the first language model respectively;

determining the first similarity according to sentence vectors corresponding to the question sentences respectively;

and determining the second similarity according to the sentence vectors corresponding to the answer sentences respectively.

5. The method of claim 4, wherein the converting, by the first language model, the question sentences and answer sentences in the first question-answer set into sentence vectors respectively comprises:

vector converting the question statement by a first conversion network in the first language model;

vector-converting the answer sentence by a second conversion network in the first language model.

6. The method according to claim 4 or 5, wherein the filtering out the remaining sentences according to the question-answer relationship identifiers of the remaining sentences in the first question-answer set comprises:

determining a target question-answer pair with a preset question-answer relationship in the remaining sentences according to the question-answer relationship identification;

inputting the target question-answer pair into the first language model, and classifying whether a preset question-answer relation of the target question-answer pair is established or not by a classification network in the first language model;

and filtering out target question-answer pairs with an unfitted preset question-answer relationship according to the confidence coefficient of the classification result output by the classification network.

7. The method according to claim 5, wherein the model training according to the first question-answer set and the fourth question-answer set to obtain a first language model comprises:

and performing model training according to the first question-answer set and the fourth question-answer set to obtain a first conversion network and a second conversion network in the first language model.

8. The method of claim 5, wherein training the first language model based on the second set of questions and the third set of questions to obtain a second language model comprises:

inputting the second question-answer set and the third question-answer set into the first language model, outputting sentence vectors corresponding to question sentences through the first conversion network in the first language model, and outputting sentence vectors corresponding to answer sentences through the second conversion network;

9. The method of claim 6, wherein before filtering out redundant sentences in the first set of questions and answers to obtain a second set of questions and answers, the method further comprises:

and performing model training according to a preset question-answer relationship between question sentences and answer sentences in the first question-answer set to obtain a classification network in the first language model.

10. A model training apparatus, comprising:

the acquisition module is used for acquiring a first question-answer set;

a filtering module, configured to filter redundant statements in the first question and answer set to obtain a second question and answer set, where any question and answer pair in the second question and answer set includes a target question statement and a target answer statement;

11. An electronic device, comprising: a processor and a memory; wherein the memory is to store one or more computer instructions that when executed by the processor implement:

acquiring a first question-answer set;

12. A computer-readable storage medium storing computer instructions, which when executed by one or more processors, cause the one or more processors to perform at least the following acts:

acquiring a first question-answer set;