CN114398908A

CN114398908A - Model training method, information processing method, device, equipment and storage medium

Info

Publication number: CN114398908A
Application number: CN202210028961.8A
Authority: CN
Inventors: 陈谦; 王雯
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2022-01-11
Filing date: 2022-01-11
Publication date: 2022-04-26

Abstract

The application provides a model training method, an information processing device, equipment and a storage medium, wherein the method comprises the following steps: training a dialogue understanding model according to a plurality of seed training data, and generating generation data matched with the seed training data according to various sub-training data in at least part of the seed training data; selecting generation data with a prediction intention matched with the generation intention from the obtained generation data; the generation intention of the generated data is the intention of corresponding seed training data, and the prediction intention is the intention obtained by predicting the generated data through the dialogue understanding model; and performing pseudo-data training on the dialogue understanding model according to the selected generated data. The method solves the problem that the traditional method is difficult to collect data from the web crawler data by a retrieval method, and improves the overall training efficiency and accuracy of the dialogue understanding model.

Description

Model training method, information processing method, device, equipment and storage medium

Technical Field

The present application relates to the field of natural language processing technologies, and in particular, to a model training method, an information processing apparatus, a device, and a storage medium.

Background

The dialogue understanding model can analyze the information input by the user, determine the intention of the user, and push corresponding reply information to the user according to the intention, and has very important function in the field of natural language processing.

In order to train the dialogue understanding model, a large amount of training data is required. The traditional method carries out model training by retrieving data in the same field as training data, but human-computer conversation data which can be retrieved in a network are few, so that the data in the same field is difficult to search from web crawler data, and the overall training efficiency and accuracy of a conversation understanding model are low.

Disclosure of Invention

The embodiments of the present application mainly aim to provide a model training method, an information processing apparatus, a device, and a storage medium, so as to quickly and accurately generate data for training a dialogue understanding model, and improve efficiency and accuracy of training the dialogue understanding model.

In a first aspect, an embodiment of the present application provides a model training method, including:

training a dialogue understanding model according to a plurality of seed training data, and generating generation data matched with the seed training data according to various sub-training data in at least part of the seed training data;

selecting generation data with a prediction intention matched with the generation intention from the obtained generation data; the generation intention of the generated data is the intention of corresponding seed training data, and the prediction intention is the intention obtained by predicting the generated data through the dialogue understanding model;

and performing pseudo-data training on the dialogue understanding model according to the selected generated data.

Optionally, generating, according to various sub-training data in at least part of the seed training data, generation data matched with the seed training data, including:

dividing at least part of the seed training data into at least one group of seed training data, wherein each group comprises at least one seed training data and each sub training data corresponds to the same intention;

and generating generation data matched with the group of seed training data according to each group of seed training data, wherein the number of the generation data is one or more.

Optionally, generating generation data matched with the set of seed training data includes:

splicing the seed training data in the group;

and performing prompt generation based on the pre-training language model according to the spliced data to obtain corresponding generated data.

Optionally, selecting generation data with a prediction intention matched with the generation intention from the obtained generation data includes:

from the generated data obtained, generated data that matches the semantics of arbitrary seed training data and matches the prediction intention with the generation intention is selected.

Optionally, selecting generated data that matches semantics of any seed training data and has a prediction intention consistent with the generation intention from the obtained generated data, including:

determining semantic matching degrees of each generated data and the seed training data aiming at each seed training data, and selecting a preset number of generated data with the highest semantic matching degree and/or generated data with the semantic matching degree larger than a threshold value to obtain adjacent generated data;

predicting adjacent generated data through the dialogue understanding model to obtain a prediction intention of the adjacent generated data;

from the adjacent generated data, generated data whose generation intention matches the prediction intention is selected.

Optionally, predicting neighboring generation data through the dialogue understanding model to obtain a prediction intention of the neighboring generation data, including: predicting adjacent generated data through the dialogue understanding model to obtain a pseudo label of the adjacent generated data, wherein the pseudo label comprises a prediction intention and a prediction groove value;

performing pseudo-data training on the dialog understanding model according to the selected generated data, wherein the pseudo-data training comprises the following steps: and continuing to train the dialogue understanding model according to the selected generation data and the corresponding pseudo label, the seed training data and the corresponding label.

In a second aspect, an embodiment of the present application provides an information processing method, including:

acquiring request information input by a user, and determining an intention corresponding to the request information through a dialogue understanding model;

outputting reply information according to the intention;

wherein the dialogue understanding model is trained according to the method of any one of the first aspect.

In a third aspect, an embodiment of the present application provides a model training apparatus, including:

the generating module is used for training the dialogue understanding model according to a plurality of seed training data and generating data matched with the seed training data according to various sub-training data in at least part of seed training data;

a matching module for selecting generation data with the prediction intention matched with the generation intention from the obtained generation data; the generation intention of the generated data is the intention of corresponding seed training data, and the prediction intention is the intention obtained by predicting the generated data through the dialogue understanding model;

and the training module is used for carrying out pseudo data training on the dialogue understanding model according to the selected generated data.

In a fourth aspect, an embodiment of the present application provides an information processing apparatus, including:

the acquisition module is used for acquiring request information input by a user and determining the corresponding intention of the request information through a dialogue understanding model;

the output module is used for outputting reply information according to the intention;

wherein the dialogue understanding model is trained by the apparatus according to the third aspect.

In a fifth aspect, an embodiment of the present application provides an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to cause the electronic device to perform the method of any of the above aspects.

In a sixth aspect, an embodiment of the present application provides a computer-readable storage medium, where computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the method according to any one of the above aspects is implemented.

In a seventh aspect, the present application provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the method of any one of the above aspects.

The model training method, the information processing method, the device, the equipment and the storage medium can train a dialogue understanding model according to a plurality of seed training data, generate generated data matched with the seed training data according to various sub-training data in at least part of the seed training data, select generated data with a prediction intention matched with the generation intention from the generated data, wherein the generation intention of the generated data is the intention of the corresponding seed training data, the prediction intention is the intention obtained by predicting the generated data through the dialogue understanding model, and perform pseudo data training on the dialogue understanding model according to the selected generated data, so that a large amount of weakly supervised data can be generated through the seed training data, the problem that data are difficult to collect from the network crawler data by a retrieval method in the traditional method is solved, and filtering the generated data through the prediction intents and the generation intents, so that the generated data with high reliability of the prediction labels can be quickly and accurately found out for model training, and the overall training efficiency and accuracy of the dialogue understanding model are improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;

fig. 2 is a schematic application diagram of a dialog understanding model provided in an embodiment of the present application;

fig. 3 is a schematic flowchart of a model training method according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram illustrating a comparison between a predicted intent and a generated intent provided by an embodiment of the present application;

fig. 5 is a schematic diagram of obtaining generated data according to an embodiment of the present application;

FIG. 6 is a schematic flow chart illustrating a process of screening generated data according to an embodiment of the present application;

FIG. 7 is a schematic flow chart diagram illustrating another model training method according to an embodiment of the present disclosure;

fig. 8 is a schematic flowchart of an information processing method according to an embodiment of the present application;

FIG. 9 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

fig. 10 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application.

The terms referred to in this application are explained first:

prompt generation technology: the prompt-based generation generates data matching the data given a piece of data, for example, generates other text similar to a piece of text given a piece of text.

Conversational understanding: natural language understating, which can determine the corresponding intention according to the above of the conversation, and the intention can be used for determining the below of the conversation.

Weak supervision: the familiar types of the weakly supervision are: incomplete supervision, inaccurate supervision, and the like.

Self-training: self-training, model co-training using a small amount of labeled data and a large amount of unlabeled data.

Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application. As shown in fig. 1, the server may train the session understanding model according to the training data, and issue the trained session understanding model to the terminal device, and the terminal device may interact with the user based on the model, for example, when the user inputs the request information, the terminal device may determine reply information corresponding to the request information through the model and push the reply information to the user.

In addition, the training and using process of the model can be adjusted according to actual needs. For example, the model may be trained by the terminal device and the trained model used. Or the server can train and use the model, and when the model is used, the terminal device can upload the request information of the user to the server and acquire the reply information returned by the server and push the reply information to the user.

Alternatively, the dialog understanding model in the embodiment of the present application may refer to a model for determining an intention of an input request. Fig. 2 is an application diagram of a dialog understanding model according to an embodiment of the present application. As shown in fig. 2, if the request information input by the user is "what is the weather of beijing tomorrow", the corresponding intention can be determined as "weather acquisition" through the dialogue understanding model, so that the intention of the request information can be understood to determine the reply information according to the corresponding intention, for example, the reply information may be "how cloudy and little rain of beijing tomorrow, please remember to take an umbrella".

Optionally, the dialog understanding model may be used for determining a slot value in addition to the intent, and the slot value may be at least one parameter value corresponding to the intent, for example, in the above example, the corresponding slot value may include: the location is "beijing", the time is "tomorrow", and the reply information can be determined more accurately according to the intention "get weather" and the slot value "beijing" and "tomorrow".

In the embodiment of the application, the dialog understanding model can be applied to any product which needs to identify the intention of the user, for example, the dialog understanding model can be applied to navigation type APP, video type APP, intelligent sound box and the like, the user can be supported to input request information through characters, the user can also be supported to use a voice assistant, the user can input the request information through voice, voice and character conversion is carried out by equipment, and corresponding reply information is determined.

In order to train the dialog understanding models used by different products, a large amount of training data corresponding to fields often needs to be acquired, for example, a large amount of man-machine dialog data of a navigation class is needed for the dialog understanding model related to the navigation class APP, and a large amount of man-machine dialog data of a video class is needed for the dialog understanding model related to the video class APP. The traditional method generally acquires data by a method of retrieving web crawler data, the efficiency is low, and the man-machine conversation data in the network is insufficient, so that the problem of insufficient data amount still exists during model training, and the accuracy of the model is poor.

In view of this, an embodiment of the present application provides a model training method, which includes obtaining an original dialogue understanding model through training of a plurality of seed training data, obtaining generated data matched with the seed training data through a prompt generation technique, generating corresponding at least one generated data according to at least one seed training data each time because of a plurality of seed training data, obtaining a plurality of generated data through multiple times of prompt generation, performing consistency detection on the generated data, screening out generated data with consistent generation intentions and prediction intentions, and further training the original dialogue understanding model according to the screened generated data to obtain a new dialogue understanding model.

The generation intention corresponding to each generation data may refer to an intention of seed training data corresponding to the generation data, and the prediction intention may refer to an intention predicted for the generation data by the original dialogue understanding model.

Therefore, a large amount of weakly supervised man-machine conversation data can be generated through a prompt generation technology, the problem that the man-machine conversation data in the same field is difficult to collect from the webworm data by a retrieval method in the traditional method is solved, the generated data is filtered through the prediction intents and the generation intents, if the generation intents of the generated data are inconsistent with the prediction intents, the quality of the generated data is poor, the reliability of the prediction intents as pseudo labels is poor, and therefore the generated data which are not used is abandoned, and only the generated data with the generation intents consistent with the prediction intents are reserved, so that the generated data with high data quality and high reliability of the prediction labels can be quickly and accurately found out for model training, and the overall training efficiency and accuracy of a dialogue understanding model are improved.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The features of the embodiments and examples described below may be combined with each other without conflict between the embodiments. In addition, the sequence of steps in each method embodiment described below is only an example and is not strictly limited.

Fig. 3 is a schematic flowchart of a model training method according to an embodiment of the present application. The execution subject of the method in this embodiment may be applied to any device having a data processing function, such as a terminal device and/or a server. As shown in fig. 3, the method may include:

step 301, training a dialogue understanding model according to a plurality of seed training data, and generating generation data matched with the seed training data according to various sub-training data in at least part of the seed training data.

Optionally, a seed training data set may be obtained, where the seed training data set may include a plurality of seed training data, the seed training data may be natural language text information, specifically may be query (request information) in human-computer dialogue data, the seed training data may correspond to a label, the label may include an intention, and the label may be obtained by manual marking or other manners.

The initial dialogue understanding model can be trained through a plurality of seed training data in the seed training data set and corresponding labels, and the obtained dialogue understanding model can be marked as an original dialogue understanding model.

In addition, corresponding generated data can be obtained through a prompt generation technology according to the seed training data. Wherein the cue generation technique may be used to generate generation data that matches the seed training data, e.g., generation data that is similar to the seed training data.

Some or all of the seed training data may be selected from the seed training data set for prompt generation, and when a prompt is generated, one seed training data may be used as an input to obtain corresponding generated data, or a plurality of seed training data may be used as an input to obtain corresponding generated data. The prompt generation may be performed one or more times to obtain sufficient generated data.

It should be noted that, in this step, training the dialog understanding model and obtaining the generated data may be performed simultaneously or sequentially, and this embodiment does not limit the execution order of the two.

Step 302, selecting generation data with the prediction intention matched with the generation intention from the obtained generation data.

The generation intention of the generation data is the intention of corresponding seed training data, and the prediction intention is the intention obtained by predicting the generation data through the dialogue understanding model.

Optionally, for each generated data, the generated data may be input into the original dialog understanding model obtained through the training in step 301, so as to obtain an intention corresponding to the generated data.

In an example, the prediction intent matches the generation intent, which may mean that the prediction intent is consistent with the generation intent. For any seed training data, assuming that the intention is A, the generation data corresponding to the seed training data can be obtained by a prompt generation technology, the prediction intention of the generation data is predicted, whether the predicted intention is consistent with A or not is judged, and if the predicted intention is not consistent with A, the generation data is discarded without use.

Fig. 4 is a schematic diagram illustrating a comparison between a predicted intent and a generated intent according to an embodiment of the present application. As shown in fig. 4, there are 3 seed training data, the intentions are A, B, C respectively, corresponding generated data are obtained according to the 3 seed training data, and then an intention of each generated data is obtained through a dialogue understanding model, assuming A, B, D, by comparing the generated intention with the predicted intention, it is known that the first two generated data meet the requirements, and the 3 rd generated data do not meet the requirements and need to be removed.

In another example, the prediction intent matches the generation intent, which may mean that the prediction intent is consistent with or close to the generation intent.

And 303, performing pseudo data training on the dialogue understanding model according to the selected generated data.

After the generated data with the prediction intent matching the generation intent is screened out, the original dialogue understanding model trained in step 301 may be trained continuously according to the screened generated data. The labels corresponding to the generated data are not determined by manual marking, but are generated by prediction of an original dialog understanding model, so that the labels can be regarded as pseudo data, and training of the model based on the generated data can be regarded as pseudo data training.

Optionally, during training, seed training data may also be added, that is, this step may specifically include: training the dialogue understanding model according to the selected generation data and the seed training data.

In practical applications, steps 301 to 303 may be repeatedly performed multiple times, and the latest dialog understanding model obtained by the current training may be trained each time step 301 or step 303 is repeatedly performed. The seed training data set used in each training may be the same or different, for example, in different iteration processes, other seed training data with labels may be added, or generated data with pseudo labels may be added to the seed training data set for further training.

Self-training of the dialogue understanding model can be achieved by repeatedly performing the steps. The resulting conversational understanding model may be deployed at a server or terminal device for use in interacting with a user.

In summary, the model training method provided in this embodiment may train a dialogue understanding model according to a plurality of seed training data, generate generated data matching with the seed training data according to various sub-training data in at least part of the seed training data, select generated data with a prediction intention matching with the generation intention from the generated data, where the generation intention of the generated data is an intention of the corresponding seed training data, the prediction intention is an intention predicted from the generated data by the dialogue understanding model, and perform pseudo data training on the dialogue understanding model according to the generated data, so that a large amount of weakly supervised data may be generated by the seed training data, the problem that it is difficult to collect data from the cyberward worm data by an retrievable method in the conventional method is solved, and then the generated data is filtered by the prediction intention and the generation intention, therefore, the generated data with high reliability of the predicted label can be quickly and accurately found out for model training, and the overall training efficiency and accuracy of the dialogue understanding model are improved.

In one or more embodiments of the present application, optionally, generating, according to various sub-training data in at least part of the seed training data, generation data matched with the seed training data may include:

dividing at least part of the seed training data into at least one group of seed training data, wherein each group comprises at least one seed training data and each sub training data corresponds to the same intention; and generating generation data matched with the group of seed training data according to each group of seed training data, wherein the number of the generation data is one or more.

Optionally, seed training data with the same intention may be screened from the multiple seed training data, and a plurality of seed training data with the same intention may be randomly selected from the seed training data as a group for prompt generation.

For example, there may be multiple seed training data intended to be "weather-picking," from which some or all of the seed data may be selected as a group and the corresponding generated data for that group is obtained. For other intended seed training data, corresponding generated data may also be obtained in units of groups.

Optionally, a set of generated data may be generated according to a set of seed training data with the same intent, and a set of generated data may include one or more generated data, each generated data being similar to one seed training data and may be a sentence.

In conclusion, by using a plurality of seed training data extracted with the same intention for prompt generation, the generated data can be obtained on the basis of learning more knowledge, the quality of the generated data is improved, and the model training effect of the dialogue understanding model is improved.

Fig. 5 is a schematic diagram of obtaining generated data according to an embodiment of the present application. As shown in fig. 5, generating generation data that matches the set of seed training data may include: splicing the seed training data in the group; and performing prompt generation based on the pre-training language model according to the spliced data to obtain corresponding generated data.

Optionally, during the splicing, a separator may be disposed between adjacent seed training data, for example, the seed training data may be spliced together by using a vertical line, and the seed training data obtained after the splicing is input into the pre-training language model to obtain the generated data with a preset length, where the preset length may be set according to actual needs, for example, the generated data with 500 words may be set.

Optionally, the generated data output by the pre-trained language model may be separated by separators, and one or more generated data, i.e., a sentence or a plurality of sentences, may be obtained by the separators.

Referring to fig. 5, there are 3 seed training data, which are respectively "how the weather is today in beijing", "the weather in the left and right Hangzhou in Hangzhou", and "how the weather is at the weekend of Chongqing", and the corresponding intentions are "weather acquisition", and the three seed training data are spliced into prefixes and input into a pre-training language model to obtain generated data with a preset length, and the data are divided into "how good the weather is today", "how high the temperature is in tomorrow", "what weather is in Wuhan Mingming day", "how good the weather is in today's luck, and the like.

Alternatively, the pre-training language model may adopt an existing model, such as GPT3, T5, etc., and may use the rewrite function of the models themselves to input a set of data as the above, and the models may output a context similar to the above.

Or, a pre-training language model may be designed and trained separately as needed, for example, a plurality of sets of matched natural language text information may be labeled manually, and the model may be trained through manually labeled data, so that the trained model may output matched data according to the input data.

In summary, a group of seed training data is spliced and then input into a model such as GPT3 and the like for prompt generation, so that the generated data matched with the group of seed training data can be quickly and accurately acquired, and the efficiency and accuracy of the generated data are improved.

In other alternative implementations, a single seed training data may also be used as an input to obtain corresponding generation data.

In one or more embodiments of the present application, optionally, selecting generation data with a prediction intention matching the generation intention from the obtained generation data may include: from the generated data obtained, generated data that matches the semantics of arbitrary seed training data and matches the prediction intention with the generation intention is selected.

Optionally, the obtained generated data may be filtered through semantic retrieval and consistency filtering; the semantic retrieval is used for screening out generated data matched with the semantics of various sub-training data; and the consistency screening is used for screening out the generated data with the generation intention consistent with the prediction intention.

The sequence of semantic retrieval and consistency screening is not limited, and part of data can be screened out from the generated data through semantic retrieval first, and then consistency screening is further performed from the screened-out data, or part of data can be obtained through consistency screening first, and then further screening is performed through semantic retrieval. The generated data which passes the semantic retrieval and the consistency screening can be used as training data for the next training.

In practical application, if the semantics of a generated data and various sub-training data are not matched, the generated data is far from the current seed training data set in terms of semantics, and therefore the generated data is discarded without use.

In conclusion, through semantic retrieval and consistency screening, the finally obtained generated data can meet the requirements on intentions and semantic descriptions, the quality of the generated training data is improved, and the model training effect is further improved.

Fig. 6 is a schematic flowchart illustrating a process of screening generated data according to an embodiment of the present application. As shown in fig. 6, selecting, from the generated data obtained, generated data that matches the semantics of arbitrary seed training data and whose prediction intention matches the generation intention may include:

step 601, determining semantic matching degrees of each generated data and each seed training data according to each seed training data, and selecting a preset number of generated data with the highest semantic matching degree and/or generated data with the semantic matching degree larger than a threshold value to obtain adjacent generated data.

Wherein, this step can be used to realize semantic retrieval. Optionally, the semantic matching degree between the seed training data and each generated data may be determined by a semantic retrieval model, where the semantic retrieval model may use a vector semantic model such as simcse (simple contrast Learning of sequence entries), or a simple character matching model.

In one example, for each seed training data, Top-N generation data may be semantically retrieved, for example, N is 3, and then 3 generation data with the highest matching degree with the seed training data may be selected as the generation data matching with the seed training data.

In another example, for each seed training data, the generation data with semantic matching degree greater than a threshold may be found, for example, if the threshold is set to 0.8, the generation data with matching degree greater than 0.8 with the seed training data may be screened out as the generation data matching with the seed training data.

In yet another example, a preset number of generated data with the highest semantic matching degree and generated data with a semantic matching degree greater than a threshold value may be selected.

The generated data selected by the semantic search is marked as adjacent generated data, and represents a part of the generated data that is semantically closest to the seed training data set in the obtained generated data.

Step 602, predicting the adjacent generated data through the dialogue understanding model to obtain a prediction intention of the adjacent generated data.

Alternatively, the dialogue understanding model may be a dialogue understanding model trained by seed training data. Pseudo-labels, which may include predicted intents, may be derived adjacent to the generated data by the conversational understanding model.

Step 603 selects generation data having a generation intention matching the prediction intention from the adjacent generation data.

In practical application, for a plurality of generated data obtained by the prompt generation technology, layer-by-layer filtering can be performed through semantic retrieval and consistency screening, adjacent generated data meeting the requirement semantically are screened out from the generated data, intention prediction is performed on the adjacent generated data through the dialogue understanding model, and all generated data passing through the prompt generation technology do not need to be input into the dialogue understanding model for intention prediction. The intent prediction may be followed by consistency screening, further screening out generated data whose generated intent is consistent with the predicted intent for subsequent training.

In conclusion, adjacent generated data obtained by semantic retrieval are screened out from the adjacent generated data with consistent intentions, so that the efficiency of screening the generated data can be effectively improved, and the overall training efficiency of the model is further improved.

In one or more embodiments of the present application, optionally, selecting generation data whose generation intention is consistent with the prediction intention from adjacent generation data may include: selecting generation data with consistent generation intention, prediction intention and semantic retrieval intention from adjacent generation data; and the semantic retrieval intention of the generated data is the intention of seed training data matched with the semantics of the generated data.

Illustratively, prompt generation is performed based on the seed training data P to obtain corresponding generated data, and in the semantic retrieval process, the seed training data with the highest semantic matching degree with the generated data is Q, so that whether the intention of the seed training data P, the intention of the seed training data Q, and the predicted intention of the generated data are consistent or not can be detected, and if so, the intention can be used for subsequent training.

In conclusion, the consistency screening is carried out on the generating intention, the predicting intention and the semantic retrieval intention, so that the training effect of the model can be further improved.

In one or more embodiments of the present application, optionally, predicting neighboring generated data through the dialog understanding model to obtain a prediction intention of the neighboring generated data may include: and predicting adjacent generated data through the dialogue understanding model to obtain a pseudo label of the adjacent generated data, wherein the pseudo label comprises a prediction intention and a prediction groove value. When the dialogue understanding model is trained according to the generated data, the dialogue understanding model can be specifically trained by using the prediction intentions and the prediction slot values, so that the trained dialogue understanding model has the capability of not only predicting intentions but also predicting slot values.

Optionally, the performing the pseudo data training on the dialog understanding model according to the selected generated data may include: and continuing to train the dialogue understanding model according to the selected generation data and the corresponding pseudo label, the seed training data and the corresponding label.

In practical applications, the selected generation data and the original seed training data may be combined, and the dialogue understanding model may be further trained according to the combined data.

In conclusion, by predicting the intention and the prediction groove value, a dialogue understanding model with richer functions can be trained, and by combining the generated data and the seed training data, the training of the dialogue understanding model can be realized based on more and more accurate samples, so that the accuracy of the dialogue understanding model is improved.

Fig. 7 is a schematic flowchart of another model training method according to an embodiment of the present application. The embodiment provides a specific implementation manner for completing model training by sequentially performing initial training, prompt generation, semantic retrieval and consistency screening on the basis of the technical scheme provided by the embodiment. As shown in fig. 7, the method includes:

and a, training a dialogue understanding model by using seed training data to obtain an original dialogue understanding model.

Optionally, the number of the seed training data may be multiple, each seed training data may be a sentence, has a corresponding label, and the label may include an intention, and further, may also include a slot value, which may be obtained by manual labeling or other means.

And b, screening out the seed training data with the same intention, and randomly selecting a plurality of data from the seed training data.

Specifically, samples with the same intention can be screened from the seed training data, and further, a plurality of samples with the same intention can be randomly selected from the samples. Illustratively, 3 queries intended to be "get weather" are extracted, respectively: "how the weather of Beijing is today", "the weather of Hangzhou, and" the weather of Chongqing weekend ".

And c, using the same intention data as a prefix to perform prompt generation to obtain generated data.

Optionally, the extracted queries with the same intent may be spliced into prefixes, and the prefixes are prompted to generate by using a large-scale pre-trained model, such as a GPT3 model. And c, repeating the steps b and c for a plurality of times until enough generated data are obtained. Optionally, the seed training data with different intentions may be selected each time, so as to obtain generation data corresponding to the seed training data with more intentions.

And d, performing semantic retrieval on each seed training data in the generated data to obtain adjacent generated data.

Alternatively, multiple seed training data may be traversed, and for each seed training data, Top-N generation data may be semantically retrieved from the generation data.

Illustratively, 3 adjacent generation data are retrieved: "the highest temperature in tomorrow", "what weather in Wuhan tomorrow", "so like the weather in today's city of fortune".

And e, carrying out pseudo label prediction by using the original dialogue understanding model to obtain a prediction intention, and screening out generated data with the prediction intention consistent with the generation intention.

Optionally, for each piece of data in the adjacent generated data, the original dialogue understanding model may be used to perform pseudo label prediction to obtain a prediction intention and a prediction slot value, and then data with a prediction intention consistent with the generation intention is screened out.

For example, the generation data 4 is deleted because the prediction intention is "acquisition temperature" and does not coincide with the generation intention "acquisition weather", and the deletion line added to the generation data 4 in the figure indicates that the data is to be discarded.

And f, combining the screened generation data and the original seed training data, and continuing training to obtain a new dialogue understanding model.

Specifically, the original dialogue understanding model may be trained by using the filtered generation data and the original seed training data to obtain a new dialogue understanding model.

Through the steps a to f, self-training of the dialogue understanding model can be achieved, optionally, the steps a to f can be repeatedly executed, new generated data are continuously added and the model is trained, and the prediction effect of the model is improved.

In conclusion, the problem that the traditional method is difficult to collect man-machine conversation data in the same field by using a retrieval method from the web crawler data can be solved by generating weakly supervised man-machine conversation data by adopting a prompt generation technology, then Top-N data closest to the existing training data is screened by using a semantic retrieval method, consistency screening is carried out, and finally the screened data is used for self-training, so that the aim of improving the accuracy of a conversation understanding model can be achieved.

In one or more embodiments of the present application, optionally, when generating the prompt, at least part of the seed training data may be divided into at least one set of seed training data according to the intention and the slot value, each seed training data included in each set corresponds to the same intention and slot value, and the generated data matched with the set of seed training data is generated according to each set of seed training data.

Accordingly, selecting generation data of which the prediction intention matches the generation intention from the obtained generation data may include: selecting generation data of which the prediction intention is matched with the generation intention and the prediction groove value is matched with the generation groove value from the obtained generation data; the predicted slot value is a slot value obtained by predicting the generated data through a dialogue understanding model, and the generated slot value is a slot value of the seed training data corresponding to the generated data.

Illustratively, seed training data intended for "get weather," the slot values including "Beijing" and "tomorrow" may be grouped and the corresponding generated data for that group may be derived by a prompt generation technique. During consistency screening, the prediction intention and the prediction slot value of the generated data can be obtained through the trained dialogue understanding model, and the generated data of which the prediction intention is not 'acquiring weather' or the prediction slot value does not comprise 'Beijing' and 'tomorrow' is discarded.

In conclusion, the intention and the trough value are used for prompt generation, so that the seed training data generated by the prompt can be more concentrated, the obtained generated data and the seed training data have higher matching performance, consistency screening is performed through the intention and the trough value, the generated data with higher quality can be obtained, and the model training effect is further improved.

Fig. 8 is a flowchart illustrating an information processing method according to an embodiment of the present application. The information processing method can be applied to any device having an information processing capability, such as a terminal device and/or a server. As shown in fig. 8, the method may include:

step 801, acquiring request information input by a user, and determining an intention corresponding to the request information through a dialogue understanding model.

The dialogue understanding model is obtained by training according to the model training method in any embodiment of the application.

In practical application, both the seed training data and the generated data used for training the model can be natural language text information, and the trained model can be used for predicting a corresponding intention according to the natural language text information.

In use, the acquired request information may be any type of information, such as text information, voice information, image information, and the like. Illustratively, according to voice information or image information input by a user, the voice information or the image information can be converted to obtain corresponding text information, and then the corresponding intention is determined through a dialog understanding model.

And step 802, outputting reply information according to the intention.

Optionally, after determining the intent, the reply message may be determined according to the intent. The specific implementation manner of determining the reply information according to the purpose is not limited by the embodiment of the application.

Illustratively, if the intent is to "play a joke," a joke can be obtained from a joke library and pushed to the user.

Optionally, the output of the dialog understanding model may include a slot value in addition to the intent, and the reply information may be determined based on the intent and the slot value.

Illustratively, the intention corresponding to the request information is "acquire weather", the slot value is time ═ tomorrow ", and the place ═ beijing", then the weather of the tomorrow in beijing can be inquired and pushed to the user.

Optionally, there may be at least one way of outputting the reply message, including but not limited to: display, voice play, send to other devices, etc.

In practical application, the terminal device may be deployed with a trained dialogue understanding model, directly process request information input by a user according to the model, and output reply information. Or the trained conversation understanding model can be deployed in a server, the terminal device can send the request information to the server after acquiring the request information input by the user, the server feeds back the corresponding intention to the terminal device according to the model, or the server feeds back reply information determined according to the intention to the terminal device, and the reply information is pushed to the user by the terminal device.

In summary, the information processing method provided in this embodiment may use the dialogue understanding model obtained by training based on the prompt generation technology to perform information processing, and may generate a large amount of weakly supervised human-computer dialogue data by the prompt generation technology, so as to solve the problem that it is difficult to collect human-computer dialogue data in the same field by a search method from web crawler data in the conventional method, and may quickly and accurately find generated data with higher data quality and higher reliability of prediction tags for model training by combining consistency screening of prediction intents and generation intents, thereby improving accuracy of the outputted reply information and improving user experience.

In the embodiments of the present application, the apparatus for training and using the model is not limited. The training and use process may involve one or more devices, for example, some steps may be performed by the terminal device and other steps may be performed by the server during the model training process.

Corresponding to the model training method, the embodiment of the application also provides a model training device. Fig. 9 is a schematic structural diagram of a model training apparatus according to an embodiment of the present application. As shown in fig. 9, the apparatus includes:

a generating module 901, configured to train a dialogue understanding model according to a plurality of seed training data, and generate generating data matched with the seed training data according to various sub-training data in at least part of the seed training data;

a matching module 902, configured to select generation data with a prediction intent matching the generation intent from the obtained generation data; the generation intention of the generated data is the intention of corresponding seed training data, and the prediction intention is the intention obtained by predicting the generated data through the dialogue understanding model;

a training module 903, configured to perform pseudo data training on the dialog understanding model according to the selected generation data.

In one or more embodiments of the present application, optionally, when generating, according to various sub-training data in at least part of the seed training data, the generation module 901 is specifically configured to:

In one or more embodiments of the present application, optionally, when generating the generated data matched with the set of seed training data, the generating module 901 is specifically configured to:

splicing the seed training data in the group;

In one or more embodiments of the present application, optionally, the matching module 902 is specifically configured to:

In one or more embodiments of the present application, optionally, when predicting the neighboring generated data through the dialog understanding model to obtain the prediction intention of the neighboring generated data, the matching module 902 is specifically configured to: and predicting adjacent generated data through the dialogue understanding model to obtain a pseudo label of the adjacent generated data, wherein the pseudo label comprises a prediction intention and a prediction groove value.

The training module 903 is specifically configured to: and continuing to train the dialogue understanding model according to the selected generation data and the corresponding pseudo label, the seed training data and the corresponding label.

The model training device provided in the embodiment of the present application may be used to implement the technical solutions of the embodiments shown in fig. 1 to 7, and the implementation principles and technical effects thereof are similar, and this embodiment is not described herein again.

Corresponding to the information processing method, the embodiment of the application also provides an information processing device. Fig. 10 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present application. As shown in fig. 10, the apparatus includes:

an obtaining module 1001, configured to obtain request information input by a user, and determine an intention corresponding to the request information through a dialog understanding model;

an output module 1002, configured to output reply information according to the intention;

wherein, the dialogue understanding model is obtained by training according to the model training device of any one of the above embodiments.

The information processing apparatus provided in the embodiment of the present application may be configured to execute the technical solution in the embodiment shown in fig. 8, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 11, the electronic device of the present embodiment may include:

at least one processor 1101; and

a memory 1102 communicatively coupled to the at least one processor;

wherein the memory 1102 stores instructions executable by the at least one processor 1101 to cause the electronic device to perform a method according to any one of the embodiments described above.

Alternatively, the memory 1102 may be separate or integrated with the processor 1101.

For the implementation principle and the technical effect of the electronic device provided by this embodiment, reference may be made to the foregoing embodiments, which are not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, in which computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the method described in any one of the foregoing embodiments is implemented.

The present application further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the method described in any of the foregoing embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules is only one logical division, and other divisions may be realized in practice, for example, a plurality of modules may be combined or integrated into another system, or some features may be omitted, or not executed.

The integrated module implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor to execute some steps of the methods described in the embodiments of the present application.

It should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in the incorporated application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile storage NVM, such as at least one disk memory, and may also be a usb disk, a removable hard disk, a read-only memory, a magnetic or optical disk, etc.

The storage medium may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the storage medium may reside as discrete components in an electronic device or host device.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims

1. A method of model training, comprising:

2. The method of claim 1, wherein generating generation data that matches the seed training data based on various sub-training data in at least a portion of the seed training data comprises:

3. The method of claim 2, wherein generating generation data that matches the set of seed training data comprises:

splicing the seed training data in the group;

4. The method according to any one of claims 1 to 3, wherein selecting the generated data from which the predicted intent matches the generated intent comprises:

5. The method of claim 4, wherein selecting, from the generated data obtained, generated data that matches semantics of arbitrary seed training data and whose predicted intent is consistent with the generation intent comprises:

6. The method of claim 5,

predicting adjacent generated data through the dialogue understanding model to obtain the prediction intention of the adjacent generated data, wherein the prediction intention comprises the following steps: predicting adjacent generated data through the dialogue understanding model to obtain a pseudo label of the adjacent generated data, wherein the pseudo label comprises a prediction intention and a prediction groove value;

7. An information processing method characterized by comprising:

outputting reply information according to the intention;

wherein the dialogue understanding model is trained according to the method of any one of claims 1-6.

8. A model training apparatus, comprising:

9. An information processing apparatus characterized by comprising:

wherein the dialogue understanding model is trained by the apparatus of claim 8.

10. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to cause the electronic device to perform the method of any of claims 1-7.

11. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1-7.

12. A computer program product comprising a computer program, characterized in that the computer program realizes the method according to any of claims 1-7 when executed by a processor.