CN112528679A

CN112528679A - Intention understanding model training method and device and intention understanding method and device

Info

Publication number: CN112528679A
Application number: CN202011500085.1A
Authority: CN
Inventors: 尹坤; 刘权; 陈志刚; 王智国; 胡国平
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2021-03-19
Anticipated expiration: 2040-12-17
Also published as: CN112528679B

Abstract

The application discloses an intention understanding model training method and device, and an intention understanding method and device, wherein the intention understanding model training method comprises the following steps: after target language training data and auxiliary language training data are obtained, inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data which are output by the intention understanding model, and determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data; and updating the intention understanding model according to the model prediction loss, and returning to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached. This can effectively improve the intention understanding performance of the intention understanding model.

Description

Intention understanding model training method and device and intention understanding method and device

Technical Field

The present application relates to the field of computer technologies, and in particular, to an intention understanding model training method and apparatus, and an intention understanding method and apparatus.

Background

Currently, some human-computer interaction devices can perform human-computer interaction with a user according to a user sentence (e.g., a voice sentence and/or a text sentence) input by the user, so that the human-computer interaction devices can assist the user to complete a corresponding operation requirement (e.g., a requirement of route inquiry, air ticket ordering, and the like).

For the human-computer interaction device, after the human-computer interaction device receives a user statement input by a user, the human-computer interaction device needs to understand the intention of the user statement to determine the intention of the user, and then the human-computer interaction device performs human-computer interaction with the user according to the intention of the user.

However, the existing human-computer interaction device still cannot accurately understand the user sentences (especially the user sentences in languages with smaller use ranges, such as local dialects and languages), so how to accurately understand the user intentions is a technical problem to be solved urgently.

Disclosure of Invention

The embodiments of the present application mainly aim to provide an intention understanding model training method and apparatus, and an intention understanding method and apparatus, which can accurately understand a user intention from a user sentence, and particularly can accurately understand the user intention from a user sentence in a language with a small use range, such as a local dialect and a Chinese language.

The embodiment of the application provides an intention understanding model training method, which comprises the following steps:

acquiring target language training data and auxiliary language training data;

inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data which are output by the intention understanding model;

determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data;

and updating the intention understanding model according to the model prediction loss of the intention understanding model, and continuously executing the step of inputting the target language training data and the auxiliary language training data into the intention understanding model until a preset stop condition is reached.

In one possible embodiment, the target language training data includes at least one of target language real data, target language translation data, and target language generation data; the target language translation data is obtained by translating the auxiliary language real data; the target language generation data is generated from candidate intent data.

In a possible implementation manner, the acquiring process of the target language generation data is as follows:

inputting the candidate intention data into a pre-constructed target language data generation model to obtain target language generation data output by the target language data generation model; the target language data generation model is obtained by training by using target language labeling data and auxiliary language labeling data.

In one possible embodiment, the building process of the target language data generation model includes:

training a model to be trained by using the auxiliary language labeling data to obtain an auxiliary language data generation model;

and training the auxiliary language data generation model by using the target language marking data to obtain the target language data generation model.

In a possible implementation manner, the obtaining process of the target language translation data is as follows:

determining a target language vocabulary corresponding to the vocabulary to be translated according to a preset vocabulary mapping relation and the vocabulary to be translated in the auxiliary language real data, and determining the target language translation data according to the auxiliary language real data and the target language vocabulary corresponding to the vocabulary to be translated; the preset vocabulary mapping relation comprises a corresponding relation between the vocabulary to be translated and a target language vocabulary corresponding to the vocabulary to be translated;

alternatively, the first and second electrodes may be,

inputting the auxiliary language real data into a pre-constructed language translation model to obtain a translation result corresponding to the auxiliary language real data output by the language translation model, and determining the target language translation data according to the translation result corresponding to the auxiliary language real data.

In one possible embodiment, the determining a model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data includes:

determining the prediction loss of the target language according to the prediction intention corresponding to the training data of the target language;

determining the prediction loss of the auxiliary language according to the prediction intention corresponding to the auxiliary language training data;

determining a model prediction loss of the intent understanding model based on the target language prediction loss and the auxiliary language prediction loss.

In one possible implementation, when the target language training data includes target language real data, target language translation data, and target language generation data, and the predicted intention corresponding to the target language training data includes a predicted intention of the target language real data, a predicted intention of the target language translation data, and a predicted intention of the target language generation data, the method further includes:

acquiring an actual intention of target language real data, a reference intention of target language translation data and an actual intention of target language generation data;

determining a target language prediction loss according to the prediction intention corresponding to the target language training data, wherein the determining comprises the following steps:

determining a prediction loss corresponding to the target language real data according to the prediction intention of the target language real data and the actual intention of the target language real data;

determining a prediction loss corresponding to the target language translation data according to the prediction intention of the target language translation data and the reference intention of the target language translation data;

determining a prediction loss corresponding to the target language generation data according to the actual intention of the target language generation data and the actual intention of the target language generation data;

and determining the prediction loss of the target language according to the prediction loss corresponding to the real data of the target language, the prediction loss corresponding to the translation data of the target language and the prediction loss corresponding to the generation data of the target language.

The embodiment of the application also provides an intention understanding method, which comprises the following steps:

acquiring data to be understood of a target language;

inputting the target language data to be understood into the intention understanding model to obtain a prediction intention corresponding to the target language data to be understood output by the intention understanding model; the intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application.

An embodiment of the present application further provides an intention understanding model training device, which includes:

a first acquisition unit configured to acquire target language training data and auxiliary language training data;

a first prediction unit, configured to input the target language training data and the auxiliary language training data into an intention understanding model, and obtain a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data that are output by the intention understanding model;

a loss determination unit, configured to determine a model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data;

and the model updating unit is used for updating the intention understanding model according to the model prediction loss of the intention understanding model and returning to the first prediction unit to continue executing the step of inputting the target language training data and the auxiliary language training data into the intention understanding model until a preset stop condition is reached.

An intention understanding device is also provided in an embodiment of the present application, the device including:

the second acquisition unit is used for acquiring the data to be understood by the target language;

the second prediction unit is used for inputting the target language data to be understood into the intention understanding model to obtain a prediction intention corresponding to the target language data to be understood output by the intention understanding model; the intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application.

The embodiment of the present application further provides an intention understanding model training device, including: a processor, a memory, a system bus;

the processor and the memory are connected through the system bus;

the memory is configured to store one or more programs, the one or more programs including instructions, which when executed by the processor, cause the processor to perform any of the implementations of the intent understanding model training methods provided by the embodiments of the present application.

An embodiment of the present application further provides an intention understanding apparatus, including: a processor, a memory, a system bus;

the processor and the memory are connected through the system bus;

the memory is used for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any implementation of the intent understanding method provided by the embodiments of the present application.

The embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed on a terminal device, the terminal device is caused to execute any implementation of the intention understanding model training method provided in the embodiment of the present application, or execute any implementation of the intention understanding method provided in the embodiment of the present application.

The embodiment of the present application further provides a computer program product, which when running on a terminal device, enables the terminal device to execute any implementation of the method for training an intention understanding model provided in the embodiment of the present application, or execute any implementation of the method for understanding an intention provided in the embodiment of the present application.

Based on the technical scheme, the method has the following beneficial effects:

according to the intention understanding model training method, after target language training data and auxiliary language training data are obtained, the target language training data and the auxiliary language training data are input into an intention understanding model, a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data output by the intention understanding model are obtained, and model prediction loss of the intention understanding model is determined according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data; and updating the intention understanding model according to the model prediction loss, and returning to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached.

The intention understanding model is trained on the basis of the target language training data, so that the intention understanding model can accurately recognize the user intention described by the user sentence in the target language, and then the user intention can be accurately understood from the user sentence in the target language (for example, languages with smaller use ranges such as local dialects and languages) on the basis of the intention understanding model.

The difference between the auxiliary language data and the target language data is small due to the fact that the difference between the auxiliary language data and the target language data is small, the auxiliary language data can be used for expanding training data corresponding to the intention understanding model, adverse effects caused by the fact that the training data corresponding to the intention understanding model are small are avoided, the intention understanding performance of the intention understanding model obtained through training based on the target language training data and the auxiliary language training data is better, the user intention can be accurately understood from user sentences through the trained intention understanding model, and particularly the user intention can be accurately understood from the user sentences in languages with small using ranges such as local dialects and whistles.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of an intention understanding model training method provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of a language translation model provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a process for constructing a target language data generation model according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of an intended understanding model provided in an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating training of an intent understanding model provided by an embodiment of the present application;

FIG. 6 is a flow chart of an intent understanding method provided by an embodiment of the present application;

FIG. 7 is a schematic training diagram of an intention understanding model applied to Yue language intention understanding according to an embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of an intention understanding model training apparatus according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an intention understanding apparatus provided in an embodiment of the present application.

Detailed Description

The inventors found in the study of intention understanding that, in the related art, an intention understanding model can be trained in advance by using user history sentences in a target language, so that the trained intention understanding model can understand the intention of the user sentences in the target language input by the user. However, if the applicability of the target language (e.g., local dialect or dialect) is small, the target language is less in user history statements, and therefore the training data corresponding to the intention understanding model is less, and thus the intention understanding performance of the intention understanding model trained based on a small amount of training data is poor.

The inventors have also found in research with an intent to understand that there is similarity between different kinds of languages. For example, the chinese local dialect (e.g., cantonese, etc.) is slightly different from the chinese official language (i.e., mandarin) in terms of word order and pronunciation, so that the chinese local dialect has a relatively large similarity to the chinese official language, and thus the difference between the chinese local dialect and the chinese official language is relatively small, so when the intention understanding model is used for intention understanding of the user sentence in the target language, and the target language is the chinese local dialect, the training data corresponding to the intention understanding model can be expanded by using the chinese official language, so that the intention understanding model can be trained based on the user history sentence in the chinese local language and the user history sentence in the chinese local dialect.

Based on the research findings of the inventor, in order to solve the technical problems in the background art section and the defects in the related art, an intention understanding model training method is provided in an embodiment of the present application, and includes: acquiring target language training data and auxiliary language training data; inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data which are output by the intention understanding model; determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data; and updating the intention understanding model according to the model prediction loss, and returning to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached.

It can be seen that, since the intention understanding model is trained based on the target language training data, the intention understanding model can accurately recognize the user intention described by the user sentence in the target language, so that the user intention can be accurately understood from the user sentence in the target language (e.g., a language with a small use range, such as a local dialect, a Chinese language, etc.) based on the intention understanding model. The difference between the auxiliary language data and the target language data is small due to the fact that the difference between the auxiliary language data and the target language data is small, the auxiliary language data can be used for expanding training data corresponding to the intention understanding model, adverse effects caused by the fact that the training data corresponding to the intention understanding model are small are avoided, the intention understanding performance of the intention understanding model obtained through training based on the target language training data and the auxiliary language training data is better, the user intention can be accurately understood from user sentences through the trained intention understanding model, and particularly the user intention can be accurately understood from the user sentences in languages with small using ranges such as local dialects and whistles.

In addition, the embodiment of the present application does not limit the execution subject of the intention understanding model training method, and for example, the intention understanding model training method provided in the embodiment of the present application may be applied to a data processing device such as a terminal device or a server. The terminal device may be a smart phone, a computer, a Personal Digital Assistant (PDA), a tablet computer, or the like. The server may be a stand-alone server, a cluster server, or a cloud server.

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Method embodiment one

Referring to fig. 1, the figure is a flowchart of an intention understanding model training method provided in an embodiment of the present application.

The intention understanding model training method provided by the embodiment of the application comprises the following steps of S101-S105:

s101: target language training data and auxiliary language training data are obtained.

The target language is a language with a small use range (or a small number of people), and the target language is not limited in the embodiment of the present application, and for example, the target language may be a local dialect or a small language.

The auxiliary language refers to a language with less difference from the target language (i.e., the difference between the auxiliary language and the target language is lower than a preset difference threshold). For example, the difference between the official Chinese language and the local Chinese dialect is small because the official Chinese language (i.e., Mandarin) is slightly different from the local Chinese dialect (e.g., northern, Wu, Xiang, gan, Guest, Min, Guangdong, etc.) in terms of word order and pronunciation, so that the official Chinese language can be used as an assistant language when the target language is the local Chinese dialect.

The auxiliary language training data refers to auxiliary language sentences used for training the intention understanding model, and the auxiliary language training data is used for augmenting training data used for the intention understanding model. It should be noted that, the embodiment of the present application does not limit the manner of obtaining the auxiliary language training data, for example, the auxiliary language training data may be legally crawled from a web page, or may be read from a user history statement in an auxiliary language stored in the human-computer interaction device.

Target language training data refers to target language statements used to train the intent understanding model. In addition, the target language training data is not limited in the embodiments of the present application, and for example, the target language training data may include at least one of target language real data, target language translation data, and target language generation data. For ease of understanding, the target language real data, the target language translation data, and the target language generation data are described below separately.

The target language real data refers to real target language sentences acquired by preset acquisition means (for example, means of legally crawling from a webpage, reading from user history sentences in a target language stored in the human-computer interaction device, and the like). It should be noted that the preset acquisition means may be preset, and the embodiment of the present application does not limit the preset acquisition means.

The target language translation data refers to target language sentences acquired through a preset translation means. It should be noted that the preset translation means may be preset, and the preset translation means is not limited in the embodiment of the present application, for example, the preset translation means may be a means for performing translation based on a preset vocabulary mapping relationship, which is described below, or a means for performing translation based on a language translation model, which is described below. In addition, the embodiment of the present application also does not limit the translation object of the preset translation means, for example, the translation object may be an auxiliary language sentence because the difference between the auxiliary language and the target language is small.

To facilitate understanding of the acquisition process of the target language translation data, the following description is made in conjunction with two possible embodiments.

In a first possible implementation, the process of obtaining target language translation data may include steps 11 to 12:

step 11: and determining a target language vocabulary corresponding to the vocabulary to be translated according to the preset vocabulary mapping relation and the vocabulary to be translated in the auxiliary language real data. The preset vocabulary mapping relation comprises a corresponding relation between the vocabulary to be translated and the target language vocabulary corresponding to the vocabulary to be translated.

The preset vocabulary mapping relation is used for recording the corresponding relation between each target language vocabulary and each auxiliary language vocabulary. For example, when the target language is cantonese and the auxiliary language is mandarin, the predetermined vocabulary mapping relationship can be used to record the corresponding relationship between each cantonese vocabulary and each mandarin vocabulary (as shown in table 1).

Guangdong language vocabulary	Mandarin Chinese vocabulary
		degree of margin	Where
Sample application line	How to go
		……	……
Spotting is carried out	How to look like

TABLE 1 correspondence between respective Cantonese vocabulary and respective Mandarin vocabulary

The auxiliary language real data refers to the collected real auxiliary language sentences, and the auxiliary language real data comprises at least one auxiliary language vocabulary. It should be noted that the embodiment of the present application does not limit the collection manner of the auxiliary language real data, for example, the auxiliary language real data may be legally crawled from a web page, or may be read from a user history statement in the auxiliary language stored in the human-computer interaction device. In addition, the embodiment of the present application does not limit the relationship between the auxiliary language real data and the above auxiliary language training data, for example, the auxiliary language real data may be the same as the above auxiliary language training data or different from the above auxiliary language training data.

The vocabulary to be translated refers to auxiliary language vocabulary which needs to be translated into target language vocabulary in the auxiliary language real data, and the vocabulary to be translated can be determined according to a preset vocabulary mapping relation. For example, if the supplementary language real data is the mandarin sentence "how to go by all, then the mandarin vocabulary" how to go "in the mandarin sentence" how to go "can be determined as the vocabulary to be translated because the mandarin vocabulary" how to go "corresponds to the cantonese vocabulary" sample application row "according to the preset vocabulary mapping relationship shown in table 1.

Based on the related content in step 11, after the auxiliary language real data (e.g., how to go the mandarin language sentence "how to go" is won) is obtained, each vocabulary to be translated (e.g., how to go the mandarin language vocabulary ") in the auxiliary language real data may be determined according to the preset vocabulary mapping relationship; and then searching a target language vocabulary (such as a Guangdong language vocabulary sample application line) corresponding to each vocabulary to be translated from the preset vocabulary mapping relation so as to translate the auxiliary language real data into target language translation data (such as a Guangdong language sentence Vanda sample application line) corresponding to the auxiliary language real data based on the target language vocabulary corresponding to each vocabulary to be translated.

Step 12: and determining target language translation data according to the auxiliary language real data and the target language vocabulary corresponding to the vocabulary to be translated.

In the embodiment of the application, after the target language vocabulary corresponding to each vocabulary to be translated in the auxiliary language real data is obtained, each vocabulary to be translated in the auxiliary language real data can be replaced by the target language vocabulary corresponding to each vocabulary to be translated respectively, so that target language translation data are obtained. For example, when the auxiliary language real data includes N words to be translated, the 1 st word to be translated in the auxiliary language real data may be replaced with a target language word corresponding to the 1 st word to be translated, the 2 nd word to be translated in the auxiliary language real data may be replaced with a target language word corresponding to the 2 nd word to be translated, … … (analogy in turn), the nth word to be translated in the auxiliary language real data may be replaced with a target language word corresponding to the nth word to be translated, and the target language translation data corresponding to the auxiliary language real data may be obtained, so that the target language translation data includes the target language words corresponding to the N words to be translated, and the target language translation data does not include the N words to be translated.

Based on the related content of the first possible implementation manner of obtaining the target language translation data, it can be known that each vocabulary to be translated in the auxiliary language real data can be directly replaced by a target language vocabulary corresponding to each vocabulary to be translated, and the target language translation data corresponding to the auxiliary language real data is obtained, so that the target language translation data can include the target language vocabulary corresponding to each vocabulary to be translated. For example, the mandarin vocabulary "how to go" in the mandarin sentence "how to go" is directly replaced by the cantonese vocabulary "sample application line" to obtain the cantonese sentence "how to sample application line".

In a second possible implementation, the process of obtaining the target language translation data may include steps 21-22:

step 21: and inputting the auxiliary language real data into a pre-constructed language translation model to obtain a translation result corresponding to the auxiliary language real data output by the language translation model.

The language translation model is used for translating the auxiliary language sentences into the target language sentences. Wherein the auxiliary language sentence includes at least one auxiliary language vocabulary, and the target language sentence includes at least one target language vocabulary.

It should be noted that the embodiment of the present application is not limited to the language translation model, and the language translation model may be any existing or future-appearing model capable of translating the auxiliary language statement into the target language statement.

The translation result corresponding to the auxiliary language real data is a translation result obtained by translating the auxiliary language real data by the language translation model. In addition, the number of translation results corresponding to the auxiliary language real data is not limited in the embodiment of the application. For example, as shown in fig. 2, when the target language is cantonese and the auxiliary language is mandarin, if the auxiliary language real data is mandarin sentence "close navigation", the language translation model can translate the auxiliary language real data and output M translation results "close navigation" sequentially ordered according to the recommended sequence; closing the tour; umile tours; … …'. Wherein M is a positive integer.

Based on the related content in step 21, after the auxiliary language real data is obtained, the auxiliary language real data may be directly input into a pre-constructed language translation model, so that the language translation model can translate the auxiliary language real data and output M translation results corresponding to the auxiliary language real data, so that target language translation data corresponding to the auxiliary language real data can be determined from the M translation results in the following process.

Step 22: and determining target language translation data according to the translation result corresponding to the auxiliary language real data.

In the embodiment of the application, after the M translation results corresponding to the auxiliary language real data are obtained, the target language translation data corresponding to the auxiliary language real data can be directly screened from the M translation results. For example, the Top-ranked one of the M translation results (i.e., Top1 translation result) may be determined as the target language translation data. For another example, in order to increase the diversity of the training data, one translation result may be randomly selected from the Top G translation results (i.e., Top1 translation result to Top translation result) of the M translation results, and determined as the target language translation data. Wherein G is a positive integer.

Based on the related content of the second possible implementation manner of obtaining the target language translation data, the pre-constructed language translation model may be directly used to translate the auxiliary language real data and output at least one translation result corresponding to the auxiliary language real data, and the target language translation data corresponding to the auxiliary language real data may be determined according to the at least one translation result.

The target language generation data refers to a target language sentence generated from the candidate intention data. Among them, the candidate intention data is used to describe the user intention under a preset application field (that is, an application field of the intention understanding model).

It should be noted that the preset application field may be preset, and the preset application field is not limited in the embodiment of the present application, for example, the preset application field may be a navigation technology field. In addition, the user intention of the preset application field is not limited in the embodiment of the present application, for example, if the preset application field is a navigation technology field, the user intention of the preset application field may include the user intention in the navigation technology field shown in table 2.

TABLE 2 user intention List under navigation technology

In addition, the embodiment of the present application does not limit the representation form of the candidate intention data, for example, the candidate intention data may be represented by a binary group, that is, the candidate intention data may be represented as (intention type, intention parameter). The intention type refers to a type to which the user intention belongs, and for example, the intention type may be a type of positioning, POI searching, and the like. The intention parameter (also called slot) refers to a parameter to which the user intention relates, for example, the intention parameter may be a POI (as in shanghai).

In addition, the embodiment of the present application does not limit the process of acquiring the target language generation data, and for convenience of understanding, the following description is made with reference to a possible implementation manner of acquiring the target language generation data.

In a possible implementation manner, the acquiring process of the target language generation data may specifically be: and inputting the candidate intention data into a pre-constructed target language data generation model to obtain target language generation data output by the target language data generation model.

The target language data generation model is used for generating target language generation data according to the candidate intention data, and the target language data generation model is obtained by utilizing target language marking data and auxiliary language marking data for training.

The target language marking data refers to target language sentences marked with real user intentions and used for training the target language data generation model. In addition, the present embodiment does not limit the representation form of the target language markup data, and for example, the target language markup data may be represented by a triplet (target language sentence, intent type, intent parameter). In addition, the embodiment of the present application also does not limit the relationship between the target language markup data and the above target language real data, for example, the target language markup data may be the same as the above target language real data or different from the above target language real data.

The auxiliary language marking data refers to auxiliary language sentences marked with real user intentions and used for training the target language data generation model. In addition, the embodiment of the present application does not limit the representation form of the auxiliary language markup data, and for example, the auxiliary language markup data may be represented by a triplet (auxiliary language sentence, intent type, intent parameter). In addition, the embodiment of the present application does not limit the relationship between the auxiliary language labeling data and the above auxiliary language real data, for example, the auxiliary language labeling data may be the same as the above auxiliary language training data or different from the above auxiliary language training data.

In addition, the target language data generation model may be a generative model, and the embodiment of the present application does not limit the generative framework of the target language data generation model, for example, the generative framework of the target language data generation model may be UnLim-V2 proposed by microsoft.

In addition, the embodiment of the present application also does not limit the process of constructing the target language data generation model, and for the convenience of understanding, the following description is made with reference to one possible implementation of constructing the target language data generation model.

In one possible embodiment, the process of constructing the target language data generation model may include steps 31-32:

step 31: and training the model to be trained by using the auxiliary language labeling data to obtain an auxiliary language data generation model.

The model to be trained is a basic model for constructing a target language data generation model, and the model to be trained can become the target language data generation model through two rounds of training. In addition, the model to be trained is not limited in the embodiment of the present application, for example, the model to be trained may be a generative model, and particularly, may be a generative model with UnLim-V2 as a generative framework.

The auxiliary language data generation model is used for generating auxiliary language sentences according to the candidate intention data. For example, when the candidate intent data is A and the auxiliary language statement is X, then the auxiliary language data generation model is used to generate X from A (i.e., A → X). As can be seen, the auxiliary linguistic data generation model may be represented as formula (1).

X＝SC_f(A) (1)

In the formula, X is an auxiliary language sentence (e.g., mandarin sentence "tomorrow go to shanghai"); a is candidate intention data; SC (Single chip computer)_f() A model function of the model is generated for the auxiliary language data. The present embodiment does not limit the expression of a, and for example, a ═ I, P, I is an intention type (e.g., POI search), and P is an intention parameter (e.g., shanghai).

It should be noted that, the training process of step 31 is not limited in the embodiment of the present application, and any existing or future training method that can train the model to be trained into the auxiliary language data generation model may be used to implement the training process.

Step 32: and training the auxiliary language data generation model by using the target language marking data to obtain the target language data generation model.

And the target language data generation model is used for generating target language sentences according to the candidate intention data. For example, if the candidate intent data is A and the target language statement is L, then the target language data generation model is used to generate L from A (i.e., A → L). As can be seen, the target language data generation model can be expressed as formula (2).

L＝SC_m(A) (2)

Where L is a target language statement (e.g., the cantonese statement "listen to shanghai"); a is candidate intention data; SC (Single chip computer)_m() A model function of the model is generated for the target language data.

The embodiment of the present application is not limited to the training process of step 32, and may be implemented by any existing or future training method capable of training the auxiliary language data generation model into the auxiliary language data generation model.

It should also be noted that, for the target language data generation model, step 31 is a pre-trained process and step 32 is a transfer learning process. For example, as shown in FIG. 3, the construction process of the target language data generation model with the generative framework UnLim-V2 may include two phases of pre-training and transfer learning.

Based on the above-mentioned related content of S101, in the embodiment of the present application, in order to train an intention understanding model with better intention understanding performance, target language training data and auxiliary language training data may be obtained, so that the intention understanding model can be trained by using the target language training data and the auxiliary language training data in the following, so that sufficient training data is provided when the intention understanding model is trained, and thus, adverse effects on the intention understanding model due to less training data can be effectively avoided.

S102: and inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain a predicted intention corresponding to the target language training data and a predicted intention corresponding to the auxiliary language training data which are output by the intention understanding model.

The intention understanding model is used for carrying out user intention understanding on user sentences in the target language.

The present embodiment is not limited to the structure of the model intended to be understood, and may be implemented using any model structure that is currently available or that can be developed in the future for the intended understanding. For example, as shown in FIG. 4, it is intended that the model may include an encoder (i.e., an encoding layer) and a classifier (i.e., a fully-connected layer). The encoder is used for carrying out semantic understanding on model input data of the intention understanding model to obtain statement vectors corresponding to the model input data. The classifier is configured to determine a prediction intent (i.e., model output data) corresponding to the model input data output by the encoder, based on the statement vector corresponding to the model input data.

It should be noted that the present embodiment is not limited to the Encoder, and may be implemented by using any existing or future semantic understanding model, for example, the Encoder may be bert (bidirectional Encoder responses from transforms). In addition, the embodiment of the present application also does not limit the classifier, and may be implemented by using any existing or future-appearing classifier, for example, the classifier may be a linear classifier. It can be seen that when the intent understanding model includes an encoder and a classifier, the encoder is BERT and the classifier is a linear classifier, the intent understanding model can be implemented using equation (3).

y＝liner(BERT(x)) (3)

Where y is the model output data intended to understand the model (i.e., the predicted intent corresponding to input data x); x is model input data intended to understand the model; BERT () is an encoding function in an intended understanding model; liner () is the classification function in the intent understanding model.

The predicted intention corresponding to the target language training data is obtained by the intention understanding model through intention understanding of the target language training data.

The predicted intention corresponding to the auxiliary language training data is obtained by intention understanding of the auxiliary language training data by an intention understanding model.

Based on the above-mentioned contents of S102, the target language training data and the auxiliary language training data may be input to the intention understanding model, so that the intention understanding model performs intention understanding on the target language training data and the auxiliary language training data to obtain the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data. For example, as shown in fig. 5, when the target language training data includes target language real data, target language translation data, and target language generation data, the target language real data, the target language translation data, the target language generation data, and the auxiliary language training data may be input into the intention understanding model, and a predicted intention corresponding to the target language real data, a predicted intention corresponding to the target language translation data, a predicted intention corresponding to the target language generation data, and a predicted intention corresponding to the auxiliary language training data output by the intention understanding model may be obtained, so that the intention understanding performance of the intention understanding model may be subsequently determined based on these predicted intentions.

S103: and determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data.

Wherein, the model prediction loss is used for characterizing the intention understanding performance of the intention understanding model, and is specifically as follows: the larger the model prediction loss is, the worse the intention understanding performance of the intention understanding model is, but the smaller the model prediction loss is, the better the intention understanding performance of the intention understanding model is.

The embodiment of the present application does not limit the calculation method of the model prediction loss, and for the convenience of understanding, the following description will be made with reference to an example.

As an example, S103 may specifically include S1031-S1033:

s1031: and determining the prediction loss of the target language according to the prediction intention corresponding to the training data of the target language.

Wherein the target language prediction loss is used to characterize an intent understanding performance of the intent understanding model on the target language statement. In addition, the embodiment of the present application does not limit the determination method of the target language prediction loss. For ease of understanding, the following description is made in connection with one possible embodiment.

In a possible implementation manner, when the target language training data includes target language real data, target language translation data, and target language generation data, and the predicted intention corresponding to the target language training data includes a predicted intention of the target language real data, a predicted intention of the target language translation data, and a predicted intention of the target language generation data, the obtaining process of the target language prediction loss may specifically include steps 41 to 45:

step 41: and acquiring the actual intention of the target language real data, the reference intention of the target language translation data and the actual intention of the target language generation data.

The actual intention of the target language real data refers to the actual intention described by the target language real data; in the training process of the intention understanding model, the actual intention of the target language real data can be used as label information to guide the training process of the intention understanding model, so that the predicted intention output by the trained intention understanding model aiming at the target language real data can be as close to the actual intention of the target language real data as possible.

The reference intention of the target language translation data can be used as label information to guide the training process of the intention understanding model, so that the predicted intention output by the trained intention understanding model for the target language translation data can be as close to the reference intention of the target language translation data as possible.

It should be noted that the embodiments of the present application do not limit the reference intent of the target language translation data. In some cases, if the target language translation data is obtained by translating the auxiliary language real data, the reference intention of the target language translation data may be a predicted intention corresponding to the auxiliary language real data. And the predicted intention corresponding to the auxiliary language real data is obtained by intention understanding of the auxiliary language real data by an intention understanding model. For example, if the auxiliary language real data is the above auxiliary language training data, the reference intention of the target language translation data may be a predicted intention corresponding to the above auxiliary language training data.

The actual intention of the target language generation data refers to the real intention of the target language generation data; and if the target language generation data is generated from the candidate intent data, the actual intent of the target language generation data may be the intent described by the candidate intent data.

In addition, in the training process of the intention understanding model, the actual intention of the target language generation data may be used as the label information to guide the training process of the intention understanding model, so that the predicted intention of the trained intention understanding model output for the target language generation data can be as close as possible to the actual intention of the target language generation data.

Step 42: and determining the prediction loss corresponding to the target language real data according to the prediction intention of the target language real data and the actual intention of the target language real data.

And the predicted loss corresponding to the target language real data is used for representing the intention understanding performance of the intention understanding model on the target language real data.

In addition, the embodiment of the present application does not limit the calculation method of the predicted loss corresponding to the target language real data, for example, step 42 may specifically be: as shown in formula (4), Cross Entropy (CE) between the predicted intent of the target language real data and the actual intent of the target language real data is determined as the predicted loss corresponding to the target language real data.

In the formula (I), the compound is shown in the specification,

predicting loss corresponding to the real data of the target language;

the predicted intent for the target language real data (e.g.,

) And is and

real data for a target language);

actual intent as target language real data; CE () is a cross entropy function.

Step 43: and determining the corresponding prediction loss of the target language translation data according to the prediction intention of the target language translation data and the reference intention of the target language translation data.

And the predicted loss corresponding to the target language translation data is used for representing the intention understanding performance of the intention understanding model on the target language translation data.

In addition, the embodiment of the present application does not limit the calculation method of the prediction loss corresponding to the target language translation data, for example, step 43 may specifically be: as shown in equation (5), the relative entropy (KL Divergence) between the prediction intent of the target language translation data and the reference intent of the target language translation data is determined as the prediction loss corresponding to the target language translation data.

In the formula (I), the compound is shown in the specification,

translating the corresponding predicted loss of the data for the target language;

the predicted intent of the data is translated for the target language (e.g.,

) And is and

translating the data for the target language);

translating the reference intent of the data for the target language; KL () is the KL divergence function.

Step 44: and determining the corresponding prediction loss of the target language generation data according to the actual intention of the target language generation data and the actual intention of the target language generation data.

And the predicted loss corresponding to the target language generation data is used for characterizing the intention understanding performance of the intention understanding model on the target language generation data.

In addition, the embodiment of the present application is not limited to the calculation method of the prediction loss corresponding to the target language generation data, and for example, step 44 may specifically be: as shown in equation (6), the cross entropy between the actual intent of the target language generation data and the actual intent of the target language generation data is determined as the prediction loss corresponding to the target language generation data.

In the formula (I), the compound is shown in the specification,

generating a predicted loss corresponding to the data for the target language;

the actual intent of the data is generated for the target language (e.g.,

) And is and

generating data for a target language);

generating an actual intent of the data for the target language; CE () is a cross entropy function.

Step 45: and determining the target language prediction loss according to the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data.

In the embodiment of the application, after the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data are obtained, weighted summation can be performed among the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data, so that the target language prediction loss is obtained. It should be noted that the weights involved in the weighted summation process may be set in advance according to an application scenario.

Based on the relevant content of the above steps 41 to 45, if the target language training data includes the target language real data, the target language translation data, and the target language generation data, the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data, and the prediction loss corresponding to the target language generation data may be calculated according to the prediction intention of the target language real data, the prediction intention of the target language translation data, and the prediction intention of the target language generation data, respectively; and determining the target language prediction loss according to the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data, so that the target language prediction loss can accurately represent the intention understanding performance of the intention understanding model on the target language sentences.

S1032: and determining the prediction loss of the auxiliary language according to the prediction intention corresponding to the auxiliary language training data.

Wherein the auxiliary language prediction loss is used for characterizing the intention understanding performance of the intention understanding model on the auxiliary language sentence.

In addition, the embodiment of the present application does not limit the calculation process of the auxiliary language prediction loss, for example, the acquisition process of the auxiliary language prediction loss may specifically include steps 51 to 52:

step 51: and acquiring the actual intention corresponding to the auxiliary language training data.

The actual intention corresponding to the auxiliary language training data refers to the real intention of the auxiliary language training data; in the training process of the intention understanding model, the actual intention corresponding to the auxiliary language training data can be used as label information to guide the training process of the intention understanding model, so that the predicted intention output by the trained intention understanding model aiming at the auxiliary language training data can be as close as possible to the actual intention corresponding to the auxiliary language training data. The embodiment of the present application does not limit the manner of acquiring the actual intention corresponding to the auxiliary language training data.

Step 52: and determining the auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data and the actual intention corresponding to the auxiliary language training data.

The embodiment of the present application does not limit the calculation method of the auxiliary language prediction loss, and for the convenience of understanding, the following description is made with reference to an example.

As an example, step 52 may specifically be: as shown in equation (7), the cross entropy between the predicted intent corresponding to the auxiliary language training data and the actual intent corresponding to the auxiliary language training data is determined as the auxiliary language prediction loss.

In the formula (I), the compound is shown in the specification,

predicting a loss for an auxiliary language;

to assist in predicting intent corresponding to the language training data (e.g.,

) And is and

for assisting in language trainingAccordingly);

the actual intention corresponding to the auxiliary language training data; CE () is a cross entropy function.

Based on the above-mentioned related contents of step 51 to step 52, the cross entropy between the predicted intent corresponding to the auxiliary language training data and the actual intent corresponding to the auxiliary language training data can be determined as the auxiliary language prediction loss, so that the auxiliary language prediction loss can accurately represent the intention understanding performance of the intention understanding model for the auxiliary language sentence.

S1033: and determining the model prediction loss of the intention understanding model according to the target language prediction loss and the auxiliary language prediction loss.

In the embodiment of the application, after the target language prediction loss and the auxiliary language prediction loss are obtained, the target language prediction loss and the auxiliary language prediction loss can be subjected to weighted summation to obtain the model prediction loss of the intention understanding model. It should be noted that the weights involved in the weighted summation process may be set in advance according to an application scenario.

Based on the relevant content in S1031 to S1033, after the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data are obtained by using the intention understanding model, the prediction loss of the target language may be obtained according to the prediction intention corresponding to the target language training data, and the prediction loss of the auxiliary language may be obtained according to the prediction intention corresponding to the auxiliary language training data; and then carrying out weighted summation on the target language prediction loss and the auxiliary language prediction loss to obtain the model prediction loss of the intention understanding model, so that the model prediction loss can accurately represent the intention understanding performance of the intention understanding model.

S104: judging whether a preset stopping condition is reached, if so, ending the training process of the intention understanding model; if not, go to S105.

The preset stopping condition refers to a preset constraint condition required for stopping training the intention understanding model. In addition, the preset stop condition may be set in advance according to an application scenario. In addition, the preset stop condition is not limited in the embodiments of the present application, for example, the preset stop condition may be that the model prediction loss of the intended understanding model is lower than the first threshold. For another example, the preset stop condition may be that the rate of change of the model prediction loss of the intended understanding model is smaller than the second threshold (i.e., the predicted intent of the intended understanding model output reaches convergence). For example, the preset stop condition may be that the number of updates of the intended understanding model reaches the third threshold. The first threshold, the second threshold and the third threshold may be preset.

In the embodiment of the application, if the intention understanding model of the current round reaches the preset stop condition, the intention understanding model of the current round has higher intention understanding performance, so that the training process of the intention understanding model can be directly ended and the intention understanding model of the current round can be stored, so that the user sentence in the target language can be subjected to intention understanding by using the stored intention understanding model in the following process; if the intention understanding model of the current wheel does not reach the preset stop condition, the intention understanding performance of the intention understanding model of the current wheel is low, so that the intention understanding model can be updated according to the model prediction loss of the intention understanding model, and then the intention understanding performance of the updated intention understanding model can be detected by using the target language training data and the auxiliary language training data.

S105: the intention understanding model is updated based on the model prediction loss of the intention understanding model, and execution returns to S102.

It should be noted that the embodiment of the present application does not limit the update process intended to understand the model, and may be implemented by any existing or future model update method.

Based on the relevant contents of S101 to S105, in the intention understanding model training method provided in the present application, after target language training data and auxiliary language training data are acquired, the target language training data and the auxiliary language training data are input into an intention understanding model, a predicted intention corresponding to the target language training data and a predicted intention corresponding to the auxiliary language training data output by the intention understanding model are obtained, and a model prediction loss of the intention understanding model is determined according to the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data; and updating the intention understanding model according to the model prediction loss, and returning to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached.

Based on the relevant contents of the above-described intention understanding model training method, since the trained intention understanding model can accurately predict the intention of the user sentence in the target language, after the intention understanding model is trained by using any one of the embodiments of the above-described intention understanding model training method, the intention of the user sentence in the target language can be understood by using the trained intention understanding model. Based on the above, the embodiment of the present application also provides an intention understanding method, which is combined with the following stepsSquare block Method example twoThe description is given.

Method embodiment two

Referring to fig. 6, a flowchart of an intent understanding method provided by an embodiment of the present application is shown.

The intention understanding method provided by the embodiment of the application comprises the following steps of S601-S602:

s601: and acquiring data to be understood in the target language.

The target language to-be-understood data refers to target language statements which need to be understood intently. In addition, the embodiment of the application does not limit the acquisition process of the data to be understood by the target language. For ease of understanding, the following description is made in connection with two examples.

Example 1, S601 may specifically be: and determining the data to be understood in the target language according to the target language text data input by the user. The target language text data refers to a target language sentence input by a user through a text input mode (for example, typing in a text box).

Therefore, after the target language text data input by the user in the text input mode is acquired, the target language to-be-understood data can be determined according to the target language text data. For example, the target language text data may be directly determined as the target language to-be-understood data. For another example, the target language text data may be first processed, and then the target language text data after the first processing is determined as the target language data to be understood; the first process may be preset, and the embodiment of the present application does not limit the first process (for example, the first process may include a wrong word correction process, etc.).

Example 2, S601 may specifically include S6011-S6013:

s6011: and acquiring target speech voice data input by a user.

The target language speech data refers to target language sentences input by a user in a speech input mode. It should be noted that, in the embodiment of the present application, the acquisition process of the target speech voice data is not limited, and may be implemented by using any existing or future speech acquisition mode.

S6012: and carrying out voice recognition on the target speech voice data to obtain text data corresponding to the target speech voice data.

The text data corresponding to the target speech voice data is a voice recognition result of the target speech voice data; and the text data corresponding to the target speech sound data includes the text information recorded in the target speech sound data.

The present embodiment is not limited to the implementation of speech recognition, and may be implemented by any existing or future speech recognition method.

S6013: and determining data to be understood by the target language according to the text data corresponding to the target language voice data.

In the embodiment of the application, after the text data corresponding to the target language speech data is obtained, the data to be understood by the target language can be determined according to the text data corresponding to the target language speech data. For example, the text data corresponding to the target language speech data may be directly determined as the target language to-be-understood data. For another example, the text data corresponding to the target language speech data may be first subjected to the second processing, and then the text data after the second processing may be determined as the data to be understood by the target language. The second process may be preset, and the embodiment of the present application does not limit the second process (for example, the second process includes a wrong word correction process, etc.).

S602: and inputting the target language data to be understood into the intention understanding model to obtain the prediction intention corresponding to the target language data to be understood output by the intention understanding model.

The intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application. In addition, please refer to the above for the related contents of the intended understanding modelMethod embodiment one。

Based on the relevant contents of S601 to S602, after the target language to-be-understood data is obtained, the target language to-be-understood data may be input to the intention understanding model obtained by training according to any embodiment of the intention understanding model training method provided above, so that the intention understanding model can perform intention understanding on the target language to-be-understood data, and a predicted intention corresponding to the target language to-be-understood data output by the intention understanding model is obtained. The intention understanding model can accurately identify the user intention described by the user statement in the target language, so that the predicted intention corresponding to the data to be understood in the target language, which is predicted by the intention understanding model, can accurately represent the user intention described by the data to be understood in the target language, and the intention understanding accuracy of the user statement in the target language can be effectively improved.

In order to facilitate understanding of the above intent to understand the model training method and intent to understand the method, the following description is made in conjunction with the scenario embodiments.

Scene embodiment

Assuming that the target language is cantonese; the auxiliary language is mandarin; the application domain intended to understand the model (i.e. the above preset application domain) is the navigation technology domain; the candidate intention data is navigation intention data, and the navigation intention data is used for describing the intention of the user in the navigation technical field. The number of pieces of navigation intention data is not limited in the embodiments of the present application.

Based on the above assumptions, the intent understanding model can be used to understand intent for cantonese statements; moreover, the training process of the intention understanding model may specifically include steps 60 to 69:

step 60: and acquiring cantonese real data, the actual intention of the cantonese real data, mandarin real data, the actual intention of the mandarin real data and navigation intention data.

Wherein the actual intent of the cantonese real data is used as tag information for the cantonese real data; and the actual intention of the mandarin chinese real data is for the tag information as the mandarin chinese real data.

The embodiments of the present application do not limit the process of acquiring actual data and actual intent of cantonese, and for example, the data may be legally crawled in a web page or read from a historical cantonese dialogue stored in a human-computer interaction device. In addition, the embodiment of the application also does not limit the obtaining process of the mandarin real data and the actual intention thereof, for example, legal crawling can be performed in a webpage, and reading can also be performed from historical mandarin dialogues stored in the human-computer interaction device.

It should be further noted that, the embodiment of the present application does not limit the acquisition process of the navigation intention data, and for example, the navigation intention data may be generated according to a legally crawled navigation intention in a web page, or may be generated according to a historical navigation intention stored in a preset navigation application.

Step 61: and carrying out Puyue translation on the mandarin real data to obtain Guangdong translation data.

The general translation refers to translating a mandarin sentence into a cantonese sentence. The present embodiment is not limited to the embodiments of the general translation, and may be implemented by any method capable of translating a mandarin sentence into a cantonese sentence. For example, a pre-constructed promeo vocabulary mapping relationship can be adopted for promeo translation; the general Guangdong vocabulary mapping relation is used for recording the mapping relation between each Mandarin vocabulary and each Guangdong vocabulary. For example, a language translation model having a general translation function constructed in advance may be used to perform general translation.

Step 62: and inputting the navigation intention data into a pre-constructed cantonese data generation model to obtain cantonese generation data.

The cantonese data generation model can generate cantonese generation data according to the navigation intention data, and the training process of the cantonese data generation model is similar to that of the target language data generation model.

And step 63: inputting the cantonese real data, mandarin real data, cantonese translation data and cantonese generation data into the intention understanding model to obtain a prediction intention corresponding to the cantonese real data, a prediction intention corresponding to the mandarin real data, a prediction intention corresponding to the cantonese translation data and a prediction intention corresponding to the cantonese generation data output by the intention understanding model.

Step 64: and determining the cross entropy between the prediction intention corresponding to the actual data of the cantonese as the prediction loss corresponding to the actual data of the cantonese.

Step 65: and determining the cross entropy between the prediction intention corresponding to the mandarin real data and the actual intention of the mandarin real data as the prediction loss corresponding to the mandarin real data.

And step 66: the KL divergence between the prediction intention corresponding to the cantonese translation data and the prediction intention corresponding to the mandarin real data is determined as the prediction loss corresponding to the cantonese translation data.

Step 67: and determining the cross entropy between the prediction intention corresponding to the cantonese generation data and the navigation intention data as the prediction loss corresponding to the cantonese generation data.

Step 68: and carrying out weighted summation on the prediction loss corresponding to the cantonese real data, the prediction loss corresponding to the mandarin real data, the prediction loss corresponding to the cantonese translation data and the prediction loss corresponding to the cantonese generation data to obtain the model prediction loss of the intention understanding model.

That is, step 68 may be calculated using equation (8).

In the formula, loss_modelPredicting a loss for a model intended to understand the model;

the predicted loss corresponding to the actual data of the cantonese is obtained; alpha is the weight occupied by the prediction loss corresponding to the actual data of the cantonese;

the predicted loss corresponding to the mandarin real data; beta is the weight occupied by the prediction loss corresponding to the mandarin real data;

the corresponding predicted loss of the cantonese translation data; gamma is the weight occupied by the prediction loss corresponding to the cantonese translation data;

generating a predicted loss corresponding to the data for the cantonese; δ is the weight of the predicted loss corresponding to the cantonese generated data.

Step 69: judging whether a preset stopping condition is reached, if so, ending the training process of the intention understanding model; if not, the intention understanding model is updated according to the model prediction loss of the intention understanding model, and the step 63 is returned to and executed.

As can be seen from the above-mentioned related contents of steps 60 to 69, in the embodiment of the present application, the cantonese real data having the tag information, the mandarin real data having the tag information, and the navigation intention data may be combined to generate the training data (for example, cantonese real data, mandarin real data, cantonese translation data, and cantonese generation data) required for training the intention understanding model, and then the intention understanding model may be trained based on the training data.

Therefore, the difference between the mandarin sentences and the cantonese sentences is small due to the small difference between the mandarin sentences and the cantonese sentences, so that the difference between the mandarin real data formed by the mandarin sentences and the cantonese real data formed by the cantonese sentences is small, the mandarin real data and the cantonese translation data generated based on the mandarin real data can be used for expanding the training data corresponding to the intention understanding model, adverse effects caused by less training data can be effectively avoided, and the intention understanding performance of the intention understanding model can be improved. In addition, the navigation intention data can accurately describe the user intention in the application field of the intention understanding model, so that the cantonese generation data generated based on the navigation intention data can more accurately represent the user intention in the application field of the intention understanding model, and the intention understanding model trained based on the cantonese generation data can better predict the user intention in the application field, thereby being beneficial to improving the intention understanding performance of the intention understanding model.

Based on the above assumptions and the above training process of the intention understanding model, the use process of the intention understanding model includes steps 71 to 72:

step 71: and acquiring data to be understood in cantonese.

The data to be understood in cantonese is a cantonese sentence that is required to be understood with intent. In addition, the process of acquiring data to be understood in cantonese is similar to that of acquiring data to be understood in the target language above.

Step 72: inputting the cantonese data to be understood into the intention understanding model trained in the steps 60 to 69, and obtaining the predicted intention corresponding to the cantonese data to be understood output by the intention understanding model.

Based on the above-mentioned contents of steps 71 to 72, after the intention understanding model is obtained by training in steps 60 to 69, the cantonese data to be understood can be directly input to the intention understanding model, so that the intention understanding model can accurately determine the user intention described by the cantonese data to be understood, thereby improving the understanding accuracy of the cantonese user intention.

Based on the intention understanding model training method provided by the above method embodiment, the embodiment of the present application further provides an intention understanding model training device, which is explained and explained below with reference to the accompanying drawings.

Apparatus embodiment one

Device embodiment an intended understanding model training device is introduced, and for related matter, reference is made to the above method embodiment.

Referring to fig. 8, the drawing is a schematic structural diagram of an intention understanding model training apparatus provided in an embodiment of the present application.

The intention understanding model training device 800 provided by the embodiment of the application comprises:

a first acquiring unit 801 configured to acquire target language training data and auxiliary language training data;

a first prediction unit 802, configured to input the target language training data and the auxiliary language training data into an intention understanding model, so as to obtain a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data that are output by the intention understanding model;

a loss determining unit 803, configured to determine a model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data;

a model updating unit 804, configured to update the intention understanding model according to the model prediction loss of the intention understanding model, and return to the first predicting unit 802 to continue to perform the step of inputting the target language training data and the auxiliary language training data into the intention understanding model until a preset stop condition is reached.

In one possible implementation, the target language training data includes at least one of target language real data, target language translation data, and target language generation data; the target language translation data is obtained by translating the auxiliary language real data; the target language generation data is generated from candidate intent data.

In a possible implementation manner, the obtaining process of the target language generation data is as follows:

In one possible implementation, the building process of the target language data generation model includes:

alternatively, the first and second electrodes may be,

In a possible implementation, the loss determining unit 803 includes:

the first determining subunit is used for determining the prediction loss of the target language according to the prediction intention corresponding to the target language training data;

the second determining subunit is used for determining the auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data;

a third determining subunit, configured to determine a model prediction loss of the intent understanding model according to the target language prediction loss and the auxiliary language prediction loss.

In one possible implementation, when the target language training data includes target language real data, target language translation data, and target language generation data, and the predicted intention corresponding to the target language training data includes a predicted intention of the target language real data, a predicted intention of the target language translation data, and a predicted intention of the target language generation data, the intention understanding model training device 800 further includes:

a third acquisition unit configured to acquire an actual intention of the target language real data, a reference intention of the target language translation data, and an actual intention of the target language generation data;

the first determining subunit is specifically configured to: determining a prediction loss corresponding to the target language real data according to the prediction intention of the target language real data and the actual intention of the target language real data; determining a prediction loss corresponding to the target language translation data according to the prediction intention of the target language translation data and the reference intention of the target language translation data; determining a prediction loss corresponding to the target language generation data according to the actual intention of the target language generation data and the actual intention of the target language generation data; and determining the prediction loss of the target language according to the prediction loss corresponding to the real data of the target language, the prediction loss corresponding to the translation data of the target language and the prediction loss corresponding to the generation data of the target language.

Based on the intention understanding method provided by the above method embodiment, the embodiment of the present application also provides an intention understanding device, which is explained and explained below with reference to the accompanying drawings.

Device embodiment II

Device embodiments an understanding-purpose device is described, and reference is made to the above method embodiments for relevant disclosure.

Referring to fig. 9, the figure is a schematic structural diagram of an intended understanding apparatus provided in the embodiment of the present application.

The intention understanding apparatus 900 provided by the embodiment of the present application includes:

a second obtaining unit 901, configured to obtain data to be understood by a target language;

a second prediction unit 902, configured to input the target language to-be-understood data into the intention understanding model, and obtain a prediction intention corresponding to the target language to-be-understood data output by the intention understanding model; the intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application.

Further, an intention understanding model training device is also provided in an embodiment of the present application, including: a processor, a memory, a system bus;

the processor and the memory are connected through the system bus;

the memory is for storing one or more programs, the one or more programs including instructions, which when executed by the processor, cause the processor to perform any of the embodiments of the intended understanding model training method described above.

Further, an intention understanding apparatus is also provided in an embodiment of the present application, including: a processor, a memory, a system bus;

the processor and the memory are connected through the system bus;

the memory is for storing one or more programs, the one or more programs including instructions, which when executed by the processor, cause the processor to perform any of the embodiments of the intended understanding method described above.

Further, an embodiment of the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed on a terminal device, the instructions cause the terminal device to perform any one of the above-mentioned method for training an intention understanding model, or perform any one of the above-mentioned method for understanding an intention.

Further, an embodiment of the present application also provides a computer program product, which when run on a terminal device, causes the terminal device to execute any embodiment of the above-mentioned intent understanding model training method, or execute any embodiment of the above-mentioned intent understanding method.

As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that all or part of the steps in the above embodiment methods can be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network communication device such as a media gateway, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.

It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for training an intent understanding model, the method comprising:

acquiring target language training data and auxiliary language training data;

2. The method of claim 1, wherein the target language training data comprises at least one of target language real data, target language translation data, and target language generation data; the target language translation data is obtained by translating the auxiliary language real data; the target language generation data is generated from candidate intent data.

3. The method of claim 2, wherein the target language generation data is obtained by:

4. The method of claim 3, wherein the building of the target language data generation model comprises:

5. The method of claim 2, wherein the target language translation data is obtained by:

alternatively, the first and second electrodes may be,

6. The method of claim 1, wherein determining a model prediction loss of the intent understanding model based on the predicted intent corresponding to the target language training data and the predicted intent corresponding to the auxiliary language training data comprises:

7. The method of claim 6, wherein when the target language training data comprises target language real data, target language translation data, and target language generation data, and the predicted intent corresponding to the target language training data comprises a predicted intent of the target language real data, a predicted intent of the target language translation data, and a predicted intent of the target language generation data, the method further comprises:

8. A method for intent understanding, the method comprising:

acquiring data to be understood of a target language;

inputting the target language data to be understood into the intention understanding model to obtain a prediction intention corresponding to the target language data to be understood output by the intention understanding model; wherein the intention understanding model is trained by the intention understanding model training method of any one of claims 1 to 7.

9. An intent understanding model training apparatus, characterized in that the apparatus comprises:

10. An intent understanding apparatus, characterized in that the apparatus comprises:

the second prediction unit is used for inputting the target language data to be understood into the intention understanding model to obtain a prediction intention corresponding to the target language data to be understood output by the intention understanding model; wherein the intention understanding model is trained by the intention understanding model training method of any one of claims 1 to 7.