CN114757176A

CN114757176A - Method for obtaining target intention recognition model and intention recognition method

Info

Publication number: CN114757176A
Application number: CN202210571180.3A
Authority: CN
Inventors: 吴鹏劼; 胡景超
Original assignee: Shanghai Hongji Information Technology Co Ltd
Current assignee: Shanghai Hongji Information Technology Co Ltd
Priority date: 2022-05-24
Filing date: 2022-05-24
Publication date: 2022-07-15
Anticipated expiration: 2042-05-24
Also published as: CN114757176B

Abstract

The embodiment of the application provides a method for obtaining an intention recognition model and an intention recognition method, wherein the method comprises the following steps: acquiring an original training text set, wherein the original training text set comprises training data respectively corresponding to each intention type of at least two intention types, and the training data comprises a plurality of original training sentences and original labeling labels respectively corresponding to each original training sentence; obtaining a target training text set according to the data in the original training text set, wherein each item label training sentence in the target training text set is obtained by covering an original label; and fine-tuning a pre-training mask language model Bert based on the data in the target training text set to obtain a target intention recognition model. According to the embodiment of the application, the purpose recognition model with better generalization performance and more robustness can be trained under the conditions of small samples and unbalanced data.

Description

Method for obtaining target intention recognition model and intention recognition method

Technical Field

The application relates to the technical field of natural language processing, in particular to a method for obtaining a target intention recognition model and an intention recognition method.

Background

Taking a business inquiry robot as an example to exemplarily explain the conversation process of the conversation robot, the conversation process comprises the following steps: the method comprises the steps of obtaining a question Q of a guest, abstracting the question into a target intention (intent) of the guest through an intention identification module, transmitting the target intention to a corresponding interaction module to obtain an answer sentence most relevant to the question of the guest, and providing the answer sentence for the guest. It is understood that the core of current conversational robot work is user intent recognition, since only if an intent is made clear can a corresponding interaction model be found and a targeted answer be given with the interaction module.

A target intention recognition model running on an existing intention recognition module needs to be trained based on data in an original training text set to have the intention recognition capability, and a multi-functional (i.e., multi-intention type) conversation robot has a problem that training data collected for different types of intents are unbalanced. For example, a chatting function is added to a customer service robot which originally has only a function of solving a common question, FAQ (frequencytly assigned questions), because the sample numbers of the FAQ and the chatting category are different greatly, that is, the scale of training data corresponding to the function of recognizing the chatting by the training session robot is far larger than the scale of training data corresponding to the function of recognizing the FAQ by the training session robot, the trained intention recognition model is easy to be over-fitted, prediction is more prone to judge that the user input is a chatting intention, and therefore, the intention which should be classified as the FAQ is treated as the chatting intention, and the user experience is seriously reduced.

Disclosure of Invention

The embodiment of the application aims to provide a method for acquiring a target intention recognition model and an intention recognition method, and the intention recognition model with better generalization performance and more robustness can be trained under the condition of small samples and unbalanced training data.

In a first aspect, some embodiments of the present application provide a method of obtaining a target intent recognition model, the method comprising: acquiring an original training text set, wherein the original training text set comprises training data corresponding to each intention type in at least two intention types, the training data comprises a plurality of original training sentences and original labeling labels corresponding to the original training sentences, and the original labeling labels are used for representing real intents of the corresponding original training sentences; obtaining a target training text set according to the data in the original training text set, wherein each item label training sentence in the target training text set is obtained by covering an original label; and fine-tuning a pre-training mask language model Bert based on the data in the target training text set to obtain a target intention recognition model.

According to some embodiments of the application, the pre-training mask language model is subjected to fine adjustment through the target training sentences with the covering parts, so that the target intention recognition module obtained after fine adjustment has the function of deducing the covered original marking labels, and therefore the intention recognition model with better generalization performance and more robustness can be trained under the condition of unbalance of small samples and training data.

In some embodiments, the obtaining a target training text set according to the data in the original training text set includes: and extracting all original training sentences corresponding to various intention types from the original training text set, and obtaining data in the target training text set according to all the original training sentences.

In some embodiments, the obtaining of the target training text set according to the data in the original training text set includes: acquiring an ith original training sentence with the same intention type as the ith target training sentence from the original training text set, and acquiring a jth original labeling label corresponding to the ith original training sentence; acquiring a prompt template, wherein the prompt template comprises a prompt part and a blank part to be filled with content, the prompt part and the content filled in the blank part form a sentence with complete semantics, and the complete semantics means that the sentence comprises a subject, a predicate and an object; and obtaining the ith target training sentence and the target labeling label according to the prompt template and the ith original training sentence.

According to some embodiments of the application, the prompt template and the content in the original training text set are used for obtaining the item label training sentences and the target label labels, so that the obtained data in the target training text set can meet the requirement of the pre-training mask code language coding model on the input data.

In some embodiments, the obtaining the ith target training sentence according to the prompt template and the ith original training sentence includes: filling the jth original labeling label into the blank part to obtain a text to be mixed; covering the jth original labeling label included in the text to be mixed to obtain a target mixed text; and obtaining the ith target training sentence according to the target mixed text and the ith original training sentence.

Some embodiments of the present application provide a method for obtaining any one target training sentence by covering an original annotation tag filled in a prompt template, and it can be understood that in the embodiments of the present application, a target intention recognition model is obtained by training a pre-training mask language model and further having a way of recognizing the covered original annotation tag, so that the requirement of training data volume is reduced, and the recognition effect of the target intention recognition model is improved.

In some embodiments, the obtaining the ith target training sentence according to the prompt template and the ith original training sentence includes: covering the blank part to obtain a target mixed text; and obtaining the ith target training sentence according to the target mixed text and the ith original training sentence.

In some embodiments, the obtaining the ith target training sentence according to the target mixed-in text and the ith original training sentence includes: and taking the target mixed text as a prefix or a suffix of the ith original training sentence to obtain the ith target training sentence.

Some embodiments of the present application provide two methods of mixing a target mixed-in text (i.e., a prompt portion and an occluded portion) with an ith original training sentence, by which a corresponding target training sentence can be obtained based on the target mixed-in text.

In some embodiments, the step of masking the jth original labeling tag included in the text to be mixed to obtain a target mixed-in text includes: and placing the covering text at the position of the jth original labeling label included in the text to be mixed to obtain the target mixed text.

Some embodiments of the present application provide a method for obtaining a target mixed-in text by placing a covering text at a position where a jth original label is located.

In some embodiments, the ith target training sentence comprises the prompt part, a masking text and the ith original training sentence, wherein the masking text is used for blocking the jth original labeling label or blocking the blank part.

The structure of the ith target training sentence provided by some embodiments of the present application includes the above three parts, and training the intention recognition model through such an ith target training sentence can make full use of the inherent attributes of the pre-training mask language model, reduce the amount of required training data, and can effectively solve the problem of overfitting caused by imbalance of the original training data corresponding to different intention types.

In some embodiments, the obtaining the target labeling label according to the prompt template and the ith original training sentence includes: filling the jth original labeling label into the blank part to obtain a text to be mixed; and taking the text to be mixed as the prefix or suffix of the ith original training sentence to obtain the target labeling label.

Some embodiments of the present application provide two methods of mixing the text to be mixed in with the ith original training sentence.

In some embodiments, the target annotation tag comprises: the prompt part, the jth original labeling label and the ith original training sentence.

Some embodiments of the present application provide a method for constructing an ith target annotation label according to a prompt section, a jth original annotation label and an ith original training sentence.

In some embodiments, if the target mixed-in text is a prefix of the ith original training sentence, the ith target training sentence is: the prompt part + the covering text + the ith original training sentence, and the target labeling label is: the prompt part + the jth original labeling label + the ith original training sentence.

Some embodiments of the present application provide a specific structure for mixing a text to be mixed with an ith original training sentence in a prefix manner to obtain a target training sentence and a target labeling label.

In some embodiments, the at least two intent types include a first intent and a second intent, the training data corresponding to the first intent is first training data, the training data corresponding to the second intent is second training data, the first training data includes a first original training sentence, the original annotation label corresponding to the first original training sentence is a first word, the second training data includes a second original training sentence, the original annotation label corresponding to the second original training sentence is a second word; wherein, the first target training sentence included in the target training text set is: the prompt part + the covering text + the first original training sentence, and a target label corresponding to the first target training sentence is: the prompt component + the first term + the first original training sentence; the second target training sentence included in the target training text set is: the prompt part + the covering text + the second original training sentence, and a target label corresponding to the second target training sentence is: the prompt component + the second term + the second original training sentence.

In some embodiments of the application, if the target mixed-in text is a suffix of the ith original training sentence, the ith target training sentence is: the ith original training sentence + the prompt part + a cover text, and the target labeling label is: the ith original training sentence + the prompt part + the jth original labeling label.

In some embodiments of the present application, the set of target training texts comprises: a plurality of target training sentences and target labeling labels corresponding to the entry labeling training sentences; the fine tuning of the pre-training mask language model Bert based on the data in the target training text set includes: loading a word segmentation device; segmenting the target label and the target training sentence to character level according to the word segmentation device to obtain character sequences, wherein one target training sentence and one target label correspond to one character sequence respectively; and fine-tuning the pre-training mask language model Bert according to the character sequence.

In some embodiments of the present application, the fine-tuning of the pre-training mask language model Bert according to the character sequence includes: acquiring an input ID sequence, a Token type sequence and a Token position sequence corresponding to each character sequence according to the dictionary; and fine-tuning the pre-training mask language model Bert according to the input ID sequence, the Token type sequence and the Token position sequence.

In some embodiments of the application, the fine-tuning the pre-training mask language model Bert according to the input ID sequence, the Token type sequence, and the Token position sequence includes: inputting the input ID sequence, the Token type sequence and the Token position sequence into the pre-training mask language model Bert, and obtaining a presumed result through the pre-training mask language model Bert, wherein the presumed result is prediction of the content of the covered part; and obtaining a loss function value according to the guessed result, and conducting reverse conduction according to the loss function value so as to update the parameters of the pre-training mask language model Bert.

In a second aspect, some embodiments of the present application provide a method of identifying an intent type, the method comprising: obtaining a sentence to be identified; inputting the sentence to be recognized into the target intention recognition model obtained by the method according to any embodiment of the first aspect, and obtaining the target intention corresponding to the sentence to be recognized through the target intention recognition model.

In a third aspect, some embodiments of the present application provide an apparatus for obtaining a target intent recognition model, the apparatus including: an original training text set obtaining module, configured to obtain an original training text set, where the original training text set includes training data corresponding to each of at least two intention types, the training data includes a plurality of original training sentences and original labeling labels corresponding to each original training sentence, and the original labeling labels are used to represent real intents of the corresponding original training sentences; a target training text set obtaining module configured to obtain a target training text set according to the data in the original training text set, wherein each entry label training sentence in the target training text set is obtained by masking an original label; and the training module is configured to finely adjust a pre-training mask language model Bert based on the data in the target training text set to obtain a target intention recognition model.

In a fourth aspect, some embodiments of the present application provide an apparatus for identifying an intent type, the apparatus comprising: the sentence to be recognized acquiring module is configured to acquire a sentence to be recognized; and a target intention recognition model trained by the method according to any one of the embodiments of the first aspect, and configured to: and receiving the input sentence to be recognized, and acquiring a target intention corresponding to the sentence to be recognized.

In a fifth aspect, some embodiments of the present application provide a robot, comprising: the audio data acquisition unit is configured to acquire a sentence to be identified; an intent recognition unit configured to: obtaining a target intention type corresponding to the sentence to be recognized according to the sentence to be recognized and a target intention recognition model obtained by training according to any embodiment of the first aspect; at least one interactive module, wherein a target interactive module in the at least one interactive module is configured to obtain an output statement corresponding to the statement to be recognized according to the statement to be recognized and the target intention type; an output unit configured to provide the output sentence.

In a sixth aspect, some embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, may implement the method as described in any of the embodiments of the first or second aspect.

In a seventh aspect, some embodiments of the present application provide an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, may implement the method according to any one of the first aspect or the second aspect.

In an eighth aspect, some embodiments of the present application provide a computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the method according to any of the embodiments of the first or second aspect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

Fig. 1 is a schematic view of a conversation process and composition of a conversation robot provided in the related art;

FIG. 2 is a diagram illustrating a process for obtaining an intention recognition model provided in the related art;

FIG. 3 is a flowchart of a method for obtaining an intention recognition model according to an embodiment of the present application;

FIG. 4 is a flowchart of a method for identifying intent provided by an embodiment of the present application;

FIG. 5 is a block diagram illustrating components of an apparatus for obtaining an intention recognition model according to an embodiment of the present application;

FIG. 6 is a block diagram of an apparatus for identifying intent, provided in an embodiment of the present application;

FIG. 7 is a schematic diagram of a robot according to an embodiment of the present disclosure;

fig. 8 is a schematic composition diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

The technical deficiencies of the related art are exemplarily explained below in conjunction with fig. 1 and 2.

Referring to fig. 1, fig. 1 is a schematic diagram of a session process between a session robot and a client and an internal composition of the session robot provided in the related art. The conversation robot 20 of fig. 1 is internally provided with an intention recognition module 100, a chatting mode processing module 200 (as one kind of interaction module), and a question and answer mode processing module 300 (as one kind of interaction module), wherein the intention recognition module 100 performs intention recognition on the received sentence based on the intention recognition model, wherein the intention recognition model is trained by training data in the original training text set to have the intention recognition capability for the received sentence, the chatting mode processing module 200 is configured to obtain an output sentence corresponding to the chatting mode according to the input data (i.e. the collected sentence to be recognized), the question-answering mode processing module 300 is configured to obtain an output sentence corresponding to the question-answering mode according to the input data (i.e. the collected sentence to be recognized), it can be understood that, only one of the chatting mode processing module and the question-answering mode processing module is in a working state in a specific session process. It should be noted that the original training text set includes training data corresponding to each of at least two intention types, where the training data includes multiple original training sentences and original labeling labels corresponding to the original training sentences, and the original labeling labels are used to represent real intents of the corresponding original training sentences.

The following describes an exemplary conversation process of the conversation robot in conjunction with a specific conversation process.

The conversation robot 20 in fig. 1 acquires an input sentence 11 from the user 10 through its own audio capturing device, and then the conversation robot 20 performs intent recognition on the input sentence 11 by using an intent recognition module 100 provided therein and obtains a target intent corresponding to the input sentence 11 as a chit chat, the conversation robot inputs the input sentence 11 into a chit chat mode processing module 200, the chit chat mode processing module 200 obtains a corresponding output sentence 21 according to the input sentence 11, and then the conversation robot provides the output sentence 21 to the user 10 through an output unit (e.g., an audio signal output device). This concludes the session. That is, when the intention recognition module 100 of fig. 1 determines that the user's intention is FAQ (i.e., in question and answer mode), the session robot forwards the request to the question and answer mode processing module 300, and when the intention recognition module determines that the user's intention is chatting, the session robot forwards the request to the chatting pattern processing module 200. It can be understood that if the user inputs an FAQ question, but the intention recognition module discriminates it as a chatting question, some answer irrelevant to the FAQ service will be output by the chatting model, resulting in a poor user experience. Therefore, it is understood that the intention recognition capability of the intention recognition module determines the conversation effect of the conversation robot.

As described in the background section, the existing intent recognition module performs intent recognition by means of an intent recognition model, and the intent recognition model needs to be trained based on data in the original training text set to have the capability of intent recognition, while the session robot for multiple functions (i.e. multiple intent types) has the problem of unbalanced training data collected for different types of intents. If the training data for different intention types are not balanced, an overfitting defect is generated in the trained target intention recognition model, so that the intention prediction result obtained by the intention recognition module is more prone to the intention type with more training data, and the intention recognition is wrong. In addition, the training mode disclosed in the related art does not fully utilize the inherent properties of the trained model (which includes the pre-training mask language model and the full connection layer), thereby resulting in the technical defects of large required training data volume and undesirable training effect.

Fig. 2 is a process of obtaining an intention recognition model of the related art disclosure, the process including:

s101, obtaining an original training sentence.

It should be noted that the original training sentence of the related art does not include the covered portion (i.e., the original training sentence is a sentence with complete semantics and does not include the covered portion), for example, the original training sentence is: what is the company wifi password.

S101, original training sentences are obtained, namely one or more original training sentences are read from an original training text set, wherein the original training text set comprises training data corresponding to various intention types to be recognized, the training data corresponding to each intention type comprises a plurality of training sentences and original labeling labels corresponding to the training sentences, and the original labeling labels are used for representing real intentions of the corresponding training sentences. It will be appreciated that these original training sentences are acquired.

S102, fine-tuning the pre-training language model according to the original training sentences and training the full-connection layer connected with the pre-training language model, namely training the trained model according to the original training sentences.

And repeating the steps S101 and S102 until the intention recognition model is obtained after the training is finished.

It is to be understood that the above-described training sentences used for training the intention recognition model are complete sentences (excluding the masked portions), and the related-art trained model includes a pre-training mask language model and a full-link layer. It can be understood that the pre-training mask language model has a function of predicting the blocked part according to the context of the blocked part, but the related art does not utilize the feature when training the trained model, and instead, takes a complete statement without the blocked part as an original training statement, resulting in many technical defects of long training time, large amount of required training data, and the like. In addition, if training data for different intention types are unbalanced, an intention recognition model obtained by the training mode of fig. 2 still has an over-fitting defect, so that the intention recognition capability is seriously reduced.

At least in order to solve the above problem, an embodiment of the present application provides a method for constructing a target training text set based on data in an original training text set (each entry mark training sentence in the text set includes a sentence covering an intention type), and retraining (i.e., fine-tuning) a pre-training mask language model according to the target training text set to obtain a target intention recognition model, where an intention recognition module obtained by training in the target training text set constructed by the present application has no risk of overfitting, and thus accuracy of intention recognition is effectively improved to improve user experience, and training data amount required by the present application is less, and training time is also shorter.

Fig. 3 is a flowchart illustrating an example of a method for obtaining a target intention recognition model according to some embodiments of the present application, where the method for obtaining the target intention recognition model may be obtained by directly fine-tuning a pre-training mask language model according to a newly constructed target training text set according to the embodiments of the present application.

As shown in fig. 3, some embodiments of the present application provide a method of obtaining a target intent recognition model, the method comprising:

s201, acquiring an original training text set.

It should be noted that the original training text set related to S201 includes training data corresponding to each of at least two intention types, where the training data includes a plurality of original training sentences (obtained through acquisition) and original labeling labels (obtained through a labeling manner) corresponding to each of the original training sentences, and the original labeling labels are used to represent real intents of the corresponding original training sentences.

For example, if the session robot is a robot applied to an insurance company, and all intention types include a chat intention type and an inquiry service intention type, the original label of the original training sentence corresponding to the chat intention type is a chat, the original label of the original training sentence corresponding to the inquiry service intention type is a question, and any original training sentence corresponding to the chat intention type is: today, i are not happy, and any original training sentence corresponding to the inquiry service intention type is as follows: there are several categories of insurance? .

For example, if the conversational robot is a robot applied in the field of intelligent customer service, and the intent types include an inquiry service intent type and a manual processing intent type, the original label of the original training sentence corresponding to the inquiry service type is a question, the original label of the original training sentence corresponding to the manual processing intent type is a manual processing intent type, and any one of the original training sentences corresponding to the inquiry service intent type is: today, the company is in business, and any original training sentence corresponding to the type of the manual processing intention is as follows: please help to transfer to manual service.

It should be noted that, the above examples only illustrate two types of intent types as examples of the types of original training sentences, and it should be understood that the embodiments of the present application do not limit the specific number of types of intent that can be recognized by a conversational robot or a smart customer service device. The original training sentences of the above example only show one sentence, but those skilled in the art need to collect a large number of original training sentences and label each training sentence with a real intention type to obtain a corresponding original label in order to obtain a better intention recognition model. The number of the original training sentences collected corresponding to each intention type is not limited in the embodiments of the present application.

S202, a target training text set is obtained according to the data in the original training text set, wherein each item label training sentence in the target training text set is obtained by covering an original label.

It should be noted that, in some embodiments of the present application, all original training sentences respectively corresponding to various types of intentions are extracted from an original training text set, and data in a target training text set is obtained according to the training sentences. In some embodiments of the present application, original training sentences respectively corresponding to various types of intentions are extracted from an original training text set, and data in a target training text set is obtained from the training sentences. For example, the conversation robot needs to recognize three intentions, namely a first intention, a second intention, and a third intention, and the numbers of the original training sentences collected corresponding to the three intentions are: in some embodiments of the present application, the number of target training sentences in the target training text set corresponding to the three intent types is: 100 (i.e., obtaining the 100 entry target training sentences according to all 100 original training sentences corresponding to the first intention), 6000 (i.e., screening 100 original training sentences from all 10000 original training sentences included in the second intention, and then obtaining the 100 entry target training sentences according to the 100 original training sentences obtained by screening), and 100 (i.e., screening 100 original training sentences from all 2000 original training sentences included in the third intention, and then obtaining the 100 entry target training sentences according to the 100 original training sentences obtained by screening).

It is understood that, in the target training sentences in the target training text set constructed in the embodiment of the present application, since the attribute of the pre-training mask language model Bert is fully considered (that is, the content of the covered portion is predicted by the context of the covered portion, and the covered portion is the original identification tag corresponding to the intention type in the present application), model training can be completed only by a small number of training samples, and the overfitting problem of the intention identification module obtained by training due to imbalance of the training data corresponding to two or more intention types is effectively solved.

Taking the example of obtaining any one target training sentence (i.e. the ith target training sentence) and the target label corresponding to the target training sentence as an example, the implementation process of S202 is exemplarily described below.

At least in order to quickly obtain each entry mark training sentence, in some embodiments of the present application, the target training text set related to S202 includes an ith target training sentence and a target annotation label corresponding to the ith target training sentence, and then S202 exemplarily includes:

firstly, obtaining an ith original training sentence from the original training text set obtained in the step S201 and obtaining a jth original labeling label corresponding to the ith original training sentence.

For example, the ith original training sentence is: "is there rain today? And the jth original label corresponding to the ith original training sentence is: and (5) chatting, namely, executing a first step of reading out the ith original training sentence from the original training text set and reading out the original labeling label corresponding to the ith original training sentence.

And secondly, acquiring a prompt template.

It should be noted that the hint template includes a hint portion and a blank portion to be filled with content. It will be appreciated that the prompt section includes one or more prompt words. In some embodiments of the present application, the design principle of the prompt template is that after the original annotation tag is filled in the blank part of the prompt template, the content of the original annotation tag and the content of the prompt part are complete as semantic as possible, and the sentences are smooth. For example, if the original annotation tag is a noun, the hint template may be "subject + predicate + blank part (object) for filling in the original annotation tag of the noun as the object of the sentence, so that the sentence after filling in the content has a complete grammatical structure, wherein the subject and the predicate constitute the content of the hint part. It will be appreciated that one skilled in the art may design a hint template according to particular needs. For example, in some embodiments of the present application, the hint template is: i want + blank part (wait to fill in the original label subsequently) ", in other embodiments of the present application, the prompt template is: i cannot follow you + blank parts (wait for subsequent filling in of the original label tag).

And thirdly, filling the jth original label into the blank part to obtain the text to be mixed.

That is to say, the jth original identification tag corresponding to the ith original training sentence is filled into the blank part included in the prompt template read in the second step, so as to obtain the text to be mixed.

For example, if the prompt template is "i want to chat + blank part", the ith original training sentence is "do you please today", and the jth original label corresponding to the ith original training sentence is "chat", then the third step is executed, that is, the second word of chat is filled into the blank part of the prompt template, and the obtained text to be mixed is "i want to chat".

It should be noted that the "+" is not a symbol actually included in the hint template, and the symbol is used to indicate that the concatenation operation is performed on different parts included in the hint template.

And fourthly, obtaining the ith target training sentence and the target label according to the text to be mixed and the ith original training sentence.

The implementation process of the fourth step of obtaining the ith target training sentence is exemplarily described below.

In some embodiments of the present application, the process of obtaining the ith target training sentence according to the text to be mixed and the ith original training sentence in the fourth step exemplarily includes: covering the jth original labeling label included in the text to be mixed to obtain a target mixed text (for example, putting the covered text to the position of the jth original labeling label included in the text to be mixed to obtain the target mixed text); and obtaining an ith target training sentence according to the target mixed-in text and the ith original training sentence (for example, obtaining the ith target training sentence by using the target mixed-in text as a prefix or a suffix of the ith original training sentence). That is to say, some embodiments of the present application provide a method for obtaining any one target training sentence by covering an original label filled in a prompt template, and training data constructed by the method can better utilize the inherent attribute of a pre-training mask language model, thereby saving training time and improving training effect.

That is, some embodiments of the present application provide two methods (prefix or suffix) of mixing a target mixed-in text (i.e., by a hint portion and an occluded portion) with an ith original training sentence, by which a corresponding target training sentence can be derived based on the target mixed-in text. Some embodiments of the present application provide a method for obtaining a target mixed-in text by placing a covering text at a position where a jth original label is located.

It will be appreciated that in some embodiments of the present application, the ith target training sentence comprises: the prompt template comprises a prompt part, covering text for blocking the intention type and the ith original training sentence corresponding to the intention type. The structure of the ith target training sentence provided by some embodiments of the present application includes the above three parts, and training the intention recognition model through such an ith target training sentence can make full use of the inherent attributes of the pre-training mask language model, reduce the amount of required training data, and effectively solve the problem of overfitting caused by imbalance of the original training data corresponding to different intention types.

The following exemplarily illustrates the implementation process of obtaining the target label corresponding to the ith target training sentence in the fourth step.

In some embodiments of the present application, the process of obtaining the target annotation tag according to the text to be mixed and the ith original training sentence in the fourth step exemplarily includes: and taking the text to be mixed as the prefix or suffix of the ith original training sentence to obtain the target labeling label. It is understood that, if the target tag is mixed in a prefix manner, in some embodiments of the present application, the target tag sequentially includes the following three parts: the prompt part, the jth original labeling label and the ith original training sentence. It is understood that, if mixed in a suffix manner, in some embodiments of the present application, the target annotation tag sequentially includes the following three parts: the ith original training sentence, the prompt part and the jth original labeling label.

It can be understood that, in some embodiments of the present application, if the text to be mixed in the ith target training sentence and the target labeling label is a prefix of the ith original training sentence, and the prompt part is a, the ith target training sentence is: a + cover text + the ith original training sentence, wherein the target labeling label is as follows: a + the jth original annotation label + the ith original training sentence. It should be noted that the "+" is not a symbol actually included in the hint template, and the symbol is used to indicate that the concatenation operation is performed on different parts included in the hint template.

The following describes the structure of any one target training sentence and a target labeling label corresponding to any one target training sentence by combining two intention types. In some embodiments of the present application, the at least one type of intent includes a first intent or a second intent, and the original annotation label corresponding to the first intent is a first word, the original annotation label corresponding to the second intent is a second word, the training data corresponding to the first intent includes a kth original training sentence, and the training sentence corresponding to the second intent includes an mth original training sentence; wherein, the kth target training sentence included in the target training text set is: a + cover text + the kth original training sentence, and a target label corresponding to the kth target training sentence is: a + first word + the kth original training sentence; the mth target training sentence included in the target training text set is: a + cover text + the mth original training sentence, and a target label corresponding to the mth target training sentence is: a + second word + the mth original training sentence.

The following describes an exemplary process of obtaining the ith target training sentence and the target label in combination with a specific prompt template.

Assume that the prompt template is: i want (as a specific example of a hint section) + blank section (to be filled in with original label)

In some embodiments of the present application, S202 is executed to fill the original label of the original training sentence into the prompt template, and then mix the text with the filled content into the original training sentence, where the text can be used as a prefix of the original training sentence or as a suffix of the original training sentence. If the determined mixing mode is a prefix mixing mode, adding a prompt template filled with the original labeling labels to the corresponding original training sentences as prefixes to form target training sentences and corresponding target labeling labels.

For example: one original training sentence is: "what is the company wifi password? "the original label corresponding to the original training sentence is: the prompt template is: "i want + blank part", mix-in mode: prefix, then the obtained target label is: "what is the wifi password of the company? ". Putting the covering text 'MASK' into a prompt template, covering the original label, and then the corresponding target training sentence is: "what is my thinking [ MASK ], what is the company wifi password? ". It is understood that the covering text is "[ MASK ]".

S203, training a pre-training mask language model Bert based on the data in the target training text set to obtain a target intention recognition model. Compared with the prior art, the target intention recognition model can be obtained by directly training the pre-training mask language model without adding a full connection layer behind the pre-training mask language model as the trained model, so that the training time is saved, the structure of the network model is simplified, the attribute of the pre-training mask language model is fully utilized, and the resource requirement is reduced.

The following exemplarily illustrates a process of obtaining a target intention recognition model by training a pre-training mask language model with a target training text set of the present application.

Step A: and loading a word segmentation device of a pre-training mask language model Bert, segmenting the target label and the target training sentence generated in the step S202 to character level, and for Chinese, segmenting a text into single Chinese characters or punctuation marks so as to convert the segmented target training sentence and the corresponding target label into a character sequence. Then, a special character [ CLS ] is added to the beginning position of the character sequence, and a special character [ SEP ] is added to the ending position.

And B, step B: and then generating an input id (serial number of the character in the dictionary) sequence, a Token type sequence and a Token position sequence according to the dictionary of Bert and the character sequence generated in the step A by the word segmentation device.

For example, the target label corresponding to the ith target training sentence is: "what is the wifi password of a company that I want to ask questions? ", the generated input id, Token and Token position sequences are as follows:

inputting an id sequence: [101,2769,2682,2990,7309,8024,1062,1385,8306,2166,4772,3221,784,720,8043,102]

Token type sequence: [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

Token position sequence: [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

The ith target training sentence is: "what is my thinking [ MASK ], what is the company wifi password? ", the generated input id, Token and Token position sequences are as follows:

inputting an id sequence: [101,2769,2682,103,8024,1062,1385,8306,2166,4772,3221,784,720,8043,102]

Token type sequence: [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

Token position sequence: [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

And C: sending the input id sequence, Token type sequence and Token position sequence generated in step B into a pre-training mask language model Bert for training according to batches (a plurality of each batch, for example, 16 batches), wherein the training strategy is epoch (epoch ═ 5), namely 5 rounds of training by using the same training set; the learning rate is 2 e-5. And when the model conducts forwards each time, the character of 'MASK' in the target training statement is presumed, the presumed result is compared with the target labeling label, the loss is calculated according to a loss function (cross entropy function), then reverse conduction is carried out, and the parameter of the neural network (namely the pre-training MASK language model Bert) is updated.

For example: the ith target training statement is: "I want [ MASK ], what is the company wifi password? "feed to Bert for forward conduction.

The Bert outputs a character corresponding to "[ MASK ]", such as: "chat". Then the text inferred by the model is:

"what are the wifi passwords of a company that i want to chat? "

And then, labeling a target label corresponding to the ith target training sentence: "what is the wifi password of the company? And comparing and calculating a loss function (cross entropy), then conducting reverse conduction, and generating a classification model after 5 rounds of training to obtain the target intention recognition model of some embodiments of the application.

It can be understood that some embodiments of the present application train the pre-training mask language model by using the target training sentences with the covered parts, so that the trained target intent recognition module has the function of inferring the original covered label, because the pre-training mask language model is obtained by self-supervised training of a complete shape and space task through a large amount of corpora, the pre-training mask language model already contains rich semantic knowledge about complete shape and space, and the embodiments of the present application train the pre-training mask language model by using this attribute to reconstruct the target training sentences covered with consciousness types, and the training data amount (i.e. the target training text set) required for the subsequent fine tuning (i.e. retraining the pre-training mask language model) is small, so that the generalization performance can be trained even in the case of a small sample and unbalanced data, a more robust intent recognition model.

From the above, it can be seen that the model of some embodiments of the present application infers the masking character: the chinese character of "[ MASK ]" corresponds to the original label of each original training sentence (i.e. corresponds to a specific consciousness type). The process of model inference covering characters can be regarded as a process of completion type filling in (filling out covered characters), and since inferred characters are original label labels of original training sentences, some embodiments of the application correspond to conversion of a conversational robot intention recognition task (text classification task) into a completion type filling task. As shown in fig. 2, the conventional method for obtaining the target intention recognition model by using the pre-training language model Bert is: and (4) fine-tune adding the full connection layer to the Bert, namely fine-tuning the pre-training language model Bert based on data in the original training text set and training the full connection layer connected with the pre-training language model to obtain an intention recognition model. It is understood that, in this way, when the pre-training language model is fine-tuned for forward conduction, the model infers the classification label according to the semantic information of the whole text (a whole sentence) of a training sample (the whole sentence corresponds to a sentence vector, and the starting position of the sentence vector is represented by the specific symbol [ CLS ] at the top of the character sequence), and then compares the classification label with the labeling label, and performs backward conduction to adjust the model parameter. Because semantic information contained in the sentence vector is very limited, when the traditional text classification method is adopted, a large number of training samples are needed, and meanwhile, the data balance of the training samples also needs to be paid attention to, so that the fine-tuned model can achieve a good effect. In short, the Bert is trained by the shape completion and blank filling task in a self-supervision manner through a large amount of linguistic data, the generated pre-training language model already contains rich semantic knowledge about the shape completion and blank filling, and a new training text set (namely a target training text set) is generated by utilizing the attribute of the embodiment of the application, which is smaller in training data amount required by subsequent fine tuning, so that a model with better generalization performance and more robustness can be trained under the conditions of small samples and unbalanced data.

The following describes a process of acquiring a target intention recognition model according to an embodiment of the present application with reference to three specific application scenario examples.

Example 1

In the field of intelligent customer service, customers require that a customer service system has the functions of answering FAQ questions and chatting at the same time. The system needs to be able to identify whether the user's input is an FAQ intent (query business intent) or a chatting intent. When the system judges that the user intention is FAQ, the system forwards the request to an FAQ model; when the system determines that the user's intent is to chat, the system forwards the request to the chat model.

Firstly, designing a prompt template. Assume that there are two types of original labeled tags (i.e. two intent types) in the corpus at present, which are: "chat" and "question", the prompt template can be designed as: "I want + [ original markup tag ]". Take a training sample as an example:

the ith original training sentence is: "do you want to chat with me? "

The jth original labeling label corresponding to the ith original training sentence is: chat "

Then, the ith target training sentence after the prompt template is mixed is: "do I want [ MASK ] do you want to chat with me? ", the target label corresponding to the ith target training sentence is: "do i want to chat you want to chat with me? "

All original training text sets are converted into the above form by the method, and then word segmentation is carried out by utilizing a word segmentation device of a pre-training mask language model Bert to generate a character sequence:

the word segmentation character sequence of the ith target training sentence is as follows: [ "me", "want", "[ MASK ]", "you", "want", "and", "i", "chat", "day", "do",? "].

The word segmentation character sequence of the target label corresponding to the ith target training sentence is as follows: [ "I am", "want", "chat", "day", "you", "want", "and", "I", "chat", "day", "Do",? "].

According to the generated character sequence, generating an id sequence, a Token type sequence and a Token position sequence by using a dictionary of Bert and a word segmentation device:

and (3) an input id sequence corresponding to the ith target training sentence: [101,2769,2682,103,872,2682,1469,2769,5464,1921,1408,8043,102]

The Token type sequence corresponding to the ith target training statement: [0,0,0,0,0,0,0,0,0,0,0,0,0]

The Token position sequence corresponding to the ith target training statement: [1,1,1,1,1,1,1,1,1,1,1,1,1]

And (3) an input id sequence of a target labeling label corresponding to the ith target training sentence: [101,2769,2682,5464,1921,872,2682,1469,2769,5464,1921,1408,8043,102]

The Token type sequence of the target label corresponding to the ith target training sentence is as follows: [0,0,0,0,0,0,0,0,0,0,0,0,0,0]

The Token position sequence of the target label corresponding to the ith target training sentence is as follows: [1,1,1,1,1,1,1,1,1,1,1,1,1,1]

And inputting the input id sequence, the Token type sequence and the Token position sequence of the target label corresponding to the ith target training sentence and the ith target training sentence as a training set into a pre-training mask language model Bert for training. When the model conducts forwards, the training text is deduced: "do I want [ MASK ] do you want to chat with me? "middle" [ MASK ] "if" MASK ] "is inferred to be" question ", then the text is inferred to be: "do i want to ask you want to chat with me? "and label tag: "do i want to chat you want to chat with me? And if the network parameters are inconsistent, performing reverse conduction fine tuning on the network parameters of the pre-training mask language model Bert after calculating the loss. Therefore, the original text classification task is converted into a complete filling task, and the problem of data imbalance is effectively solved by utilizing the abundant prior semantic knowledge of the pre-training mask language model, so that a better classification effect is obtained, and the user experience is improved.

Example 2

In the field of intelligent customer service, customers require that a customer service system simultaneously has the functions of answering FAQ questions and transferring to manual work. There is therefore a need for a system that can identify whether a user's input is an FAQ intent (query business intent) or a manual intent. If the system often takes the user input originally with the FAQ intention as the manual intention to be forwarded to the manual customer service, the workload and the operation cost of the manual customer service are undoubtedly greatly increased. In order to improve such a problem, a better intention classification model can be generated by a method similar to the method in embodiment 1, which utilizes a method of fine-tuning a mask language model based on a prompt template, thereby reducing the workload and the operation cost of manual customer service.

Example 3

In the field of RPA, an example is that a client requests an RPA (robotic Process automation) human-computer interaction robot to be able to answer an RPA FAQ question and trigger to perform an RPA-related operation (e.g., perform a certain RPA Process). Because the number of the RPA FAQ corpora is much larger than that of the corpora for performing the RPA related operation, there is also a problem of data imbalance, which causes the user to be mistaken by the system as an RPA FAQ question when the user needs to perform the RPA related operation, and further causes the RPA related operation not to be triggered. How to improve such a problem? Specifically, the method similar to the method in embodiment 1 can be used to effectively solve the problem and greatly improve the user experience by using a method for finely tuning the mask language model based on the prompt template.

It can be understood that, in some embodiments of the present application, a method for fine-tuning a pre-training mask language model (i.e., retraining the pre-training mask language model) by using data in a target training text set constructed by a prompt template may convert an intent classification task into a complete filling task, and the problem of training data imbalance for different intent types is solved well by using rich a priori semantic knowledge of the pre-training mask language model (i.e., the pre-training mask language model already has an attribute for predicting actual content corresponding to an occluded part according to the context of the occluded part).

As shown in fig. 4, some embodiments of the present application provide a method of identifying an intent type, the method comprising: s310, obtaining a sentence to be identified; and S302, inputting the sentence to be recognized into the target intention recognition model obtained by the method for obtaining the target intention recognition model, and obtaining the target intention corresponding to the sentence to be recognized through the target intention recognition model. That is, what is input into the trimmed Bert pre-training mask language model (i.e., the target intention recognition model obtained by training the pre-training mask language model through the target training text) is the text to be recognized, and what is output by the model is the intention of the text.

It can be understood that the data in the target training text set is used for training the Bert model, so that the overfitting problem caused by unbalance of training data corresponding to different intention types is effectively eliminated, the target intention obtained by using the target intention recognition model is more accurate, and the user experience is effectively improved.

Referring to fig. 5, fig. 5 shows an apparatus for obtaining a target intention model provided in an embodiment of the present application, it should be understood that the apparatus corresponds to the above-mentioned method embodiment of fig. 3, and is capable of performing various steps related to the above-mentioned method embodiment, and specific functions of the apparatus may be referred to the above description, and detailed descriptions are appropriately omitted herein to avoid repetition. The device comprises at least one software functional module which can be stored in a memory in the form of software or firmware or solidified in an operating system of the device, and the device for acquiring the target intention module comprises: an original training text set acquisition module 101, a target training text set acquisition module 102, and a training module 103.

An original training text set obtaining module 101, configured to obtain an original training text set, where the original training text set includes training data corresponding to each intention type in at least one intention type, and the training data includes a plurality of original training sentences and original labeling labels corresponding to each original training sentence, and the original labeling labels are used to represent real intents of the corresponding original training sentences.

A target training text set obtaining module 102, configured to obtain a target training text set according to the data in the original training text set, where each entry label training sentence in the target training text set is obtained by performing a masking process on an original label.

A training module 103 configured to train a pre-training mask language model Bert based on the data in the target training text set, so as to obtain a target intention recognition model.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.

Referring to fig. 6, fig. 6 shows an apparatus for identifying the intention type provided by the embodiment of the present application, it should be understood that the apparatus corresponds to the embodiment of the method of fig. 4, and can perform the steps related to the embodiment of the method, and the specific functions of the apparatus can be referred to the description above, and the detailed description is appropriately omitted here to avoid repetition. The device comprises at least one software functional module which can be stored in a memory in the form of software or firmware or solidified in an operating system of the device, and the device for acquiring the target intention module comprises: some embodiments of the present application provide an apparatus to identify an intent type, the apparatus comprising: a sentence to be recognized obtaining module 201 and a target intention recognition model 202.

A sentence to be recognized obtaining module 201 configured to obtain a sentence to be recognized.

The target intent recognition model 202 is configured to: and receiving the input sentence to be recognized, and acquiring a target intention corresponding to the sentence to be recognized. The object intention recognition model 202 is obtained by the above-described method of obtaining an object intention recognition model.

As shown in fig. 7, some embodiments of the present application provide a robot comprising: an audio data acquisition unit 401, an update intention recognition unit 402 (which is different from the intention recognition unit of fig. 1 and operates based on a target intention recognition model obtained through the above training manner of the present application), at least one interaction module 403, and an output unit 404.

An audio data acquisition unit 401 configured to acquire a sentence to be recognized; an update intention identifying unit 402 configured to: obtaining a target intention type corresponding to the sentence to be recognized according to the sentence to be recognized and the target intention recognition model obtained by adopting the method of the embodiment of the application; an interaction module 403 which is arranged corresponding to each intention type and is configured to obtain an output statement corresponding to the statement to be recognized according to the statement to be recognized and the target intention type; an output unit 404 configured to provide the output statement.

It should be noted that, in at least one interaction module 403 (for example, fig. 1 includes two types of interaction models, which are the chat mode processing module 200 and the question and answer mode processing module 300, respectively) in fig. 7, a target interaction module (that is, an interaction module corresponding to a target intention confirmed in the current session, for example, the chat mode processing module in fig. 1) is configured to obtain an output sentence corresponding to the to-be-recognized sentence according to the to-be-recognized sentence and the target intention type.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the robot described above may refer to the corresponding process in the foregoing method, and redundant description is not repeated here.

Some embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which program, when executed by a processor, may implement the method as described above in relation to the corresponding embodiments of fig. 3 or 4.

Some embodiments of the present application provide a computer program product comprising a computer program, wherein the computer program when executed by a processor may implement the method according to the corresponding embodiment of fig. 3 or fig. 4 as described above.

As shown in fig. 8, some embodiments of the present application provide an electronic device 500, where the electronic device 500 includes a memory 510, a processor 520, a bus 530, and a computer program stored in the memory and executable on the processor, and where the processor 520 reads the program from the memory 510 through the bus 530 and executes the program, the method according to the embodiment of fig. 3 or fig. 4 can be implemented.

The processor may process digital signals and may include various computing structures. Such as a complex instruction set computer architecture, a architecturally reduced instruction set computer architecture, or an architecture that implements a combination of multiple instruction sets. In some examples, the processor may be a microprocessor.

The memory may be used for storing instructions to be executed by the processor or data associated with the execution of the instructions. The instructions and/or data may include code for performing some or all of the functions of one or more of the modules described in embodiments of the application. The processor of the disclosed embodiments may be used to execute instructions in memory to implement the methods shown in fig. 3 or fig. 4. The memory may include dynamic random access memory, static random access memory, flash memory, optical memory, or other memory known to those skilled in the art.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative and, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.

The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A method of obtaining a target intent recognition model, the method comprising:

acquiring an original training text set, wherein the original training text set comprises training data respectively corresponding to each intention type of at least two intention types, the training data comprises a plurality of original training sentences and original labeling labels respectively corresponding to the original training sentences, and the original labeling labels are used for representing real intents of the corresponding original training sentences;

obtaining a target training text set according to the data in the original training text set, wherein each item label training sentence in the target training text set is obtained by covering an original label;

and fine-tuning a pre-training mask language model Bert based on the data in the target training text set to obtain a target intention recognition model.

2. The method of claim 1, wherein said deriving a target training text set from data in said original training text set comprises:

and extracting all original training sentences respectively corresponding to various intention types from the original training text set, and obtaining data in the target training text set according to all the original training sentences.

3. The method of any one of claims 1-2, wherein the set of target training texts comprises an ith target training sentence and a target annotation label corresponding to the ith target training sentence, wherein,

the obtaining of the target training text set according to the data in the original training text set includes:

acquiring an ith original training sentence with the same intention type as the ith target training sentence from the original training text set, and acquiring a jth original labeling label corresponding to the ith original training sentence;

acquiring a prompt template, wherein the prompt template comprises a prompt part and a blank part of content to be filled, the prompt part and the content filled in the blank part form a sentence with complete semantics, and the complete semantics means that the sentence comprises a subject, a predicate and an object;

and obtaining the ith target training sentence and the target labeling label according to the prompt template and the ith original training sentence.

4. The method of claim 3, wherein the deriving the ith target training sentence from the prompt template and the ith original training sentence comprises:

Filling the jth original label into the blank part to obtain a text to be mixed;

covering the jth original labeling label included in the text to be mixed to obtain a target mixed text;

and obtaining the ith target training sentence according to the target mixed text and the ith original training sentence.

5. The method of claim 3, wherein the deriving the ith target training sentence from the prompt template and the ith original training sentence comprises:

covering the blank part to obtain a target mixed text;

6. The method of any one of claims 4-5, wherein the deriving the ith target training sentence from the target mixed-in text and the ith original training sentence comprises:

and taking the target mixed text as a prefix or suffix of the ith original training sentence to obtain the ith target training sentence.

7. The method of claim 3, wherein the ith target training sentence comprises: the prompt part, the covering text and the ith original training sentence, wherein the covering text is used for covering the jth original labeling label or the blank part.

8. The method of claim 3, wherein the deriving the target annotation label from the prompt template and the ith original training sentence comprises:

and taking the text to be mixed as the prefix or suffix of the ith original training sentence to obtain the target labeling label.

9. The method of claim 3, wherein the target annotation tag comprises: the prompt part, the jth original labeling label and the ith original training sentence.

10. The method of claim 6, wherein if the target mixed-in text is prefixed to the ith original training sentence, the ith target training sentence is: the prompt part + the covering text + the ith original training sentence, and the target labeling label is: the prompt part + the jth original labeling label + the ith original training sentence.

11. The method of claim 10, wherein the at least two intent types include a first intent and a second intent, the training data corresponding to the first intent is first training data, the training data corresponding to the second intent is second training data, the first training data includes a first original training sentence, the original annotation label corresponding to the first original training sentence is a first word, the second training data includes a second original training sentence, the original annotation label corresponding to the second original training sentence is a second word; wherein,

The first target training sentence included in the target training text set is: the prompt part + the covering text + the first original training sentence, and a target labeling label corresponding to the first target training sentence is: the prompt component + the first term + the first original training sentence;

the second target training sentence included in the target training text set is: the prompt part + the covering text + the second original training sentence, and a target labeling label corresponding to the second target training sentence is: the prompt component + the second term + the second original training sentence.

12. The method of claim 6, wherein if the target mixed-in text is a suffix of the ith original training sentence, the ith target training sentence is: the ith original training sentence + the prompt part + a cover text, and the target labeling label is: the ith original training sentence + the prompt part + the jth original labeling label.

13. The method of any of claims 1-5 and 7-12, wherein the set of target training texts comprises: a plurality of target training sentences and target labeling labels corresponding to the entry labeling training sentences; wherein,

The fine adjustment of the pre-training mask language model Bert based on the data in the target training text set comprises the following steps:

loading a word segmentation device;

segmenting the target label and the target training sentence to character level according to the word segmentation device to obtain character sequences, wherein one target training sentence and one target label correspond to one character sequence respectively;

and fine-tuning the pre-training mask language model Bert according to the character sequence.

14. The method of claim 13, wherein the fine-tuning of the pre-trained mask language model Bert according to the sequence of characters comprises:

acquiring an input ID sequence, a Token type sequence and a Token position sequence corresponding to each character sequence according to the dictionary;

and fine-tuning the pre-training mask language model Bert according to the input ID sequence, the Token type sequence and the Token position sequence.

15. The method of claim 14, wherein the fine-tuning of the pre-training mask language model Bert according to the input ID sequence, the Token type sequence, and the Token position sequence comprises:

inputting the input ID sequence, the Token type sequence and the Token position sequence into the pre-training mask language model Bert, and obtaining a presumed result through the pre-training mask language model Bert, wherein the presumed result is prediction of the content of the covered part;

And obtaining a loss function value according to the guessed result, and conducting reverse conduction according to the loss function value so as to update the parameters of the pre-training mask language model Bert.

16. A method of identifying an intent type, the method comprising:

obtaining a sentence to be identified;

inputting the sentence to be recognized into a target intention recognition model obtained by adopting the method of any one of claims 1 to 14, and acquiring a target intention corresponding to the sentence to be recognized through the target intention recognition model.

17. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 16.

18. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program is operable to implement the method of any of claims 1-16.