CN113837370A

CN113837370A - Method and apparatus for training a model based on contrast learning

Info

Publication number: CN113837370A
Application number: CN202111221793.6A
Authority: CN
Inventors: 窦辰晓
Original assignee: Beijing Fangjianghu Technology Co Ltd
Current assignee: Seashell Housing Beijing Technology Co Ltd
Priority date: 2021-10-20
Filing date: 2021-10-20
Publication date: 2021-12-24
Anticipated expiration: 2041-10-20
Also published as: CN113837370B

Abstract

The embodiment of the invention provides a method and a device for training a model based on contrast learning, belonging to the technical field of computers. The method comprises the following steps: determining an antagonistic statement sample of the original statement sample; obtaining an original statement vector corresponding to an original statement in the original statement samples of the current batch; obtaining countermeasure statement vectors corresponding to countermeasure statements in the countermeasure statement samples of the current batch; obtaining a comparison loss function value; and adjusting parameters of the first preset neural network model according to the obtained comparison loss function value, and repeating the processes of obtaining the comparison loss function value and adjusting the parameters of the first preset neural network model according to the obtained comparison loss function value, so that the number of times of adjusting the parameters of the first preset neural network model reaches a first preset number of times, thereby completing the training process. Therefore, the problems of sentence semantic change and the like possibly caused by simple operations of randomly inserting and deleting words are solved.

Description

Method and apparatus for training a model based on contrast learning

Technical Field

The invention relates to the technical field of computers, in particular to a method and a device for training a model based on comparison learning.

Background

The contrast learning is a discriminant self-supervision learning mode, a similar sample and a dissimilar sample are automatically constructed through a pre-established data conversion strategy, and then a contrast loss function is utilized to draw the distance between the similar sample (positive sample) and push away the distance between the dissimilar samples (negative sample). The training mode is widely applied to the fields of CV and NLP to obtain high-quality vector representation space, and the obtained excellent performance attracts great attention. There are two important parts of the model based on comparative learning: one is a data conversion strategy, and the effective data conversion strategy can determine the invariance of the finally learned vector representation and improve the representation of the representation vector in downstream tasks. Another is the design of contrast loss, and practice shows that increasing the number of negative samples can improve the quality of the representation vector in the final representation space. However, in the NLP field, most of the currently used text representation models based on contrast learning adopt an end-to-end structure, and the increase of negative samples in the structure is equivalent to the increase of the number of training samples (batch) in the same batch, which obviously needs to be limited by display memory. Furthermore, a great deal of research (particularly in the field of NLP) is focused on applying contrast learning in an auto-supervised manner, and the idea of contrast learning is rarely incorporated into supervised tasks.

The disadvantages of the prior art include the following. One aspect is a data conversion aspect. In the field of NLP, currently popular data conversion methods include random deletion, insertion, or using dropout mechanism in network as data enhancement strategy, however, each method has its own problems. For example, randomly deleting words may delete words that represent the semantics of the entire sentence, thereby changing the semantics of the original sentence. In addition, although the dropout mechanism is simple and effective, research shows that since the dropout strategy does not change the sentence length, the length information of the sentences can be used as a characteristic to distinguish positive and negative examples, so that the model is biased to consider sentences with consistent lengths as positive examples and sentences with large length differences as negative examples. Yet another aspect is a loss function aspect. For the ith sentence, the design of the contrast loss function includes one positive sample and N-1 negative samples. N is the batch size. In an end-to-end based architecture, the design of such a contrast loss function currently contains only one positive sample, and the number of negative samples cannot be increased due to video memory limitations and other factors. In addition, the self-supervision mode inevitably samples sentences which are similar to the original sentences in semantics as negative samples, and is not beneficial to learning vector representation. On the other hand, the current research in the field of NLP mainly focuses on the comparative learning of the training sentence vector in a self-supervision manner, and the comparative learning is rarely used on the supervised task as an auxiliary task to improve the performance of the model on the supervised task.

Disclosure of Invention

It is an aim of embodiments of the present invention to provide a method and apparatus for training a model based on comparative learning that addresses, or at least partially addresses, the above mentioned problems.

To achieve the above object, an aspect of the embodiments of the present invention provides a method for training a model based on contrast learning, the method including: determining an antagonistic statement sample of the original statement sample; inputting the original statement samples of the current batch into a first preset neural network model for obtaining statement vectors, and obtaining original statement vectors corresponding to original statements in the original statement samples of the current batch; inputting the countermeasure statement samples of the original statement samples of the current batch into the first preset neural network model to obtain countermeasure statement vectors corresponding to countermeasure statements in the countermeasure statement samples of the current batch; based on a preset contrast loss function, combining the obtained original statement vector and the obtained countermeasure statement vector to obtain a contrast loss function value; and adjusting parameters of the first preset neural network model according to the obtained comparison loss function value, and repeating the processes of obtaining the comparison loss function value and adjusting the parameters of the first preset neural network model according to the obtained comparison loss function value, so that the number of times of adjusting the parameters of the first preset neural network model reaches a first preset number of times, thereby completing the training process.

Optionally, the determining the countermeasure sentence sample of the original sentence sample comprises: inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining a sentence vector in the sentence classification model to enable the category of an original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category; changing sentence structure of the original sentence in the original sentence sample by synonym replacement; inputting the original sentence with the changed sentence structure into the sentence classification model again for predicting the category; and for any original sentence which is input into the sentence classification model again and has the same predicted category as the real category, repeating the process of changing the sentence structure and the predicted category until the category prediction is wrong or the number of times of changing the sentence structure and the predicted category reaches a second preset value, wherein after the sentence structure of the original sentence is changed, the sentence with the category prediction error is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the second preset value versions corresponding to the original sentence is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, the obtained contrast loss function value is further combined with the countermeasure statement vector corresponding to the countermeasure statement in the countermeasure statement sample of the historical batch.

Optionally, in a case that it is determined that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, the countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the historical batch is obtained based on the trained second preset neural network model.

Optionally, the method further comprises: and adjusting the parameters of the trained second preset neural network model according to the obtained comparison loss function value.

Optionally, for any one of the original sentences in the original sentence samples, in the countermeasure sentence samples of the current batch and the countermeasure sentence samples of the historical batch, the countermeasure sentence with the same category as the original sentence is a positive sample, and the countermeasure sentence with a different category from the original sentence is a negative sample.

Optionally, the preset contrast loss function includes:

wherein m represents the number of countermeasure sentences in the countermeasure sentence samples of the current batch and the history batch that belong to the same category as the original sentence i in the original sentence samples of the current batch, y_iClass, y, representing original sentence i_jA category representing the confrontation sentence j,

is shown when y_iAnd y_jWhen the same is 1 and when y_iAnd y_jWhen the difference is 0, N represents the number of original sentences in the original sentence samples of the current batch or the number of confrontation sentences in the confrontation sentence samples of the current batch, Q represents the capacity of a queue for storing confrontation sentence vectors corresponding to the confrontation sentences in the confrontation sentence samples of the historical batch, and L represents_SCL(i) Representing the value of the contrast loss function, L, of the original sentence i_SCLAn average value of the contrast loss function values, z, representing all original sentences in the original sentence sample of the current batch_iOriginal sentence vector, z, representing original sentence i_jA challenge statement vector, z, representing a challenge statement j_kA countermeasure sentence vector representing a countermeasure sentence k, Sim represents cosine similarity, and τ represents temperature.

Accordingly, another aspect of the embodiments of the present invention provides an apparatus for training a model based on comparative learning, the apparatus comprising: the countermeasure sentence sample determining module is used for determining a countermeasure sentence sample of the original sentence sample; the original statement vector determining module is used for inputting the original statement samples of the current batch into a first preset neural network model used for obtaining statement vectors to obtain original statement vectors corresponding to original statements in the original statement samples of the current batch; the countermeasure statement vector determination module is used for inputting the countermeasure statement samples of the original statement samples of the current batch into the first preset neural network model to obtain countermeasure statement vectors corresponding to countermeasure statements in the countermeasure statement samples of the current batch; the comparison loss function value determining module is used for combining the obtained original sentence vector and the obtained confrontation sentence vector to obtain a comparison loss function value based on a preset comparison loss function; and the adjusting module is used for adjusting the parameters of the first preset neural network model according to the obtained comparison loss function value, repeating the processes of obtaining the comparison loss function value and adjusting the parameters of the first preset neural network model according to the obtained comparison loss function value, so that the times of adjusting the parameters of the first preset neural network model reach a first preset value, and the training process is completed.

Optionally, the determining the countermeasure sentence sample of the original sentence sample by the countermeasure sentence sample determination module includes: inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining a sentence vector in the sentence classification model to enable the category of an original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category; changing sentence structure of the original sentence in the original sentence sample by synonym replacement; inputting the original sentence with the changed sentence structure into the sentence classification model again for predicting the category; and for any original sentence which is input into the sentence classification model again and has the same predicted category as the real category, repeating the process of changing the sentence structure and the predicted category until the category prediction is wrong or the number of times of changing the sentence structure and the predicted category reaches a second preset value, wherein after the sentence structure of the original sentence is changed, the sentence with the category prediction error is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the second preset value versions corresponding to the original sentence is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, the contrast loss function value determining module obtains the contrast loss function value in combination with the countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the historical batch.

Optionally, in a case that the countermeasure sentence sample determination module determines that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, the countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the historical batch is obtained based on the trained second preset neural network model.

Optionally, the adjusting module is further configured to adjust a parameter of the trained second preset neural network model according to the obtained contrast loss function value.

Optionally, the preset contrast loss function includes:

In addition, another aspect of the embodiments of the present invention also provides a machine-readable storage medium, which stores instructions for causing a machine to execute the above-mentioned method.

In addition, another aspect of the embodiments of the present invention further provides a processor, configured to execute a program, where the program is executed to perform the above method.

Furthermore, another aspect of the embodiments of the present invention also provides a computer program product, which includes a computer program/instructions, and the computer program/instructions, when executed by a processor, implement the method described above.

By the technical scheme, the countersentence samples are used for training the model based on the contrast learning, and the countersentence samples are used as a strategy for enhancing data, so that the problems of sentence semantic change and the like possibly caused by simple operations of randomly inserting and deleting words are solved.

Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the embodiments of the invention without limiting the embodiments of the invention. In the drawings:

FIG. 1 is a schematic diagram of a BERT model;

FIG. 2 is a schematic diagram of data conversion;

FIG. 3 is a diagram of a structure of a text representation model based on contrast learning;

FIG. 4 is a flow diagram of a method for training a model based on comparative learning provided by an embodiment of the present invention;

FIG. 5 is a schematic illustration of a challenge sample provided in accordance with another embodiment of the present invention;

FIG. 6 is a schematic diagram of a model based on comparative learning provided by another embodiment of the present invention; and

fig. 7 is a block diagram of an apparatus for training a model based on comparative learning according to another embodiment of the present invention.

Description of the reference numerals

1 confrontation statement sample determination module 2 confrontation statement vector determination module

3 original statement vector determination module 4 contrast loss function value determination module

5 adjusting module

Detailed Description

The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating embodiments of the invention, are given by way of illustration and explanation only, not limitation.

Pre-trained language models (such as BERT) have achieved significant results on various NLP tasks and are the most commonly used coders. At present, the BERT itself can be used to express striking achievements in tasks such as text classification, and when the BERT is used to perform the task of text classification, vectors corresponding to characters (namely [ CLS ]) at the first position of the last layer are generally taken as semantic representations of the whole sentence for classification, as shown in fig. 1.

The self-supervision learning belongs to one of the unsupervised learning, the model is trained from the non-labeled data through a pre-established auxiliary task to learn effective vector representation, the pre-trained model is used for downstream tasks, and the performance of the model on the downstream tasks can be greatly improved. Such training mechanisms have been widely used in various fields. Common auxiliary tasks are mask language model tasks in the text field, which belong to generative self-supervision learning, and the model reconstructs original input by adding noise to the input, or angle prediction tasks in the image field, which belong to discriminant self-supervision learning, and the model predicts the angle of a picture to be rotated, and the angle is a pseudo label required by most discriminant self-supervision learning tasks. Recently, a discriminant self-supervised learning method (also referred to as contrast learning) has attracted great attention because of its excellent performance in the CV and NLP fields. Different from the self-supervision learning task, the method does not need pseudo labels, and also does not need to reconstruct original input in the self-supervision learning of a generation type. Specifically, the goal of the comparative supervised learning is to first generate a transformed version of the original input data using a data transformation strategy, randomly pick a sample from the original data, pick a sample from the transformed data, and let the model predict whether the two samples are from the same original data (or whether one of the samples is the other sample obtained through data transformation), as shown in fig. 2.

Network structures based on comparison learning are various at present, and can be roughly divided into four types of structures according to different acquisition processes of negative samples: end-to-end form, use of memory pool, use of momentum coding, introduction of clustering. In the CV field, the above four architectures have respective advantages, and in the NLP field, the end-to-end (end-to-end) based architecture is currently most commonly used. Furthermore, in the CV domain, commonly used data transformation includes rotation, clipping, etc., while in the NLP domain, currently commonly used data transformation includes translation back, random insertion (insert), or deletion (delete) words, etc. The structure of the present NLP domain representation model based on comparative learning is roughly shown in fig. 3. The loss function (coherent loss) of contrast learning generally adopts the following formula:

where U represents the number of samples input (e.g., U-2 in fig. 3); sim (a, b) represents the similarity between two vectors, and generally adopts cosine similarity measurement to represent the cosine similarity between the vector a and the vector b, and the calculation formula is

z_pRepresents a sentence vector z 'obtained by BERT coding the p-th sentence'_pThe idea of comparative learning is to regard the two sentences as positive samples to shorten the distance between the two sentences in the representation space and to push the distance between the p-th sentence and other sentences away.

One aspect of an embodiment of the present invention provides a method for training a model based on comparative learning.

FIG. 4 is a flowchart of a method for training a model based on comparative learning according to an embodiment of the present invention. As shown in fig. 4, the method includes the following.

In step S40, a countermeasure sentence sample of the original sentence sample is determined. The countermeasure samples are originally used in the CV field, and refer to those samples which are obtained after data processing operation, and compared with the original pictures, the samples are not observed by human eyes and can cause misjudgment of the model. In the NLP field, because sentences are composed of words, and the discreteness makes a slightly changed part of the words detectable by human eyes, the countermeasure sample in the NLP field refers to a sentence after data processing operation, and the semantic is not changed compared with the original sentence, but the sentence can make the model misjudge. The way to generate the countersample is classified into white-box attack and black-box attack. The strategy of the white-box attack needs to calculate the gradient of the model parameters according to the label, and disturbance is added along the direction of gradient rising to reduce loss. The black box attack strategy does not care about details such as specific parameters of the model, but changes sentences through operations such as synonym replacement on the premise of not changing the semantics of the original sentences as much as possible, so that the model is misjudged. The black box attack method may employ TextFooler, as shown in FIG. 5 below. Specifically, in the embodiment of the present invention, the countermeasure sentence sample for determining the original sentence sample may be determined by countercurrently training a second preset neural network model for obtaining a sentence vector of the sentence in the sentence classification model for sentence classification. Wherein the second pre-set neural network model may be BERT. In addition, in the embodiment of the invention, the synonym replacement mode is used, so that the sentence length of the sentence is changed, and the problem caused by the dropout mechanism is solved.

In step S41, the original sentence samples of the current batch are input into the first preset neural network model for obtaining the sentence vectors of the sentences, so as to obtain the original sentence vectors corresponding to the original sentences in the original sentence samples of the current batch. Wherein, the first preset neural network model may adopt BERT.

In step 42, the countermeasure statement samples of the current batch of original statement samples are input into the first preset neural network model, and countermeasure statement vectors corresponding to the countermeasure statements in the current batch of countermeasure statement samples are obtained.

In step S43, based on the preset contrast loss function, the obtained original sentence vector and the obtained countermeasure sentence vector are combined to obtain a contrast loss function value.

In step S44, the process of adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value and continuously repeating the process of obtaining the contrast loss function value and adjusting the parameters of the first preset neural network model according to the obtained contrast loss function value makes the number of times of adjusting the parameters of the first preset neural network model reach the first preset number of times, so as to complete the training process. For example, the parameters of the first predetermined neural network model may be adjusted in a back-propagation manner.

Alternatively, in the embodiment of the present invention, determining the countermeasure sentence sample of the original sentence sample may include the following. And inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining a sentence vector in the sentence classification model to enable the category of the original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category. Specifically, the second preset neural network model is trained by adjusting parameters of the second preset neural network model. Sentence structure of the original sentence in the original sentence sample is changed by synonym substitution. And inputting the original sentence with the changed sentence structure into the sentence classification model again for predicting the classification, wherein at the moment, a second preset neural network model in the sentence classification model is trained. And for any original sentence which is input into the sentence classification model again and has the same predicted category as the real category, repeating the process of changing the sentence structure and the predicted category until the category is predicted wrongly or the times of changing the sentence structure and the predicted category reach a second preset value, wherein after the sentence structure of the original sentence is changed, the sentence with the wrong category is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence coefficient in the sentences of the second preset value versions corresponding to the original sentence is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, in this embodiment of the present invention, the contrast loss function value is obtained by combining the countermeasure statement vector corresponding to the countermeasure statement in the history batch of countermeasure statement samples.

Optionally, in this embodiment of the present invention, in a case that it is determined that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, the countermeasure sentence vector corresponding to the countermeasure sentence in the history batch of countermeasure sentence samples is obtained based on the trained second preset neural network model.

Optionally, in an embodiment of the present invention, the method further includes: and adjusting the parameters of the trained second preset neural network model according to the obtained comparison loss function value.

Optionally, in this embodiment of the present invention, for any original sentence in the original sentence samples, in the countermeasure sentence samples of the current batch and the countermeasure sentence samples of the historical batch, a countermeasure sentence with the same category as the original sentence is a positive sample, and a countermeasure sentence with a different category from the original sentence is a negative sample. The countermeasure sentence of the same category as an original sentence is taken as a positive sample, increasing the number of sentences included in the positive sample. The countermeasure sentence samples of the historical lot are also added to the positive and negative samples, increasing the number of sentences included in the positive and negative samples. In addition, the countermeasure sentences of the same category as the original sentences are used as positive samples, and the countermeasure sentences of different categories from the original sentences are used as negative samples, so that the comparative learning is applied to the supervision task in a supervision mode.

Optionally, in an embodiment of the present invention, the presetting of the contrast loss function includes:

wherein m represents the number of countermeasure sentences belonging to the same category as the original sentences i in the original sentence samples of the current batch in the countermeasure sentence samples of the current batch and the historical batch, y_iClass, y, representing original sentence i_jA category representing the confrontation sentence j,

is shown when y_iAnd y_jWhen the same is 1 and when y_iAnd y_jWhen the two are not equal to each other, N is 0, N represents the number of original statements in the original statement samples of the current batch or the number of countermeasure statements in the countermeasure statement samples of the current batch, Q represents the capacity of a queue for storing countermeasure statement vectors corresponding to countermeasure statements in the countermeasure statement samples of the historical batch, and L is L_SCL(i) Representing the value of the contrast loss function, L, of the original sentence i_SCLAll original in the original sentence sample representing the current batchAverage of the comparative loss function values of the sentence, z_iOriginal sentence vector, z, representing original sentence i_jA challenge statement vector, z, representing a challenge statement j_kA countermeasure sentence vector representing a countermeasure sentence k, Sim represents cosine similarity, and τ represents temperature.

Fig. 6 is a schematic diagram of a model based on comparative learning according to another embodiment of the present invention. In this embodiment, the generation of the countermeasure sample may adopt a black box attack technique, such as the TextFooler method, to input the sentence with the category tag into the model, and repeatedly input the sentence into the model through operations such as synonym replacement until the model predicts a category error or outputs a lower confidence without changing the sentence semantics. In the auto-supervised form of the contrast loss function, for a certain sentence, all the remaining sentences within a batch are typically pushed away as negative samples by their distance in the representation space. However, many sentences in a batch belong to a class, and the distance between sentences in the same class is pushed away, so that the accuracy of model classification is reduced. Thus a loss of contrast in the form of supervision is used, i.e. the label information of the supervision task is utilized. Specifically, for a certain original sentence, all countermeasure sentences in the batch and belonging to a category together with the original sentence are taken as positive samples of the original sentence, and countermeasure sentences in other categories are taken as negative samples. Thus, the contrast loss of the supervised form differs from the auto-supervised form in that a plurality of countermeasure sentences can be included in the positive sample of one original sentence. The momentum contrast mechanism mainly comprises a queue and a slow updating encoder. The queue stores sentence vectors of each batch of countermeasure sentences and corresponding categories, and the encoder adopts a model trained in the countermeasure training stage, because the model has certain fault tolerance, the model is more robust to some countermeasure samples. The technical scheme provided by the embodiment of the invention can realize that: 1) the countermeasure sample is used as a strategy for enhancing data, so that the quality of the positive sample is improved, and the problems of sentence semantic change and the like possibly caused by simple random insertion and word deletion operations are avoided; 2) the method has the advantages that the countermeasure sentences of the same category as the original sentences are used as positive samples, the countermeasure sentences of different categories from the original sentences are used as negative samples, comparative learning is used as an auxiliary task in a supervision mode and is applied to a supervision task, the performance of a model on the supervision task is improved, meanwhile, label category information in the supervision task is used as an extra learning signal, a comparative loss function is improved, and the quality of final learnt sentence vectors is improved; 3) and increasing the number of negative samples by using a momentum contrast mechanism on the premise of not enlarging the batch size, wherein the aim is to improve the quality of the finally learned sentence vector.

The method for training the model based on the comparative learning provided in the embodiment of the present invention is exemplarily described below with reference to fig. 6. In this embodiment, the first preset neural network model and the second preset neural network model both use BERT, the first preset neural network model is represented by BERTc, and the second preset neural network model is represented by BERTa.

First, a countermeasure sentence sample is determined by the countermeasure training BERTa. In the training of countermeasures, a black box attack technology is adopted, for example, synonym replacement and other operations are adopted to change sentence structures. Specifically, with the textfooler method, the sentence classification model for sentence classification includes BERTa and a full-link layer neural network. Specifically, BERTa is first trimmed with the original sentence sample so that BERTa can learn the features of the original sentence in the original sentence sample. That is, by fine-tuning parameters of BERTa, the class obtained by the sentence classification model for the original sentence in the original sentence sample is the same as the real class of the original sentence, wherein the real class can be input into BERTa. After the BERTA can learn the characteristics of the original sentences in the original sentence samples, the sentence structures of the original sentences are changed through operations such as synonym replacement and the like, the original sentences in the original sentence samples are input into the sentence classification model again, the categories of the original sentences are predicted through the sentence classification model, the sentence structures are continuously changed through operations such as synonym replacement and the like until the category prediction is wrong, and the original sentences with the changed sentence structures are used as countersentences of the original sentences with the unchanged sentence structures. The countermeasure sentences of the original sentences in the original sentence samples constitute countermeasure sentence samples. In addition, if the model prediction is always not wrong, the times of changing the sentence structure reach a second preset value, then in a second preset value of categories and corresponding category confidences obtained by the sentence classification model by performing category prediction on the sentences of the second preset value of versions, the sentence of the version with the lowest category confidence is selected as the confrontation sentence. In addition, the confrontation statement samples generated by the operation can be added into the training set to further fine-tune BERTA, and the robustness of the model is enhanced.

Secondly, a momentum contrast mechanism is used in the embodiment of the invention, wherein the momentum contrast mechanism comprises a queue and a momentum encoding module. Specifically, the original statement samples of the current batch are input to BERTc to obtain the original statement vectors corresponding to the original statements in the original statement samples. And inputting the countermeasure statement samples of the original statement samples of the current batch into BERTC to obtain countermeasure statement vectors corresponding to the countermeasure statements in the countermeasure statement samples of the current batch. And inputting the countermeasure sentence samples of the historical batch into a momentum coding module to obtain a countermeasure sentence vector corresponding to the countermeasure sentence, wherein an encoder of the momentum coding module adopts BERTA which is subjected to countermeasure training. And storing the confrontation statement vector obtained by the momentum coding module in a queue. If the queue reaches maximum capacity, remove the countermeasure statement vector that the previous batch put into. In addition, since the text adopts a supervised form of comparative learning, the category of the sentence corresponding to each sentence vector also needs to be saved.

Then, based on a preset contrast loss function, the obtained original sentence vector and the countermeasure sentence vector are combined to obtain a contrast loss function value. In the self-supervision form of contrast loss, positive samples are derived from data-enhanced samples, and negative samples are obtained by randomly selecting samples from all data-enhanced samples or data sets of the current batch. This is not suitable for supervision tasks, since it is imperative that many samples of the same category be contained within a batch, especially when the number of categories is small and the number of batches is large. In the technical scheme provided by the embodiment of the invention, the comparison loss in a supervision mode is adopted, for an original statement, a negative sample is taken from the countermeasure statements which belong to different categories with the original statement sample in the same batch, and in order to increase the number of positive samples, the countermeasure statements which belong to the same category with the original statement can be taken as positive samples. Furthermore, by utilizing the momentum contrast mechanism, the positive and negative examples are also taken from the countermeasure statements in the countermeasure samples of the historical batch, the number of positive and negative examples can be further increased. Specifically, for any original sentence in the original sentence samples, in the countermeasure sentence samples of the current batch and the countermeasure sentence samples of the historical batch, a countermeasure sentence with the same category as the original sentence is a positive sample, and a countermeasure sentence with a different category from the original sentence is a negative sample. The preset contrast loss function is designed according to the following formula:

is shown when y_iAnd y_jWhen the same is 1 and when y_iAnd y_jWhen the difference is 0, N represents the number of original sentences in the original sentence samples of the current batch or the number of confrontation sentences in the confrontation sentence samples of the current batch, Q represents the capacity of a queue for storing confrontation sentence vectors corresponding to the confrontation sentences in the confrontation sentence samples of the historical batch, and L represents_SCL(i) Representing the value of the contrast loss function, L, of the original sentence i_SCLRepresents the average of the contrast loss function values of all original sentences in the current batch of original sentence samples, z_iOriginal sentence vector, z, representing original sentence i_jA challenge statement vector, z, representing a challenge statement j_kA countermeasure sentence vector representing a countermeasure sentence k, Sim represents cosine similarity, and τ represents temperature. Furthermore, for the original languageContrast loss L of sentence i_SCL(i) The method is characterized in that the countermeasure sentence samples of the current batch and the countermeasure sentences of the same category in all the countermeasure sentence samples in the queue and the original sentence i are taken as positive samples to be drawn closer to the distance between the samples of the same category, and the other countermeasure sentences of different categories are taken as negative samples to be drawn further away from the distance between the samples of different categories.

And finally, adding the contrast loss function value of the original statement sample and the cross entropy loss function value of the original statement sample, adjusting the parameters of BERTC in a back propagation mode according to the addition, and adjusting the parameters of BERTA used in the momentum coding module in a slow updating mode. The update mechanism of the BERTa parameter in the momentum coding module may adopt the following formula: theta_a＝λθ_a+(1-λ)θ_cWherein, theta_aParameter, θ, representing BERTA in the momentum encoding module_cThe parameters representing BERTc are parameters obtained by parameter adjustment based on the sum of the contrast loss function values and the cross-entropy loss function values. The sentence vectors of a plurality of batches stored in the queue keep a certain degree of consistency through a slow updating mode. Furthermore, the formula for the cross entropy loss function may take the form:

p may be a softmax function. And repeating the process of calculating the cross entropy loss function value and the contrast loss function value, adjusting the parameters of the BERTC according to the sum of the cross entropy loss function value and the contrast loss function value and adjusting the parameters of the BERTA in the momentum coding module in a slow updating mode, so that the times of adjusting the parameters of the BERTC and the BERTA reach a first preset value, and finishing the training process.

In summary, in the technical solution provided in the embodiment of the present invention, 1) the contrast loss is applied to the text classification task in a supervised manner, so that the model can learn the differences between the sample representation characteristics of different classes, thereby improving the classification accuracy, and the final model can be used as both the classification task and the semantic matching task; 2) and by introducing mechanisms such as confrontation samples and momentum contrast, the text representation quality of the final learning is improved.

Accordingly, another aspect of the embodiments of the present invention also provides an apparatus for training a model based on comparative learning.

Fig. 7 is a block diagram of an apparatus for training a model based on comparative learning according to another embodiment of the present invention. As shown in fig. 7, the apparatus includes a countermeasure sentence sample determination module 1, a countermeasure sentence vector determination module 2, an original sentence vector determination module 3, a contrast loss function value determination module 4, and an adjustment module 5. The confrontation statement sample determining module 1 is used for determining confrontation statement samples of original statement samples; the original statement vector determining module 3 is configured to input the original statement samples of the current batch into a first preset neural network model for obtaining statement vectors of statements, so as to obtain original statement vectors corresponding to original statements in the original statement samples of the current batch; the confrontation statement vector determination module 2 is configured to input the confrontation statement samples of the current batch of original statement samples into the first preset neural network model, and obtain confrontation statement vectors corresponding to the confrontation statements in the current batch of confrontation statement samples; the comparison loss function value determining module 4 is configured to obtain a comparison loss function value by combining the obtained original sentence vector and the obtained countermeasure sentence vector based on a preset comparison loss function; the adjusting module 5 is configured to adjust a parameter of the first preset neural network model according to the obtained comparison loss function value, and repeat the processes of obtaining the comparison loss function value and adjusting the parameter of the first preset neural network model according to the obtained comparison loss function value, so that the number of times of adjusting the parameter of the first preset neural network model reaches a first preset number of times, thereby completing a training process.

Optionally, in this embodiment of the present invention, the determining, by the confrontation statement sample determining module, a confrontation statement sample of the original statement sample includes: inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining a sentence vector in the sentence classification model to enable the category of the original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category; changing the sentence structure of the original sentence in the original sentence sample through synonym replacement; inputting the original sentence with the changed sentence structure into the sentence classification model again to predict the category; and for any original sentence which is input into the sentence classification model again and has the same predicted category as the real category, repeating the process of changing the sentence structure and the predicted category until the category is predicted wrongly or the times of changing the sentence structure and the predicted category reach a second preset value, wherein after the sentence structure of the original sentence is changed, the sentence with the wrong category is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence coefficient in the sentences of the second preset value versions corresponding to the original sentence is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

Optionally, in this embodiment of the present invention, the comparison loss function value obtained by the comparison loss function value determining module is further combined with a countermeasure sentence vector corresponding to a countermeasure sentence in a history batch of countermeasure sentence samples.

Optionally, in this embodiment of the present invention, in a case that the countermeasure sentence sample determination module determines that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model, the countermeasure sentence vector corresponding to the countermeasure sentence in the history batch of countermeasure sentence samples is obtained based on the trained second preset neural network model.

Optionally, in an embodiment of the present invention, the adjusting module is further configured to adjust a parameter of the trained second preset neural network model according to the obtained comparison loss function value.

Optionally, in this embodiment of the present invention, for any original sentence in the original sentence samples, in the countermeasure sentence samples of the current batch and the countermeasure sentence samples of the historical batch, a countermeasure sentence with the same category as the original sentence is a positive sample, and a countermeasure sentence with a different category from the original sentence is a negative sample.

is shown when y_iAnd y_jWhen the same is 1 and when y_iAnd y_jWhen the difference is 0, N represents the number of original sentences in the original sentence samples of the current batch or the number of confrontation sentences in the confrontation sentence samples of the current batch, Q represents the capacity of a queue for storing confrontation sentence vectors corresponding to the confrontation sentences in the confrontation sentence samples of the historical batch, and L represents_SCL(i) Representing the value of the contrast loss function, L, of the original sentence i_SCLRepresents the average of the values of the contrast loss functions of all the original sentences in the current batch of original sentence samples, z_iOriginal sentence vector, z, representing original sentence i_jA challenge statement vector, z, representing a challenge statement j_kA countermeasure sentence vector representing a countermeasure sentence k, Sim represents cosine similarity, and τ represents temperature.

The specific working principle of the device for training the model based on the comparative learning provided by the embodiment of the invention is similar to the specific working principle and benefits of the method for training the model based on the comparative learning provided by the embodiment of the invention, and the detailed description is omitted here.

The device for training the model based on the comparison learning comprises a processor and a memory, wherein the confrontation sentence sample determination module, the original sentence vector determination module, the confrontation sentence vector determination module, the comparison loss function value determination module, the adjustment module and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.

The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the problems of sentence semantic change and the like possibly brought by simple random insertion and word deletion operations are solved by adjusting the kernel parameters.

The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.

In addition, another aspect of the embodiments of the present invention also provides a machine-readable storage medium, on which instructions are stored, the instructions being used for causing a machine to execute the method described in the above embodiments. .

In addition, another aspect of the embodiments of the present invention further provides a processor, where the processor is configured to execute a program, where the program executes the method described in the foregoing embodiments.

In addition, another aspect of the embodiments of the present invention further provides an apparatus, where the apparatus includes a processor, a memory, and a program stored in the memory and executable on the processor, and the processor executes the program to implement the method described in the foregoing embodiments. The device herein may be a server, a PC, a PAD, a mobile phone, etc.

Furthermore, another aspect of the embodiments of the present invention also provides a computer program product, which includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the computer program/instruction implements the method described in the above embodiments.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method for training a model based on comparative learning, the method comprising:

determining an antagonistic statement sample of the original statement sample;

inputting the original statement samples of the current batch into a first preset neural network model for obtaining statement vectors, and obtaining original statement vectors corresponding to original statements in the original statement samples of the current batch;

inputting the countermeasure statement samples of the original statement samples of the current batch into the first preset neural network model to obtain countermeasure statement vectors corresponding to countermeasure statements in the countermeasure statement samples of the current batch;

based on a preset contrast loss function, combining the obtained original statement vector and the obtained countermeasure statement vector to obtain a contrast loss function value; and

and adjusting parameters of the first preset neural network model according to the obtained comparison loss function value, and repeating the processes of obtaining the comparison loss function value and adjusting the parameters of the first preset neural network model according to the obtained comparison loss function value, so that the number of times of adjusting the parameters of the first preset neural network model reaches a first preset number of times, thereby completing the training process.

2. The method of claim 1, wherein determining the antagonistic sentence sample of the original sentence sample comprises:

inputting the original sentence sample into a sentence classification model for sentence classification, and training a second preset neural network model for obtaining a sentence vector in the sentence classification model to enable the category of an original sentence in the original sentence sample predicted by the sentence classification model to be the same as the real category;

changing sentence structure of the original sentence in the original sentence sample by synonym replacement;

inputting the original sentence with the changed sentence structure into the sentence classification model again for predicting the category; and

and for any original sentence which is input into the sentence classification model again and has the same predicted category as the real category, repeating the process of changing the sentence structure and the predicted category until the category prediction is wrong or the number of times of changing the sentence structure and the predicted category reaches a second preset value, wherein after the sentence structure of the original sentence is changed, the sentence with the category prediction error is the countermeasure sentence of the original sentence or the sentence with the lowest category confidence in the second preset value versions corresponding to the original sentence is the countermeasure sentence of the original sentence, and all the countermeasure sentences form a countermeasure sentence sample.

3. The method of claim 1 or 2, wherein the derived contrast loss function value is further combined with the countermeasure statement vector corresponding to the countermeasure statement in the countermeasure statement sample of a historical batch.

4. The method of claim 3, wherein in a case that the determination that the countermeasure sentence sample of the original sentence sample is determined based on the sentence classification model is performed, the countermeasure sentence vector corresponding to the countermeasure sentence in the countermeasure sentence sample of the historical lot is obtained based on the trained second preset neural network model.

5. The method of claim 4, further comprising:

and adjusting the parameters of the trained second preset neural network model according to the obtained comparison loss function value.

6. The method according to claim 3, wherein for any one of the original sentences in the original sentence samples, in the countermeasure sentence samples of the current batch and the countermeasure sentence samples of the historical batch, the countermeasure sentence of the same category as the original sentence is a positive sample, and the countermeasure sentence of a different category from the original sentence is a negative sample.

7. The method of claim 6, wherein the predetermined contrast loss function comprises:

8. A machine-readable storage medium having stored thereon instructions for causing a machine to perform the method of any one of claims 1-7.

9. A processor configured to execute a program, wherein the program is configured to perform the method of any one of claims 1-7 when executed.

10. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the method of any of claims 1-7.