CN114169392A

CN114169392A - Model training method and device, task processing method, storage medium and processor

Info

Publication number: CN114169392A
Application number: CN202111267999.2A
Authority: CN
Inventors: 李进锋
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2022-03-11

Abstract

The invention discloses a model training method and device, a task processing method, a storage medium and a processor. Wherein, the method comprises the following steps: training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model. The invention solves the technical problems that the model training method in the related technology can not realize the model training under a small amount of labeled data and the active perception and resistance against risks are poor.

Description

Model training method and device, task processing method, storage medium and processor

Technical Field

The invention relates to the technical field of deep learning, in particular to a model training method and device, a task processing method, a storage medium and a processor.

Background

With the continuous breakthrough of the deep learning technology, the artificial intelligence model based on the deep neural network is widely applied to various fields such as images, texts, voices and the like, and the conventional artificial intelligence model usually needs to rely on a large amount of high-quality and labeled data.

On the one hand, obtaining high-quality marking data in a real scene is time-consuming and labor-consuming, a large amount of labor cost and material cost are required to be consumed, and the quality of a label is greatly influenced by the learning ability of a labeling worker, the prior knowledge of the label and the difficulty of a labeling task. On the other hand, the existing artificial intelligence model is often constructed based on clean and non-countermeasure data, and the neural network model is sensitive to anti-noise due to the intrinsic vulnerability of the anti-interference and cannot accurately process malicious users in a real countermeasure scene to elaborately construct countermeasure samples, so that the problems of fast performance attenuation, incapability of maintaining robustness and the like exist in the countermeasure scene, and the accuracy, robustness and safety of the neural network model in actual deployment and application cannot be guaranteed.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a model training method and device, a task processing method, a storage medium and a processor, which are used for at least solving the technical problems that the model training method in the related technology cannot realize model training under a small amount of labeled data and the active perception and resistance capability of the anti-risk are poor.

According to an aspect of an embodiment of the present invention, there is provided a task processing method, including: acquiring a task to be processed; analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for training to obtain an initial model, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data; and displaying the processing result.

According to another aspect of the embodiments of the present invention, there is also provided a model training method, including: training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

According to another aspect of the embodiments of the present invention, there is also provided a task processing method, including: receiving a task to be processed from a client; analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for training to obtain an initial model, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data; and feeding back the processing result to the client.

According to another aspect of the embodiments of the present invention, there is also provided a model training apparatus, including: the first training module is used for training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; the disturbance resisting module is used for carrying out disturbance resisting on the first model based on the first data to obtain second data, wherein the second data are data with pseudo labels; and the second training module is used for training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, where the computer-readable storage medium includes a stored program, and when the program runs, the apparatus in which the computer-readable storage medium is located is controlled to execute any one of the above model training methods.

According to another aspect of the embodiments of the present invention, there is also provided a processor, configured to run a program, where the program executes any one of the above model training methods.

According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including: a processor; and a memory, connected to the processor, for providing instructions to the processor for processing the following processing steps: step 1, training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; step 2, performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and 3, training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

In the embodiment of the invention, a first model is obtained by utilizing first data training, wherein the first data is labeled data, and the first model is a teacher model; performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

Training a teacher model based on a small amount of high-quality labeled data, actively generating antagonistic samples based on the labeled data, fusing the antagonistic samples to a large amount of unlabeled data, generating pseudo labels based on the unlabeled data and the actively generated antagonistic data by using the teacher model, and detecting and removing abnormal samples by using the abnormal data outside the distribution; and then training a student model with larger memory capacity based on the mixed data with the pseudo labels, and taking the student model as a new teacher model until the effect of the student model is better than that of the teacher model, so that the model training problem under a small amount of labeled data is solved, and the purposes of active perception and resistance to wind risks in the training model are improved, so that the self-enhancement of the robustness of the training model is realized, the technical effect of model performance attenuation under the antagonistic environment is effectively responded, and the technical problems that the model training method in the related technology cannot realize model training under a small amount of labeled data, and the active perception and resistance to risks are poorer are solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

fig. 1 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing a model training method;

FIG. 2 is a flow chart of a model training method according to an embodiment of the present invention;

FIG. 3 is a flow diagram of a method of task processing according to an embodiment of the invention;

FIG. 4 is a flow diagram of a method of task processing according to an embodiment of the invention;

FIG. 5 is a schematic diagram of a model training apparatus according to an embodiment of the present invention;

fig. 6 is a block diagram of a computer terminal according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, some terms or terms appearing in the description of the embodiments of the present invention are applicable to the following explanations:

learning with noise: the process of performing model training learning on the given error label is noisy learning, and the aim is to solve the problem of how to make the model robust to the error label during training when the real label of the training data is in error.

And (3) resisting the text: the method is characterized in that the prediction result of the model can be easily changed by adding a few tiny disturbances to the input of the deep learning model. This perturbation is called countermeasure perturbation, the perturbed input is called countermeasure sample, and the process of misleading the input countermeasure sample to the model is called countermeasure attack. If the sample of the model input is perturbed text, it is referred to as countermeasure text.

Robustness: robust's transliteration, i.e., a Robust and Robust meaning. By "robustness", it is meant that the control system maintains certain other performance characteristics under certain (structural, size) parameter changes.

Example 1

In accordance with an embodiment of the present invention, there is provided a model training method embodiment, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than here.

The method provided by embodiment 1 of the present invention may be implemented in a mobile terminal, a computer terminal, or a similar computing device. Fig. 1 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing the model training method. As shown in fig. 1, the computer terminal 10 (or mobile device 10) may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 104 for storing data, and a transmission module 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial BUS (USB) port (which may be included as one of the ports of the BUS), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the invention, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).

The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the model training method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in the memory 104, so as to implement the model training method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 10 (or mobile device).

With the continuous new breakthrough of the deep learning technology, the artificial intelligence model based on the deep neural network is widely applied to various fields such as images, texts, voices and the like, so that a series of industrial applications such as face recognition, content security audit, trademark infringement detection and the like are promoted, and the deep learning technology is greatly successful. However, artificial intelligence systems developed in industrial settings often need to rely on large amounts of high quality, tagged data. In a real scene, obtaining high-quality marking data is time-consuming and labor-consuming, a large amount of labor cost and material cost are consumed, and the quality of a label is greatly influenced by the learning ability of a labeling worker, the prior knowledge of the label and the difficulty of a labeling task. On the other hand, existing artificial intelligence systems are often constructed based on clean and non-countermeasure data, and the neural network model is usually sensitive to anti-noise due to intrinsic vulnerability of countermeasure interference, so that malicious users in a real countermeasure scene cannot accurately process meticulously constructed countermeasure samples. Therefore, in a countermeasure scenario, the artificial intelligence system generally has the problems of fast performance attenuation, incapability of maintaining robustness and the like, so that the accuracy, robustness and safety of the artificial intelligence system in actual deployment and application cannot be ensured.

In the above operating environment, in order to solve the above technical problem, the present invention provides a model training method as shown in fig. 2, where fig. 2 is a flowchart of a model training method according to an embodiment of the present invention, and as shown in fig. 2, the method includes:

step S202, training by utilizing first data to obtain a first model;

step S204, carrying out anti-disturbance on the first model based on the first data to obtain second data;

and step S206, training by using the first data and the second data to obtain a second model, wherein the second model is a student model.

It should be noted that the model training method provided in the embodiment of the present invention may be understood as a robustness self-enhanced model training method, and may be specifically applied to platforms such as e-commerce platforms, online social platforms, and online social media, for example, the method may include but is not limited to application to important tasks such as original protection, content security audit, and the like, and may effectively ensure accuracy, robustness, and security of an artificial intelligence system in actual deployment, especially in a strong confrontation scenario, and eliminate a risk of confrontation variation in a real confrontation scenario.

As an alternative embodiment, the first data is tagged data, and the first model is a teacher model.

As an alternative embodiment, the second data is data with a pseudo tag; the second data includes: pseudo-label data with class labels, pseudo-label data with class confidence.

In an optional embodiment, the teacher model is trained by using labeled data, for example, the teacher model can be trained by using a small amount of high-quality labeled data, actively generating a countermeasure sample based on the labeled data to be fused with a large amount of unlabeled data, generating a pseudo label by using the teacher model based on the unlabeled data and the actively generated countermeasure data, and detecting and removing abnormal samples by using the abnormal data outside the distribution; and then training a student model with larger memory capacity based on the mixed data with the pseudo labels, and taking the student model as a new teacher model until the effect of the student model is better than that of the teacher model, so that the model training problem under a small amount of labeled data is solved, and the purposes of active perception and resistance to wind risks in the training model are improved, so that the self-enhancement of the robustness of the training model is realized, the technical effect of model performance attenuation under the antagonistic environment is effectively responded, and the technical problems that the model training method in the related technology cannot realize model training under a small amount of labeled data, and the active perception and resistance to risks are poorer are solved.

In order to relieve performance attenuation of a training model in an confrontation scene and eliminate potential safety hazards of an AI system in actual deployment and application, the embodiment of the invention provides a self-enhancement training method for model robustness, which integrates noisy learning, out-of-distribution anomaly (OOD) detection and friendly confrontation training, namely a model training method, wherein a teacher model is trained on the basis of a small amount of high-quality labeled data, confrontation samples are actively generated on the basis of labeled data and fused into a large amount of unlabeled data, then pseudo labels are generated by the teacher model on the basis of the unlabeled data and the actively generated confrontation data, and abnormal samples are removed through the out-of-distribution anomaly data detection; then training a student model with larger memory capacity based on the mixed data with the pseudo labels until the effect of the student model is better than that of the teacher model, and taking the student model as a new teacher model; and finally, regenerating the pseudo label by using the teacher model, gradually increasing the generated disturbance of the countermeasure, and performing iterative training until convergence.

Through alternate iterative training, the problem of model training under a small amount of labeled data in the related technology can be solved, the active perception and resistance capability of the model for resisting risks can be improved, the self-enhancement of the robustness of the model is realized, and the problem of model performance attenuation in resisting environments is effectively solved.

As an alternative embodiment, performing disturbance rejection on the first model based on the first data, and obtaining the second data includes:

step S302, generating third data based on the first data, wherein the third data is a confrontation sample;

step S304, performing fusion processing on the third data and fourth data to obtain fifth data, wherein the fourth data is label-free data;

and step S306, labeling the fifth data by using the first model to obtain the second data.

Optionally, in the embodiment of the present invention, an anti-attack algorithm may be used to simulate an attacker or a malicious user in a real scene, and the third data, that is, the anti-sample, may be actively generated.

Optionally, in the embodiment of the present invention, the countermeasure sample and the process sample in the countermeasure attack may be mixed into the unlabeled data.

As an alternative embodiment, the first data, that is, the data with the label based on the high quality, may be adopted, the third data is actively generated by using the counterattack algorithm and fused to the fourth data to obtain the fifth data, that is, the counterattack sample is fused to the unlabeled data to obtain the mixed data, and then the teacher model is used to generate the pseudo label of the algorithm based on the mixed data to obtain the second data, that is, the training data with the pseudo label.

Optionally, in the above optional embodiment, the teacher model is used to label the mixed data, and generate an algorithm pseudo label, so as to obtain a large amount of pseudo label data with hard labels (class labels) and soft labels (class confidence).

As an optional embodiment, the model training method further includes: and carrying out external distribution anomaly detection on the second data to obtain a screening result.

In the above optional embodiment, the abnormal points outside the distribution can be detected based on the distribution difference between the normal data and the abnormal data to eliminate the toxic data harmful to training, and further, the student model with stronger learning ability can be trained based on the screened data with the pseudo label and a small amount of good-quality labeled data. And repeating the steps to carry out iterative self-training, and gradually increasing the counterattack noise added in the counterattack in each iteration until the performance of the student model can not exceed that of the teacher model in the current iteration to finish the training process.

As an alternative embodiment, performing out-of-distribution anomaly detection on the second data, and obtaining the screening result includes:

step S402, acquiring a target threshold corresponding to the abnormal detection outside the distribution;

step S404, removing outliers from the second data by using the target threshold to obtain the screening result.

As an alternative embodiment, the training the second model using the first data and the second data includes: and training by adopting the first data and the screening result to obtain a second model.

Optionally, the detection of the out-of-distribution abnormal point may be implemented by: firstly, statistically analyzing the distribution of the softmax probabilities of samples outside the distribution and samples inside the distribution based on the maximum softmax probability output by a large-scale pre-training model, and obtaining a most reasonable threshold value, namely a target threshold value, for distinguishing the samples by maximizing the distribution difference between the samples and the distribution; secondly, for a given sample with a pseudo label, judging whether the sample with the pseudo label is an abnormal point outside distribution or not through a target threshold, if not, the sample with the pseudo label is reserved as an effective training sample, otherwise, the sample with the pseudo label is regarded as toxic data, namely, the sample with the pseudo label needs to be removed from candidate data.

In the training stage of the student model, the scheme of the invention takes a network structure with large capacity of an optimal model as the student model, and utilizes a model distillation algorithm to train the student model by simultaneously minimizing soft loss and hard loss based on a small amount of high-quality labeled data and generating pseudo-labeled data by a teacher model. In the iteration self-training stage, the steps of training the student model based on a small amount of high-quality labeled data and the generation of pseudo-label data by the teacher model are continuously repeated, soft loss and hard loss are simultaneously minimized, the counternoise added in counterattack is gradually increased in each iteration, the counterinterference resistance of the student model is gradually enhanced, and the model training process is ended until the performance of the student model cannot exceed that of the teacher model in the current iteration finally.

As an optional embodiment, the embodiment of the present invention may also perform model training by manually marking massive data, and periodically updating the model.

By the model training method provided by the embodiment of the invention, the AI model training problem under the scene of less label data is solved by using noisy learning, the dependence of the training process on a large amount of label data can be effectively degraded, and further the manual marking cost can be degraded; secondly, the embodiment of the invention firstly proposes that a countermeasure generation technology is introduced into a noisy learning framework, real attackers and malicious users are simulated by utilizing a countermeasure attack algorithm, a countermeasure sample is generated and fused into label-free data to participate in model training so as to improve the active defense capacity of the model for resisting noise, the robustness of the model is gradually enhanced by means of alternative iteration self-training, and the problem that the performance of an artificial intelligence system is quickly attenuated in a countermeasure environment can be effectively solved.

According to an embodiment of the present invention, the present invention provides a task processing method as shown in fig. 3, and fig. 3 is a flowchart of a task processing method according to an embodiment of the present invention, and as shown in fig. 3, the method includes:

step S502, acquiring a task to be processed;

step S504, analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for obtaining an initial model by training, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data;

step S506, the processing result is displayed.

It should be noted that the task processing method provided by the embodiment of the present invention may be applicable to processing tasks on platforms such as e-commerce platforms, online social media, and the like, for example, may include but is not limited to important tasks applied to original protection, content security audit, and the like, may effectively ensure accuracy, robustness, and security of an artificial intelligence system in actual deployment, especially in strong confrontation scenarios, and eliminate the risk of adversarial variation in real confrontation scenarios.

It should be noted that an execution subject of the task processing method provided in the embodiment of the present invention may be a SaaS client, for example, a client running on a platform such as an e-commerce platform, an online social platform, and an online social media, or an e-commerce client, an online social client, and an online social media client.

As an alternative embodiment, the first data is tagged data, and the second data is pseudo-tagged data; the second data includes: pseudo-label data with class labels, pseudo-label data with class confidence.

In an optional embodiment, the teacher model is trained by using labeled data, for example, the teacher model can be trained by using a small amount of high-quality labeled data, actively generating a countermeasure sample based on the labeled data to be fused with a large amount of unlabeled data, generating a pseudo label by using the teacher model based on the unlabeled data and the actively generated countermeasure data, and detecting and removing abnormal samples by using the abnormal data outside the distribution; then, a student model with larger memory capacity is trained based on mixed data with pseudo labels, the to-be-processed task is analyzed by the student model to obtain a processing result, therefore, the model training problem under a small amount of labeled data is solved, meanwhile, the purpose of active perception and resistance to wind risks in the training model is improved, the self-enhancement of robustness of the training model is realized, the technical effect of model performance attenuation under the antagonistic environment is effectively responded, and the technical problems that the model training method in the related technology cannot realize model training under a small amount of labeled data, and the active perception and resistance to risks are poor are solved.

According to an embodiment of the present invention, the present invention provides another task processing method as shown in fig. 4, and fig. 4 is a flowchart of a task processing method according to an embodiment of the present invention, and as shown in fig. 4, the method includes:

step S602, receiving a task to be processed from a client;

step S604, analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for obtaining an initial model by training, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data;

step S606, feeding back the processing result to the client.

It should be noted that an execution subject of the task processing method provided in the embodiment of the present invention may be a SaaS server, and is in communication connection with a SaaS client, for example, the method is applied to platforms such as an e-commerce platform, an online social platform, and an online social media.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a computer-readable storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

Example 2

According to an embodiment of the present invention, there is further provided an embodiment of an apparatus for implementing the model training method, fig. 5 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention, as shown in fig. 5, the apparatus includes: a first training module 500, an anti-perturbation module 502, and a second training module 504, wherein:

a first training module 500, configured to train to obtain a first model by using first data, where the first data is tagged data, and the first model is a teacher model; an anti-disturbance module 502, configured to perform anti-disturbance on the first model based on the first data to obtain second data, where the second data is data with a pseudo tag; a second training module 504, configured to train to obtain a second model by using the first data and the second data, where the second model is a student model.

It should be noted here that the first training module 500, the disturbance rejection module 502, and the second training module 504 correspond to steps S202 to S206 in embodiment 1, and the three modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in embodiment 1. It should be noted that the above modules may be operated in the computer terminal 10 provided in embodiment 1 as a part of the apparatus.

It should be noted that, reference may be made to the relevant description in embodiment 1 for a preferred implementation of this embodiment, and details are not described here again.

Example 3

According to an embodiment of the present invention, there is further provided an embodiment of an electronic apparatus, which may be any one of computing devices in a computing device group. The electronic device includes: a processor and a memory, wherein:

a processor; and a memory, connected to the processor, for providing instructions to the processor for processing the following processing steps: step 1, training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; step 2, performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and 3, training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

Example 4

According to an embodiment of the present invention, there may be provided an embodiment of a computer terminal, which may be any one computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.

Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.

In this embodiment, the computer terminal may execute the program code of the following steps in the model training method: training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

Alternatively, fig. 6 is a block diagram of a computer terminal according to an embodiment of the present invention. As shown in fig. 6, the computer terminal may include: one or more processors 122 (only one of which is shown), memory 124, and peripherals interface 126.

The memory may be configured to store software programs and modules, such as program instructions/modules corresponding to the model training method and apparatus in the embodiments of the present invention, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, so as to implement the model training method. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory located remotely from the processor, and these remote memories may be connected to the computer terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The processor can call the information and application program stored in the memory through the transmission device to execute the following steps: training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

Optionally, the processor may further execute the program code of the following steps: generating third data based on the first data, wherein the third data is a confrontation sample; performing fusion processing on the third data and the fourth data to obtain fifth data, wherein the fourth data is label-free data; and performing labeling processing on the fifth data by using the first model to obtain the second data.

Optionally, the processor may further execute the program code of the following steps: and carrying out external distribution anomaly detection on the second data to obtain a screening result.

Optionally, the processor may further execute the program code of the following steps: acquiring a target threshold corresponding to the abnormal detection outside the distribution; and removing abnormal points outside the distribution from the second data by using the target threshold value to obtain the screening result.

Optionally, the processor may further execute the program code of the following steps: and training by adopting the first data and the screening result to obtain a second model.

Optionally, the processor may further execute the program code of the following steps: acquiring a task to be processed; analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for training to obtain an initial model, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data; and displaying the processing result.

Optionally, the processor may further execute the program code of the following steps: receiving a task to be processed from a client; analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for training to obtain an initial model, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data; and feeding back the processing result to the client.

The embodiment of the invention provides a scheme for model training. Training a teacher model based on a small amount of high-quality labeled data, actively generating antagonistic samples based on the labeled data, fusing the antagonistic samples to a large amount of unlabeled data, generating pseudo labels based on the unlabeled data and the actively generated antagonistic data by using the teacher model, and detecting and removing abnormal samples through abnormal data outside distribution; and then training a student model with larger memory capacity based on the mixed data with the pseudo labels, and taking the student model as a new teacher model until the effect of the student model is better than that of the teacher model, so that the model training problem under a small amount of labeled data is solved, and the purposes of active perception and resistance to wind risks in the training model are improved, so that the self-enhancement of the robustness of the training model is realized, the technical effect of model performance attenuation under the antagonistic environment is effectively responded, and the technical problems that the model training method in the related technology cannot realize model training under a small amount of labeled data, and the active perception and resistance to risks are poorer are solved.

It can be understood by those skilled in the art that the structure shown in fig. 6 is only an illustration, and the computer terminal may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 6 is a diagram illustrating a structure of the electronic device. For example, the computer terminal may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 6, or have a different configuration than shown in FIG. 6.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the computer-readable storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

Example 5

Embodiments of a computer-readable storage medium are also provided according to embodiments of the present invention. Optionally, in this embodiment, the computer-readable storage medium may be used to store the program code executed by the model training method provided in embodiment 1.

Optionally, in this embodiment, the computer-readable storage medium may be located in any one of a group of computer terminals in a computer network, or in any one of a group of mobile terminals.

Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model; performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label; and training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: generating third data based on the first data, wherein the third data is a confrontation sample; performing fusion processing on the third data and the fourth data to obtain fifth data, wherein the fourth data is label-free data; and performing labeling processing on the fifth data by using the first model to obtain the second data.

Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: and carrying out external distribution anomaly detection on the second data to obtain a screening result.

Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: acquiring a target threshold corresponding to the abnormal detection outside the distribution; and removing abnormal points outside the distribution from the second data by using the target threshold value to obtain the screening result.

Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: and training by adopting the first data and the screening result to obtain a second model.

Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: acquiring a task to be processed; analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for training to obtain an initial model, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data; and displaying the processing result.

Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: receiving a task to be processed from a client; analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training by using first data and second data, the first data is labeled data, the first data is used for training to obtain an initial model, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data; and feeding back the processing result to the client.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present invention, it should be understood that the disclosed technical contents can be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a computer-readable storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage media comprise: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A task processing method, comprising:

acquiring a task to be processed;

analyzing the task to be processed by using a target model to obtain a processing result, wherein the target model is a student model obtained by training first data and second data, the first data is labeled data, the first data is used for training to obtain an initial model, the initial model is a teacher model, and the second data is pseudo-labeled data obtained by performing anti-disturbance on the initial model based on the first data;

and displaying the processing result.

2. A method of model training, comprising:

training by using first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model;

performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label;

and training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

3. The model training method of claim 2, wherein performing the robust perturbation on the first model based on the first data to obtain the second data comprises:

generating third data based on the first data, wherein the third data is a challenge sample;

performing fusion processing on the third data and the fourth data to obtain fifth data, wherein the fourth data is label-free data;

and labeling the fifth data by adopting the first model to obtain the second data.

4. The model training method of claim 1, wherein the second data comprises:

pseudo-label data with class labels, pseudo-label data with class confidence.

5. The model training method of claim 1, further comprising:

and carrying out external distribution anomaly detection on the second data to obtain a screening result.

6. The model training method of claim 5, wherein the performing of the out-of-distribution anomaly detection on the second data to obtain the screening result comprises:

acquiring a target threshold corresponding to the abnormal detection outside the distribution;

and removing abnormal points outside the distribution from the second data by using the target threshold value to obtain the screening result.

7. The model training method of claim 5, wherein training with the first data and the second data to obtain a second model comprises:

and training by adopting the first data and the screening result to obtain the second model.

8. A task processing method, comprising:

receiving a task to be processed from a client;

and feeding back the processing result to the client.

9. A model training apparatus, comprising:

the first training module is used for training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model;

the disturbance resisting module is used for carrying out disturbance resisting on the first model based on the first data to obtain second data, wherein the second data are data with pseudo labels;

and the second training module is used for training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.

10. A computer-readable storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus on which the computer-readable storage medium resides to perform the model training method of any one of claims 1 to 8.

11. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the model training method of any one of claims 1 to 8.

12. An electronic device, comprising:

a processor; and

a memory coupled to the processor for providing instructions to the processor for processing the following processing steps:

step 1, training by utilizing first data to obtain a first model, wherein the first data is labeled data, and the first model is a teacher model;

step 2, performing anti-disturbance on the first model based on the first data to obtain second data, wherein the second data is data with a pseudo label;

and 3, training by adopting the first data and the second data to obtain a second model, wherein the second model is a student model.