CN112734039B - Virtual confrontation training method, device and equipment for deep neural network - Google Patents

Virtual confrontation training method, device and equipment for deep neural network Download PDF

Info

Publication number
CN112734039B
CN112734039B CN202110352167.4A CN202110352167A CN112734039B CN 112734039 B CN112734039 B CN 112734039B CN 202110352167 A CN202110352167 A CN 202110352167A CN 112734039 B CN112734039 B CN 112734039B
Authority
CN
China
Prior art keywords
seed
vector
deep neural
value
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110352167.4A
Other languages
Chinese (zh)
Other versions
CN112734039A (en
Inventor
王滨
王星
张峰
万里
周少鹏
钱亚冠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202110352167.4A priority Critical patent/CN112734039B/en
Publication of CN112734039A publication Critical patent/CN112734039A/en
Application granted granted Critical
Publication of CN112734039B publication Critical patent/CN112734039B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a virtual confrontation training method, a device and equipment of a deep neural network, wherein the method comprises the following steps: inputting a plurality of natural samples into a first sub-network of the deep neural network model to obtain a plurality of initial feature vectors corresponding to the plurality of natural samples; selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors; aiming at each seed characteristic vector, generating a virtual countermeasure characteristic vector corresponding to the seed characteristic vector based on the seed characteristic vector and a disturbance vector corresponding to the seed characteristic vector; inputting all the virtual confrontation feature vectors and all the non-seed feature vectors into a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to a plurality of natural samples; parameters of the deep neural network model are updated based on the plurality of target feature vectors. According to the technical scheme, the anti-interference capability of the deep neural network model on the attack sample is improved, and the reliability of the deep neural network model is improved.

Description

Virtual confrontation training method, device and equipment for deep neural network
Technical Field
The application relates to the technical field of artificial intelligence safety, in particular to a virtual confrontation training method, device and equipment for a deep neural network.
Background
Deep learning is a new research direction in the field of machine learning, and is introduced into machine learning to make it closer to the original goal, i.e., to implement artificial intelligence. Deep learning is the intrinsic law and expression level of the learning sample data, and the information obtained in the learning process is very helpful for the interpretation of data such as characters, images and sounds. The final goal of deep learning is to enable the machine to have an analytical learning capability, and to recognize data such as text, images, and sounds. Deep learning is a complex machine learning algorithm, and has remarkable effects in the fields of image recognition, voice recognition, natural language processing and the like, so that the deep learning is widely applied.
When implementing functions such as image recognition, voice recognition, and natural language processing using Deep learning, it is necessary to first train a Deep Neural Network model, for example, a CNN (Convolutional Neural Network) model and a DNN (Deep Neural Network) model, and implement functions such as image recognition, voice recognition, and natural language processing based on the Deep Neural Network model.
However, in the deep learning field, various deep neural network models including CNN models and DNN models have relatively high vulnerability to attack samples. For example, if an attacker adds a small disturbance to an input sample to form an attack sample, after the attack sample is input to the deep neural network model, the deep neural network model outputs an error conclusion with high confidence, that is, an output result of the deep neural network model is an error, so that reliability of the deep neural network model is reduced.
Disclosure of Invention
The application provides a virtual confrontation training method of a deep neural network, which comprises the following steps:
inputting a plurality of natural samples into a first sub-network of a deep neural network model to obtain a plurality of initial feature vectors corresponding to the natural samples; wherein the deep neural network model comprises a first sub-network and a second sub-network, the second sub-network comprising a last network layer in the deep neural network model, the first sub-network comprising the remaining network layers except the last network layer;
selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors;
for each seed feature vector, generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector;
inputting all the virtual confrontation feature vectors and all the non-seed feature vectors into a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to the plurality of natural samples;
updating parameters of the deep neural network model based on the plurality of target feature vectors.
Illustratively, for each initial feature vector of the plurality of initial feature vectors, the initial feature vector comprises a plurality of feature values; the selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors comprises: if the difference value between the maximum characteristic value and the second large characteristic value in the initial characteristic vector is smaller than the virtual countermeasure training threshold value, selecting the initial characteristic vector as a seed characteristic vector; and if the difference value between the maximum characteristic value and the second maximum characteristic value in the initial characteristic vector is not less than the virtual countermeasure training threshold value, selecting the initial characteristic vector as a non-seed characteristic vector.
Illustratively, the seed eigenvector includes C eigenvalues, and the perturbation vector corresponding to the seed eigenvector includes C perturbation values, where the C perturbation values are in one-to-one correspondence with the C eigenvalues;
the determining method of the disturbance vector corresponding to the seed feature vector comprises the following steps: determining a perturbation value corresponding to the maximum eigenvalue in the perturbation vector based on the maximum eigenvalue and a second maximum eigenvalue in the seed eigenvector; determining a perturbation value corresponding to a second big eigenvalue in the perturbation vector based on the maximum eigenvalue and the second big eigenvalue in the seed eigenvector; and determining the disturbance value corresponding to the rest characteristic values in the disturbance vector to be 0 aiming at the rest characteristic values except the maximum characteristic value and the second largest characteristic value in the seed characteristic vector.
Illustratively, the perturbation value corresponding to the maximum eigenvalue and the perturbation value corresponding to the second maximum eigenvalue are opposite numbers, and the perturbation value corresponding to the maximum eigenvalue is a negative number;
the determining a perturbation value corresponding to the maximum eigenvalue in the perturbation vector based on the maximum eigenvalue and a second largest eigenvalue in the seed eigenvector comprises:
determining a disturbance value corresponding to the maximum eigenvalue based on the following formula:
Figure 630632DEST_PATH_IMAGE001
the determining, based on the maximum eigenvalue and the second largest eigenvalue in the seed eigenvector, a perturbation value in the perturbation vector corresponding to the second largest eigenvalue includes:
determining a disturbance value corresponding to the second largest eigenvalue based on the following formula:
Figure 923073DEST_PATH_IMAGE002
wherein,
Figure 45750DEST_PATH_IMAGE003
the maximum characteristic value is represented by a value representing the maximum characteristic value,
Figure 5615DEST_PATH_IMAGE004
representing the second largest eigenvalue.
Illustratively, the virtual confrontation feature vector corresponding to the seed feature vector comprises C virtual values, and the C virtual values are in one-to-one correspondence with the C feature values; the generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector includes: for each virtual value in the virtual confrontation feature vector, determining the virtual value based on the sum of the feature value corresponding to the virtual value and the disturbance value corresponding to the feature value.
For example, the determining manner of the virtual confrontation training threshold includes:
generating a disturbance sample corresponding to each natural sample in a classification data set, and inputting the disturbance sample to a deep neural network model to obtain a classification result of the disturbance sample; if the classification result is not matched with the label value of the natural sample, determining the natural sample as a seed sample;
inputting all the seed samples into a first sub-network of a deep neural network model to obtain an initial seed vector corresponding to each seed sample, wherein the initial seed vector comprises a plurality of characteristic values;
for each initial seed vector, determining a difference value between the maximum characteristic value and the second maximum characteristic value in the initial seed vector, and determining a first average value of the difference values corresponding to all the initial seed vectors;
determining the virtual confrontation training threshold based on the first average.
For example, after the perturbation sample is input to the deep neural network model and the classification result of the perturbation sample is obtained, the method further includes: if the classification result is matched with the label value of the natural sample, determining the natural sample as a non-seed sample; inputting all non-seed samples into a first sub-network of a deep neural network model to obtain an initial non-seed vector corresponding to each non-seed sample, wherein the initial non-seed vector comprises a plurality of characteristic values; the determining the virtual confrontation training threshold based on the first average value includes: for each initial non-seed vector, determining a difference value between the maximum characteristic value and the second maximum characteristic value in the initial non-seed vector, and determining a second average value of the difference values corresponding to all the initial non-seed vectors; and if the second average value is larger than the first average value and the difference value between the second average value and the first average value is larger than a preset threshold value, determining the first average value as the virtual countermeasure training threshold value.
Illustratively, the updating the parameters of the deep neural network model based on the plurality of target feature vectors includes: determining a loss value of a target loss function based on the plurality of target feature vectors; and updating the parameters of the deep neural network model based on the loss values.
The application provides a virtual confrontation training device of a deep neural network, comprising: the acquisition module is used for inputting a plurality of natural samples into a first sub-network of the deep neural network model to obtain a plurality of initial feature vectors corresponding to the natural samples; the deep neural network model comprises a first sub-network and a second sub-network, the second sub-network comprises a last network layer in the deep neural network model, and the first sub-network comprises the rest network layers except the last network layer; a selecting module for selecting seed feature vectors and non-seed feature vectors from the plurality of initial feature vectors; the generating module is used for generating a virtual countermeasure characteristic vector corresponding to each seed characteristic vector based on the seed characteristic vector and a disturbance vector corresponding to the seed characteristic vector; the obtaining module is further configured to input all virtual confrontation feature vectors and all non-seed feature vectors to a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to the plurality of natural samples; an updating module for updating parameters of the deep neural network model based on the plurality of target feature vectors.
The application provides a virtual confrontation training device of a deep neural network, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to perform the steps of:
inputting a plurality of natural samples into a first sub-network of a deep neural network model to obtain a plurality of initial feature vectors corresponding to the natural samples; wherein the deep neural network model comprises a first sub-network and a second sub-network, the second sub-network comprising a last network layer in the deep neural network model, the first sub-network comprising the remaining network layers except the last network layer;
selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors;
for each seed feature vector, generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector;
inputting all the virtual confrontation feature vectors and all the non-seed feature vectors into a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to the plurality of natural samples;
updating parameters of the deep neural network model based on the plurality of target feature vectors.
According to the technical scheme, in the embodiment of the application, when the deep neural network model is trained, the virtual confrontation sample can be used for training the deep neural network model, so that the confrontation capacity of the deep neural network model on the attack sample can be improved, and the anti-interference capacity of the deep neural network model on the attack sample is obviously improved. For example, if an attacker adds a small disturbance to an input sample to form an attack sample, after the attack sample is input to the deep neural network model, the deep neural network model may also output a correct conclusion, and may not output an incorrect conclusion with high confidence, that is, the output result of the deep neural network model is correct, thereby improving the reliability of the deep neural network model. When the virtual confrontation sample is used for training the deep neural network model, a real confrontation sample does not need to be generated, namely the real confrontation sample does not need to participate in the confrontation training, the value of the seed characteristic vector is changed on the second last layer of the deep neural network model to obtain the virtual confrontation sample (namely, the disturbance vector is added to the seed characteristic vector to obtain the virtual confrontation characteristic vector), and then the network parameters are updated through back propagation, so that the situation that the gradient of the sample is lost due to multiple times of calculation caused by generation of the real confrontation sample can be avoided, the speed of the confrontation training is greatly improved, and the training time length is shortened. The virtual countermeasure training can enhance the reliability and the defense effect of the deep neural network model, and has little influence on natural sample classification.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present application or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present application.
FIG. 1 is a flow diagram of a method for virtual confrontation training of a deep neural network in one embodiment of the present application;
FIG. 2 is a schematic diagram of a deep neural network model according to an embodiment of the present application;
FIG. 3 is a flow diagram of a method for virtual confrontation training of a deep neural network in one embodiment of the present application;
FIG. 4 is a block diagram of a virtual confrontation training device of a deep neural network in one embodiment of the present application;
FIG. 5 is a block diagram of a virtual confrontation training device of a deep neural network in one embodiment of the present application.
Detailed Description
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".
In order to improve the anti-interference capability of the deep neural network model for the attack samples, in one possible implementation, a training data set may be constructed, and the training data set may include a plurality of natural samples, and on this basis, a countermeasure sample (countermeasure sample) may be added to the training data set, where the countermeasure sample refers to: the challenge sample is formed by adding a slight perturbation to the natural sample in the training dataset.
In summary, the training data set includes a natural sample and a confrontation sample, and when the deep neural network model is trained by using the training data set, the deep neural network model is trained by using the natural sample and the confrontation sample. The confrontation training of the deep neural network model is realized by training the deep neural network model by using the confrontation sample, so that the confrontation capacity of the deep neural network model to the attack sample can be improved.
However, in the above-mentioned method, a real countermeasure sample needs to be generated based on a natural sample, and the process of generating the countermeasure sample takes a relatively long time, resulting in a relatively long training time for the countermeasure training. Because the training data set comprises a large number of confrontation samples, the calculation complexity of the confrontation training is high, the efficiency of the confrontation training is low, and the classification effect on the natural samples can be seriously reduced after the confrontation training is carried out on the deep neural network model.
In contrast to the above, in another possible implementation, a training data set may be constructed, which may include a plurality of natural samples, but does not require the addition of countermeasure samples in the training data set (i.e., does not require the generation of countermeasure samples based on the natural samples), or the deep neural network model is trained using the natural samples when the deep neural network model is trained using the training data set. In the training process of the deep neural network model, the value of the seed characteristic vector is changed on the second last layer of the deep neural network model to obtain a virtual confrontation sample, and then the network parameters are updated through back propagation to realize the virtual confrontation training of the deep neural network model, so that the confrontation capacity of the deep neural network model on the attack sample can be improved.
Illustratively, a virtual confrontation sample is generated by changing a feature vector output by the second last layer of the deep neural network model, adding a disturbance vector to the feature vector, and then performing back propagation to update network parameters (i.e. performing back propagation to update network weights), which is called virtual confrontation training.
In the mode, a real countermeasure sample does not need to be generated based on a natural sample, the time required for generating the countermeasure sample is saved, and the time for updating disturbance through calculating the gradient of the sample by back propagation to generate the countermeasure sample is saved, so that the training speed of virtual countermeasure training is greatly improved, the training time is shortened, and the training efficiency is improved. Because the training data set only comprises the natural samples and does not comprise a large number of confrontation samples, the computational complexity of the virtual confrontation training can be reduced, and the training efficiency is improved. After the virtual confrontation training is carried out on the deep neural network model, the classification effect of the deep neural network model on the natural samples cannot be seriously reduced, and the influence on the natural sample classification is small. The virtual confrontation training can enhance the reliability and the defense effect of the deep neural network model.
The technical solutions of the embodiments of the present application are described below with reference to specific embodiments.
The embodiment of the application provides a virtual confrontation training method for a deep neural network, which can be applied to any electronic device, and is shown in fig. 1, which is a schematic flow diagram of the method, and the method includes:
step 101, inputting a plurality of natural samples into a first sub-network of a deep neural network model to obtain a plurality of initial feature vectors corresponding to the plurality of natural samples. Illustratively, the deep neural network model includes a first sub-network and a second sub-network, the second sub-network may include a last network layer in the deep neural network model, and the first sub-network may include the remaining network layers except the last network layer.
For example, a deep neural network model may be constructed in advance, the deep neural network model is an untrained deep neural network model, the deep neural network model may be trained by using a natural sample, and the structure of the deep neural network model is not limited. For example, the deep neural network model may be a CNN model, a DNN model, an RNN (recurrent neural network) model, or a fully-connected network model, etc., and may include a plurality of network layers, which may include, but are not limited to, convolutional layers (Conv), pooling layers (Pool), excitation layers, fully-connected layers (FC), etc.
For example, all network layers of the deep neural network model may be divided into a first sub-network and a second sub-network, the second sub-network including a last network layer in the deep neural network model, and the first sub-network including the remaining network layers except the last network layer. For example, the deep neural network model includes M network layers, the second sub-network includes the mth network layer, and the first sub-network includes the 1 st network layer to the M-1 st network layer. If M is 5, the first sub-network includes the 1 st network layer (denoted as network layer a 1), the 2 nd network layer (denoted as network layer a 2), the 3 rd network layer (denoted as network layer a 3), and the 4 th network layer (denoted as network layer a 4), and the second sub-network includes the 5 th network layer (denoted as network layer a 5).
For example, when the deep neural network model is trained, a training data set may be obtained, where the training data set includes a plurality of natural samples (for convenience of distinguishing, sample data in the training data set is recorded as natural samples), and the plurality of natural samples in the training data set needs to be input to the deep neural network model.
When a plurality of natural samples are input into the deep neural network model, the natural samples are input into a first sub-network of the deep neural network model to obtain a plurality of initial feature vectors corresponding to the natural samples.
For example, for each natural sample, the natural sample is input to the network layer a1 of the deep neural network model, the network layer a1 processes the natural sample to obtain a feature vector b1, the feature vector b1 is input to the network layer a2 of the deep neural network model, the feature vector b1 is processed by the network layer a2 to obtain a feature vector b2, the feature vector b2 is input to the network layer a3 of the deep neural network model, the feature vector b2 is processed by the network layer a3 to obtain a feature vector b3, the feature vector b3 is input to the network layer a4 of the deep neural network model, and the feature vector b3 is processed by the network layer a4 to obtain a feature vector b 4. So far, the last network layer of the first subnetwork, which is also the penultimate network layer of the deep neural network model, has been reached, and the feature vector b4 output by this network layer is recorded as the initial feature vector.
Obviously, after each natural sample is input into the first sub-network of the deep neural network model, the initial feature vector corresponding to the natural sample can be obtained, i.e. a plurality of initial feature vectors are obtained.
Step 102, selecting seed feature vectors and non-seed feature vectors from a plurality of initial feature vectors.
For example, after obtaining a plurality of initial feature vectors corresponding to a plurality of natural samples, the plurality of initial feature vectors may be divided into seed feature vectors and non-seed feature vectors. The seed feature vector is an initial feature vector that needs to be changed in all initial feature vectors (i.e., the seed feature vector needs to be changed into a virtual countermeasure feature vector), and the non-seed feature vector is an initial feature vector that does not need to be changed in all initial feature vectors (i.e., the non-seed feature vector is kept unchanged and does not need to be changed into a virtual countermeasure feature vector).
In one possible implementation, for each initial feature vector in the plurality of initial feature vectors, the initial feature vector may include a plurality of feature values, e.g., the initial feature vector is a matrix of C1 × C2 dimensions, each value in the matrix is referred to as an feature value, and the initial feature vector is subsequently assumed to include C feature values.
On this basis, if the difference between the maximum eigenvalue and the second maximum eigenvalue in the initial eigenvector is smaller than the virtual countermeasure training threshold, the initial eigenvector can be selected as the seed eigenvector. Alternatively, if the difference between the maximum eigenvalue and the second maximum eigenvalue in the initial eigenvector is not less than the virtual countermeasure training threshold, the initial eigenvector may be selected as a non-seed eigenvector.
For example, for each initial feature vector, the largest feature value and the second largest feature value are first found from the C feature values of the initial feature vector. Then, the difference between the largest eigenvalue and the second largest eigenvalue is calculated. If the difference is smaller than the virtual countermeasure training threshold, the initial feature vector is a seed feature vector, and if the difference is not smaller than the virtual countermeasure training threshold, the initial feature vector is used as a non-seed feature vector.
Step 103, aiming at each seed feature vector, based on the seed feature vector and the perturbation vector corresponding to the seed feature vector, generating a virtual countermeasure feature vector corresponding to the seed feature vector.
For example, for each seed eigenvector, the seed eigenvector may include C eigenvalues (e.g., the initial eigenvector is a matrix of dimensions C1 × C2, each value in the matrix is referred to as an eigenvalue), the perturbation vector to which the seed eigenvector corresponds may include C perturbation values (e.g., the perturbation vector is also a matrix of dimensions C1 × C2, each value in the matrix is referred to as a perturbation value), and the virtual countermeasure eigenvector to which the seed eigenvector corresponds may include C virtual values (e.g., the virtual countermeasure eigenvector is also a matrix of dimensions C1 × C2, each value in the matrix is referred to as a virtual value). To sum up, C virtual values in the virtual countermeasure eigenvector and C eigenvalues in the seed eigenvector may correspond one to one, and C perturbation values in the perturbation vector and C eigenvalues in the seed eigenvector may correspond one to one.
Based on this, generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and the perturbation vector corresponding to the seed feature vector may include: for each virtual value in the virtual confrontation feature vector, determining the virtual value based on the sum of the feature value corresponding to the virtual value and the disturbance value corresponding to the feature value. A virtual confrontation feature vector corresponding to the seed feature vector may be determined based on all virtual values.
For example, it is assumed that the seed eigenvector includes an eigenvalue d11, an eigenvalue d12, an eigenvalue d13, and an eigenvalue d14, the perturbation vector includes a perturbation value d21 (corresponding to an eigenvalue d 11), a perturbation value d22 (corresponding to an eigenvalue d 12), a perturbation value d23 (corresponding to an eigenvalue d 13), and a perturbation value d24 (corresponding to an eigenvalue d 14), and the virtual confrontation eigenvector includes a virtual value d31 (corresponding to an eigenvalue d 11), a virtual value d32 (corresponding to an eigenvalue d 12), a virtual value d33 (corresponding to an eigenvalue d 13), and a virtual value d34 (corresponding to an eigenvalue d 14). The virtual value d31 may be determined based on the sum of the eigenvalue d11 and the disturbance value d21, i.e., d31= d11+ d 21; determining a virtual value d32, namely d32= d12+ d22, based on the sum of the characteristic value d12 and the disturbance value d 22; determining a virtual value d33, namely d33= d13+ d23, based on the sum of the characteristic value d13 and the disturbance value d 23; based on the sum of the eigenvalue d14 and the disturbance value d24, a virtual value d34 is determined, i.e. d34= d14+ d 24. Obviously, after obtaining d31, d32, d33 and d34, the four virtual values constitute a virtual confrontation feature vector.
In summary, for each seed feature vector, a virtual confrontation feature vector corresponding to the seed feature vector may be obtained, and the virtual confrontation feature vector may be used as a virtual confrontation sample.
Illustratively, for each seed feature vector, the seed feature vector includes C feature values, the perturbation vector corresponding to the seed feature vector includes C perturbation values, and the C perturbation values in the perturbation vector correspond to the C feature values in the seed feature vector in a one-to-one manner. Based on this, the determining manner of the perturbation vector corresponding to the seed feature vector may include, but is not limited to: determining a perturbation value corresponding to the maximum eigenvalue in the perturbation vector based on the maximum eigenvalue and the second maximum eigenvalue in the seed eigenvector; determining a perturbation value corresponding to a second big eigenvalue in the perturbation vector based on the maximum eigenvalue and the second big eigenvalue in the seed eigenvector; and determining the disturbance value corresponding to the rest characteristic values in the disturbance vector to be 0 aiming at the rest characteristic values except the maximum characteristic value and the second largest characteristic value in the seed characteristic vector.
For example, assume that the seed eigenvector includes eigenvalue d11, eigenvalue d12, eigenvalue d13 and eigenvalue d14, and that eigenvalue d11 is the largest eigenvalue and eigenvalue d13 is the second largest eigenvalue. The disturbance vector includes a disturbance value d21 (corresponding to a eigenvalue d 11), a disturbance value d22 (corresponding to an eigenvalue d 12), a disturbance value d23 (corresponding to an eigenvalue d 13), and a disturbance value d24 (corresponding to an eigenvalue d 14).
Based on this, the disturbance value d21 (i.e., the disturbance value in the disturbance vector corresponding to the maximum eigenvalue) can be determined based on the eigenvalue d11 and the eigenvalue d 13. The perturbation value d23 (i.e., the perturbation value in the perturbation vector corresponding to the second largest eigenvalue) may be determined based on the eigenvalue d11 and the eigenvalue d 13. It can be determined that the disturbance value d22 and the disturbance value d24 are both 0 (i.e., the disturbance values corresponding to the remaining eigenvalues in the disturbance vector are 0).
In summary, for each seed feature vector, a perturbation vector corresponding to the seed feature vector may be obtained, and then a virtual countermeasure feature vector may be obtained based on the perturbation vector and the seed feature vector.
In a possible implementation manner, the perturbation value corresponding to the maximum eigenvalue and the perturbation value corresponding to the second maximum eigenvalue may be opposite numbers, the perturbation value corresponding to the maximum eigenvalue may be a negative number, and the perturbation value corresponding to the second maximum eigenvalue may be a positive number. For example, determining the perturbation value corresponding to the maximum eigenvalue in the perturbation vector based on the maximum eigenvalue and the second largest eigenvalue in the seed eigenvector may include, but is not limited to: determining a disturbance value corresponding to the maximum eigenvalue based on the following formula:
Figure 922756DEST_PATH_IMAGE005
. Determining a perturbation value in the perturbation vector corresponding to the second largest eigenvalue based on the largest eigenvalue and the second largest eigenvalue in the seed eigenvector may include, but is not limited to: determining a disturbance value corresponding to the second largest eigenvalue based on the following formula:
Figure 968072DEST_PATH_IMAGE002
. Exemplaryly,
Figure 628861DEST_PATH_IMAGE003
the maximum value of the characteristic is represented,
Figure 443233DEST_PATH_IMAGE004
representing the second largest eigenvalue.
For example, when determining the disturbance value d21 based on the characteristic value d11 and the characteristic value d13, the disturbance value d21 may be
Figure 531275DEST_PATH_IMAGE005
And, when determining the disturbance value d23 based on the eigenvalue d11 and the eigenvalue d13, the disturbance value d23 may be
Figure 63887DEST_PATH_IMAGE002
Figure 528367DEST_PATH_IMAGE003
The characteristic value d11 is represented,
Figure 931666DEST_PATH_IMAGE004
representing the characteristic value d 13.
And 104, inputting all the virtual confrontation feature vectors and all the non-seed feature vectors into a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to a plurality of natural samples.
For example, for each natural sample, after the natural sample is input to the first sub-network of the deep neural network model, the initial feature vector corresponding to the natural sample can be obtained. And if the initial characteristic vector is a non-seed characteristic vector, inputting the non-seed characteristic vector to a second sub-network of the deep neural network model, namely the last network layer of the deep neural network model, processing the non-seed characteristic vector by the network layer, and marking the characteristic vector processed by the network layer as a target characteristic vector to obtain the target characteristic vector corresponding to the natural sample. If the initial feature vector is a seed feature vector, generating a virtual countermeasure feature vector corresponding to the seed feature vector, inputting the virtual countermeasure feature vector to a second sub-network of the deep neural network model, namely the last network layer of the deep neural network model, and processing the virtual countermeasure feature vector by the network layer to obtain a target feature vector corresponding to the natural sample.
Obviously, after each natural sample is input to the first sub-network of the deep neural network model, the target feature vector corresponding to the natural sample can be obtained, i.e. a plurality of target feature vectors are obtained.
In step 105, parameters (i.e. network parameters) of the deep neural network model are updated based on the plurality of target feature vectors, that is, network weights of the deep neural network model are updated.
For example, a loss value of the target loss function may be determined based on the plurality of target feature vectors and a parameter of the deep neural network model may be updated based on the loss value. For example, a target loss function may be configured in advance, an input of the target loss function may be a feature vector output by the last network layer of the deep neural network model (that is, a target feature vector may be a target feature vector corresponding to the virtual confrontation feature vector or a target feature vector corresponding to the non-seed feature vector), and an output of the target loss function may be a loss value, and the target loss function is not limited as long as the input-output relationship is satisfied. Based on this, after the target feature vector corresponding to the natural sample is obtained, the target feature vector can be substituted into the target loss function to obtain the loss value of the target loss function.
Then, parameters of the deep neural network model can be updated based on the loss value of the target loss function, that is, network parameters (namely, network weights) of the deep neural network model are updated through a back propagation algorithm, so that the updated deep neural network model is obtained, and the updating process is not limited. An example of a back propagation algorithm may be a gradient descent method, i.e. the network weights of the deep neural network model are updated by a gradient descent method.
And after the updated deep neural network model is obtained, determining whether the deep neural network model is trained completely. And if so, taking the updated deep neural network model as the finally used deep neural network model, and detecting based on the deep neural network model. In the detection process, the input samples can be input into the deep neural network model, and the detection result of the input samples is given by the deep neural network model. If not, the updated deep neural network model is used as the deep neural network model to be trained, the step 101 is returned, a plurality of natural samples are input to a first sub-network of the deep neural network model (the updated deep neural network model), a plurality of initial feature vectors corresponding to the plurality of natural samples are obtained, and the like.
In one possible embodiment, the virtual confrontation training threshold may be configured empirically, or may be determined in the following manner, which is not limited to the manner in which the virtual confrontation training threshold is determined. Generating a disturbance sample corresponding to each natural sample in the classification data set, and inputting the disturbance sample to the deep neural network model to obtain a classification result of the disturbance sample; if the classification result is not matched with the label value of the natural sample, determining the natural sample as a seed sample; inputting all the seed samples into a first sub-network of the deep neural network model to obtain an initial seed vector corresponding to each seed sample, wherein the initial seed vector comprises a plurality of characteristic values; for each initial seed vector, determining a difference value between the maximum characteristic value and the second maximum characteristic value in the initial seed vector, and determining a first average value of the difference values corresponding to all the initial seed vectors; a virtual confrontation training threshold is determined based on the first average.
For example, a classification dataset may be constructed, which may be different from the training dataset, or some natural samples (a small number of natural samples) may be selected from the training dataset, and the classification dataset may be constructed using these natural samples. Each sample data in the classified data set is marked as a natural sample, and each natural sample in the classified data set can have a label value, and the label value represents a classification result of the natural sample.
For each natural sample in the classification dataset, a perturbation sample (i.e., a challenge sample) corresponding to the natural sample may be generated, and the generation manner of the challenge sample is not limited, for example, a slight perturbation is added to the natural sample to form a perturbation sample corresponding to the natural sample.
After obtaining the perturbation sample corresponding to the natural sample, the perturbation sample may be input to the deep neural network model (i.e., sequentially pass through the first sub-network and the second sub-network), a feature vector is output by a last network layer (i.e., the second sub-network) of the deep neural network model, and the deep neural network model may obtain a classification result of the perturbation sample based on the feature vector and output the classification result of the perturbation sample.
After the classification result of the disturbance sample is obtained, if the classification result is not matched with the label value of the natural sample, it indicates that the classification result of the deep neural network model on the natural sample is wrong, and the natural sample is determined as a seed sample. And if the classification result is matched with the label value of the natural sample, the classification result of the deep neural network model on the natural sample is correct, and the natural sample is determined to be a non-seed sample.
In summary, after the disturbance samples corresponding to all the natural samples in the classification data set are input to the deep neural network model, all the natural samples in the classification data set can be divided into seed samples and non-seed samples based on the classification result of each disturbance sample, so as to obtain a plurality of seed samples and a plurality of non-seed samples.
After all the natural samples in the classification dataset are divided into seed samples and non-seed samples, all the seed samples (which are not the disturbance samples corresponding to the natural samples) may be input to the first sub-network of the deep neural network model, so as to obtain an initial seed vector corresponding to each seed sample, where the initial seed vector may include a plurality of feature values. For example, referring to step 101, the deep neural network model includes a first sub-network and a second sub-network, all the seed samples may be input into the first sub-network of the deep neural network model, a feature vector corresponding to each seed sample is output by the first sub-network, and the feature vector is recorded as an initial seed vector.
Based on each initial seed vector output by the first subnetwork, the largest eigenvalue and the second largest eigenvalue can be found from all eigenvalues of this initial seed vector, and the difference between the largest eigenvalue and the second largest eigenvalue is calculated, i.e. one difference for each initial seed vector. Then, the average of the differences corresponding to all the initial seed vectors is determined (the average is taken as the first average). A virtual confrontation training threshold is then determined based on the first average, e.g., the virtual confrontation training threshold is the first average.
In summary, the virtual confrontation training threshold can be obtained, and in another possible implementation, the virtual confrontation training threshold can be determined as follows. After all the natural samples in the classification dataset are divided into seed samples and non-seed samples, all the non-seed samples (which are not the disturbance samples corresponding to the natural samples) can be input to the first sub-network of the deep neural network model, so as to obtain an initial non-seed vector corresponding to each non-seed sample, where the initial non-seed vector may include a plurality of eigenvalues.
Based on each initial non-seed vector output by the first subnetwork, the largest eigenvalue and the second largest eigenvalue may be found from all eigenvalues of the initial non-seed vector, and the difference between the largest eigenvalue and the second largest eigenvalue is calculated, i.e. one difference for each initial non-seed vector. Then, the average of the differences corresponding to all the initial non-seed vectors is determined (the average is taken as the second average).
If the second average value is greater than the first average value and the difference between the second average value and the first average value is greater than a preset threshold (which may be configured empirically), the first average value is determined as the virtual confrontation training threshold. If the second average value is not greater than the first average value, or the second average value is greater than the first average value, but the difference between the second average value and the first average value is not greater than a preset threshold, not determining the first average value as a virtual confrontation training threshold, constructing a new classification data set, and repeating the steps to determine the virtual confrontation training threshold.
According to the technical scheme, in the embodiment of the application, when the deep neural network model is trained, the virtual confrontation sample can be used for training the deep neural network model, so that the confrontation capacity of the deep neural network model on the attack sample can be improved, and the anti-interference capacity of the deep neural network model on the attack sample is obviously improved. For example, if an attacker adds a small disturbance to an input sample to form an attack sample, after the attack sample is input to the deep neural network model, the deep neural network model may also output a correct conclusion, and may not output an incorrect conclusion with high confidence, that is, the output result of the deep neural network model is correct, thereby improving the reliability of the deep neural network model. When the virtual confrontation sample is used for training the deep neural network model, a real confrontation sample does not need to be generated, namely the real confrontation sample does not need to participate in the confrontation training, the value of the seed characteristic vector is changed on the second last layer of the deep neural network model to obtain the virtual confrontation sample (namely, the disturbance vector is added to the seed characteristic vector to obtain the virtual confrontation characteristic vector), and then the network parameters are updated through back propagation, so that the situation that the gradient of the sample is lost due to multiple times of calculation caused by generation of the real confrontation sample can be avoided, the speed of the confrontation training is greatly improved, and the training time length is shortened. The virtual countermeasure training can enhance the reliability and the defense effect of the deep neural network model, and has little influence on natural sample classification.
The following describes the technical solution of the embodiment of the present application with reference to a specific application scenario.
Before the technical solution of the embodiment of the present application is introduced, concepts related to the embodiment of the present application are introduced:
training data set: defining a training data set
Figure 957653DEST_PATH_IMAGE006
Figure 711983DEST_PATH_IMAGE007
A natural sample is represented by a sample of,
Figure 714574DEST_PATH_IMAGE008
Figure 503538DEST_PATH_IMAGE009
is composed of
Figure 933382DEST_PATH_IMAGE007
Class labels of, i.e. natural samples
Figure 175008DEST_PATH_IMAGE007
N represents the total number of natural samples.
Deep neural network model: definition of
Figure 981290DEST_PATH_IMAGE010
Is a parameter of
Figure 624761DEST_PATH_IMAGE011
The deep neural network model of (2), the deep neural network model is a pre-trained neural network model, i.e. a deep neural network model that needs to be trained. The deep neural network model can be expressed as
Figure 225506DEST_PATH_IMAGE012
The feature vector (Logit) of the second to last layer output of the deep neural network model is recorded as
Figure 954428DEST_PATH_IMAGE013
The feature vector (i.e. probability vector) output from the last layer of the deep neural network model is recorded as
Figure 298822DEST_PATH_IMAGE014
. Feature vectors (i.e., probability vectors) output with respect to the last layer:
Figure 62378DEST_PATH_IMAGE015
Figure 568446DEST_PATH_IMAGE016
Figure 50243DEST_PATH_IMAGE017
indicates that the sample is discriminated as the second
Figure 932748DEST_PATH_IMAGE018
The degree of confidence of the class or classes,
Figure 550811DEST_PATH_IMAGE019
is a natural sample
Figure 758939DEST_PATH_IMAGE007
The predicted classification label of (1).
The challenge sample: for natural samples
Figure 196873DEST_PATH_IMAGE020
The correct class label of the natural sample is
Figure 883070DEST_PATH_IMAGE021
. If there is a disturbance
Figure 355639DEST_PATH_IMAGE022
Figure 469089DEST_PATH_IMAGE023
So that
Figure 659899DEST_PATH_IMAGE024
Satisfy the requirement of
Figure 884207DEST_PATH_IMAGE025
And is and
Figure 476862DEST_PATH_IMAGE026
then call
Figure 761213DEST_PATH_IMAGE027
To fight the sample.
Virtual confrontation sample: for natural samples
Figure 439319DEST_PATH_IMAGE020
The correct class label of the natural sample is
Figure 965853DEST_PATH_IMAGE021
. If the feature vector (Logit) of the natural sample output by the second last layer of the deep neural network model is
Figure 616277DEST_PATH_IMAGE028
Adding perturbations on the Logit, i.e.
Figure 337108DEST_PATH_IMAGE029
If there is a corresponding one in the input space (e.g. natural sample space)
Figure 1046DEST_PATH_IMAGE030
Satisfy the following requirements
Figure 832736DEST_PATH_IMAGE031
Then will be
Figure 134404DEST_PATH_IMAGE032
Referred to as virtual confrontation samples.
Illustratively, the training data set may be
Figure 760557DEST_PATH_IMAGE033
Is divided into
Figure 413256DEST_PATH_IMAGE034
Sub data sets (minipatch), each sub data set comprising
Figure 783057DEST_PATH_IMAGE035
Setting T epochs (one-time complete training of the deep neural network model by using all data of a training data set is called as one-generation training, namely epochs) of a natural sample, wherein the T epochs represent the T times of complete training of the deep neural network model, and setting a virtual confrontation training threshold value as
Figure 204811DEST_PATH_IMAGE036
All natural samples in the first sub-data set are input into parameters of
Figure 267445DEST_PATH_IMAGE037
The natural samples pass through a second last layer of the deep neural network model and then output a plurality of initial samples of the second last layerThe feature vector is noted as
Figure 876281DEST_PATH_IMAGE038
Figure 315352DEST_PATH_IMAGE035
Corresponding to a natural sample
Figure 591613DEST_PATH_IMAGE035
An initial feature vector. For each initial feature vector, the initial feature vector is noted as
Figure 559569DEST_PATH_IMAGE039
Figure 921280DEST_PATH_IMAGE040
And representing an initial feature vector corresponding to the jth natural sample, wherein the initial feature vector comprises C feature values.
From
Figure 367305DEST_PATH_IMAGE035
Selecting the initial characteristic vector to satisfy the condition
Figure 763651DEST_PATH_IMAGE041
Is/are as follows
Figure 168088DEST_PATH_IMAGE042
Composition of
Figure 751516DEST_PATH_IMAGE043
To do so
Figure 266811DEST_PATH_IMAGE043
It is the feature vector of the seed that,
Figure 252084DEST_PATH_IMAGE035
the remaining initial feature vectors except the seed feature vector in the initial feature vector are non-seed feature vectors, and the remaining non-seed feature vectors are recorded as
Figure 827422DEST_PATH_IMAGE044
Illustratively, the conditions satisfied in the initial feature vector
Figure 163726DEST_PATH_IMAGE041
Is/are as follows
Figure 217132DEST_PATH_IMAGE042
The method comprises the following steps: initial feature vector
Figure 322492DEST_PATH_IMAGE042
Is less than the virtual countermeasure training threshold
Figure 570196DEST_PATH_IMAGE036
Then, will
Figure 128216DEST_PATH_IMAGE045
Directly input to the last network layer of the deep neural network model, and
Figure 250893DEST_PATH_IMAGE043
adding corresponding perturbations
Figure 210758DEST_PATH_IMAGE046
And obtaining a virtual confrontation sample, and inputting the virtual confrontation sample to the last network layer of the deep neural network model. The last network layer is based on
Figure 127899DEST_PATH_IMAGE045
Determining loss value by the virtual countermeasure sample, and then updating parameters by back propagation
Figure 949048DEST_PATH_IMAGE037
Is composed of
Figure 140995DEST_PATH_IMAGE047
Thus completing the training process of one subdata set.
For the residueRepeating the training process by the remaining K-1 sub-data sets until the parameters are changed from
Figure 752105DEST_PATH_IMAGE048
Is updated to
Figure 371305DEST_PATH_IMAGE049
And completing the training of an epoch, namely performing one-time complete training on the deep neural network model by using all data of the training data set. After repeating the training process for T times, the training process of the deep neural network model can be completed, and the deep neural network model which needs to be output finally is obtained.
In one possible implementation, the training process of the deep neural network model may be as shown in fig. 2, and the deep neural network model may include a plurality of network layers, for example, 9 network layers in fig. 2. The first network layer is conv1, the second network layer is conv2, the third network layer is conv3, the fourth network layer is conv4, the fifth network layer is conv5, the sixth network layer is fc6, the seventh network layer is fc7, the eighth network layer is fc8, and the ninth network layer is softmax. Obviously, fc8 is the penultimate network layer of the deep neural network model, and softmax is the last network layer of the deep neural network model.
Referring to fig. 2, with respect to the initial feature vector (Logit) output from fc8, it is determined whether the Logit satisfies the condition
Figure 903918DEST_PATH_IMAGE041
And if so, adding corresponding disturbance to the Logit
Figure 601353DEST_PATH_IMAGE046
To obtain a virtual confrontation sample (Logit +)
Figure 535811DEST_PATH_IMAGE046
) And will (Logit +
Figure 794754DEST_PATH_IMAGE046
) Input to softmax. If not, thenThe Logit is directly input to softmax.
In the above application scenario, referring to fig. 3, a schematic flow chart of a virtual confrontation training method for a deep neural network is shown, where the method may be applied to any electronic device, and the method may include:
step 301, dividing the training data set D into K sub-data sets (minipatch), where each sub-data set includes M natural samples, T epochs are set, and a virtual countermeasure training threshold is set as
Figure 549084DEST_PATH_IMAGE036
Step 302, inputting all natural samples in the first sub-data set to the deep neural network model, outputting a plurality of initial feature vectors (Logit) at the second last layer after the natural samples pass through the second last layer of the deep neural network model, and recording the initial feature vectors as the Logit
Figure 551675DEST_PATH_IMAGE038
For each initial feature vector, the initial feature vector is denoted as
Figure 340639DEST_PATH_IMAGE039
Figure 770483DEST_PATH_IMAGE040
An initial feature vector corresponding to the jth natural sample is represented, and the initial feature vector may include C feature values.
Step 303, selecting an initial feature vector satisfying the condition from all initial feature vectors (e.g., M initial feature vectors) as a seed feature vector, and using the remaining initial feature vectors except the seed feature vector in all initial feature vectors as non-seed feature vectors. Illustratively, the seed feature vector satisfies the condition: the difference between the largest eigenvalue and the second largest eigenvalue in the seed eigenvector is less than the virtual countermeasure training threshold
Figure 12109DEST_PATH_IMAGE036
Seed and breedThe sub-feature vector satisfying the condition may be represented as follows:
Figure 818391DEST_PATH_IMAGE041
. All seed feature vectors can be recorded as
Figure 461862DEST_PATH_IMAGE043
All non-seed feature vectors are noted as
Figure 328187DEST_PATH_IMAGE044
And 304, adding corresponding disturbance vectors to the seed characteristic vectors to obtain virtual countermeasure characteristic vectors.
Step 305, inputting all the virtual confrontation feature vectors to the last network layer of the deep neural network model, and inputting all the non-seed feature vectors to the last network layer of the deep neural network model.
Step 306, determining a loss value based on all the virtual confrontation feature vectors (i.e. virtual confrontation samples) and all the non-seed feature vectors through the last network layer of the deep neural network model, and then performing back propagation to update the network parameters of the deep neural network model, thereby completing the training process of one sub data set.
And for the rest K-1 sub data sets, repeating the training process by adopting the steps 302-306, continuously updating the network parameters of the deep neural network model, and finishing the training of an epoch.
Then, for all the M sub-data sets, the training process is repeated by adopting steps 302 to 306, the network parameters of the deep neural network model are continuously updated, and the training of another epoch is completed.
And repeating the process continuously by analogy until the training of the T eopchs is finished.
In step 304, for each seed feature vector, a corresponding perturbation vector needs to be added to the seed feature vector to obtain a virtual countermeasure feature vector, so that before step 304, a perturbation vector needs to be determined for the seed feature vector, and a determination manner of the perturbation vector can be shown in formula (1).
Figure 791529DEST_PATH_IMAGE050
Formula (1)
In the formula (1), the first and second groups,
Figure 401502DEST_PATH_IMAGE051
and representing a disturbance vector which comprises C disturbance values, wherein the C disturbance values in the disturbance vector correspond to the C characteristic values in the seed characteristic vector one by one.
Figure 165059DEST_PATH_IMAGE003
Represents the largest eigenvalue in the seed eigenvector,
Figure 671126DEST_PATH_IMAGE004
representing the second largest eigenvalue in the seed eigenvector,j=ythe perturbation value in time corresponds to the largest eigenvalue in the seed eigenvector,j=sthe perturbation value in time corresponds to the second largest eigenvalue in the seed eigenvector,jis not equal toyAnd isjIs not equal tosThe perturbation value of the time corresponds to the rest of the eigenvalues in the seed eigenvector except the largest eigenvalue and the second largest eigenvalue.
The derivation process of equation (1) is described below with reference to a specific embodiment.
Assuming a natural sample
Figure 152923DEST_PATH_IMAGE052
True tag of
Figure 35429DEST_PATH_IMAGE021
And natural sample
Figure 653492DEST_PATH_IMAGE052
Predictive label of
Figure 596040DEST_PATH_IMAGE053
In agreement, i.e.
Figure 33974DEST_PATH_IMAGE054
Assume a virtual confrontation sample
Figure 985750DEST_PATH_IMAGE030
The output at Softmax (i.e., the last network layer of the deep neural network model) is:
Figure 458320DEST_PATH_IMAGE055
Figure 571769DEST_PATH_IMAGE016
Figure 762579DEST_PATH_IMAGE056
is the output label of the virtual confrontation sample. From the definition of the challenge sample:
Figure 753931DEST_PATH_IMAGE057
that is to say have
Figure 346586DEST_PATH_IMAGE058
And therefore, the first and second electrodes are,
Figure 630937DEST_PATH_IMAGE059
is equivalent to
Figure 43464DEST_PATH_IMAGE060
For example, assuming that the function of Softmax of the last network layer of the deep neural network model is a monotonically increasing function, the above inequality may be equivalent to
Figure 337042DEST_PATH_IMAGE061
Thus, the process of generating a virtual confrontation sample can be represented as the following optimization problem (1):
Figure 518625DEST_PATH_IMAGE062
Figure 239456DEST_PATH_IMAGE063
Figure 139279DEST_PATH_IMAGE064
problem (1)
In order to minimize the objective function in the problem (1), the minimum disturbance is found and the constraint condition is satisfied, so that the objective function needs to be minimized in the following steps
Figure 970969DEST_PATH_IMAGE065
Figure 272637DEST_PATH_IMAGE066
) Find the second largest value in, and note it as
Figure 898790DEST_PATH_IMAGE067
So that
Figure 285909DEST_PATH_IMAGE068
. It is obvious that
Figure 921290DEST_PATH_IMAGE069
When the temperature of the water is higher than the set temperature,
Figure 77465DEST_PATH_IMAGE070
to the minimum, in this case,
Figure 140099DEST_PATH_IMAGE071
therefore, problem (1) can be equated to problem (2):
Figure 14514DEST_PATH_IMAGE072
Figure 922427DEST_PATH_IMAGE063
Figure 198688DEST_PATH_IMAGE068
Figure 432223DEST_PATH_IMAGE073
Figure 528355DEST_PATH_IMAGE066
Figure 505538DEST_PATH_IMAGE074
Figure 901884DEST_PATH_IMAGE075
problem (2)
Illustratively, by relaxing constraints, requirements
Figure 40742DEST_PATH_IMAGE076
Then, what is sought is
Figure 624170DEST_PATH_IMAGE077
And
Figure 90530DEST_PATH_IMAGE078
can be an approximately optimal solution to problem (2). Obviously, due to
Figure 75803DEST_PATH_IMAGE079
Is also solved by
Figure 651141DEST_PATH_IMAGE080
Therefore, the above problem (2) can be expressed as a problem (3):
Figure 721865DEST_PATH_IMAGE081
Figure 775272DEST_PATH_IMAGE063
Figure 880631DEST_PATH_IMAGE076
problem (3)
Illustratively, when solving the problem (3) using the Lagrangian multiplier method, the Lagrangian can be writtenThe Lanri function is:
Figure 128335DEST_PATH_IMAGE082
and the optimality condition meets the following conditions:
Figure 686356DEST_PATH_IMAGE083
in conclusion, it can be seen that when
Figure 543453DEST_PATH_IMAGE084
When the temperature of the water is higher than the set temperature,
Figure 768898DEST_PATH_IMAGE085
Figure 686039DEST_PATH_IMAGE086
Figure 465776DEST_PATH_IMAGE087
(ii) a And when
Figure 126564DEST_PATH_IMAGE088
When the temperature of the water is higher than the set temperature,
Figure 206516DEST_PATH_IMAGE089
Figure 28978DEST_PATH_IMAGE090
Figure 561591DEST_PATH_IMAGE091
. It is obvious that
Figure 26070DEST_PATH_IMAGE084
Figure 694949DEST_PATH_IMAGE085
Figure 953892DEST_PATH_IMAGE086
Figure 973800DEST_PATH_IMAGE087
It is not satisfactory as a solution to the problem (3). In summary, for problem (3), the near-optimal solution may be
Figure 241971DEST_PATH_IMAGE092
Figure 765356DEST_PATH_IMAGE092
A perturbation vector is represented, which is shown with reference to equation (1). In physical space, there is a virtual confrontation sample
Figure 726359DEST_PATH_IMAGE093
Satisfy the following requirements
Figure 967984DEST_PATH_IMAGE094
In step 303, it is necessary to select seed feature vectors and non-seed feature vectors from all the initial feature vectors, that is, the initial feature vectors meeting the condition are used as the seed feature vectors, and the remaining initial feature vectors except the seed feature vectors are used as the non-seed feature vectors, that is, the initial feature vectors not meeting the condition are used as the non-seed feature vectors. For example, if the difference between the largest eigenvalue and the second largest eigenvalue in the initial eigenvector is smaller than the virtual countermeasure training threshold
Figure 774266DEST_PATH_IMAGE095
Then the initial feature vector may be used as the seed feature vector. If the difference between the maximum eigenvalue and the second maximum eigenvalue in the initial eigenvector is not less than the virtual countermeasure training threshold
Figure 417737DEST_PATH_IMAGE095
Then the initial feature vector may be used as a non-seed feature vector.
For example, for virtual confrontation training, if all natural samples become virtual confrontation samples to participate in training, the classification effect of the deep neural network model on the natural samples is affected, so in this embodiment, some constraint conditions may be given to select to make part of the natural samples become virtual confrontation samples to participate in training, and the rest of the natural samples do not need to become virtual confrontation samples to directly participate in training.
Based on the above, for the initial feature vector corresponding to the natural sample, if the initial feature vector satisfies the constraint condition, the initial feature vector is used as the seed feature vector, and the perturbation vector is added to the seed feature vector
Figure 782597DEST_PATH_IMAGE096
Making it a virtual confrontation sample to participate in training. If the initial characteristic vector does not meet the constraint condition, the initial characteristic vector is used as a non-seed characteristic vector, the non-seed characteristic vector directly participates in training, and a disturbance vector does not need to be added to the non-seed characteristic vector
Figure 245940DEST_PATH_IMAGE096
Therefore, all natural samples are prevented from being changed into virtual confrontation samples to participate in training, and the classification effect of the deep neural network model on the natural samples can be improved.
For the initial feature vector corresponding to the natural sample, the difference between the maximum feature value and the second largest feature value in the initial feature vector is
Figure 855912DEST_PATH_IMAGE097
Can utilize
Figure 353890DEST_PATH_IMAGE098
As a constraint condition (
Figure 125537DEST_PATH_IMAGE095
To simulate a training threshold), i.e., if the initial feature vector corresponds to
Figure 341754DEST_PATH_IMAGE098
If not, the initial feature vector is used as a non-seed feature vector.
In the above embodiment, the selection of the virtual confrontation training threshold is involved, the training durations corresponding to different virtual confrontation training thresholds are different, and the deep neural network models obtained by training with different virtual confrontation training thresholds have different prediction effects on natural samples. Obviously, the smaller the virtual confrontation training threshold, the shorter the training time required by the virtual confrontation training, and the better the prediction effect of the deep neural network model on the natural sample. In summary, a smaller virtual confrontation training threshold can be selected empirically, but the virtual confrontation training threshold cannot be too small, and it is necessary to ensure that enough virtual confrontation samples participate in training.
In one possible embodiment, in addition to empirically selecting the virtual confrontation training threshold, the virtual confrontation training threshold may be determined in the following manner, which is not limited to this determination.
Classification datasets, such as reference datasets like CIFAR-10 and ImageNet (30), may be obtained first. And generating a disturbance sample corresponding to each natural sample in the classification data set.
And judging whether the disturbance samples are classified correctly, taking the natural samples corresponding to the disturbance samples classified correctly as seed samples, and taking the natural samples corresponding to the disturbance samples classified incorrectly as non-seed samples. For example, for each disturbance sample, inputting the disturbance sample to the deep neural network model to obtain a classification result of the disturbance sample; if the classification result is not matched with the label value of the natural sample, the natural sample is used as a seed sample; and if the classification result is matched with the label value of the natural sample, the natural sample is used as a non-seed sample.
Inputting all the seed samples into a first sub-network of the deep neural network model to obtain an initial seed vector corresponding to each seed sample, finding a maximum characteristic value and a second large characteristic value from all characteristic values of the initial seed vector, calculating a difference value between the maximum characteristic value and the second large characteristic value, namely, each initial seed vector corresponds to one difference value, and determining a first average value of the difference values corresponding to all the initial seed vectors. Inputting all the non-seed samples into a first sub-network of the deep neural network model to obtain an initial non-seed vector corresponding to each non-seed sample, finding a maximum characteristic value and a second large characteristic value from all characteristic values of the initial non-seed vector, calculating a difference value between the maximum characteristic value and the second large characteristic value, namely, each initial non-seed vector corresponds to one difference value, and determining a second average value of the difference values corresponding to all the initial non-seed vectors.
And if the second average value is larger than the first average value and the difference value between the second average value and the first average value is larger than a preset threshold value, determining the first average value as a virtual confrontation training threshold value. If the second average value is not greater than the first average value, or the second average value is greater than the first average value, but the difference between the second average value and the first average value is not greater than a preset threshold, the first average value is not determined as the virtual confrontation training threshold.
Obviously, since the first average value represents the average value of the LDs of the initial seed vectors corresponding to all the seed samples, and the virtual confrontation training does not require that all the seed samples can generate the virtual confrontation samples, it can be known from the central limit theorem that when there are enough natural samples, the distribution of the LDs approximately follows a normal distribution, and when the LD is smaller than the first average value (i.e., the virtual confrontation training threshold) as a condition for selecting the seed samples, most of the seed samples can be selected, i.e., the virtual confrontation training threshold determined in the above manner is relatively reliable.
According to the technical scheme, the method and the device for improving the anti-interference capability of the deep neural network model on the attack sample can improve the anti-interference capability of the deep neural network model on the attack sample, and improve the reliability of the deep neural network model. The method can avoid calculating the gradient of the loss to the sample for many times due to the generation of the real confrontation sample, greatly improve the speed of the confrontation training and shorten the training time. The virtual countermeasure training can enhance the reliability and the defense effect of the deep neural network model, and has little influence on natural sample classification. The deep neural network model has high classification precision on natural samples, good defense effect of the confrontation training, short training time required by the confrontation training and correct classification on the natural samplesThe prediction accuracy is better. Virtual confrontation training does not seriously degrade the prediction of natural samples because: virtual confrontation training to satisfy
Figure 489839DEST_PATH_IMAGE098
The natural samples are converted into correctly classified samples and virtual confrontation samples with the probability of 50%, the samples are located on the decision boundary, the decision boundary is equivalently finely adjusted when the deep neural network model updates parameters, and the virtual confrontation training enables the change amplitude of the decision boundary to be smaller, namely the classification effect of the natural samples is less influenced.
Based on the same application concept as the above method, in the embodiment of the present application, a virtual confrontation training apparatus for a deep neural network is provided, as shown in fig. 4, which is a schematic structural diagram of the apparatus, and the apparatus may include: an obtaining module 41, configured to input a plurality of natural samples into a first sub-network of a deep neural network model, to obtain a plurality of initial feature vectors corresponding to the plurality of natural samples; wherein the deep neural network model may include a first sub-network and a second sub-network, the second sub-network including a last network layer in the deep neural network model, the first sub-network including the remaining network layers except the last network layer; a selecting module 42, configured to select a seed feature vector and a non-seed feature vector from the multiple initial feature vectors; a generating module 43, configured to generate, for each seed feature vector, a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector; the obtaining module 41 is further configured to input all virtual confrontation feature vectors and all non-seed feature vectors to a second sub-network of the deep neural network model, so as to obtain a plurality of target feature vectors corresponding to the plurality of natural samples; an updating module 44, configured to update parameters of the deep neural network model based on the plurality of target feature vectors.
In one possible implementation, for each initial feature vector of the plurality of initial feature vectors, the initial feature vector comprises a plurality of feature values; the selecting module 42 is specifically configured to, when selecting the seed feature vector and the non-seed feature vector from the plurality of initial feature vectors:
if the difference value between the maximum characteristic value and the second large characteristic value in the initial characteristic vector is smaller than the virtual countermeasure training threshold value, selecting the initial characteristic vector as a seed characteristic vector;
and if the difference value between the maximum characteristic value and the second maximum characteristic value in the initial characteristic vector is not less than the virtual countermeasure training threshold value, selecting the initial characteristic vector as a non-seed characteristic vector.
In a possible implementation manner, the seed eigenvector includes C eigenvalues, and the perturbation vector corresponding to the seed eigenvector includes C perturbation values, where the C perturbation values are in one-to-one correspondence with the C eigenvalues; the generating module 43 is further configured to determine a perturbation vector corresponding to the seed feature vector by: determining a perturbation value corresponding to the maximum eigenvalue in the perturbation vector based on the maximum eigenvalue and a second maximum eigenvalue in the seed eigenvector; determining a perturbation value corresponding to a second big eigenvalue in the perturbation vector based on the maximum eigenvalue and the second big eigenvalue in the seed eigenvector; and determining the disturbance value corresponding to the rest characteristic values in the disturbance vector to be 0 aiming at the rest characteristic values except the maximum characteristic value and the second largest characteristic value in the seed characteristic vector.
In a possible implementation manner, the perturbation value corresponding to the maximum eigenvalue and the perturbation value corresponding to the second maximum eigenvalue are opposite numbers, and the perturbation value corresponding to the maximum eigenvalue is a negative number;
the generating module 43 is specifically configured to, based on the maximum eigenvalue and the second maximum eigenvalue in the seed eigenvector, determine a perturbation value corresponding to the maximum eigenvalue in the perturbation vector:
determining a disturbance value corresponding to the maximum eigenvalue based on the following formula:
Figure 842323DEST_PATH_IMAGE005
the generating module 43 is specifically configured to, when determining the perturbation value corresponding to the second large eigenvalue in the perturbation vector based on the maximum eigenvalue and the second large eigenvalue in the seed eigenvector:
determining a disturbance value corresponding to the second largest eigenvalue based on the following formula:
Figure 50450DEST_PATH_IMAGE099
wherein,
Figure 753964DEST_PATH_IMAGE100
the maximum characteristic value is represented by a value representing the maximum characteristic value,
Figure 440161DEST_PATH_IMAGE101
representing the second largest eigenvalue.
In a possible implementation, the virtual confrontation eigenvector corresponding to the seed eigenvector includes C virtual values, and the C virtual values are in one-to-one correspondence with the C eigenvalues;
the generating module 43 is specifically configured to, based on the seed feature vector and the perturbation vector corresponding to the seed feature vector, generate a virtual countermeasure feature vector corresponding to the seed feature vector:
for each virtual value in the virtual confrontation feature vector, determining the virtual value based on the sum of the feature value corresponding to the virtual value and the disturbance value corresponding to the feature value.
In a possible embodiment, the selecting module 42 is further configured to determine the virtual confrontation training threshold by: generating a disturbance sample corresponding to each natural sample in a classification data set, and inputting the disturbance sample to a deep neural network model to obtain a classification result of the disturbance sample; if the classification result is not matched with the label value of the natural sample, determining the natural sample as a seed sample; inputting all the seed samples into a first sub-network of a deep neural network model to obtain an initial seed vector corresponding to each seed sample, wherein the initial seed vector comprises a plurality of characteristic values;
for each initial seed vector, determining a difference value between the maximum characteristic value and the second maximum characteristic value in the initial seed vector, and determining a first average value of the difference values corresponding to all the initial seed vectors;
determining the virtual confrontation training threshold based on the first average.
Based on the same application concept as the method, the embodiment of the present application proposes a virtual confrontation training device (i.e. an electronic device) of a deep neural network, and as shown in fig. 5, the device includes: a processor 51 and a machine-readable storage medium 52, the machine-readable storage medium 52 storing machine-executable instructions executable by the processor 51; the processor 51 is operable to execute machine executable instructions to perform the steps of:
inputting a plurality of natural samples into a first sub-network of a deep neural network model to obtain a plurality of initial feature vectors corresponding to the natural samples; wherein the deep neural network model comprises a first sub-network and a second sub-network, the second sub-network comprising a last network layer in the deep neural network model, the first sub-network comprising the remaining network layers except the last network layer;
selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors;
for each seed feature vector, generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector;
inputting all the virtual confrontation feature vectors and all the non-seed feature vectors into a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to the plurality of natural samples;
updating parameters of the deep neural network model based on the plurality of target feature vectors.
Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, on which a plurality of computer instructions are stored, and when the computer instructions are executed by a processor, the method for training virtual confrontation of a deep neural network disclosed in the above example of the present application can be implemented.
The machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A virtual confrontation training method for a deep neural network, the method comprising:
acquiring a training data set, wherein the training data set comprises a plurality of image data;
inputting the plurality of image data into a first sub-network of a deep neural network model to obtain a plurality of initial feature vectors corresponding to the plurality of image data; wherein the deep neural network model comprises a first sub-network and a second sub-network, the second sub-network comprising a last network layer in the deep neural network model, the first sub-network comprising the remaining network layers except the last network layer;
selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors;
for each seed feature vector, generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector;
inputting all the virtual confrontation feature vectors and all the non-seed feature vectors into a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to the plurality of image data;
and updating parameters of the deep neural network model based on the target feature vectors so as to detect input data through the updated deep neural network model.
2. The method of claim 1, wherein for each initial feature vector of the plurality of initial feature vectors, the initial feature vector comprises a plurality of feature values; the selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors comprises:
if the difference value between the maximum characteristic value and the second large characteristic value in the initial characteristic vector is smaller than the virtual countermeasure training threshold value, selecting the initial characteristic vector as a seed characteristic vector;
and if the difference value between the maximum characteristic value and the second maximum characteristic value in the initial characteristic vector is not less than the virtual countermeasure training threshold value, selecting the initial characteristic vector as a non-seed characteristic vector.
3. The method according to claim 1 or 2,
the seed characteristic vector comprises C characteristic values, the disturbance vector corresponding to the seed characteristic vector comprises C disturbance values, and the C disturbance values are in one-to-one correspondence with the C characteristic values;
the determining method of the disturbance vector corresponding to the seed feature vector comprises the following steps:
determining a perturbation value corresponding to the maximum eigenvalue in the perturbation vector based on the maximum eigenvalue and a second maximum eigenvalue in the seed eigenvector;
determining a perturbation value corresponding to a second big eigenvalue in the perturbation vector based on the maximum eigenvalue and the second big eigenvalue in the seed eigenvector;
and determining the disturbance value corresponding to the rest characteristic values in the disturbance vector to be 0 aiming at the rest characteristic values except the maximum characteristic value and the second largest characteristic value in the seed characteristic vector.
4. The method of claim 3,
the disturbance value corresponding to the maximum characteristic value and the disturbance value corresponding to the second large characteristic value are opposite numbers, and the disturbance value corresponding to the maximum characteristic value is a negative number;
the determining a perturbation value corresponding to the maximum eigenvalue in the perturbation vector based on the maximum eigenvalue and a second largest eigenvalue in the seed eigenvector comprises:
determining a disturbance value corresponding to the maximum eigenvalue based on the following formula:
Figure 339642DEST_PATH_IMAGE002
the determining, based on the maximum eigenvalue and the second largest eigenvalue in the seed eigenvector, a perturbation value in the perturbation vector corresponding to the second largest eigenvalue includes:
determining a disturbance value corresponding to the second largest eigenvalue based on the following formula:
Figure 725624DEST_PATH_IMAGE004
wherein,
Figure 770940DEST_PATH_IMAGE006
the maximum characteristic value is represented by a value representing the maximum characteristic value,
Figure DEST_PATH_IMAGE008
representing the second largest eigenvalue.
5. The method of claim 3, wherein the dummy countermeasure eigenvector for the seed eigenvector includes C dummy values, and the C dummy values are in one-to-one correspondence with the C eigenvalues;
the generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector includes:
for each virtual value in the virtual confrontation feature vector, determining the virtual value based on the sum of the feature value corresponding to the virtual value and the disturbance value corresponding to the feature value.
6. The method of claim 2,
the determination mode of the virtual confrontation training threshold comprises the following steps:
generating a disturbance sample corresponding to the image data aiming at each image data in the classified data set, and inputting the disturbance sample to a deep neural network model to obtain a classification result of the disturbance sample; if the classification result is not matched with the label value of the image data, determining the image data as a seed sample;
inputting all the seed samples into a first sub-network of a deep neural network model to obtain an initial seed vector corresponding to each seed sample, wherein the initial seed vector comprises a plurality of characteristic values;
for each initial seed vector, determining a difference value between the maximum characteristic value and the second maximum characteristic value in the initial seed vector, and determining a first average value of the difference values corresponding to all the initial seed vectors;
determining the virtual confrontation training threshold based on the first average.
7. The method of claim 6, wherein after inputting the perturbation sample to the deep neural network model and obtaining the classification result of the perturbation sample, the method further comprises: if the classification result is matched with the label value of the image data, determining the image data as a non-seed sample; inputting all non-seed samples into a first sub-network of a deep neural network model to obtain an initial non-seed vector corresponding to each non-seed sample, wherein the initial non-seed vector comprises a plurality of characteristic values;
the determining the virtual confrontation training threshold based on the first average value includes:
for each initial non-seed vector, determining a difference value between the maximum characteristic value and the second maximum characteristic value in the initial non-seed vector, and determining a second average value of the difference values corresponding to all the initial non-seed vectors;
and if the second average value is larger than the first average value and the difference value between the second average value and the first average value is larger than a preset threshold value, determining the first average value as the virtual countermeasure training threshold value.
8. The method of claim 1, wherein the updating the parameters of the deep neural network model based on the plurality of target feature vectors comprises:
determining a loss value of a target loss function based on the plurality of target feature vectors;
and updating the parameters of the deep neural network model based on the loss values.
9. A virtual confrontation training device for a deep neural network, the device comprising:
an acquisition module for acquiring a training data set, the training data set comprising a plurality of image data; inputting the plurality of image data into a first sub-network of a deep neural network model to obtain a plurality of initial feature vectors corresponding to the plurality of image data; wherein the deep neural network model comprises a first sub-network and a second sub-network, the second sub-network comprising a last network layer in the deep neural network model, the first sub-network comprising the remaining network layers except the last network layer;
a selecting module for selecting seed feature vectors and non-seed feature vectors from the plurality of initial feature vectors;
the generating module is used for generating a virtual countermeasure characteristic vector corresponding to each seed characteristic vector based on the seed characteristic vector and a disturbance vector corresponding to the seed characteristic vector;
the obtaining module is further configured to input all virtual confrontation feature vectors and all non-seed feature vectors to a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to the plurality of image data;
and the updating module is used for updating the parameters of the deep neural network model based on the target feature vectors so as to detect the input data through the updated deep neural network model.
10. A virtual confrontation training device of a deep neural network, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to perform the steps of:
acquiring a training data set, wherein the training data set comprises a plurality of image data;
inputting the plurality of image data into a first sub-network of a deep neural network model to obtain a plurality of initial feature vectors corresponding to the plurality of image data; wherein the deep neural network model comprises a first sub-network and a second sub-network, the second sub-network comprising a last network layer in the deep neural network model, the first sub-network comprising the remaining network layers except the last network layer;
selecting a seed feature vector and a non-seed feature vector from the plurality of initial feature vectors;
for each seed feature vector, generating a virtual countermeasure feature vector corresponding to the seed feature vector based on the seed feature vector and a perturbation vector corresponding to the seed feature vector;
inputting all the virtual confrontation feature vectors and all the non-seed feature vectors into a second sub-network of the deep neural network model to obtain a plurality of target feature vectors corresponding to the plurality of image data;
and updating parameters of the deep neural network model based on the target feature vectors so as to detect input data through the updated deep neural network model.
CN202110352167.4A 2021-03-31 2021-03-31 Virtual confrontation training method, device and equipment for deep neural network Active CN112734039B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110352167.4A CN112734039B (en) 2021-03-31 2021-03-31 Virtual confrontation training method, device and equipment for deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110352167.4A CN112734039B (en) 2021-03-31 2021-03-31 Virtual confrontation training method, device and equipment for deep neural network

Publications (2)

Publication Number Publication Date
CN112734039A CN112734039A (en) 2021-04-30
CN112734039B true CN112734039B (en) 2021-07-23

Family

ID=75596224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110352167.4A Active CN112734039B (en) 2021-03-31 2021-03-31 Virtual confrontation training method, device and equipment for deep neural network

Country Status (1)

Country Link
CN (1) CN112734039B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019106878A1 (en) * 2017-11-28 2019-06-06 桂太 杉原 Information processing system, information processing method, and computer program
CN110348475B (en) * 2019-05-29 2023-04-18 广东技术师范大学 Confrontation sample enhancement method and model based on spatial transformation
CN112529179A (en) * 2020-12-10 2021-03-19 鹏城实验室 Genetic algorithm-based confrontation training method and device and computer storage medium

Also Published As

Publication number Publication date
CN112734039A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
US20230222353A1 (en) Method and system for training a neural network model using adversarial learning and knowledge distillation
CN111523422B (en) Key point detection model training method, key point detection method and device
US20180018555A1 (en) System and method for building artificial neural network architectures
CN111126134B (en) Radar radiation source deep learning identification method based on non-fingerprint signal eliminator
WO2016026063A1 (en) A method and a system for facial landmark detection based on multi-task
US20210224647A1 (en) Model training apparatus and method
Yue et al. Effective, efficient and robust neural architecture search
WO2022217853A1 (en) Methods, devices and media for improving knowledge distillation using intermediate representations
Boo et al. Stochastic precision ensemble: self-knowledge distillation for quantized deep neural networks
Jie et al. Anytime recognition with routing convolutional networks
CN112446888A (en) Processing method and processing device for image segmentation model
CN114091597A (en) Countermeasure training method, device and equipment based on adaptive group sample disturbance constraint
CN114091594A (en) Model training method and device, equipment and storage medium
Jang et al. Adaptive weapon-to-target assignment model based on the real-time prediction of hit probability
JP7331937B2 (en) ROBUST LEARNING DEVICE, ROBUST LEARNING METHOD, PROGRAM AND STORAGE DEVICE
Anumasa et al. Improving robustness and uncertainty modelling in neural ordinary differential equations
Hurtado et al. Overcoming catastrophic forgetting using sparse coding and meta learning
Putra et al. Multilevel neural network for reducing expected inference time
CN112734039B (en) Virtual confrontation training method, device and equipment for deep neural network
CN114254686A (en) Method and device for identifying confrontation sample
CN116644798A (en) Knowledge distillation method, device, equipment and storage medium based on multiple teachers
CN114049539B (en) Collaborative target identification method, system and device based on decorrelation binary network
CN111788582A (en) Electronic device and control method thereof
CN115761343A (en) Image classification method and image classification device based on continuous learning
CN114970732A (en) Posterior calibration method and device for classification model, computer equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant