CN109948658B - Feature diagram attention mechanism-oriented anti-attack defense method and application - Google Patents
Feature diagram attention mechanism-oriented anti-attack defense method and application Download PDFInfo
- Publication number
- CN109948658B CN109948658B CN201910138087.1A CN201910138087A CN109948658B CN 109948658 B CN109948658 B CN 109948658B CN 201910138087 A CN201910138087 A CN 201910138087A CN 109948658 B CN109948658 B CN 109948658B
- Authority
- CN
- China
- Prior art keywords
- image
- sample
- attack
- confrontation
- countermeasure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses an attention mechanism-oriented anti-attack defense method, which comprises the following steps: (1) extracting the profile characteristics of the target profile by adopting an attention mechanism, adding a tiny disturbance quantity based on the profile characteristics to obtain a countermeasure sample, and optimizing a disturbance variable in a momentum iteration mode to update the countermeasure sample so as to realize countermeasure attack on the depth model; (2) and carrying out countermeasure training on the depth model by using the countermeasure sample based on a multi-strength countermeasure training strategy so as to realize the defense of the depth model against the countermeasure attack. The method improves the robustness and generalization capability of the classifier on resisting sample attacks, so that the classifier is more reliable and stable, and the safety of the deep learning model in the actual application process is improved. The application of the attention-oriented anti-attack defense method in image classification is also disclosed.
Description
Technical Field
The invention belongs to the field of safety application research of a deep learning algorithm in artificial intelligence in an image classification task, and particularly relates to an attention mechanism-oriented anti-attack defense method and application of the anti-attack defense method in image classification.
Background
In recent years, deep learning has been widely used in various industries and achieves good effects by virtue of strong feature learning capabilities, for example: computer vision, bioinformatics, complex networks, natural language processing, and the like. However, with the wide application of deep learning, the disadvantages of the deep learning are gradually revealed, and one of the main disadvantages is that the deep learning model is vulnerable to the attack of resisting samples and is very vulnerable. For example, a normal picture taken under natural conditions can be classified as the correct landmark with a high degree of confidence, but once a carefully designed small perturbation is added to obtain a confrontation image, the confrontation sample image is misclassified by the deep learning model. Worse still, the human visual system cannot distinguish these carefully designed challenge samples because the added perturbations are very small.
As research progresses, counterattack patterns for depth models have gradually been systematized. According to the degree of understanding of an attacker on the depth model, the depth model can be divided into a black box attack, a white box attack and a gray box attack, wherein the black box attack is to resist the attack without solving any parameter and structure of the model, the white box attack is to know all attributes of the model, and the gray box attack is to know the parameters and the structure of the model, namely to know part of the parameters and the structure of the model. The method comprises the steps that a target attack and a target attack can be classified according to a misclassification result realized by the countermeasure sample, the countermeasure sample of the target attack only needs to be classified wrongly, the target attack not only needs to be classified wrongly, but also needs to be classified wrongly into a target class preset by an attacker. Generally, the optimization objective function is different according to the purpose of the non-target attack and the target attack. Furthermore, these attack methods exist not only in the digital space, but also in the physical world. If an attacker can pretend other people by wearing carefully designed glasses, the attacker can cheat the face recognition system; an attacker can also attach small stickers to license plates or guideboards, causing false identifications to fool license plate identification systems or guideboard identification systems of autonomous vehicles. It can be seen that the performance of the deep learning model is seriously damaged by the attack, so that the safety of a system based on the deep learning model is threatened, and even the safety of lives and property of people is threatened. Therefore, it is necessary to study the vulnerability existing in the depth model and perform defense.
Meanwhile, research on defense methods for resisting attacks by aiming at a depth model is gradually focused, and current defense measures mainly comprise 3 types: the added countermeasure disturbance can be destroyed by modifying the input data defense, for example, adding some random noise to the input image to be recognized or turning and scaling the image; modifying the defense of the model network structure, such as modifying the size of a convolution kernel, the pooling range, increasing the number of network layers, modifying an activation function and the like; and adding an external network to the model for defense, for example, adding an external model at the test time to realize the detection or recovery of the countermeasure sample by the model. Although most defense methods have a certain defense effect on resisting attacks, the migration is limited, and novel resisting attacks cannot be well defended.
Meanwhile, the latest research shows that the modification of the training data set of the model, namely the addition of the countermeasure sample in the training data to carry out the countermeasure training on the model, is a defense means with better effect at present. However, the defense effect of the countertraining depends on the quality of the generated countersample, and the migration capability of the countersample generated by the current attack method is weak, so that the better defense effect of the countertraining is difficult to achieve.
Disclosure of Invention
The invention aims to provide a method for defending against attacks facing a feature map attention mechanism, which focuses the contour features of a target in an image and increases disturbance quantity on the focused contour features through the feature map attention mechanism, realizes that countermeasures samples are easy to generate for resisting attacks on a depth model, and trains the depth model by utilizing the countermeasures and normal samples so as to improve the robustness of a classification model for defending against the attacks.
The invention also aims to provide application of the feature diagram attention mechanism-oriented anti-attack defense method in image classification, the feature diagram attention mechanism-oriented anti-attack defense method can obtain an image classification model capable of defending attack, and the image classification model can greatly improve the accuracy of image classification.
In order to achieve the purpose, the invention provides the following technical scheme:
an attention-oriented anti-attack defense method comprises the following steps:
(1) extracting the contour features of a target contour in an image by adopting an attention mechanism, designing a tiny disturbance quantity based on the extracted contour features to be added into an original normal sample to obtain a countercheck sample, and optimizing a disturbance variable in a momentum iteration mode to update the countercheck sample so as to realize countercheck attack on a depth model;
(2) and performing countermeasure training on the depth model based on a multi-strength countermeasure training strategy by using the data set obtained by mixing the countermeasure sample and the normal sample so as to realize defense of the depth model against countermeasure attack.
The method utilizes the spatial key information of the accurate classification of the target contour on the characteristic diagram of the spatial attention mechanism, further obtains the position to be added for the anti-disturbance through the gradient calculation of the output loss function value, and optimizes the disturbance value of each time based on the momentum iteration method so as to generate high-quality anti-sample to realize effective attack. And then carrying out multi-strength countermeasure training on the depth model so as to realize the robustness and the mobility of the depth model to the countermeasure attack defense.
Wherein, the extracting the contour feature of the target contour in the image by adopting the attention mechanism, and designing a tiny disturbance amount to be added into the original normal sample based on the extracted contour feature, and the obtaining the confrontation sample comprises:
a reconstruction feature extraction step, namely extracting a shallow feature image of an input original image as a feature image by adopting an attention mechanism based on shallow network features of a depth model, and performing up-sampling operation on the feature image to obtain a reconstruction feature image;
a channel space attention weight calculation step, namely calculating a channel space attention weight matrix according to the original image and the reconstructed characteristic image;
a pixel space attention weight calculation step, namely calculating a pixel space attention weight matrix according to the reconstructed channel space attention weight matrix and the original image;
and a countermeasure sample generation step, namely calculating the added disturbance quantity according to the pixel space attention weight matrix, and adding the disturbance quantity to the original image to obtain a countermeasure sample.
The attention mechanism can be divided into a soft attention mechanism and a hard attention mechanism, wherein the hard attention mechanism is a random weight distribution process based on Bernoulli distribution, the soft attention mechanism is an embeddable weighting method parameterized by a neural network, and a better effect can be achieved by end-to-end training by using global information in a depth model. Therefore, the present invention uses a soft attention mechanism for the anti-perturbation computation.
In the depth model classifier, the deep features have a larger field of view than the shallow features, but the spatial information of the deep feature map is largely lost. Therefore, the shallow feature output of the deep neural network is reconstructed through bilinear interpolation to be H and W which are the same as those of the input sample, wherein H represents the number of pixel points in the vertical direction of the image, and W represents the number of pixel points in the horizontal direction of the image. Attention mechanisms for searching for the disturbance distribution include channel space attention, which focuses on the channel feature distribution by performing weighted feature mapping on different channels, and pixel space attention, which focuses on the pixel feature distribution by performing weighted feature mapping on different pixel regions.
Specifically, in the channel space attention weight calculation step,
will have a size of [ H, W,3 ]]Is converted into a size of [3, l ] by reshape operation]Picture x ofreH represents the number of pixels in the vertical direction of the image, W represents the number of pixels in the horizontal direction of the image, 3 represents a color image with three channels of RGB, and l is H × W;
the size of the shallow hidden layer after up-sampling is [ H, W, c ]]Reconstructed characteristic image f ofmConverted to the size [ c, l ] by reshape operation]Reconstructed characteristic image f ofmm;
By the formulaTo obtain a size of [3, c]Of the channel space attention weight matrix WcWherein softmax (·) is an activation function.
In the pixel space attention weight calculation step,
using formulasCalculated size of [3, l]OfChannel spatial attention weightWherein the content of the first and second substances,represents multiplication of a matrix;
using formulasCalculated size of [1, l]Pixel space attention weight WpWhere, denotes multiplication of each corresponding element of the matrix, softmax (·) is an activation function.
In the step of generating the challenge sample,
operate by reshape function to get size [1, l]Pixel space attention weight WpIs changed to a size of [ H, W,1 ]]Attention mapping weight W ofmap;
The added disturbance amount ρ is calculated by the following formula:
wherein, represents the multiplication of corresponding elements of the two matrixes; y represents the correct class mark corresponding to the original image x;representing calculated gradients1-norm of (i.e., the sum of the absolute numbers of vector elements); x is the number ofiA pixel matrix representing an ith channel;
finally, by the formulaObtaining a confrontation sample x*Wherein, in the step (A),representing the addition of corresponding elements of the matrix.
Specifically, optimizing the disturbance variables to update the countermeasure samples by way of momentum iteration includes:
the maximum iteration number of the trained deep learning classifier f is set to be T, the original image is x, and the correct class corresponding to the original image x is marked as y. At the beginning of the iteration, orderSetting an initial velocity vector g0=0;
Defining an attack optimization objective function of an iterative process as follows:
the over parameter kappa is more than or equal to 0, the confidence coefficient of the misclassification class target of the generated countermeasure sample is represented, the requirement for producing the countermeasure sample is higher when the value of kappa is larger, and the obtained sample attack performance is more reliable; x is the number of0Representing an initial image without added disturbance, i.e. an original image x; z (x)yConfidence that the sample is classified as y, Z (x)y′Represents the confidence that the sample is classified as y';represents x-x0Is used to limit the magnitude of the counterdisturbance, i.e. the sum of the squares of the absolute values of the vector elements is then root-opened, yt' represents a specific target label preset by an attacker;
(1) inputting an imageTo the deep learning classifier f, calculating the gradient of the deep learning classifier f to the inputAnd capturing an imageShallow feature images in a networkMethod for aligning shallow feature image by bilinear interpolationPerforming an upsampling operation to obtain a reconstructed feature imageObtaining a pixel space attention weight by the following calculation formula
Wherein the content of the first and second substances,representing the reconstructed channel spatial attention weight,representing the channel space attention weight before reconstruction. By reshape function pairPerforming a reconstruction operation to obtain Representing a matrix multiplication, softmax (-) is an activation function,representing a reconstructed image matrixRepresenting multiplication of corresponding elements of the matrix, pair before execution of the softmax (·) functionThe calculated matrix is summed in the column direction once so that
(2) Weighting pixel spatial attention by a reconstruction operationReconstructing as attention mapping weights
(3) Updating velocity vector g by gradient-based directioni+1:
(4) based on the velocity vector gi+1Calculating the disturbance amount rho required to be addedi:
ρi=gi+1×α
Wherein alpha represents the disturbance step length added each time in the iterative process;
repeating the steps (1) to (5) until the disturbance is larger than the preset valueOr to achieve a successful attackI.e., the challenge sample has been successfully generated, wherein,representing infinite norm, i.e.And the maximum value of the medium absolute value, epsilon is a preset disturbance size, and y is a correct class mark of the original image x.
Performing countermeasure training on the depth model based on a multi-strength countermeasure training strategy using countermeasure samples includes:
(1) based on a preset disturbance amplitude parameter epsilon, adopting step (1) in an attention-oriented system counterattack defense method to generate a batch of counterattack sample subset { xadv1And then continuously adjusting the disturbance amplitude value to be epsilon/2, epsilon/3, epsilon/4 to obtain a confrontation sample subset { x }adv2H, a subset of confrontation samples { x }adv3H, a subset of confrontation samples { x }adv4};
(2) Mixing all the confrontation sample subsets obtained in the step (1) to obtain a total confrontation sample set with different attack capabilities, and mixing the confrontation samples and normal samples from 0.1, 0.2, 0.3, … and 1.0 according to the value of the attack strength AIn to obtain new training data sets with different attack strengths;
(3) and (3) carrying out fine tuning training on the weight parameters of the depth model by using the new training data set with different attack strengths obtained in the step (2).
The application of the attention-oriented anti-attack defense method in image classification is characterized by comprising the following processes:
firstly, an image set with similar characteristics to an image to be classified is used as an original image, a deep neural network is used as an image classification model, a large number of confrontation samples are generated by using the confrontation attack defense method facing the characteristic image attention mechanism, the confrontation samples are used for carrying out multi-strength confrontation training on the trained image classification model to discover and repair existing loopholes, and the image classification model with the capability of defending the confrontation samples is obtained;
and then, classifying the classified images by adopting a trained image classification model with the capability of defending against the sample to obtain a reliable classification result.
The invention provides a feature map attention mechanism-oriented method for defending against attacks, which is characterized in that a countermeasure sample with more tiny disturbance but capable of reliably misleading a classifier is obtained through the feature map attention mechanism, and the robustness and generalization capability of the classifier on the countermeasure sample attack are improved by carrying out multi-strength countermeasure training on an original classifier through the countermeasure sample, so that the classifier is more reliable and stable, and the safety of a deep learning model in the actual application process is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of a countermeasure sample generation method FineFool based on a feature map attention mechanism;
FIG. 2 is a graph of the challenge samples generated by the depth model ResNet-v2 under the attack of MI-FGSM, PGD, and FineFool attack methods;
FIG. 3 is a graph of challenge samples generated by the depth model inclusion-v 3 under the attack of MI-FGSM, PGD, and FineFool attack methods;
FIG. 4 is a confidence reduction curve of the original correct class labels of the countermeasure samples generated by the depth model ResNet-v2 under the attack of the MI-FGSM, PGD and FineFool attack methods;
FIG. 5 is a confidence curve of the error classification class labels of the countermeasure samples generated by the depth model increment-v 3 under the attack of the MI-FGSM, PGD and FineFool attack methods.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
In order to improve the robustness of the deep learning model, the embodiment provides a method for defending against attacks facing to a feature map attention mechanism, which mainly comprises two stages, namely a stage for generating a confrontation sample and a stage for training the confrontation of the deep learning model, and the specific process is as follows:
for the challenge sample generation phase:
in the stage, the method mainly utilizes an attention mechanism to extract the contour features of the target contour, adds a small disturbance quantity based on the contour features, and optimizes disturbance variables in a momentum iteration mode, so as to realize counterattack on the depth model, and the counterattack method is named as FineFool and can generate a countersample, and specifically, as shown in FIG. 1, the counterattack method comprises a reconstruction feature extraction step, a channel space attention weight calculation step, a pixel space attention weight calculation step and a generation step of the countersample.
The method comprises a step of reconstructing characteristic, a step of extracting a shallow network characteristic graph in a deep learning model, and a step of extracting a shallow network characteristic graph in the deep learning model. For a size of [ H, W,3 ]]H is the number of pixels in the vertical direction of the image, W is the number of pixels in the horizontal direction of the image, and 3 is the number of RGB channels included in the original image, the original image (original image) is divided intox are input into a depth classification model (i.e., classifier f) and are calculated to extract dimensions [ H1, W1, C ]]Shallow feature image x offThe shallow feature image has better spatial feature as the feature image, and then the feature image is subjected to bilinear up-sampling, namely a bilinear interpolation method is used for carrying out bilinear up-sampling on the feature image xfThe upsampling operation is performed to obtain a dimension of [ H, W, c]Reconstructed characteristic image f ofm。
The channel (or channel) space attention weight calculation step is mainly used for calculating the channel space attention weight Wc. The specific process is as follows: will have a size of [ H, W,3 ]]Is converted into a size of [3, l ] by reshape operation]Picture x ofreH represents the number of pixels in the vertical direction of the image, W represents the number of pixels in the horizontal direction of the image, and l is H × W; the size is [ H, W, c ]]Reconstructed characteristic image f ofmConverted to the size [ c, l ] by reshape operation]Reconstructed characteristic image f ofmmThen, by the formulaTo obtain a size of [3, c]Of the channel space attention weight matrix WcWherein softmax (·) is an activation function.
The pixel space attention weight calculation step is mainly used for calculating the pixel space attention weight Wp. The specific process is as follows: first, using a formulaCalculated size of [3, l]Reconstructed channel space attention weight ofWherein the content of the first and second substances,represents multiplication of a matrix; then, using the formulaCalculated size of [1, l]Pixel space attention rights ofHeavy WpWhere, denotes multiplication of each corresponding element of the matrix, softmax (·) is an activation function.
The challenge sample generation step is mainly used for generating a challenge sample x*The specific process is as follows: first, the size is [1, l ] by reshape function operation]Pixel space attention weight WpIs changed to a size of [ H, W,1 ]]Attention mapping weight W ofmapThen, the added disturbance amount ρ is calculated by the following formula:
wherein, represents the multiplication of corresponding elements of the two matrixes; y represents the correct class mark corresponding to the original image x;representing calculated gradients1-norm of (i.e., the sum of the absolute numbers of vector elements); x is the number ofiA pixel matrix representing an ith channel;
finally, by the formulaObtaining a confrontation sample x*Wherein, in the step (A),representing the addition of corresponding elements of the matrix.
On the basis of the generation of the countermeasure sample, the specific process of updating the countermeasure sample by a momentum iteration method is as follows:
the maximum iteration number of the trained deep learning classifier f is set to be T, the original image is x, and the correct class corresponding to the original image x is marked as y. At the beginning of the iteration, orderSetting an initial velocity vector g0=0。
Defining an attack optimization objective function of an iterative process as follows:
the over parameter kappa is more than or equal to 0, the confidence coefficient of the misclassification class target of the generated countermeasure sample is represented, the requirement for producing the countermeasure sample is higher when the value of kappa is larger, and the obtained sample attack performance is more reliable; x is the number of0Representing an initial image without added disturbance, i.e. an original image x; z (x)yConfidence that the sample is classified as y, Z (x)y′Represents the confidence that the sample is classified as y';represents x-x0Is used to limit the magnitude of the counterdisturbance, i.e. the sum of the squares of the absolute values of the vector elements is then root-opened, yt' represents a specific target label preset by an attacker;
on the basis, the iteration process is as follows:
(1) inputting an imageTo the deep learning classifier f, calculating the gradient of the deep learning classifier f to the inputAnd capturing an imageShallow feature images in a networkMethod for aligning shallow feature image by bilinear interpolationPerforming an upsampling operation to obtain a reconstructed feature imageObtaining a pixel space attention weight by the following calculation formula
Wherein the content of the first and second substances,representing the reconstructed channel spatial attention weight,representing the channel space attention weight before reconstruction. By reshape function pairPerforming a reconstruction operation to obtain Representing a matrix multiplication, softmax (-) is an activation function,representing a reconstructed image matrixRepresenting multiplication of corresponding elements of the matrix, pair before execution of the softmax (·) functionThe calculated matrix is summed in the column direction once so that
(2) Weighting pixel spatial attention by a reconstruction operationReconstructing as attention mapping weights
(3) Updating velocity vector g by gradient-based directioni+1:
(4) based on the velocity vector gi+1Calculating the disturbance amount rho required to be addedi:
ρi=gi+1×α
Wherein alpha represents the disturbance step length added each time in the iterative process;
repeating the steps (1) to (5) until the disturbance is larger than the preset valueOr to achieve a successful attackI.e. the challenge sample has been successfully generated. Wherein the content of the first and second substances,representing infinite norm, i.e.The maximum value of the medium absolute value, epsilon is a preset disturbance size, and y is a correct class mark of the original image x;
and if the countermeasure sample is successfully generated, jumping out of iteration and outputting the countermeasure sample. Otherwise, judging whether the current iteration time i exceeds the maximum iteration time T, if not, continuing momentum iteration, if so, stopping iteration and outputting attack failure.
The resulting confrontational sample visualization is shown in the last column of FIGS. 2 and 3, where ρFineFoolShowing the anti-disturbance visualization by the FineFool method, AdvFineFoolIndicating that the confrontation sample after the confrontation disturbance is added on the original normal sample.
Countermeasure training phase for depth model:
in the stage, the countermeasure samples generated in the countermeasure sample generation stage are utilized to carry out multi-strength countermeasure training on the depth model, and the multi-strength countermeasure training method specifically comprises the following steps:
under the same other conditions, different anti-disturbance upper limit values, namely different epsilon values, are set, so that anti-samples with attack capabilities of different strengths are obtained. The method comprises the steps of mixing countermeasure samples with different strengths and normal samples according to a certain proportion to obtain different training data sets for countermeasure training, and performing batch-wise countermeasure training on a depth model by using the training data sets, so that the depth model improves the generalization capability on countermeasure attack defense under the condition that the classification accuracy of the normal samples is reduced as little as possible, namely, the countermeasure samples generated by different attack methods can be defended.
The attack strength (AIn) of the training data set is defined as:
AIn=Num(Adv)/Num(Nor)
wherein num (adv) and num (nor) respectively represent the number of samples of the confrontation sample and the normal sample, generally, the number of samples of the normal image in the training data set is fixed, the confrontation sample can be generated according to different parameters of the attack method, so the number far exceeds the number of the normal sample, and the value range of AIn is Ain ≧ 0.
The specific process of performing countermeasure training on the depth model comprises the following steps:
(1) based on a preset disturbance amplitude parameter epsilon, generating a batch of confrontation sample subsets { x ] through the confrontation attack method attack depth model based on the feature map attention machine mechanismadv1And then continuously adjusting the disturbance amplitude values to be epsilon/2, epsilon/3 and epsilon/4 to obtain more data sample subset { x }adv2}、{xadv3}、{xadv4And as the preset disturbance amplitude is reduced, the attack success rate is reduced, the number of corresponding confrontation samples is reduced, and the overall attack capability of the confrontation samples of each set is weakened.
(2) Mixing all the confrontation samples obtained in the step (1) to obtain a total set of the confrontation samples with different attack capabilities, ensuring the balance and diversity of data distribution, and then mixing the confrontation samples with normal samples from 0.1, 0.2, 0.3, … and 1.0 according to the value of AIn to obtain a new training data set with different attack strengths; the normal samples in these new training data sets are all the same, and the challenge samples have a certain randomness.
(3) And (3) carrying out fine tuning training on the weight parameters of the depth model by using the training data sets with different attack strengths obtained in the step (2), so that the depth model has better robustness to the attack of the anti-sample, and the reliability of the application of the depth model is improved.
Application example
The method for defending against attacks facing the feature map attention mechanism is applied to image classification, and specifically can be used for classifying animal images and target images such as face images.
When the method is applied, firstly, an image set with similar characteristics to an image to be classified is used as an original image, a deep learning network (which can be Resnet-v2 or inclusion-v 3) is used as an image classification model, a large number of confrontation samples are generated by using the confrontation attack defense method facing the characteristic map attention mechanism, the confrontation samples are used for carrying out multi-strength confrontation training on the trained image classification model to discover and repair existing bugs, an image classification model with the capability of defending the confrontation samples is obtained, and then the trained image classification model with the capability of defending is used for classifying the classified images to obtain a reliable classification result.
Specific experiments are as follows:
the image dataset used in this experiment was a subset of the ImageNet image dataset from http:// www.image-net.org/, the basic cases of which include: (a) the image dataset had 130000 training image samples, 100000 test image samples, and 50000 validation set samples, each image sample having a size of 64 x 64 matrices; (b) the data set can be divided into 1000 classes, each class has the same number of image samples, namely 130 samples in each class in the training set, 50 samples in each class in the verification set and 100 samples in each class in the testing set; (c) a simple normalization operation was performed for each picture for ease of experiment.
And performing parameter fine-tuning training on the trained image classification model by using the training set, and generating a confrontation sample by using a FineFool method.
The image classification models used in this experiment are Resnet-v2 and inclusion-v 3, the resulting confrontation sample visualization results are shown in FIG. 2, the last column of FIG. 3, and original in FIG. 2 represents the original normal image, ρMI-FGSM、AdvMI-FGSM、ρPGD、AdvPGD、ρFineFool、AdvFineFoolRespectively representing the perturbation and challenge profiles obtained by the MI-FGSM, PGD and FineFool attack methods. Fig. 2 and 3 show the results obtained by the attack depth models respet-v 2 and inclusion-v 3, respectively. Fig. 4 and 5 show a confidence decreasing curve of the original correct class target and a confidence increasing curve of the misclassified class target of the countermeasure sample as shown in fig. 2 and 3 during the attack.
Among them, PGD and MI-FGSM are comparative attack methods. The PGD applies a standard gradient descent once, and then cuts all coordinates into a region, and research shows that the local maximum obtained by the PGD has similar loss functions compared with a network of normal training or resistance training, and the phenomenon shows that the resistance sample generated by the method has good robustness. The MI-FGSM attack method introduces a generalized momentum iteration algorithm to enhance the anti-attack capability, and the momentum term is embedded into the attack iteration process, so that the updating direction of disturbance can be stabilized in the iteration process, and the problem of local optimum is avoided.
The MI-FGSM, PGD and FineFool anti-attack methods attack the Resnet-v2 and inclusion-v 3 depth models, and then the generated anti-sample is used for multi-strength anti-training defense operation, and the obtained defense effect is shown in Table 1. Shown in table 1 is the attack success rate, the smaller the value, the less the model is successfully attacked, and the better the defense ability is. It can be seen that the FineFool provided by the invention can generate better confrontation samples, so that the model has better defense effect after confrontation training. Different attack methods attack the model after the countertraining of the countersample generated by using the FineFool attack method.
TABLE 1 attack success rate after countertraining based on FineFool attack method
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.
Claims (2)
1. An attention-oriented anti-attack defense method comprises the following steps:
step 1, extracting the contour features of a target contour in an image by adopting an attention mechanism, designing a small disturbance amount to be added into an original normal sample based on the extracted contour features, and obtaining a confrontation sample: the method comprises the following steps:
a reconstruction feature extraction step, namely extracting a shallow feature image of an input original image as a feature image by adopting an attention mechanism based on shallow network features of a depth classification model, and performing up-sampling operation on the feature image to obtain a reconstruction feature image;
a channel space attention weight calculation step, which is used for calculating a channel space attention weight matrix according to the original image and the reconstructed characteristic image, and comprises the following steps: will have a size of [ H, W,3 ]]Is converted into a size of [3, l ] by reshape operation]Picture x ofreH represents the number of pixels in the vertical direction of the image, W represents the number of pixels in the horizontal direction of the image, 3 represents a color image with three channels of RGB, and l is H × W; the size of the shallow hidden layer after up-sampling is [ H, W, c ]]Reconstructed characteristic image f ofmConverted to the size [ c, l ] by reshape operation]Reconstructed characteristic image f ofmm(ii) a By the formulaTo obtain a size of [3, c]Of the channel space attention weight matrix WcWherein softmax (·) is an activation function;
a pixel space attention weight calculation step, which is used for calculating a pixel space attention weight matrix according to the reconstructed channel space attention weight matrix and the original image, and comprises the following steps: using formulasCalculated size of [3, l]Reconstructed channel space attention weight ofWherein the content of the first and second substances,represents multiplication of a matrix; using formulasCalculated size of [1, l]Pixel space attention weight WpWherein, representing multiplication of each corresponding element of the matrix, and softmax (·) is an activation function;
an antagonistic sample generation step of calculating an added disturbance amount according to the pixel space attention weight matrix, and adding the disturbance amount to the original image to obtain an antagonistic sample, including: operate by reshape function to get size [1, l]Pixel space attention weight WpIs changed to a size of [ H, W,1 ]]Attention mapping weight W ofmap(ii) a Calculating an added disturbance amount ρ by formula (1); by the formulaObtaining a confrontation sample x*Wherein, in the step (A),representing the addition of corresponding elements of the matrix;
wherein, represents the multiplication of corresponding elements of the two matrixes; y represents the correct class mark corresponding to the original image x;representing calculated gradients1-norm of (i.e., the sum of the absolute numbers of vector elements); x is the number ofkA pixel matrix representing a k-th channel;
step 2, optimizing disturbance variables in a momentum iteration mode to update a counterattack sample, so that counterattack on the depth classification model is realized, and the method comprises the following steps:
firstly, setting the maximum iteration number of a trained depth classification model f as T, an original image as x, and a correct class corresponding to the original image x as y, and enabling the depth classification model f to be trained when iteration startsSetting an initial velocity vector g0=0;
Then, defining an attack optimization objective function of the iterative process as:
the over parameter kappa is more than or equal to 0, the confidence coefficient of the misclassification class target of the generated countermeasure sample is represented, the requirement for producing the countermeasure sample is higher when the value of kappa is larger, and the obtained sample attack performance is more reliable; x is the number of0Representing an initial image without added disturbance, i.e. an original image; z (x)yConfidence that the sample is classified as y, Z (x)y′Represents the confidence that the sample is classified as y';represents x-x0Is used to limit the magnitude of the counterdisturbance, i.e. the sum of the squares of the absolute values of the vector elements is then root-opened, yt' represents a specific target label preset by an attacker;
finally, the process of optimizing the disturbance variable in a momentum iteration mode to update the countermeasure sample is as follows:
(1) inputting an imageTo a depth classification model f, calculating the gradient of the depth classification model f to the inputAnd capturing an imageShallow feature images in a networkMethod for aligning shallow feature image by bilinear interpolationPerforming an upsampling operation to obtain a reconstructed feature imageObtaining a pixel space attention weight by the following calculation formula
Wherein the content of the first and second substances,representing the reconstructed channel spatial attention weight,representing the channel space attention weight before reconstruction; by reshape function pairPerforming a reconstruction operation to obtain Representing a matrix multiplication, softmax (-) is an activation function,representing a reconstructed image matrixRepresenting multiplication of corresponding elements of the matrix, pair before execution of the softmax (·) functionThe calculated matrix is summed in the column direction once so that
(2) Weighting pixel spatial attention by a reconstruction operationReconstructing as attention mapping weights
(3) Updating velocity vector g by gradient-based directioni+1:
(4) based on the velocity vector gi+1Calculating the disturbance amount rho required to be addedi:
ρi=gi+1×α (5)
Wherein alpha represents the disturbance step length added each time in the iterative process;
repeating the steps (1) to (5) until the disturbance is larger than the preset valueOr to achieve a successful attackI.e., the challenge sample has been successfully generated, wherein,representing infinite norm, i.e.The maximum value of the medium absolute value, epsilon is a preset disturbance size, and y is a correct class mark of the original image x;
and 3, performing countermeasure training on the deep classification model by using a data set obtained by mixing the countermeasure sample and the normal sample based on a multi-strength countermeasure training strategy to realize defense of the deep classification model against countermeasure attack, wherein the defense comprises the following steps:
(1) generating a batch of confrontation sample subsets { x ] by adopting steps 1 and 2 in the confrontation attack defense method facing the attention mechanism based on the preset disturbance size epsilonadv1And then continuously adjusting the disturbance amplitude value to be epsilon/2, epsilon/3, epsilon/4 to obtain a confrontation sample subset { x }adv2H, a subset of confrontation samples { x }adv3H, a subset of confrontation samples { x }adv4};
(2) Mixing all the confrontation sample subsets to obtain a total confrontation sample set with different attack capabilities, and mixing the confrontation samples and normal samples from 0.1, 0.2, 0.3, … and 1.0 according to the value of the attack intensity AIn to obtain a new training data set with different attack intensities, wherein the attack intensity is the ratio of the number of the confrontation samples to the number of the normal samples;
(3) and carrying out fine tuning training on the weight parameters of the deep classification model by using the new training data sets with different attack strengths.
2. Use of the method of defending against attacks directed to the attention mechanism according to claim 1, in image classification, characterized in that it comprises the following processes:
firstly, an image set with similar characteristics to an image to be classified is used as an original image, a deep neural network is used as an image classification model, a large number of confrontation samples are generated by using the confronting attack defense method facing the attention mechanism in claim 1, the confrontation samples are used for carrying out multi-strength confrontation training on the trained image classification model to discover and repair existing bugs, and the image classification model with the capability of defending the confrontation samples is obtained;
and then, classifying the classified images by adopting a trained image classification model with the capability of defending against the sample to obtain a reliable classification result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910138087.1A CN109948658B (en) | 2019-02-25 | 2019-02-25 | Feature diagram attention mechanism-oriented anti-attack defense method and application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910138087.1A CN109948658B (en) | 2019-02-25 | 2019-02-25 | Feature diagram attention mechanism-oriented anti-attack defense method and application |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109948658A CN109948658A (en) | 2019-06-28 |
CN109948658B true CN109948658B (en) | 2021-06-15 |
Family
ID=67006468
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910138087.1A Active CN109948658B (en) | 2019-02-25 | 2019-02-25 | Feature diagram attention mechanism-oriented anti-attack defense method and application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109948658B (en) |
Families Citing this family (55)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472672B (en) * | 2019-07-25 | 2023-04-18 | 创新先进技术有限公司 | Method and apparatus for training machine learning models |
CN110444208A (en) * | 2019-08-12 | 2019-11-12 | 浙江工业大学 | A kind of speech recognition attack defense method and device based on gradient estimation and CTC algorithm |
CN110674938B (en) * | 2019-08-21 | 2021-12-21 | 浙江工业大学 | Anti-attack defense method based on cooperative multi-task training |
CN110633655A (en) * | 2019-08-29 | 2019-12-31 | 河南中原大数据研究院有限公司 | Attention-attack face recognition attack algorithm |
CN110782420A (en) * | 2019-09-19 | 2020-02-11 | 杭州电子科技大学 | Small target feature representation enhancement method based on deep learning |
CN110705652B (en) * | 2019-10-17 | 2020-10-23 | 北京瑞莱智慧科技有限公司 | Countermeasure sample, generation method, medium, device and computing equipment thereof |
CN110852363B (en) * | 2019-10-31 | 2022-08-02 | 大连理工大学 | Anti-sample defense method based on deception attacker |
CN110941794B (en) * | 2019-11-27 | 2023-08-22 | 浙江工业大学 | Challenge attack defense method based on general inverse disturbance defense matrix |
CN111046673B (en) * | 2019-12-17 | 2021-09-03 | 湖南大学 | Training method for defending text malicious sample against generation network |
CN111191717B (en) * | 2019-12-30 | 2022-05-10 | 电子科技大学 | Black box confrontation sample generation algorithm based on hidden space clustering |
CN111046847A (en) * | 2019-12-30 | 2020-04-21 | 北京澎思科技有限公司 | Video processing method and device, electronic equipment and medium |
CN111275106B (en) * | 2020-01-19 | 2022-07-01 | 支付宝(杭州)信息技术有限公司 | Countermeasure sample generation method and device and computer equipment |
CN111325319B (en) * | 2020-02-02 | 2023-11-28 | 腾讯云计算(北京)有限责任公司 | Neural network model detection method, device, equipment and storage medium |
CN111340180B (en) * | 2020-02-10 | 2021-10-08 | 中国人民解放军国防科技大学 | Countermeasure sample generation method and device for designated label, electronic equipment and medium |
CN111325341B (en) * | 2020-02-18 | 2023-11-14 | 中国空间技术研究院 | Countermeasure training method with self-adaptive countermeasure intensity |
CN111368908B (en) * | 2020-03-03 | 2023-12-19 | 广州大学 | HRRP non-target countermeasure sample generation method based on deep learning |
CN111368725B (en) * | 2020-03-03 | 2023-10-03 | 广州大学 | HRRP targeted countermeasure sample generation method based on deep learning |
CN111488916B (en) * | 2020-03-19 | 2023-01-24 | 天津大学 | Anti-attack method based on training set data |
CN111414964A (en) * | 2020-03-23 | 2020-07-14 | 上海金桥信息股份有限公司 | Image security identification method based on defense sample |
CN111476228A (en) * | 2020-04-07 | 2020-07-31 | 海南阿凡题科技有限公司 | White-box confrontation sample generation method for scene character recognition model |
CN115063790A (en) * | 2020-05-11 | 2022-09-16 | 北京航空航天大学 | Anti-attack method and device based on three-dimensional dynamic interaction scene |
CN112115761B (en) * | 2020-05-12 | 2022-09-13 | 吉林大学 | Countermeasure sample generation method for detecting vulnerability of visual perception system of automatic driving automobile |
CN111754519B (en) * | 2020-05-27 | 2024-04-30 | 浙江工业大学 | Class activation mapping-based countermeasure method |
CN111625820A (en) * | 2020-05-29 | 2020-09-04 | 华东师范大学 | Federal defense method based on AIoT-oriented security |
CN111783085B (en) * | 2020-06-29 | 2023-08-22 | 浙大城市学院 | Defense method and device for resisting sample attack and electronic equipment |
CN111783629B (en) * | 2020-06-29 | 2023-04-07 | 浙大城市学院 | Human face in-vivo detection method and device for resisting sample attack |
CN111860681B (en) * | 2020-07-30 | 2024-04-30 | 江南大学 | Deep network difficulty sample generation method under double-attention mechanism and application |
CN111881436A (en) * | 2020-08-04 | 2020-11-03 | 公安部第三研究所 | Method and device for generating black box face anti-attack sample based on feature consistency and storage medium thereof |
CN112016686B (en) * | 2020-08-13 | 2023-07-21 | 中山大学 | Antagonistic training method based on deep learning model |
CN112085069B (en) * | 2020-08-18 | 2023-06-20 | 中国人民解放军战略支援部队信息工程大学 | Multi-target countermeasure patch generation method and device based on integrated attention mechanism |
CN112035834A (en) * | 2020-08-28 | 2020-12-04 | 北京推想科技有限公司 | Countermeasure training method and device, and application method and device of neural network model |
CN112215151B (en) * | 2020-10-13 | 2022-10-25 | 电子科技大学 | Method for enhancing anti-interference capability of target detection system by using 3D (three-dimensional) countermeasure sample |
CN112541404A (en) * | 2020-11-22 | 2021-03-23 | 同济大学 | Physical attack counterattack sample generation method facing traffic information perception |
CN112507811A (en) * | 2020-11-23 | 2021-03-16 | 广州大学 | Method and system for detecting face recognition system to resist masquerading attack |
CN112488321B (en) * | 2020-12-07 | 2022-07-01 | 重庆邮电大学 | Antagonistic machine learning defense method oriented to generalized nonnegative matrix factorization algorithm |
CN112580822B (en) * | 2020-12-16 | 2023-10-17 | 北京百度网讯科技有限公司 | Countermeasure training method device for machine learning model, electronic equipment and medium |
CN112804231B (en) * | 2021-01-13 | 2021-09-24 | 广州大学 | Distributed construction method, system and medium for attack graph of large-scale network |
CN112949678B (en) * | 2021-01-14 | 2023-05-02 | 西安交通大学 | Deep learning model countermeasure sample generation method, system, equipment and storage medium |
CN115019050A (en) * | 2021-03-05 | 2022-09-06 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and storage medium |
CN113076980B (en) * | 2021-03-24 | 2023-11-14 | 中山大学 | Method for detecting images outside distribution based on attention enhancement and input disturbance |
CN113611323B (en) * | 2021-05-07 | 2024-02-20 | 北京至芯开源科技有限责任公司 | Voice enhancement method and system based on double-channel convolution attention network |
CN113344090B (en) * | 2021-06-18 | 2022-11-22 | 成都井之丽科技有限公司 | Image processing method for resisting attack by target in middle layer |
CN113571067B (en) * | 2021-06-21 | 2023-12-26 | 浙江工业大学 | Voiceprint recognition countermeasure sample generation method based on boundary attack |
CN113485313A (en) * | 2021-06-25 | 2021-10-08 | 杭州玳数科技有限公司 | Anti-interference method and device for automatic driving vehicle |
CN113392932B (en) * | 2021-07-06 | 2024-01-30 | 中国兵器工业信息中心 | Anti-attack system for deep intrusion detection |
CN113780557B (en) * | 2021-11-11 | 2022-02-15 | 中南大学 | Method, device, product and medium for resisting image attack based on immune theory |
CN114092856B (en) * | 2021-11-18 | 2024-02-06 | 西安交通大学 | Video weak supervision abnormality detection system and method for antagonism and attention combination mechanism |
CN114241268A (en) * | 2021-12-21 | 2022-03-25 | 支付宝(杭州)信息技术有限公司 | Model training method, device and equipment |
CN114492832A (en) * | 2021-12-24 | 2022-05-13 | 北京航空航天大学 | Selective attack method and device based on associative learning |
CN114332569B (en) * | 2022-03-17 | 2022-05-27 | 南京理工大学 | Low-disturbance attack resisting method based on attention mechanism |
CN114742170B (en) * | 2022-04-22 | 2023-07-25 | 马上消费金融股份有限公司 | Countermeasure sample generation method, model training method, image recognition method and device |
CN114978654B (en) * | 2022-05-12 | 2023-03-10 | 北京大学 | End-to-end communication system attack defense method based on deep learning |
CN114612688B (en) * | 2022-05-16 | 2022-09-09 | 中国科学技术大学 | Countermeasure sample generation method, model training method, processing method and electronic equipment |
CN114943641B (en) * | 2022-07-26 | 2022-10-28 | 北京航空航天大学 | Method and device for generating confrontation texture image based on model sharing structure |
CN116450187B (en) * | 2023-05-05 | 2024-06-25 | 北京慧和伙科技有限公司 | Digital online application processing method and AI application system applied to AI analysis |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108322349A (en) * | 2018-02-11 | 2018-07-24 | 浙江工业大学 | The deep learning antagonism attack defense method of network is generated based on confrontation type |
CN108446765A (en) * | 2018-02-11 | 2018-08-24 | 浙江工业大学 | The multi-model composite defense method of sexual assault is fought towards deep learning |
CN108932527A (en) * | 2018-06-06 | 2018-12-04 | 上海交通大学 | Using cross-training model inspection to the method for resisting sample |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10636141B2 (en) * | 2017-02-09 | 2020-04-28 | Siemens Healthcare Gmbh | Adversarial and dual inverse deep learning networks for medical image analysis |
-
2019
- 2019-02-25 CN CN201910138087.1A patent/CN109948658B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108322349A (en) * | 2018-02-11 | 2018-07-24 | 浙江工业大学 | The deep learning antagonism attack defense method of network is generated based on confrontation type |
CN108446765A (en) * | 2018-02-11 | 2018-08-24 | 浙江工业大学 | The multi-model composite defense method of sexual assault is fought towards deep learning |
CN108932527A (en) * | 2018-06-06 | 2018-12-04 | 上海交通大学 | Using cross-training model inspection to the method for resisting sample |
Non-Patent Citations (1)
Title |
---|
FineFool: Fine Object Contour Attack via Attention;Jinyin Chen 等;《arXiv:1812.01713v1》;20181201;第1-8页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109948658A (en) | 2019-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109948658B (en) | Feature diagram attention mechanism-oriented anti-attack defense method and application | |
CN111475797B (en) | Method, device and equipment for generating countermeasure image and readable storage medium | |
CN110941794A (en) | Anti-attack defense method based on universal inverse disturbance defense matrix | |
CN111460426B (en) | Deep learning resistant text verification code generation system and method based on antagonism evolution framework | |
CN111753881A (en) | Defense method for quantitatively identifying anti-attack based on concept sensitivity | |
CN112364915A (en) | Imperceptible counterpatch generation method and application | |
CN113255816B (en) | Directional attack countermeasure patch generation method and device | |
CN111754519B (en) | Class activation mapping-based countermeasure method | |
CN111178504B (en) | Information processing method and system of robust compression model based on deep neural network | |
CN113283599A (en) | Anti-attack defense method based on neuron activation rate | |
CN114387449A (en) | Image processing method and system for coping with adversarial attack of neural network | |
Yu et al. | A multi-task learning CNN for image steganalysis | |
CN113435264A (en) | Face recognition attack resisting method and device based on black box substitution model searching | |
CN113221388A (en) | Method for generating confrontation sample of black box depth model constrained by visual perception disturbance | |
CN117011508A (en) | Countermeasure training method based on visual transformation and feature robustness | |
CN115510986A (en) | Countermeasure sample generation method based on AdvGAN | |
CN115238271A (en) | AI security detection method based on generative learning | |
CN112766401B (en) | Countermeasure sample defense method based on significance countermeasure training | |
CN115270891A (en) | Method, device, equipment and storage medium for generating signal countermeasure sample | |
CN114693973A (en) | Black box confrontation sample generation method based on Transformer model | |
CN114359653A (en) | Attack resisting method, defense method and device based on reinforced universal patch | |
CN114842242A (en) | Robust countermeasure sample generation method based on generative model | |
CN111340066B (en) | Confrontation sample generation method based on geometric vector | |
CN113222480A (en) | Training method and device for confrontation sample generation model | |
Zhu et al. | Adversarial example defense via perturbation grading strategy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |