CN110942094B - Norm-based antagonistic sample detection and classification method - Google Patents

Norm-based antagonistic sample detection and classification method Download PDF

Info

Publication number
CN110942094B
CN110942094B CN201911174658.3A CN201911174658A CN110942094B CN 110942094 B CN110942094 B CN 110942094B CN 201911174658 A CN201911174658 A CN 201911174658A CN 110942094 B CN110942094 B CN 110942094B
Authority
CN
China
Prior art keywords
attack
antagonistic
sample
image sample
norm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911174658.3A
Other languages
Chinese (zh)
Other versions
CN110942094A (en
Inventor
江维
詹瑾瑜
何致远
吴俊廷
龚子成
潘唯迦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911174658.3A priority Critical patent/CN110942094B/en
Publication of CN110942094A publication Critical patent/CN110942094A/en
Application granted granted Critical
Publication of CN110942094B publication Critical patent/CN110942094B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for detecting and classifying antagonism samples based on norms, which comprises the following steps: s1, generating a resistance sample, and selecting attack methods with different attack strengths to generate the resistance sample; s2, calculating norm of the antagonistic sample to obtain a classification threshold and a grading threshold; s3, determining a detector; s4, compressing the target sample, and calculating a norm value L (L) of the sample,l2,l0) Comparing the calculated norm value with the grading threshold value obtained in the step S2 to judge whether the target sample is a antagonism sample, and if so, further obtaining the attack classification and the attack strength of the antagonism sample; otherwise, the sample is not processed; and S5, verifying the reasonability and the effectiveness of the integral detector. The method can simultaneously detect the specific classification and attack strength of the antagonistic sample on the premise of realizing accurate detection of the antagonistic sample.

Description

Norm-based antagonistic sample detection and classification method
Technical Field
The invention relates to a method for detecting and classifying antagonism samples based on norms.
Background
The adversarial sample is firstly proposed by Christian Szegedy and Ian Goodfellow, etc., and the machine learning algorithm can output an error result by slightly adjusting the original data, so that the adversarial sample can achieve a high cheating rate in the aspects of image recognition, natural language processing, voice recognition, etc. Such minor adjustments are not easily perceived and are highly detrimental to the AI system, essential for the detection of resistant samples. The basic problems of testing resistant samples are: under the condition of not influencing the system function, the existence of the antagonistic sample is effectively detected, and the belonged classification and the attack strength of the antagonistic sample are detected according to the classification characteristics of the antagonistic sample. The problems to be solved are therefore: under the condition of ensuring the model precision, the method can effectively detect the antagonistic sample and give the classification and attack strength of the antagonistic sample.
The traditional detection methods are three, namely sample statistics, sub-network addition and prediction inconsistency. Sample statistics a large amount of normal data is collected and the resistant sample is detected by comparing the difference between the target sample and the normal sample. Common alignment methods include statistical tests for maximum mean difference, K-nearest neighbor algorithms, and kernel density estimation. The sample statistics approach requires a large number of antagonisms and legal inputs and fails to detect a single antagonism example. At the same time, it is computationally expensive and only resistant cases far from the legitimate population can be detected. Because the antagonistic instances are inherently imperceptible, it seems less effective to use sample statistics to separate the antagonistic instances from the legitimate input. Adding a sub-network adds a resistant sample detector as a sub-network to the model. Similar to confrontational training, the confrontational examples may be used to train the detector. However, this strategy also requires a large number of counterexamples, and is therefore expensive and prone to overfitting counterattacks, which generate examples for training the detector. The detection accuracy depends on whether the training data is perfect, and if a new adversarial attack method occurs, the adversarial sample cannot be detected because the detection subnet does not learn the corresponding adversarial features. The basic idea of prediction inconsistency is to measure the divergence between several models when predicting unknown inputs, since one antagonism example may only be for a certain class of models, and not fool each model. Input inconsistency can be achieved in a number of ways: detecting whether an antagonistic sample exists or not through the prediction difference of different models on the same data; the method comprises the steps that input data are predicted for multiple times by using the same model in a Dropout mode, and Dropout layers use different probabilities in each prediction; and (3) outputting interpretable results for the input data by using a plurality of interpretable models, and comparing different interpretable results to judge whether the input data is a resistance sample.
The traditional method for detecting the antagonistic sample can realize accurate detection of the antagonistic sample through preprocessing input data or modifying a model, but cannot detect detailed information of the antagonistic sample. The detailed information comprises characteristics classification of the resistance sample and definition of attack strength. The detailed information can be well prepared for defense adversarial samples. Traditional classification of resistant samples is subjective, based on some of the effects produced by resistant samples. Dividing the adversary attack into a white box attack and a black box attack according to the information whether the adversary knows the model; classifying the antagonistic attack into a targeted attack and a non-targeted attack according to whether the antagonistic sample misleads the classification of the data into a certain target class; the adversarial attack is divided into a one-step attack and an iterative attack according to whether the one-step method or the iterative method is used when the adversarial attack generates the adversarial sample. The above classifications are not perfect and may overlap in some cases. For example, a white-box attack on one model and a black-box attack on another model. And the classification of the complaints lacks the description of the characteristics of the resistance samples, and the characteristics of each type of attack cannot be counted. The traditional detection method does not well reflect the attack strength of the antagonistic sample.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a norm-based method for detecting and classifying antagonistic samples, which can simultaneously detect the specific classification and attack strength of the antagonistic samples on the premise of accurately detecting the antagonistic samples.
The purpose of the invention is realized by the following technical scheme: a method for detecting and classifying antagonistic samples based on norms comprises the following steps:
s1, generating a resistance sample, and selecting attack methods with different attack strengths to generate the resistance sample;
s2, calculating the norm of the antagonism sample, and obtaining a classification threshold value and a grading threshold value;
s3, determining a detector: determining a detector using a detection method based on improved prediction inconsistency;
s4, compressing the target sample by the detector, and calculating a norm value L of the sample (L),l2,l0) Comparing the calculated norm value with the grading threshold value obtained in the step S2 to judge whether the target sample is a antagonism sample, and if so, further obtaining the attack classification and the attack strength of the antagonism sample; otherwise, the sample is not processed;
s5, verifying the rationality and the validity of the whole detector: inputting a test sample, comparing whether the detected classification and grade are consistent with those of the original label, if so, ending the operation, otherwise, returning to the step S1.
Further, the step S1 includes the following sub-steps:
s11, determining a resistance attack method to generate a resistance sample;
s12, calculating a loss function L (x') in the generation process of the resistance sample during each attack, and generating the following resistance samples according to the loss function:
Figure BDA0002289644810000021
d(x*and x') ≦ ε for the resistant sample x*The distance from x' is within a preset minimum value epsilon;
s13, selecting different limiting conditions, and constraining the antagonistic sample generated each time to be not easily perceived; the limiting conditions comprise an L-0 norm, an L-2 norm and an L-infinity norm;
s14, obtaining different antagonistic samples by adjusting attack iteration times and confidence degrees of the antagonistic attack method, observing the attack success rate and the confidence degree of each antagonistic sample, and determining the attack strength;
s15, dividing the obtained antagonistic sample into two parts: one part is stored according to the limitation condition and the attack strength in a classified mode, and the other part is used as test data after the limitation condition and the attack strength are marked.
Further, the step S2 includes the following sub-steps:
s21, taking the antagonism samples obtained from the step S15 and stored in a classified manner as input, and calculating L-infinity, L-2 and L-0 norm values of all the antagonism samples; the calculation formula is as follows:
L:||x||=max(|x1|,|x2|...|xn|)
L2
Figure BDA0002289644810000031
L0:||x||0=Count(xi≠0);
s22, according to the classified classes of the antagonistic samples, obtaining the classification threshold value gamma of each norm-constrained antagonistic sample through statistical analysisc=(c,c2,c0),c,c2,c0Thresholds of L- ∞, L-2 and L-0 norms, respectively;
s23, outputting attack success rate and confidence degree of the antagonistic sample, comparing the attack strength, and obtaining a grading threshold value gamma through statistical analysisg(ii) a The attack success rate and the confidence coefficient of the antagonistic sample are circularly calculated, and the grading threshold value gamma is updatedg
Further, the step S3 includes the following sub-steps:
s31, initializing a detector, setting the compressed color depth to be 1 bit, and setting the size of a space smoother to be 2x 2;
s32, inputting the antagonistic sample to obtain the detection rate and model precision of the detector, and storing the detection rate and model precision;
s33, replacing the detector combination, gradually increasing the compressed color depth bit number and the size of the spatial smoother, repeating the step S32, and updating the optimal detection rate and the model precision;
and S34, determining the optimal detector combination according to the optimal detection rate and the model precision.
Further, the step S4 includes the following sub-steps:
s41, inputting the antagonistic sample to the optimal detector to obtain a compressed version of the original data according to the optimal detector obtained in the step S3, and calculating L-infinity, L-2 and L-0 norm values of the data;
s42, calculating norm value and classification threshold value obtained in S41γc=(c,c2,c0) The item with the smallest difference is selected as the classification of the resistance sample;
s43, calculating the sum of the norm differences;
s44, whether the sum of the comparison differences is larger than the step grading threshold value gammagIf yes, judging the attack strength to be weak, otherwise, judging the attack strength to be strong;
and S45, outputting the classification of the resistance sample and the attack strength.
Further, the step S5 includes the following sub-steps:
s51, inputting the test data obtained in the step S15 into the detector obtained in the step S3;
s52, comparing the detected classification and attack level with the classification and level of the original label of the test set, if so, passing the verification and ending the operation; otherwise, returning to step S1, modifying the parameters and adjusting the detection method.
The invention has the beneficial effects that: the method is different from the traditional method for detecting the antagonistic sample, and can simultaneously detect the specific classification and the attack strength of the antagonistic sample on the premise of accurately detecting the antagonistic sample.
Drawings
FIG. 1 is a flow chart of a norm-based antagonism sample detection classification method of the present invention;
FIG. 2 is a method for generating a resistant sample according to different optimization methods and different constraints of the present invention;
FIG. 3 is a statistical analysis method of attack classification and intensity level thresholds of the present invention;
FIG. 4 is a method of determining an optimal detector of the present invention;
FIG. 5 is a method of classification and grading of resistance samples according to the present invention;
FIG. 6 is an authentication module of the present invention.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
As shown in fig. 1, the method for detecting and classifying a resistant sample based on norm of the present invention includes the following steps:
s1, generating a resistance sample, and selecting attack methods with different attack strengths to generate the resistance sample; as shown in fig. 2, the method specifically includes the following sub-steps:
s11, determining a resistance attack method to generate a resistance sample;
s12, calculating a loss function L (x') in the generation process of the resistance sample during each attack, and generating the following resistance samples according to the loss function:
Figure BDA0002289644810000041
d(x*and x') ≦ ε for the resistant sample x*The distance from x' is within a preset minimum value epsilon;
s13, selecting different limiting conditions, and constraining the antagonistic sample generated each time to be not easily perceived; the limiting conditions comprise an L-0 norm, an L-2 norm and an L-infinity norm; and (4) collecting antagonistic samples using different optimization methods and different limiting conditions, adjusting parameters, and determining attack strength according to the attack success rate and the confidence coefficient.
Common L- ∞ attacks are FGSM, BIM and CWL-∞. FGSM (fast gradient notation) is a fast and efficient counter attack method, performing only one step of gradient update at each pixel along the direction of the gradient sign. The generation of FGSM can be expressed as:
Figure BDA0002289644810000042
x*=x+η
wherein the content of the first and second substances,
Figure BDA0002289644810000043
is to the loss function JθAnd (x, y) performing gradient descent to obtain an optimal antagonistic sample.
BIM (basic iteration method) generates a resistance sample through multiple iterations based on FGSM, and prunes the pixel value in each iteration, thereby avoiding large change of each pixel, and the process is as follows:
x0=x,
Figure BDA0002289644810000051
wherein, Clipx,ξ() The functional representation clips the antagonism sample generated in each iteration into a preset xi field so as to limit the antagonism sample to be not easily perceived.
The CW (Carlini & Wagner) attack is a strong method of challenge attack and has been shown to be effective against most existing challenge detection defenses. The CW attack defines a new objective function g (x), which translates the generation process of the antagonistic sample into the following problem:
minn||η||p+c·g(x+η)
s.t.x+η∈[0,1]n
wherein | | | η | | purple lightpIs a constraint on the p-norm of the perturbation η, c is a constant, x + η ∈ [0, 1 ]]nThe antagonistic perturbation η is limited to a certain range.
CWL-∞The attack is an iterative attack, and a new penalty term is added in each iteration:
Figure BDA0002289644810000052
where η is the antagonistic perturbation, the constraint term in the objective function is replaced with a penalty (initially 1, decreasing in each iteration) for any term that exceeds T.
Common L-2 attacks are DeepFool and CWL-2. The basic idea of depfool is to find the closest distance from the original input to the decision boundary of the antagonistic case. To overcome the high-dimensional non-linearity problem, DeepFool uses an iterative attack of linear approximation. The process is as follows:
Figure BDA0002289644810000056
Figure BDA0002289644810000053
CWL-2in the problem posed by CW, the following constraints are added:
Figure BDA0002289644810000054
wherein a new variable ω is introduced, such that
Figure BDA0002289644810000055
Because-1. ltoreq. tanh (. omega.)i) Less than or equal to 1, so the above formula limits x to be less than or equal to 0ii≤1。
Common L-0 attacks are JSMA and CWL-0. JSMA (Jacobian-based Saliency Map attach) designed a potent significance Map, called a Jacobian-based significance Map Attack. First, calculate the jacobian matrix of a given sample x, whose formula is:
Figure BDA0002289644810000061
then, a significance map of the antagonism is defined based on the Jacobian matrix, and features to be made in each iteration are selected. CW because the L-0 norm is not differentiableL-0The L-0 attack is performed iteratively. In each iteration, some pixels are trivial to generate the antagonistic instance and are therefore deleted. The importance of a pixel is determined by the gradient of the L-2 distance. If the remaining pixels fail to generate a counterexample, the iteration stops.
S14, obtaining different antagonistic samples by adjusting attack iteration times and confidence degrees of the antagonistic attack method, observing the attack success rate and the confidence degree of each antagonistic sample, and determining the attack strength;
s15, dividing the obtained antagonistic sample into two parts: one part is stored according to the limitation condition and the attack strength in a classified mode, and the other part is used as test data after the limitation condition and the attack strength are marked.
S2, calculating the norm of the antagonism sample, and obtaining a classification threshold value and a grading threshold value; as shown in fig. 3, the method specifically includes the following sub-steps:
s21, taking the antagonism samples obtained from the step S15 and stored in a classified manner as input, and calculating L-infinity, L-2 and L-0 norm values of all the antagonism samples; the calculation formula is as follows:
L:||x||=max(|x1|,|x2|...|xn|)
L2
Figure BDA0002289644810000062
L0:||x||0=Count(xi≠0);
s22, according to the classified classes of the antagonistic samples, obtaining the classification threshold value gamma of each norm-constrained antagonistic sample through statistical analysisc=(c,c2,c0),c,c2,c0Thresholds of L- ∞, L-2 and L-0 norms, respectively;
s23, outputting attack success rate and confidence degree of the antagonistic sample, comparing the attack strength, and obtaining a grading threshold value gamma through statistical analysisg(ii) a The attack success rate and the confidence coefficient of the antagonistic sample are circularly calculated, and the grading threshold value gamma is updatedg
S3, determining a detector: determining a detector using a detection method based on improved prediction inconsistency; different versions of the sample can be obtained by the detection method based on the inconsistent prediction, and the norm calculation is facilitated. Meanwhile, a large number of antagonistic samples are not required to be relied on, so that time and calculation cost are greatly saved. Feature compression is used as a method of handling prediction inconsistencies, and there are many forms of feature compression, and the present invention focuses on two simple types of compression: reducing the color depth of the image, and using smoothing to reduce the difference between pixels. A standard digital image is represented by an array of pixels, each pixel typically represented as a number representing a particular color.
Common image representations use color bit depths that result in irrelevant functionality, so we assume that reducing the bit depth can reduce the chance of confrontation without compromising classifier accuracy. There are two common representations of images, namely 8-bit gray scale and 24-bit color. The grayscale image provides 2 for each pixel8256 possible values. An 8-bit value represents the intensity of the pixel, where 0 is black, 255 is white, and the middle number represents a different shade of grey. Spatial smoothing is a group of techniques widely used in image processing to reduce image noise, and a common method is median averaging. And selecting different color depth compression values and different spatial smoothers, comparing the detection rate of the contrast samples with the change of model precision, and determining the most suitable detector.
As shown in fig. 4, the method specifically includes the following sub-steps:
s31, initializing a detector, setting the compressed color depth to be 1 bit, and setting the size of a space smoother to be 2x 2;
s32, inputting the antagonistic sample to obtain the detection rate and model precision of the detector, and storing the detection rate and model precision;
s33, replacing the detector combination, gradually increasing the compressed color depth bit number and the size of the spatial smoother, repeating the step S32, and updating the optimal detection rate and the model precision;
and S34, determining the optimal detector combination according to the optimal detection rate and the model precision.
S4, compressing the target sample by the detector, and calculating a norm value L of the sample (L),l2,l0) Comparing the calculated norm value with the grading threshold value obtained in the step S2 to judge whether the target sample is a antagonism sample, and if so, further obtaining the attack classification and the attack strength of the antagonism sample; otherwise, the sample is not processed; as shown in fig. 5, the following sub-steps are included:
s41, inputting the antagonistic sample to the optimal detector to obtain a compressed version of the original data according to the optimal detector obtained in the step S3, and calculating L-infinity, L-2 and L-0 norm values of the data;
s42, calculating norm value and classification threshold value gamma obtained in S41c=(c,c2,c0) The item with the smallest difference is selected as the classification of the resistance sample;
Δnorm=(|l-c|,|l2-c2|,|l0-c0|)
classindex=argmin(Δnorm)
Figure BDA0002289644810000071
s43, calculating the sum of the norm differences;
sum=|Δnorm|;
s44, whether the sum of the comparison differences is larger than the step grading threshold value gammagIf yes, judging the attack strength to be weak, otherwise, judging the attack strength to be strong:
Figure BDA0002289644810000072
s45, outputting the classification and attack strength of the antagonistic sample;
s5, verifying the rationality and the validity of the whole detector: inputting a test sample, comparing whether the detected classification and grade are consistent with those of the original label or not, if so, ending the operation, otherwise, returning to the step S1; as shown in fig. 6, the following sub-steps are included:
s51, inputting the test data obtained in the step S15 into the detector obtained in the step S3;
s52, comparing the detected classification and attack level with the classification and level of the original label of the test set, if so, passing the verification and ending the operation; otherwise, returning to step S1, modifying the parameters and adjusting the detection method.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (5)

1. A method for detecting and classifying antagonistic image samples based on norms is characterized by comprising the following steps:
s1, generating a antagonism image sample, and selecting attack methods with different attack strengths to generate the antagonism image sample;
s2, calculating norm of the antagonistic image sample to obtain a classification threshold and a grading threshold;
s3, determining a detector: determining a detector using a detection method based on improved prediction inconsistency; the method comprises the following substeps:
s31, initializing a detector, setting the compressed color depth to be 1 bit, and setting the size of a space smoother to be 2x 2;
s32, inputting a antagonism image sample, compressing the image sample by using a detector to obtain the detection rate and the model precision of the detector, and storing the detection rate and the model precision;
s33, replacing the detector combination, gradually increasing the compressed color depth bit number and the size of the spatial smoother, repeating the step S32, and updating the optimal detection rate and the model precision;
s34, determining an optimal detector according to the optimal detection rate and the model precision;
s4, compressing the target image sample by the optimal detector, and calculating a norm value L of the image sample (L ═ L),l2,l0) Comparing the calculated norm value with the grading threshold value obtained in step S2, determining whether the target image sample is a challenge image sample, and if so, further obtaining an attack classification and an attack strength of the challenge image sample; otherwise, the image sample is not processed;
s5, verifying the rationality and effectiveness of the optimal detector: inputting a test image sample, comparing whether the detected classification and grade are consistent with those of the original label, if so, ending the operation, otherwise, returning to the step S1.
2. The norm-based antagonistic image sample detection classification method according to claim 1, wherein the step S1 includes the following sub-steps:
s11, determining a resistance attack method to generate a resistance image sample;
s12, calculating a loss function L (x') in the generation process of the antagonistic image sample for each attack, and generating the following antagonistic image samples according to the loss function:
Figure FDA0003414103120000011
d(x*and x') ≦ ε for the antagonistic image sample x*The distance from x' is within a preset minimum value epsilon;
s13, selecting different limiting conditions to restrict the antagonistic image sample generated each time to be not easy to be perceived; the limiting conditions comprise an L-0 norm, an L-2 norm and an L-infinity norm;
s14, obtaining different antagonistic image samples by adjusting attack iteration times and confidence degrees of the antagonistic attack method, observing the attack success rate and the confidence degree of each antagonistic image sample, and determining the attack strength;
s15, dividing the obtained antagonistic image sample into two parts: one part is stored according to the limitation condition and the attack strength in a classified mode, and the other part is used as test data after the limitation condition and the attack strength are marked.
3. The norm-based antagonistic image sample detection and classification method according to claim 2, wherein the step S2 comprises the following sub-steps:
s21, taking the antagonism image samples obtained from the step S15 and stored in a classified manner as input, and calculating L-infinity, L-2 and L-0 norm values of all the antagonism image samples; the calculation formula is as follows:
L : ||x|| = max(|x1| , |x2| … |xn|)
Figure FDA0003414103120000021
L0:||x||0=Count(xi≠0);
s22, obtaining the classification threshold value gamma of each norm-constrained antagonistic image sample through statistical analysis according to the classified classes of the antagonistic image samplesc=(c,c2,c0),c,c2,c0Thresholds of L- ∞, L-2 and L-0 norms, respectively;
s23, outputting attack success rate and confidence of the antagonistic image sample, comparing the attack intensity, and obtaining a grading threshold value gamma through statistical analysisg(ii) a The attack success rate and the confidence coefficient of the antagonistic image samples are circularly calculated, and the grading threshold value gamma is updatedg
4. The norm-based antagonistic image sample detection classification method according to claim 1, wherein the step S4 includes the following sub-steps:
s41, inputting the antagonistic image sample to the optimal detector to obtain a compressed version of the original data according to the optimal detector obtained in the step S3, and calculating L-infinity, L-2 and L-0 norm values of the data;
s42, calculating norm value and classification threshold value gamma obtained in S41c=(c,c2,c0) Selecting the item with the minimum difference as the classification of the antagonistic image samples;
s43, calculating the sum of the norm differences;
s44, whether the sum of the comparison differences is larger than the step grading threshold value gammagIf yes, judging the attack strength to be weak, otherwise, judging the attack strength to be strong;
and S45, outputting the classification of the antagonistic image sample and the attack strength.
5. The norm-based antagonistic image sample detection classification method according to claim 1, wherein the step S5 includes the following sub-steps:
s51, inputting the test data obtained in the step S15 into the detector obtained in the step S3;
s52, comparing the detected classification and attack level with the classification and level of the original label of the test set, if so, passing the verification and ending the operation; otherwise, returning to step S1, modifying the parameters and adjusting the detection method.
CN201911174658.3A 2019-11-26 2019-11-26 Norm-based antagonistic sample detection and classification method Active CN110942094B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911174658.3A CN110942094B (en) 2019-11-26 2019-11-26 Norm-based antagonistic sample detection and classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911174658.3A CN110942094B (en) 2019-11-26 2019-11-26 Norm-based antagonistic sample detection and classification method

Publications (2)

Publication Number Publication Date
CN110942094A CN110942094A (en) 2020-03-31
CN110942094B true CN110942094B (en) 2022-04-01

Family

ID=69908132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911174658.3A Active CN110942094B (en) 2019-11-26 2019-11-26 Norm-based antagonistic sample detection and classification method

Country Status (1)

Country Link
CN (1) CN110942094B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753880B (en) * 2020-05-27 2023-06-27 华东师范大学 Image classification method for avoiding challenge sample attack
CN112241532B (en) * 2020-09-17 2024-02-20 北京科技大学 Method for generating and detecting malignant countermeasure sample based on jacobian matrix
CN112488321B (en) * 2020-12-07 2022-07-01 重庆邮电大学 Antagonistic machine learning defense method oriented to generalized nonnegative matrix factorization algorithm
CN115205608B (en) * 2022-09-15 2022-12-09 杭州涿溪脑与智能研究所 Adaptive image countermeasure sample detection and defense method based on compressed sensing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175646A (en) * 2019-05-27 2019-08-27 浙江工业大学 Multichannel confrontation sample testing method and device based on image transformation
EP3543917A1 (en) * 2018-03-19 2019-09-25 SRI International Inc. Dynamic adaptation of deep neural networks
CN110334749A (en) * 2019-06-20 2019-10-15 浙江工业大学 Confrontation attack defending model, construction method and application based on attention mechanism

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190266483A1 (en) * 2018-02-27 2019-08-29 Facebook, Inc. Adjusting a classification model based on adversarial predictions

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3543917A1 (en) * 2018-03-19 2019-09-25 SRI International Inc. Dynamic adaptation of deep neural networks
CN110175646A (en) * 2019-05-27 2019-08-27 浙江工业大学 Multichannel confrontation sample testing method and device based on image transformation
CN110334749A (en) * 2019-06-20 2019-10-15 浙江工业大学 Confrontation attack defending model, construction method and application based on attention mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks";Xu W;《arXiv》;20171205;第1-15页 *
"深度学习中的对抗样本问题";张思思;《计算机学报》;20190831;第1-19页 *

Also Published As

Publication number Publication date
CN110942094A (en) 2020-03-31

Similar Documents

Publication Publication Date Title
CN110942094B (en) Norm-based antagonistic sample detection and classification method
CN108647583B (en) Face recognition algorithm training method based on multi-target learning
CN111753881B (en) Concept sensitivity-based quantitative recognition defending method against attacks
CN113378988B (en) Particle swarm algorithm-based robustness enhancement method and device for deep learning system
CN110866287B (en) Point attack method for generating countercheck sample based on weight spectrum
CN112800876B (en) Super-spherical feature embedding method and system for re-identification
CN111325324A (en) Deep learning confrontation sample generation method based on second-order method
CN111598182A (en) Method, apparatus, device and medium for training neural network and image recognition
CN114842267A (en) Image classification method and system based on label noise domain self-adaption
US20220129712A1 (en) Deep neural network hardener
CN115186816B (en) Back door detection method based on decision shortcut search
CN116912568A (en) Noise-containing label image recognition method based on self-adaptive class equalization
CN114139631B (en) Multi-target training object-oriented selectable gray box countermeasure sample generation method
Tsiligkaridis Failure prediction by confidence estimation of uncertainty-aware Dirichlet networks
CN105787045B (en) A kind of precision Enhancement Method for visual media semantic indexing
CN111652264B (en) Negative migration sample screening method based on maximum mean value difference
CN111062406B (en) Heterogeneous domain adaptation-oriented semi-supervised optimal transmission method
CN116502705A (en) Knowledge distillation method and computer equipment for dual-purpose data set inside and outside domain
CN110059557A (en) A kind of face identification method adaptive based on low-light (level)
CN113486736B (en) Black box anti-attack method based on active subspace and low-rank evolution strategy
CN115510986A (en) Countermeasure sample generation method based on AdvGAN
CN114202671A (en) Image prediction optimization processing method and device
CN115249513A (en) Neural network copy number variation detection method and system based on Adaboost integration idea
CN113487506A (en) Countermeasure sample defense method, device and system based on attention denoising
CN114220016B (en) Unmanned aerial vehicle aerial image domain adaptive identification method oriented to open scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant