CN115063654A

CN115063654A - Black box attack method based on sequence element learning, storage medium and electronic equipment

Info

Publication number: CN115063654A
Application number: CN202210639789.XA
Authority: CN
Inventors: 翁娟娟; 罗志明; 曹冬林; 江敏; 李绍滋
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2022-06-08
Filing date: 2022-06-08
Publication date: 2022-09-16
Anticipated expiration: 2042-06-08
Also published as: CN115063654B

Abstract

The invention provides a black box attack method based on sequence meta-learning, a storage medium and electronic equipment, wherein the method comprises the steps of sampling a plurality of images from an agent data set, and optimizing first directional disturbance of each model fi in a classifier model set one by one according to a sequence based on the images and the classifier model set to obtain a group of directional counterdisturbance related to each model; and then generating a second directional disturbance delta irrelevant to the model based on the directional counterdisturbance, determining the updating direction of the second directional disturbance delta according to the directional counterdisturbance, and optimizing the second directional disturbance according to the determined updating direction of the second directional disturbance. And outputting the current second directional disturbance when the iteration times reach the preset times. According to the scheme, the shared disturbance can be optimized by mining all observed models, and the mobility of the directional general disturbance is effectively improved.

Description

Black box attack method based on sequence element learning, storage medium and electronic equipment

Technical Field

The invention relates to the technical field of artificial intelligence security, in particular to a black box attack method based on sequence element learning, a storage medium and electronic equipment.

Background

Currently, artificial intelligence techniques typified by deep learning have been applied to various industries of society, such as image classification, target detection, image segmentation, and the like. However, with the continuous and deep research of artificial intelligence, convenience is brought and corresponding potential safety hazards are brought. For example, new images generated by adding subtle Perturbations (additive Perturbations) that cannot be detected by the human eye to the original image may cause a Neural network (DNN) to generate a wrong prediction result. Since the deep neural network is very susceptible to the challenge sample, the challenge Attack (adaptive Attack) is also a research hotspot in the field of artificial intelligence security.

In recent years, there are many algorithms for countermeasure sample (adaptive Example) generation.

First, in the first aspect, these algorithms are classified into White-box Attack (White-box Attack) and Black-box Attack (Black-box Attack) according to whether an attacker knows the structure, parameters, gradient, etc. of a deep neural network, where the White-box Attack is information that a target model is already known, and the Black-box Attack is information that the structure and parameters of the target model are unknown, and only output classes or confidences can be obtained, and common algorithms are based on gradient estimation or fool the target model by using mobility against samples. However, in the actual scene application, an attacker cannot acquire target model information, so that the method has practical significance for black box attack research.

Secondly, in a second aspect. These algorithms can classify attack attacks into directed attack (TargetedAttacks) and undirected attack (intargeteddattacks) depending on whether the target network misclassifies the attack samples into the specified classes. The non-directional attack means that the target classifier misjudges the confrontation sample to any one incorrect category, namely the attack is successful; while a directed attack is a specific case of a particular but more difficult non-directed attack where the countersample is misjudged to be in a specified category.

Currently, extensive research has been conducted on non-directed black box attacks, but directed black box attacks (generally applied to more realistic scenes) still present significant challenges. The existing directional black box attack mode often has the following defects: the perturbation needs to be trained separately for each image and is computationally expensive.

Disclosure of Invention

Therefore, a technical scheme of the black box attack based on sequence element learning is needed to be provided, so that under the condition that no source domain training data is needed, more essential and directional general counterdisturbance shared by a plurality of models is mined, better mobility is achieved, and the unknown models can be reused to attack so as to solve the problems that the existing black box attack algorithm is large in operation amount and poor in mobility when directional disturbance is generated.

In a first aspect, the present invention provides a black box attack method based on sequence meta learning, including the following steps:

s1: obtaining a proxy dataset

And a set of classifier models

S2: from the proxy data set

Middle sampling of several images x _b Based on said number of images x _b And the classifier model set optimizes each f in the classifier model set one by one according to the order _i First directional disturbance δ _i Obtaining a set of parameters corresponding to the set of classifier models { f ₁ ，f ₂ ，…，f _n Relative directional countermeasure perturbation [ delta ] ₁ ，δ ₂ ，…，δ _n }; each of the directional countervailing perturbations δ i is associated with the classifierCorresponding in the model set f _i Associating;

s3: generating a second model-independent directional disturbance δ based on the directional counterdisturbance, { δ } according to the directional counterdisturbance ₁ ，δ ₂ ，…，δ _n Determining the updating direction of the second directional disturbance delta, and optimizing the second directional disturbance delta according to the determined updating direction of the second directional disturbance delta;

repeating steps S2-S3 to iteratively update the second directional perturbation δ;

s4: when the iteration times reach preset times, outputting a current second directional disturbance delta;

s5: and adding the second directional disturbance delta obtained in the step S4 to all the images in the verification set, and inputting all the images in the verification set to which the second directional disturbance delta is added into the attacked unknown black box model.

Further, step S1 further includes: acquiring a training parameter;

the training parameters comprise an upper limit value epsilon for controlling the disturbance amplitude and an image x sampled every time _b The number of the iterations and the preset number of iterations.

Further, step S2 is implemented by an internal loop module, which when executing step S2 specifically includes:

s21: set the classifier models

Firstly, random arrangement is carried out, and then a model f is sequentially selected _i ；

S22: initializing a current attack model f using a directional perturbation delta _i Delta of task _i Let δ _i δ; the initial value of the directional disturbance delta is 0;

s23: using the last model f _i-1 Generated disturbance delta _i-1 And a current attack model f _i Disturbance delta of a task _i Respectively disturbing the images x _b Generating two-part challenge samples

Splicing the two parts of the confrontation samples together to form a whole confrontation sample

S24: the overall confrontation sample obtained in the step S23

As input to the current classification model f _i And calculating a loss value using a cross entropy loss function such that the overall countermeasure sample

Is misclassified as a target class t; the cross entropy loss function is calculated as shown in the following formula (1):

wherein l _t Coding a one-hot of the target category t;

s25: calculating gradient information and updating the current attack model f _i Disturbance delta of a task _i (ii) a The specific calculation mode is shown as formula (2):

further, step S24 is followed by:

s25: by means of L _p Norm metric to constrain the current attack model f _i Delta of task _i Make delta _i The maximum allowable limit e is not exceeded, and the specific constraint calculation mode is shown as the formula (3):

if the currently calculated delta _i As a set of said classifier models

Last model f _i Corresponding delta _i Then go to step S3; otherwise, step S21 is executed.

Further, step S3 is implemented by an external loop module, which when executing step S3 specifically includes:

s31: recording the disturbance before the update of the inner loop module as delta _base Calculating each model f _i Corresponding difference delta before and after update of disturbance _base -δ _i And determining the gradient update of the outer loop according to all the calculated difference values, namely:

s32: controlling the resulting disturbance δ from δ _base Towards each model of attack f _i The calculation formula is shown as the following formula (4):

wherein,

in a second aspect, the present invention provides a storage medium storing a computer program which when executed implements a method comprising:

s1: obtaining a proxy dataset

And a set of classifier models

S2: from the proxy data set

Middle sampling of several images x _b Based on said number of images x _b And the classifier model set optimizes each f in the classifier model set one by one according to the order _i First directional disturbance δ _i Obtaining a set of parameters corresponding to the set of classifier models { f ₁ ，f ₂ ，…，f _n Relative directional countermeasure perturbation [ delta ] ₁ ，δ ₂ ，…，δ _n }; each of the orientations opposes the disturbance δ _i All corresponding to f in the set of classifier models _i Associating;

s3: generating a second model-independent directional disturbance δ based on the directional counterdisturbance, according to which the directional counterdisturbance { δ } ₁ ，δ ₂ ，…，δ _n Determining the updating direction of the second directional disturbance delta, and optimizing the second directional disturbance delta according to the determined updating direction of the second directional disturbance delta;

Further, the computer program includes a first computer program, and the first computer program is configured to implement step S2 when executed, and specifically includes:

s21: set the classifier models

S22: initializing current using directional perturbation deltaAttack model f _i Delta of task _i Let δ _i δ; the initial value of the directional disturbance delta is 0;

S24: the overall confrontation sample obtained in the step S23

wherein l _t One-hot coding for the target category t;

further, the first computer program when executed further comprises after implementing step S24, implementing the steps of:

if the currently calculated delta _i As a set of said classifier models

Middle last model f _i Corresponding delta _i Then go to step S3; otherwise, step S21 is executed.

Further, the computer program includes a second computer program, and the second computer program is configured to implement step S3 when executed, and specifically includes:

s31: the disturbance corresponding to the model before the update in step S2 is recorded as delta _base Calculating the difference delta before and after the disturbance updating corresponding to all models _base -δ _i And determining the gradient update of the outer loop according to all the calculated difference values, namely:

s32: controlling disturbance δ from δ _base Towards each model of attack f _i The calculation formula is shown as the following formula (4):

wherein,

in a third aspect, the present invention also provides an electronic device comprising a processor and a storage medium, the storage medium being as in the second aspect;

the processor is adapted to execute a computer program stored in the storage medium to perform the method steps as in the first aspect.

Different from the prior art, the invention provides a black box attack method based on sequence element learning, a storage medium and electronic equipment, wherein the method comprises the following steps: s1: obtaining a proxy dataset

And a set of classifier models

S2: from the proxy data set

Middle sampling of several images x _b Based on said number of images x _b And the classifier model set optimizes each f in the classifier model set one by one according to the order _i First directional disturbance δ _i Obtaining a set of parameters corresponding to the set of classifier models { f ₁ ，f ₂ ，…，f _n Relative directional countermeasure perturbation [ delta ] ₁ ，δ ₂ ，…，δ _n }; s3: generating a second model-independent directional disturbance δ based on the directional counterdisturbance, { δ } according to the directional counterdisturbance ₁ ，δ ₂ ，…，δ _n Determining the updating direction of the second directional disturbance delta, and optimizing the second directional disturbance delta according to the determined updating direction of the second directional disturbance delta; repeating steps S2-S3 to iteratively update the second directional perturbation δ; s4: when the iteration times reach preset times, outputting a current second directional disturbance delta; s5: and adding the second directional disturbance delta obtained in the step S4 to all the images in the verification set, and inputting all the images in the verification set to which the second directional disturbance delta is added into the attacked unknown black box model. The invention can excavate all observed modelsShared disturbance is optimized, and the mobility of the directional general disturbance is effectively improved.

Drawings

Fig. 1 is a flowchart of a black box attack method based on sequence meta learning according to a first embodiment of the present invention;

FIG. 2 is a flowchart of a black box attack method based on sequence meta learning according to a second embodiment of the present invention;

FIG. 3 is a flowchart of a black box attack method based on sequence element learning according to a third embodiment of the present invention;

fig. 4 is a schematic diagram of an algorithm module of a black box attack method based on sequence element learning according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an application scenario according to an embodiment of the present invention;

FIG. 6 is a schematic illustration of a universal countermeasure disturbance, of the type owl, generated by each of the different algorithms;

FIG. 7 is a comparison of classification results of an attack unknown model before and after adding a general perturbation to which the present invention relates;

fig. 8 is a block diagram of an electronic device according to an embodiment of the present invention;

reference numerals are as follows:

10. an electronic device;

101. a processor;

102. a storage medium.

Detailed Description

In order to explain in detail possible application scenarios, technical principles, practical embodiments, and the like of the present application, the following detailed description is given with reference to the accompanying drawings in conjunction with the listed embodiments. The embodiments described herein are merely for more clearly illustrating the technical solutions of the present application, and therefore, the embodiments are only used as examples, and the scope of the present application is not limited thereby.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase "an embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or related to other embodiments specifically defined. In principle, in the present application, the technical features mentioned in the embodiments can be combined in any manner to form a corresponding implementable technical solution as long as there is no technical contradiction or conflict.

Unless defined otherwise, technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the use of relational terms herein is intended only to describe particular embodiments and is not intended to limit the present application.

In the description of the present application, the term "and/or" is a expression for describing a logical relationship between objects, meaning that three relationships may exist, for example a and/or B, meaning: there are three cases of A, B, and both A and B. In addition, the character "/" herein generally indicates that the former and latter associated objects are in a logical relationship of "or".

In this application, terms such as "first" and "second" are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

In the present application, without further limitation, the words "comprise," "include," "have" or other similar expressions used in the language of the claims are intended to cover non-exclusive inclusions, which do not exclude the presence of additional elements in a process, method or article comprising elements, such that a process, method or article comprising a list of elements may include not only those elements but also other elements not expressly listed or inherent to such process, method or article.

As is understood in the examination of the guidelines, the terms "greater than", "less than", "more than" and the like in this application are to be understood as excluding the number; the expressions "above", "below", "within" and the like are understood to include the present numbers. In addition, in the description of the embodiments of the present application, "a plurality" means two or more (including two), and expressions related to "a plurality" similar thereto are also understood, for example, "a plurality of groups", "a plurality of times", and the like, unless specifically defined otherwise.

As shown in fig. 1, in a first aspect, the present invention provides a black box attack method based on sequence meta learning, including the following steps:

s1: acquiring a proxy data set and a classifier model set;

s2: sampling a plurality of images from the proxy data set, and optimizing each model f in the classifier model set one by one in order based on the plurality of images and the classifier model set _i Obtaining a group of models f in the classifier model set _i The relative directional opposition perturbation;

s3: generating a second directional disturbance irrelevant to the model based on the directional counterdisturbance, determining the updating direction of the second directional disturbance according to the directional counterdisturbance, and optimizing the second directional disturbance according to the determined updating direction of the second directional disturbance;

s4: when the iteration times reach preset times, outputting current second directional disturbance;

s5: and adding the second directional disturbance obtained in the step S4 to all images in the verification set, and inputting all images in the verification set to which the second directional disturbance is added into the attacked unknown black box model.

Preferably, step S1 further includes: acquiring a training parameter; the training parameters comprise an upper limit value epsilon for controlling the disturbance amplitude and an image x sampled every time _b The number of the iterations and the preset number of iterations. The upper limit of the amplitude of the disturbance can be controlled by setting the upper limit value e so that the maximum variation of each pixel of the image does not exceed the upper limit. By setting the image x per sample _b The number of the training data is larger than the number of the training data, so that the training of the images is performed according to batches, and the training is performed orderly. By setting the preset number of iterations, when the number of iterations reaches the preset number,and outputting the current orientation disturbance.

As shown in fig. 2, in some embodiments, step S2 is implemented by an internal loop module, which when executing step S2 specifically includes:

s21: randomly arranging the models in the classifier model set, and then selecting a model f according to the randomly arranged sequence _i ；

S22: initializing a current attack model f using a directional perturbation delta _i Delta of task _i Let δ _i δ; the initial value of the directional disturbance delta is 0; in this way, it is guaranteed that the counterdisturbance has the same initial starting point under each model (task).

S24: the overall confrontation sample obtained in the step S23

wherein l _t One-hot coding for the target category t;

further, in order to make the disturbance small enough to be imperceptible to human beings, in some embodiments, step S24 is followed by:

s25: by means of L _p Norm measurement for constraining current attack model f _i Delta of task _i Make delta _i The maximum allowable limit e is not exceeded, and the specific constraint calculation mode is shown as the formula (3):

if the currently calculated delta _i As a set of said classifier models

Last model f _i Corresponding delta _i Jumping to perform step S3; otherwise, step S21 is executed.

As shown in fig. 3, in some embodiments, step S3 is implemented by an outer loop module, which when executing step S3 specifically includes:

s31: calculating each model f _i Determining the gradient update of the outer circulation according to all the calculated difference values corresponding to the difference values before and after the disturbance update; recording the disturbance before the update of the inner loop module as delta _base The difference before and after the update of the disturbance is delta _base -δ _i The gradient update calculation is as follows:

s32: controlling the second directional perturbation to optimize towards the average gradient of attacking each model; i.e. to control said second directional disturbance δ from δ _base Towards each model of attack f _i The calculation formula is shown as the following formula (4):

wherein,

by adopting the scheme of the application, the method has the following beneficial effects:

(1) the invention adopts a meta-learning training strategy, optimizes the shared disturbance UAP by mining all observed models, improves the mobility of the directional general disturbance, and avoids the problem that the traditional integration strategy is biased to a model.

(2) The invention provides a novel sequence meta-learning framework, which is used for sequentially training directional universal disturbance by keeping the learning knowledge of current and previous models, so that the directional mobility is further improved.

(3) The present invention may not use any data of the source domain and the algorithm may be compatible with a variety of existing penalty functions.

The invention provides a sequence-based meta-learning and directional black box attack algorithm aiming at directional black box attack research, compared with other directional black box attack methods, the invention can generate a single, directional and universal confrontation disturbance by using the advantages of meta-learning integrated multi-models under the condition of not using source data, improve the mobility of a directional confrontation sample, namely the universal confrontation disturbance based on source model training is multiplexed, and can successfully attack an unknown model.

The method of the present invention is further described with reference to specific examples, which are presented herein for the purpose of illustration and explanation, and are not intended to be limiting.

First, the proxy data set in step S1

The MS-COCO dataset in the target detection domain may be selected, and several (e.g., 3) pre-trained models for the ImageNet dataset may be selected, and the other training parameters include: batch size (number of pictures input at a time) N _b And (4) 32, training the iteration number I to 4000 and the disturbance amplitude epsilon to 10 (the pixel point range of the image is 0-255). The pre-training models may be GoogleNet, VGG16, and ResNet50 models, among others.

Next, in step S23, the first model f is selected _i Optimizing delta ₁ Then, the two part of the confrontation samples generated at this time are both

They can be spliced together as a whole, i.e.

In step S24, the category may be t-24 (gray owl); in step S25, an Adam optimizer may be used to update the gradient, setting the learning rate β of the optimizer to 0.005; finally, a set of oriented, generic countermeasure perturbations can be sequentially optimized according to the Inner loop module of FIG. 4-three corresponding, generic perturbations of the category owl { δ } can be generated for each of the three models (GoogleNet, VGG16, and ResNet50) ₁ ，δ ₂ ，δ ₃ }. The method comprises the following steps: based on delta _base Is delta of an initial value ₁ The generated countercheck sample attacks the GoogleNet model to update the delta of the current model ₁ (ii) a Based on delta _base Is delta of an initial value ₂ And delta was generated from the last GoogleNet model ₁ Under the combined action of the two parts, two parts of countersamples are generated to attack the current VGG16 model, so that the delta is updated ₂ (ii) a And so on.

Thirdly, pressing the Outer loop module in fig. 4 to obtain a cat type independent of the modelDisturbance delta of the eagle-three disturbances { delta } obtained at step S2 ₁ ，δ ₂ ，δ ₃ And δ _base Next, the disturbance δ shared by the three models is optimized using equation (4) (without biasing to any of the three models):

finally, the output category is the disturbance δ of the owl.

The effect of the method of the present invention will be further described with reference to simulation experiments.

1. Setting simulation experiment conditions:

in the simulation experiment, the selected hardware platform can be two NVIDIA Corporation GP102[ GeForce GTX 1080 Ti ], 11GB video memory. The Python version used in the simulation experiment of the present invention is Python 3.8.12, and the library and the corresponding version used are torch 1.4.0 and torch 0.5.0, respectively.

2. Simulation content and results:

the simulation experiment scene of the invention is shown in fig. 5, and is mainly used for carrying out targeted counterattack on an actual image classification system. First, a proxy dataset (MS-COCO) and a set of white-box models (GoogleNet, VGG16 and ResNet50) are used; then, training out general countermeasure noise (UAP) of a specific class (owl) by using a sequence element learning algorithm in the invention; and finally, adding universal countermeasure noise (UAP) to verification set pictures of all ImageNet except the target class to form a directional countermeasure sample, inputting an attacked unknown black box model (VGG19_ BN), and enabling the unknown black box model (VGG19_ BN) to mistake all input pictures into the category of the owl.

The results of the simulation are shown in fig. 6 and 7. In fig. 6, the left picture SMAML is a universal countermeasure disturbance (UAP) of owl type generated by using the sequence meta-learning algorithm, the middle picture MAML in fig. 6 is a universal countermeasure disturbance of owl type generated by using the meta-learning algorithm, and the right picture MIM in fig. 6 is a universal countermeasure disturbance of owl type generated by using the conventional integration algorithm, all three disturbances can be clearly observed to have a pattern of owl, but the pattern of owl of left SMAML in fig. 6 is at the center of the image and contains a denser semantic feature (texture information of the owl) than the middle picture MAML in fig. 6 and the right picture MIM in fig. 6.

The first row in fig. 7 lists three different classes of original images (green snake, spider, homing pigeon), which are directly input to the VGG19_ BN model, which can be correctly classified by the classification model and yield high confidence levels (0.898, 0.995, 1.000); the second row of images in fig. 7 is UAP (i.e. left smalm in fig. 4) with owl in this category added to the original image, and the generated countermeasure samples are sent to the VGG19_ BN model, and the countermeasure images are all classified as "owl" in the target category (confidence levels of 0.941, 1.000, and 0.507, respectively), further illustrating that the present invention can perform targeted countermeasure attack on unknown black-box attack.

According to the scheme, a meta-learning framework is introduced, and the defect that a traditional multi-model integration method is biased to a certain model is overcome by deducing sharing disturbance from a plurality of depth models, so that the target mobility is improved; in addition, in order to further improve the transferability, a novel sequence element learning method is invented, and directional and general-purpose antagonistic disturbance is trained by keeping the information of the current model and the old model, so that the success rate of resisting sample attack on an unknown model is improved.

s1: obtaining a proxy dataset

And a set of classifier models

S2: from the proxy data set

repeating the steps S2-S3 to iteratively update the second directional disturbance delta;

s21: set the classifier models

Firstly, random arrangement is carried out, and then a model f is selected according to the sequence after random arrangement _i ；

S24: the overall confrontation sample obtained in the step S23

As input to the current classification model f _i And calculating a loss value using a cross-entropy loss function such that the overall countermeasure sample

wherein l _t One-hot coding for the target category t;

if the currently calculated delta _i As a set of said classifier models

wherein,

as shown in fig. 8, in a third aspect, the present invention further provides an electronic device 10, comprising a processor 101 and a storage medium 102, wherein the storage medium 102 is the storage medium according to the second aspect; the processor 101 is adapted to execute a computer program stored in the storage medium 102 to implement the method steps as the first aspect.

In this embodiment, the electronic device is a computer device, including but not limited to: personal computer, server, general-purpose computer, special-purpose computer, network equipment, embedded equipment, programmable equipment, intelligent mobile terminal, intelligent home equipment, wearable intelligent equipment, vehicle-mounted intelligent equipment, etc. Storage media include, but are not limited to: RAM, ROM, magnetic disk, magnetic tape, optical disk, flash memory, U disk, removable hard disk, memory card, memory stick, network server storage, network cloud storage, etc. Processors include, but are not limited to, a CPU (Central processing Unit), a GPU (image processor), an MCU (Microprocessor), and the like.

As will be appreciated by one of skill in the art, the various embodiments described above may be provided as a method, apparatus, or computer program product. These embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. All or part of the steps of the methods related to the above embodiments may be implemented by relevant hardware instructed by a program, and the program may be stored in a storage medium readable by a computer device and used for executing all or part of the steps of the methods related to the above embodiments.

The various embodiments described above are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a computer apparatus to produce a machine, such that the instructions, which execute via the processor of the computer apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer device to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer apparatus to cause a series of operational steps to be performed on the computer apparatus to produce a computer implemented process such that the instructions which execute on the computer apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Although the embodiments have been described, once the basic inventive concept is obtained, other variations and modifications of these embodiments can be made by those skilled in the art, so that these embodiments are only examples of the present invention, and not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes that can be used in the present specification and drawings, or used directly or indirectly in other related fields are encompassed by the present invention.

Claims

1. A black box attack method based on sequence element learning is characterized by comprising the following steps:

s1: obtaining a proxy dataset

And a set of classifier models

S2: from the proxy data set

Middle sampling of several images x _b Based onThe number of images x _b And the classifier model set optimizes each f in the classifier model set one by one according to the order _i First directional disturbance δ _i Obtaining a set of parameters corresponding to the set of classifier models { f ₁ ，f ₂ ，…，f _n Relative directional countermeasure perturbation [ delta ] ₁ ，δ ₂ ，…，δ _n }; each of the orientations opposes the disturbance δ _i Are all associated with a respective f in the set of classifier models _i Associating;

2. The black box attack method based on sequence meta learning according to claim 1, wherein the step S1 further comprises: acquiring a training parameter;

3. The black box attack method based on sequence meta learning according to claim 1 or 2, wherein the step S2 is implemented by an inner loop module, and the inner loop module when executing the step S2 specifically comprises:

s21: set the classifier models

s23: using the last model f _i-1 Generated disturbance delta _i-1 And a current attack model f _i Disturbance delta of a task _i Respectively disturbing the image x _b Generating two-part challenge samples

S24: the overall confrontation sample obtained in the step S23

Is misclassified as a target class t; the calculation mode of the cross entropy loss function is shown as the following formula (1):

wherein 1 is _t One-hot coding for the target category t;

s25: calculating gradient information and updating disturbance delta of fi task of current attack model _i (ii) a The specific calculation mode is shown as formula (2):

4. the black box attack method based on sequence meta learning according to claim 3, wherein the step S24 is followed by further comprising:

s25: by means of L _p Norm metric to constrain the current attack model f _i Delta of task _i Make delta _i The maximum allowable limit epsilon is not exceeded, and the specific constraint calculation mode is shown in formula (3):

if the currently calculated delta _i As a set of said classifier models

5. The black box attack method based on sequence meta learning as claimed in claim 3, wherein the step S3 is implemented by an outer loop module, the outer loop module when executing the step S3 specifically comprises:

s32: controlling the resulting disturbance δ from δ _base TowardsAttack each model f _i The calculation formula is shown as the following formula (4):

wherein,

6. a storage medium storing a computer program that when executed implements steps comprising:

s1: obtaining a proxy dataset

And a set of classifier models

S2: from the proxy data set

s3: generating a second model-independent directional disturbance δ based on the directional counterdisturbance, { δ } according to the directional counterdisturbance ₁ ，δ ₂ ，…，δ _n Determining the second directional disturbanceThe updating direction of the dynamic delta is determined, and the second directional disturbance delta is optimized according to the determined updating direction of the second directional disturbance delta;

7. The storage medium of claim 6, wherein the computer program comprises a first computer program configured to, when executed, implement step S2, in particular comprising:

s21: set the classifier models

S24: the overall confrontation sample obtained in the step S23

wherein 1 is _t One-hot coding for a target category i;

8. the storage medium of claim 7, wherein the first computer program when executed further comprises after implementing step S24 implementing the steps of:

if the currently calculated delta _i As a set of said classifier models

9. The storage medium of claim 7, wherein the computer program comprises a second computer program for, when executed, implementing step S3, in particular comprising:

s32: controlling disturbance δ from δ _base Towards each model of attack f _i The calculation formula is as shown in the following formula (4):

wherein,

10. an electronic device comprising a processor and a storage medium according to claim 9;

the processor is configured to execute a computer program stored in the storage medium to implement the method of any one of claims 1 to 5.