CN114428954A

CN114428954A - Black box attack system based on dynamic network structure learning

Info

Publication number: CN114428954A
Application number: CN202111629855.7A
Authority: CN
Inventors: 薛向阳; 王文萱; 钱学林; 付彦伟
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2021-12-28
Filing date: 2021-12-28
Publication date: 2022-05-03

Abstract

The invention relates to the field of computer vision image processing, in particular to a black box attack method based on dynamic network structure learning. The attack method for the target black box model of the unknown scene under the condition of no real data participation is provided. Aiming at diversified target black box models, a surrogate model training method for dynamic network structure learning is provided, an optimal surrogate model structure is generated autonomously, optimization constraints based on a structured information graph are provided to improve the learning quality and efficiency of the surrogate model, and therefore the attack performance of generated countersamples is further improved. The method has the advantages of less query times, high learning efficiency, high attack success rate and the like, and is very suitable for a black box attack scene without any prior knowledge.

Description

Black box attack system based on dynamic network structure learning

Technical Field

The invention belongs to the field of computer vision image processing, and particularly relates to a black box attack system based on dynamic network structure learning.

Background

With the wide application of deep network models to various tasks in the real world, more and more researchers are focusing on the safety and robustness of the deep network models. It is gradually discovered that the countermeasure sample generated after adding the disturbance on the clean picture can successfully attack the deep network model, thereby causing prediction error of the model.

Counterattacks against depth models can be mainly classified into two categories: firstly, white box attack, namely an attacker can obtain a specific network structure and parameters of a target depth model, and the existing method usually carries out reverse attack through the gradient of the target model to directly generate a countersample, so that higher attack success power is achieved; secondly, black box attacks, that is, an attacker cannot directly obtain a specific network structure and parameters of a target depth model, the existing methods often improve attack mobility of countermeasure samples generated on other white box models or train a substitute model similar to a target model to generate countermeasure samples to realize attacks, however, the existing black box attack technologies still have low attack success rate and need to rely on certain prior knowledge of the target model, such as model tasks, training data, category number and the like.

In order to better implement black box attacks with practicability and high strength, the prior art proposes black box attacks under a data-free condition, that is, training of surrogate models without using real data is additionally required on a black box attack task. The first method is to generate a large number of samples through a noise input generator, and meanwhile, by means of a knowledge distillation network structure, the quality of an attack sample is improved by constraining the output consistency of a target model and a substitute model. The second method is to increase the diversity of generated samples by focusing on the quality of the generated samples of the generator, thereby further improving the efficiency of knowledge distillation training. However, they all rely on a priori knowledge of the number of classes of the target model and need to choose its optimum from a number of alternative models of different network structures.

Disclosure of Invention

The invention provides a black box attack system based on dynamic network structure learning under the condition of no real data participation. For the target black box model and the privacy protection requirements in an unknown scene, the high-quality alternative model training of the dynamic structure is realized under the condition that the specific network structure and parameters of the target depth model cannot be directly obtained and the prior knowledge such as task requirements, training data, classification quantity and the like of the target black box model is not known, so that the black box attack task with high success rate is completed.

Previously, the related countermeasure sample generation algorithms that have been analyzed in the prior art often rely on a priori knowledge of the number of classes of the target model and need to select the optimal one from a plurality of alternative models of different network structures, which results in greatly reduced utility of these algorithms and higher consumption of computational resources. Therefore, how to directly obtain the optimal alternative model by a one-time training learning mode without any prior knowledge about the target model is the technical point of the invention.

In order to achieve the above purposes, the invention provides a surrogate model training method for dynamic network structure learning, which gets rid of the limitation of a fixed and static surrogate model network structure, thereby achieving the goal of realizing the optimization generation of an autonomous network structure according to different target models. Meanwhile, in order to further improve the quality and efficiency of knowledge distillation training, a structured information graph is constructed according to a plurality of outputs of a target model, a substitution model is prompted to learn more key and hidden knowledge information from structured features among the outputs, and the attack strength of a countersample based on the substitution model is improved.

The method comprises the following specific steps:

a black box attack system based on dynamic network structure learning comprises optimization constraints based on a structural information graph and alternative training of the dynamic network structure learning, wherein the alternative training specifically comprises the following steps:

s1: generating alternative training data;

s2: alternative training of optimization constraints based on the structured information graph;

s3: generating a substitution model for the dynamic network structure learning;

s4: sending the test data into a substitution model, generating a countermeasure sample in a white box attack mode, and carrying out attack test on a target black box model.

The step S1 specifically includes the following steps:

s11: randomly generating noise samples according to Gaussian distribution;

s12: the noise samples are fed into a generator to generate alternative training data.

The step S2 specifically includes the following steps:

step 21: sending the substitute training data generated by the generator into the target model and the substitute model respectively to obtain corresponding outputs;

step 22: calculating point nodes and edge characteristics according to a plurality of outputs of the target model and the substitution model, and constructing a corresponding structured information graph;

step 23: calculating an optimization loss function based on the structural information graph according to the structural information graph output by the target model and the substitution model;

step 24: reducing the distance between the target model and the output of the substitution model according to an optimization loss function based on the structured information graph, and updating and optimizing the network parameters of the substitution model;

step 25: expanding the distance between the target model and the output of the substitution model according to an optimization loss function based on the structured information graph, and updating and optimizing generator network parameters;

the structured information graph of step 22 includes point nodes expressed by the output of the model itself and edge features, which are characteristic euclidean distance differences between two point nodes.

The step 23 measures the distance between the point nodes by using Kullback-Leibler divergence, and represents the distance between the edge features by using an MSE loss function.

The step S3 specifically includes the following steps:

step 31: according to the input feature vector, simple processing is carried out through an average pooling layer and a full connection layer;

step 32: whether to skip the current residual branch is predicted by a gate function.

In the step 32, a Hard-Sigmoid function is used as a gate function H, and a threshold value is set to 0.5, so that binarization of the dynamic gate output is realized.

In summary, the innovation of the invention is as follows:

(1) aiming at diversified target black box models, a surrogate model training method for dynamic network structure learning is provided, and the optimal surrogate model structure is autonomously generated aiming at different target models by learning of a dynamic gate structure, so that the problem of selecting the optimal consumed calculation amount in training a plurality of surrogate models with different network structures is solved.

(2) The knowledge distillation-based surrogate model training mode provides optimization constraints based on a structured information graph, and the learning quality and efficiency of the surrogate model are improved through the structured information constraints among a plurality of outputs, so that the attack performance of the generated anti-sample is further improved.

Drawings

FIG. 1 is a diagram of a black box attack system based on dynamic network structure learning according to the present invention;

FIG. 2 is a schematic diagram of an alternative model training method for dynamic network structure learning according to the present invention;

fig. 3 is a flow chart of black box attack based on dynamic network structure learning according to the present invention.

Detailed description of the preferred embodiments

In order to make the technical means, the creation features, the achievement purposes and the effects of the invention easy to understand, the technical scheme of the invention is explained in detail in the following with the accompanying drawings.

Fig. 1 is a diagram of a black box attack system based on dynamic network structure learning for a classification task model according to the present invention. The system 100 comprises media data 101, a computer device 110 and a presentation device 191. The media data 101 may be video content, such as movies, etc., or image content. The media data 101 may be distributed via television, the internet. In some embodiments, the media data 101 may also be picture data that includes a variety of categories. Computing device 110 is a computing device that processes media data 101 and primarily includes a computer processor 120 and memory 130. The processor 120 is a hardware processor, such as a central processing unit CPU, a graphics computation processor GPU, for the computing device 110. The memory 130 is a non-volatile storage device for storing computer code for the calculation process of the processor 120, and the memory 130 also stores various intermediate data and parameters. The memory 130 includes gaussian random noise 135 and associated data, executable code 140. Executable code 140 includes one or more software modules for performing computations by computer processor 120. As shown in FIG. 1, executable code 140 includes a training data generation module 141, a knowledge distillation substitution training module 143, a structured information graph learning module 144, and a dynamic network structure learning module 147.

The training data generation module 141 is a code module that processes the media data 101 and generates data, and a training data generation algorithm can be used to generate large-scale data for subsequent knowledge distillation surrogate model training under the condition that only gaussian random noise is used as input.

The knowledge distillation alternative training module 143 is configured to perform data input to the target model and the alternative model in synchronization based on the training data generated by the training data generation module 141, and constrain outputs of the target model and the alternative model to be as close as possible, thereby achieving an optimal training learning goal of the alternative model.

The structural information graph learning module 144 constructs a structural information graph based on the outputs of the plurality of target models and the substitution model, and calculates an optimization loss function based on the structural information graph, thereby efficiently and accurately realizing the substitution training target on the basis of having the structural output.

The dynamic network structure learning module 147 is configured to autonomously generate a corresponding optimal network structure of the surrogate model according to different target models and through an adaptive learning process of the dynamic gate in a network training process, so as to improve the attack performance of the black box attack.

Presentation device 191 is a device suitable for playing media data 101 and displaying the predicted scores output by computing device 110, and may be a computer, television, or mobile device.

The specific implementation mode of the invention is mainly realized by 7 steps, and the specific details are as follows:

step 1, generating substitute training data. Generating corresponding substitute training data x through a generator network model G according to the input random noise z based on Gaussian distribution, which is specifically expressed as follows,

z～N(0，1)

x＝G(z)∈R^3×h×w

where h and w represent the size of the length and width of the generated data, respectively.

And 2, performing substitution training of optimization constraints based on the structured information graph. And respectively sending the substitute training data x generated by the generator into the target model T and the substitute model S, and obtaining corresponding model output results. By constraining the output of the target model T and the substitute model S to be consistent, the substitute model S is helped to better learn rich and accurate knowledge from the target model T, and parameters in the optimized substitute model S are updated accordingly,

L_S＝d(T(x)，S(x))

where d represents a measure of the output distance of the target model T and the surrogate model S.

And 3, optimizing and training the generator. Inspired by the countermeasure generation network, when training the substitution model S, the distance difference between the output of the target model T and the output of the substitution model S is desired to be as small as possible, and at the same time, the training goal of the generator G is to expand the distance between the output of the target model T and the output of the substitution model S as much as possible, so that the generator G continuously generates the substitution training data with more learning value and difficulty, and therefore, the generator G is optimized and updated in a manner,

L_G＝-d(T(x)，S(x))

and 4, constructing a structured information graph. And constructing a corresponding structural information graph according to a plurality of outputs of the target model T and the substitution model S through the two-two relation between the outputs, wherein the structural information graph comprises point nodes and edge characteristics. Wherein, the point nodes are mainly expressed by the output of the model, the edge characteristics are expressed by the characteristic Euclidean distance difference between every two point nodes, the specific structural information graph can be expressed as follows,

A(j，k)＝||x_j-x_k||E，j，k＝1，...，B

wherein, B represents the training data quantity in each iteration training, and E represents the expression of measuring the characteristic distance between two points by adopting Euclidean distance as the edge characteristic.

And 5, calculating an optimized loss function based on the structured information graph. Structured information Graph based on obtained target model T and substitution model S^TAnd Graph^SStep 2 and step 3 are calculated and optimized by the proposed Graph-based structured Information Learning configuration (GSIL) optimization loss function, wherein the specific optimization loss function is expressed as follows,

wherein,

and

point nodes, A, of the target model T and the surrogate model S, respectively^TAnd A^SThe edge features of the target model T and the surrogate model S, respectively. The distance between point nodes is measured by Kullback-Leibler divergence, and the distance between edge features is expressed by an MSE loss function.

And 6, dynamically learning the network structure of the substitution model. In order to generate corresponding optimal alternative model network structures for different target models, a dynamic gate DG is designed to predict and output a one-hot vector so as to control whether to skip a current residual branch. The dynamic door is composed of a series of simple operations, thereby realizing its function,

DG(f)＝H(WP(f)+b)

wherein f is the input feature vector, P represents the global average pooling layer, and W and b are the network parameters of the fully-connected layer respectively. To implement the binarization for path selection, we select the Hard-Sigmoid function as the gate function H, specifically expressed as,

wherein, we set the threshold value to 0.5, so as to reduce the result of passing H to 0 or 1, where k is an approximate parameter used as a step function, and is set to 10 in the experiment.

And 7, training the neural network model to generate a substitution model for dynamic network structure learning. The SGD optimizer is adopted to train the network, and in the initial state, the learning rate of the generator G is lr equal to 0.0001, the learning rate of the surrogate model S is lr equal to 0.001, the coefficient beta is (0.9,0.999), the weight attenuation coefficient is 0.1, the batch data size batchsize is 500, and the parameter α is₁＝1，α₂1. The network is trained for about 80 rounds of convergence.

And 8, when the attack strength is evaluated, sending the test data into the substitution model obtained in the step 7, generating a countermeasure sample in a white box attack mode, and carrying out attack test on the target black box model.

FIG. 2 illustrates an alternative model training method for our dynamic network structure learning. The figure shows that in the process of the substitution training, through the learning and updating of the dynamic door structure, corresponding optimal substitution model network structures are generated for different target models.

Fig. 3 shows our black box attack method based on dynamic network structure learning. The figure depicts the data flow of the network model in detail and includes training data generation for alternative training, alternative training approaches based on knowledge distillation, and optimization constraints based on the structured information graph.

Claims

1. A black box attack system based on dynamic network structure learning is characterized by comprising optimization constraints based on a structural information graph and alternative training of the dynamic network structure learning, wherein the alternative training specifically comprises the following steps:

s1: generating alternative training data;

s3: generating a substitution model for the dynamic network structure learning;

2. The black box attack system according to claim 1, wherein the step S1 specifically comprises the steps of:

s11: randomly generating noise samples according to Gaussian distribution;

3. The black box attack system according to claim 1, wherein the step S2 specifically comprises the following steps:

step 21: respectively sending the substitute training data generated by the generator into a target model and a substitute model to obtain corresponding output;

step 25: and expanding the distance between the target model and the output of the substitution model according to an optimization loss function based on the structured information graph, and updating and optimizing generator network parameters.

4. A black-box attack system as claimed in claim 3, wherein the structured information graph of step 22 comprises point nodes expressed by the output of the model itself and edge features which are characteristic euclidean distance differences between two point nodes.

5. A black box attack system as claimed in claim 3, wherein the step 23 uses Kullback-Leibler divergence to measure the distance between point nodes, and uses MSE loss function to represent the distance between edge features.

6. The black box attack system according to claim 1, wherein the step S3 specifically comprises the following steps:

7. The black-box attack system according to claim 6, wherein the step 32 implements binarization of the dynamic gate output by using a Hard-Sigmoid function as the gate function H and setting a threshold value to 0.5.