CN113936140B

CN113936140B - Incremental learning-based evaluation method for challenge sample attack model

Info

Publication number: CN113936140B
Application number: CN202111367546.7A
Authority: CN
Inventors: 温蜜; 吕欢欢; 王亮亮; 张凯; 魏敏捷
Original assignee: Shanghai Electric Power University
Current assignee: Shanghai Electric Power University
Priority date: 2021-11-18
Filing date: 2021-11-18
Publication date: 2024-06-18
Anticipated expiration: 2041-11-18
Also published as: CN113936140A

Abstract

The invention provides an incremental learning-based evaluation method for an attack resisting model, which is characterized in that a Deeplab v semantic segmentation model is used for carrying out feature extraction on sample data by combining with a knowledge distillation incremental learning method to obtain a semantic segmentation graph, then different attack algorithms are used for carrying out attack on the models adopting different learning methods under different disturbance values to obtain attack success rates, finally the attack success rates of the models adopting different learning methods are compared to obtain new knowledge which can be learned under the condition of not storing old task images by the incremental learning method, so that the waste in time and space is reduced, the catastrophic forgetting problem generated when the deep learning model uses batch learning is solved, and meanwhile, the influence of the attack resisting sample on the deep learning model when the deep learning model in an unmanned scene executes the incremental learning task is also obtained.

Description

Incremental learning-based evaluation method for challenge sample attack model

Technical Field

The invention relates to an incremental learning-based evaluation method for a model for resisting sample attack.

Background

With the rise of artificial intelligence, the appearance of unmanned automobiles can relieve road traffic jams and reduce the risk of traffic accidents, wherein deep learning has become one of the key technologies. But deep learning models have been demonstrated by many workers to be vulnerable and vulnerable to challenge samples. An attacker can cause the error of the classification result output by the classification model by adding some fine disturbance on the original picture, thereby achieving the purpose of attack. For unmanned systems, safety is critical. Therefore, an attack against a sample will affect the deployment of artificial intelligence in an unmanned scenario, and there is also a great potential safety hazard. In addition, deep learning models also suffer from catastrophic forgetfulness. As unmanned vehicles travel on roads, they need to learn new categories and their different representations. When systems require models to learn new knowledge while not forgetting old knowledge, they can exhibit serious performance degradation. Recently, incremental learning techniques have been observed to address the challenges described above. However, previous studies on countering sample attacks in unmanned scenarios have focused mainly on batch learning. It is not clear how much the use of countersample attacks has impact on the deep learning model when performing incremental learning tasks. This problem exposes the potential safety hazards of the unmanned system and also increases the opportunities for research.

Disclosure of Invention

In order to solve the problems, the invention provides an incremental learning-based evaluation method for a model for resisting sample attack, which adopts the following technical scheme:

The invention provides an incremental learning-based evaluation method for a model for resisting sample attack, which is characterized by comprising the following steps: step S1, training data is obtained based on a preset data set, wherein the training data comprises a plurality of categories; step S2, adopting a preset semantic segmentation model to respectively perform non-incremental learning, L' _D type incremental learning and E _qL'_D type incremental learning on training data; step S3, extracting features of the learned training data based on a preset semantic segmentation model, and respectively obtaining a first semantic segmentation map for non-incremental learning, a second semantic segmentation map for L' _D type incremental learning and a third semantic segmentation map for E _qL'_D type incremental learning; step S4, adopting a plurality of predetermined attack algorithms to attack the first semantic segmentation map, the second semantic segmentation map and the third semantic segmentation map under different disturbance values, and respectively obtaining corresponding attack success rates; and S5, comparing the attack success rate, so as to evaluate the robustness of the model based on incremental learning.

The evaluation method of the model for resisting the sample attack based on incremental learning can also have the technical characteristics that a preset semantic segmentation model is DeepLab v model, the model comprises cavity convolution, cavity space pyramid pooling and conditional random fields, the DeepLab v model obtains an approximate semantic segmentation result by using DCNN, a feature map is restored to the resolution of an original image according to bilinear difference values, and the semantic segmentation result is perfected by adopting a fully connected conditional random field.

The evaluation method of the anti-sample attack model based on incremental learning provided by the invention can also have the technical characteristics that the predetermined attack algorithm comprises an FGSM attack algorithm, a DeepFool attack algorithm and an MI-FGSM attack algorithm.

The method for evaluating the model for resisting the sample attack based on incremental learning provided by the invention can also have the technical characteristics that the preset data set is a Pascal VOC2012 data set, and the sample data contains 21 categories.

The evaluation method of the model for resisting the sample attack based on incremental learning provided by the invention can also have the technical characteristics that two groups of experimental data are respectively as follows: the 21 categories of sample data are divided into a first set of experimental data of the first 20 categories and the last 1 category, and the 21 categories of sample data are divided into a second set of experimental data of the first 16 categories and the last 5 categories.

The evaluation method of the model for resisting the sample attack based on incremental learning provided by the invention can also have the technical characteristics that the learning process based on the first group of experimental data in the step S2 is as follows: non-incremental learning is conducted on the first 20 categories in the first group of experimental data, and non-incremental learning, L' _D type incremental learning and E _qL'_D type incremental learning are conducted on the last 1 categories in the first group of experimental data respectively; the learning process based on the second set of experimental data in step S2 is: non-incremental learning is performed on the first 16 categories in the second set of experimental data, and non-incremental learning, L' _D incremental learning and E _qL'_D incremental learning are performed on the last 5 categories in the second set of experimental data, respectively.

The evaluation method for the model against the sample attack based on incremental learning provided by the invention can also have the technical characteristics that the L '_D incremental learning is to perform knowledge distillation on the output layer of the preset semantic segmentation model to obtain distillation loss L' _D,E_qL'_D incremental learning is to freeze the encoder while performing knowledge distillation on the output layer of the preset semantic segmentation model, and the distillation loss E _qL'_D is obtained when the encoder is frozen.

The evaluation method of the model for resisting the sample attack based on incremental learning provided by the invention can also have the technical characteristics that the distillation loss L' _D is as follows:

In the method, in the process of the invention, Referring to a new training sample for each step, k is the incremental step of the index, k=1, 2, …, so that each time the model learns a new set of classes, M _k(X_n [ c ]) the evaluation score for class c is reflected, and S _k-1 is the combination of all classes previously learned.

The evaluation method of the model for resisting the sample attack based on incremental learning provided by the invention can also have the technical characteristics that the step S5 further comprises the step of performing the resisting training on the model based on the incremental learning to improve the robustness of the model, wherein the resisting training is as follows: and generating a countering sample aiming at the attacked model based on incremental learning by adopting a countering sample algorithm, inputting the countering sample and sample data into the model based on incremental learning for training, and learning by adopting a supervised learning mode.

The actions and effects of the invention

According to the incremental learning-based evaluation method for the anti-sample attack model, the Deeplab v semantic segmentation model is combined with the incremental learning method of knowledge distillation to perform feature extraction on sample data to obtain the semantic segmentation graph, then different attack algorithms are used for attacking the models adopting different learning methods under different disturbance values to obtain attack success rates, finally the incremental learning method can learn new knowledge under the condition of not storing old task images by comparing the attack success rates of the models adopting different learning methods, so that waste in time and space is reduced, and the problem of catastrophic forgetting generated when the deep learning model adopts batch learning can be solved. Meanwhile, the influence of the attack of the countersample on the deep learning model is also obtained when the incremental learning task is executed aiming at the deep learning model in the unmanned scene.

Drawings

FIG. 1 is a flow chart of a method of evaluating a model of challenge sample attack based on incremental learning in an embodiment of the present invention;

FIG. 2 is a schematic diagram of ASPP module in DeepLab v model in accordance with an embodiment of the invention;

FIG. 3 is a schematic diagram of the framework of the kth incremental learning step in an embodiment of the present invention;

FIG. 4 is a schematic diagram of a freezing scheme of an encoder in a kth incremental step in an embodiment of the present invention;

FIG. 5 is a schematic diagram of semantic segmentation results of a first set of experimental data according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of semantic segmentation results of a second set of experimental data according to an embodiment of the present invention;

FIG. 7 is an attack success rate based on a first set of experimental data in an embodiment of the present invention;

fig. 8 is an attack success rate based on a second set of experimental data in an embodiment of the present invention.

Detailed Description

In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the implementation of the invention easy to understand, the following describes an evaluation method of the model for resisting the sample attack based on incremental learning with reference to the embodiment and the attached drawings.

< Example >

The evaluation method of the anti-sample attack model based on incremental learning provided by the invention is used for carrying out experiments by taking unmanned as a scene. The experimental environment is set as follows: the experimental hardware was configured as Intel (R) Core (TM) i7-7800X CPU,NVIDIA GeForce RTX 2080Ti GPU and 24GB RAM, running on the Tensorflow framework in the Ubuntu environment, with the primary environment configured as python 3.6.

FIG. 1 is a flow chart of a method of evaluating a model of challenge sample attack based on incremental learning in an embodiment of the present invention.

As shown in fig. 1, the method for evaluating the model against the sample attack based on incremental learning includes the following steps:

step S1, sample data containing a plurality of categories is acquired based on a preset data set.

In this embodiment, the predetermined dataset is a Pascal VOC 2012 dataset containing a total of 10582 training images and 1449 Zhang Yanzheng images of 21 categories (background、aeroplane、bicycle、bird、boat、bottle、bus、car、cat、chair、cow、dining table、dog、horse、motorbike、person、potted plant、sheep、sofa、train、tv/monitor),.

Furthermore, the dataset contains 6 categories (bicycle, bus, car, motorbike, person, train) which are common categories in unmanned scenarios. It can therefore be used not only to evaluate the performance of a sample attack resistance algorithm based on incremental technology in an unmanned scenario, but also more generally.

And S2, dividing the sample data into two groups to serve as experimental data, and respectively performing two groups of non-incremental learning, L' _D type incremental learning and E _qL'_D type incremental learning on the experimental data by adopting a preset semantic segmentation model.

In this embodiment, the predetermined semantic segmentation model is DeepLab v's 2 model, which includes hole convolution, hole space pyramid pooling (ASPP), and Conditional Random Field (CRF).

Fig. 2 is a schematic diagram of ASPP modules in DeepLab v model in an embodiment of the invention.

The ASPP module (shown in fig. 2) is inspired by the SPP module, and replaces a common convolution layer in the SPP module with parallel hole convolutions with different expansion rates to extract image features, so as to acquire global and local feature information with different scales and acquire various receptive fields, thereby improving the final segmentation accuracy.

Wherein, two groups of experimental data are respectively:

A first set of experimental data that classifies 21 categories of sample data into a first 20 categories and a last 1 category; the 21 categories of sample data were divided into a second set of experimental data of the first 16 categories and the last 5 categories.

The learning process based on the first set of experimental data in this step S2 is:

Non-incremental learning is performed on the first 20 categories in the first set of experimental data, and non-incremental learning, L' _D incremental learning and E _qL'_D incremental learning are performed on the last 1 categories (i.e., tv/monitor) in the first set of experimental data, respectively.

The learning process based on the second set of experimental data in this step S2 is:

Non-incremental learning is performed on the first 16 categories in the second set of experimental data, and non-incremental learning, L' _D incremental learning, and E _qL'_D incremental learning are performed on the last 5 categories (potted plant, sheep, sofa, train, tv/monitor) in the second set of experimental data, respectively.

FIG. 3 is a schematic diagram of a frame of a kth incremental learning step in an embodiment of the present invention.

As shown in fig. 3, the L '_D delta learning is a knowledge distillation on the output layer of the DeepLab v model to obtain distillation loss L' _D,L'_D is to mask the log-cross entropy loss resulting from the output of the previous ZENsoftmax layer and the output of the softmax layer in the current model M _k (assuming currently in the kth delta learning step). This is because we want to save them by guiding the learning process, so the cross entropy is masked, which is very useful for considering the class that has been seen.

Wherein, distillation loss L' _D is:

FIG. 4 is a schematic diagram of the freezing scheme of the encoder in the kth increment step in an embodiment of the invention.

E _qL'_D delta learning was to freeze the encoder while performing knowledge distillation on the output layer of the DeepLab v model, obtaining a distillation loss E _qL'_D when the encoder is frozen.

The incremental learning method is a modification based on the first incremental learning method L' _D, on which the encoder aims to extract some representation of the intermediate features. This approach allows the network to learn the new class only through the decoder. It retains the same feature extraction function as compared to the previous training phase, as shown in fig. 4, where M _k-1 is the whole model of the previous step.

Wherein knowledge distillation is the migration of knowledge learned from a complex model or models onto another simple model. The two incremental learning methods described above are the most challenging arrangements, do not store (do not waste storage space), the previous performance does not degrade the image from the old task and cannot be used to aid the incremental process, which is particularly suitable for systems like unmanned cars, where both privacy concerns and storage requirements are involved.

And step S3, extracting features of the two groups of experimental data after learning based on DeepLab v model, and respectively obtaining a first semantic segmentation map of non-incremental learning, a second semantic segmentation map of L' _D type incremental learning and a third semantic segmentation map of E _qL'_D type incremental learning which correspond to the two groups of experimental data.

In an embodiment, deepLab v model obtains an approximate semantic segmentation result by using DCNN, restores the feature map to the original image resolution according to bilinear difference, and perfects the semantic segmentation result by using fully connected conditional random fields.

Fig. 5 is a schematic diagram of a semantic segmentation result of a first set of experimental data in an embodiment of the present invention, and fig. 6 is a schematic diagram of a semantic segmentation result of a second set of experimental data in an embodiment of the present invention.

As shown in fig. 5, the DeepLab v model performs feature extraction on the tv/monitor (RGB column in the figure) which is the last 1 categories based on non-incremental learning (GT column in the figure), L '_D incremental learning (L' _D column in the figure) and incremental learning (E _qL'_D column in the figure), respectively, to obtain a corresponding semantic segmentation example graph.

As shown in fig. 6, deepLab v models are used for the last 5 categories based on non-incremental learning (column GT in the figure), L '_D incremental learning (column L' _D in the figure), and incremental learning (column E _qL'_D in the figure), respectively: potted plant, sheep, sofa, train, tv/monitor (RGB columns in the figure) to perform feature extraction to obtain a corresponding semantic segmentation example graph.

And S4, respectively attacking the first semantic segmentation map, the second semantic segmentation map and the third semantic segmentation map in the two groups of experimental data by adopting a plurality of predetermined attack algorithms under different disturbance values, and respectively acquiring two groups of corresponding attack success rates.

Research on combating sample attacks in recent years can be largely divided into the following three types: white box attacks, black box attacks, and physical attacks.

The precondition of white-box attack is that the architecture of the model, including the parameter values of each layer and the composition of the model, can be fully obtained, the input of the model can be completely controlled, and the granularity of control on the input can even reach the bit level. Its advantage is high calculation speed, but the gradient information of target network is needed. The white-box attack algorithm mainly comprises the following algorithms: a fast gradient algorithm (FGSM), a saliency map attack algorithm (JSMA), a DeepFool algorithm, a momentum iterative fast gradient algorithm (MI-FGSM), a C & W algorithm, and the like.

In this embodiment, an FGSM attack algorithm, a DeepFool attack algorithm and an MI-FGSM attack algorithm are adopted to respectively attack semantic segmentation graphs obtained by respectively learning the experimental data of the base group when the disturbance value epsilon is set to epsilon=0.3, epsilon=0.2 and epsilon=0.1 so as to obtain corresponding attack success rates.

And S5, comparing the attack success rates in each group, so as to evaluate the robustness of the model based on incremental learning.

Fig. 7 is an attack success rate based on a first set of experimental data in an embodiment of the present invention.

In this example, in the learning based on the first set of experimental data, the model of non-incremental learning was designated as M (0-20), the model of L '_D -type incremental learning was designated as M (0-19) +M (20) (L' _D), and the model of E _qL'_D -type incremental learning was designated as M (0-19) +M (20) (E _qL'_D).

As shown in fig. 7, the FGSM attack algorithm at the disturbance value epsilon=0.3 is first selected for detailed analysis:

when L' _D incremental learning is adopted, the attack success rate can reach 94.55%;

when E _qL'_D incremental learning is adopted, the attack success rate can reach 92.10%;

when non-incremental learning is adopted, the attack success rate only reaches 86.12 percent.

Thus, incremental learning can increase the attack success rate of the model by 8.43%, and from the attack results of only the first 20 classes, it can be found that the attack success rate is indeed increased after incremental learning.

Then, the attack success rate of the perturbation value epsilon=0.2 is analyzed for DeepFool attack algorithms:

when L' _D incremental learning is adopted, the attack success rate can reach 83.71%;

when E _qL'_D incremental learning is adopted, the attack success rate can reach 81.52%;

When non-incremental learning is used, the attack success rate reaches only 80.18%.

Thus, when the disturbance value ε=0.2, incremental learning can improve the attack success rate of the model by 3.53%.

Similarly, analysis of MI-FGSM attack algorithm when the perturbation value epsilon=0.3, the attack success rate can be improved by 2.59% by adopting L' _D incremental learning.

In addition, E _qL'_D type increment learning can also improve certain attack success rate, but L '_D type increment learning is not improved much, so that E _qL'_D type increment learning can be obtained to be better than L' _D type increment learning in robustness resistance, namely when the model adopts increment type learning, the attack success rate of resisting a sample to attack the model is higher than that of a model of non-increment type learning.

In this example, in the learning based on the second set of experimental data, the model of non-incremental learning was designated as M (0-15), the model of L '_D type incremental learning was designated as M (0-15) +M (16-20) (L' _D), and the model of E _qL'_D type incremental learning was designated as M (0-15) +M (16-20) (E _qL'_D).

As shown in fig. 8, the FGSM attack algorithm at the disturbance value epsilon=0.3 is first selected for detailed analysis:

When L' _D incremental learning is adopted, the attack success rate can reach 92.14%;

when E _qL'_D incremental learning is adopted, the attack success rate can reach 93.75%;

Thus, incremental learning may improve the success rate of attacks on models.

Then, the attack success rate of the perturbation value epsilon=0.3 is analyzed for DeepFool attack algorithms:

when L' _D incremental learning is adopted, the attack success rate can reach 82.23%;

When E _qL'_D incremental learning is adopted, the attack success rate can reach 83.39%;

When non-incremental learning is adopted, the attack success rate only reaches 81.16 percent.

At this time, the E _qL'_D incremental learning can improve the attack success rate by 2.23%.

Similarly, analysis of MI-FGSM attack algorithm E _qL'_D incremental learning can increase attack success rate by 3.08% when perturbation value epsilon=0.1.

Therefore, the incremental learning-based challenge sample attack model has higher attack success rate. In the second group of experiments, the attack success rate of the E _qL'_D type increment learning method is higher than that of the L' _D type increment learning method.

Therefore, when the model adopts an incremental learning method to learn samples, new knowledge can be learned without storing old task images, so that waste in time and space is reduced, the problem of catastrophic forgetting of a deep learning architecture can be solved, and the robustness of the model is reduced.

In this embodiment, in order to further improve the robustness of the model using incremental learning, countermeasure training is also added. Specifically:

firstly, common challenge sample algorithms such as an FGSM attack algorithm, a DeepFool attack algorithm, an MI-FGSM attack algorithm and the like are used for generating a large number of challenge samples aiming at an attacked model, and then the challenge samples and the original data are put into the model for retraining, and supervised learning is carried out, so that a reinforced model is obtained.

Example operation and Effect

According to the incremental learning-based evaluation method for the anti-sample attack model, the Deeplab v semantic segmentation model is combined with the incremental learning method of knowledge distillation to perform feature extraction on sample data to obtain a semantic segmentation graph, then different attack algorithms are used for attacking the models adopting different learning methods under different disturbance values to obtain attack success rates, finally the incremental learning method can learn new knowledge under the condition of not storing old task images by comparing the attack success rates of the models adopting different learning methods, so that time and space waste is reduced, and the problem of catastrophic forgetting caused when the deep learning model uses batch learning can be solved. Meanwhile, the influence of the attack of the countersample on the deep learning model is also obtained when the incremental learning task is executed aiming at the deep learning model in the unmanned scene.

In an embodiment, the predetermined dataset is a Pascal VOC 2012 dataset that contains 6 categories (bicycle, bus, car, motorbike, person, train) that are common categories in unmanned scenes. It can therefore be used not only to evaluate the performance of a sample attack resistance algorithm based on incremental technology in an unmanned scenario, but also more generally.

In the embodiment, the incremental learning-based model is reinforced by adopting the countermeasure training, so that the robustness of the model is effectively improved, and the influence of the countermeasure sample attack on the incremental learning-based model is reduced.

The above examples are only for illustrating the specific embodiments of the present invention, and the present invention is not limited to the description scope of the above examples.

Claims

1. An incremental learning-based evaluation method for a challenge sample attack model is characterized by comprising the following steps:

Step S1, acquiring sample data containing a plurality of categories based on a preset data set;

Step S2, dividing the sample data into two groups to serve as experimental data, and respectively carrying out two groups of non-incremental learning, L' _D type incremental learning and E _qL'_D type incremental learning on the experimental data by adopting a preset semantic segmentation model;

Step S3, extracting features of the two groups of learned experimental data based on the preset semantic segmentation model, and respectively obtaining a first semantic segmentation map of non-incremental learning, a second semantic segmentation map of L' _D incremental learning and a third semantic segmentation map of E _qL'_D incremental learning which correspond to the two groups of experimental data;

Step S4, adopting a plurality of predetermined attack algorithms to attack the first semantic segmentation map, the second semantic segmentation map and the third semantic segmentation map in the two groups of experimental data under different disturbance values, and respectively obtaining two groups of corresponding attack success rates;

step S5, evaluating the robustness of the model based on incremental learning by comparing the attack success rates in each group,

Wherein the predetermined semantic segmentation model is DeepLab v < 2 > model which comprises cavity convolution, cavity space pyramid pooling and conditional random field,

The DeepLab v model obtains an approximate semantic segmentation result by using DCNN, restores the feature map to the original image resolution according to bilinear difference, perfects the semantic segmentation result by adopting a fully connected conditional random field,

The L '_D delta learning is to perform knowledge distillation on the output layer of the predetermined semantic segmentation model to obtain distillation loss L' _D,

The E _qL'_D delta learning is to freeze an encoder while performing knowledge distillation on an output layer of the predetermined semantic segmentation model, obtain a distillation loss E _qL'_D when the encoder is frozen,

The distillation loss L' _D is:

2. The method for evaluating a model of challenge sample attack based on incremental learning of claim 1, wherein:

Wherein the predetermined attack algorithm comprises an FGSM attack algorithm, deepFool attack algorithm and an MI-FGSM attack algorithm.

3. The method for evaluating a model of challenge sample attack based on incremental learning of claim 1, wherein:

wherein the predetermined data set is PascalVOC 2012 data set and the sample data contains 21 categories.

4. A method of evaluating a model of challenge sample attack based on incremental learning as claimed in claim 3 wherein:

wherein, the two groups of experimental data are respectively:

The 21 categories of sample data are divided into a first set of experimental data of the first 20 categories and the last 1 category,

The 21 categories of sample data were divided into a second set of experimental data of the first 16 categories and the last 5 categories.

5. The method for evaluating a model for combating a sample attack based on incremental learning according to claim 4, wherein:

the learning process based on the first set of experimental data in the step S2 is as follows:

The non-incremental learning is performed on the first 20 categories in the first set of experimental data,

Performing the non-incremental learning, the L' _D -type incremental learning, and the E _qL'_D -type incremental learning on the last 1 categories in the first set of experimental data, respectively;

the learning process based on the second set of experimental data in the step S2 is:

the non-incremental learning is performed on the first 16 categories in the second set of experimental data,

The non-incremental learning, the L' _D -incremental learning, and the E _qL'_D -incremental learning are performed on the last 5 categories in the second set of experimental data, respectively.

6. The method for evaluating a model of challenge sample attack based on incremental learning of claim 1, wherein:

Wherein the step S5 further comprises performing countermeasure training on the model based on incremental learning to promote the robustness of the model,

The countermeasure training is as follows:

generating a challenge sample aiming at the attacked model based on incremental learning by adopting a challenge sample algorithm, inputting the challenge sample and sample data into the model based on incremental learning for training, and adopting a supervised learning mode for learning.