CN112115963B

CN112115963B - Method for generating unbiased deep learning model based on transfer learning

Info

Publication number: CN112115963B
Application number: CN202010750897.5A
Authority: CN
Inventors: 陈晋音; 陈治清; 徐国宁; 徐思雨; 缪盛欢; 郑海斌
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2024-02-20
Anticipated expiration: 2040-07-30
Also published as: CN112115963A

Abstract

The invention discloses a method for generating an unbiased deep learning model based on transfer learning, which comprises the following steps: (1) Constructing an original data set with a task label and a bias label of a sample image; (2) Training a biased deep learning model by using an original data set; (3) Constructing and training an attack resisting network, and utilizing the trained attack resisting network to attack the original data set without prejudice to the data set; (4) Training an initial unbiased deep learning model with the same structure as the biased deep learning model by using an unbiased data set; (5) And preparing a third feature extractor, wherein parameters of the third feature extractor are determined based on a transfer learning strategy, and the third feature extractor and a second classifier contained in a trained initial unbiased deep learning model form an unbiased deep learning model so as to ensure fairness of the deep learning model in automatic decision according to an input image and improve accuracy of image recognition.

Description

Method for generating unbiased deep learning model based on transfer learning

Technical Field

The invention belongs to the field of deep learning, and particularly relates to a method for generating an unbiased deep learning model based on transfer learning.

Background

Deep learning helps people to make decisions automatically and solves many complex pattern recognition problems by virtue of the strong inherent rules of a learning sample data set and the capability of highly abstracting features, so that the deep learning is applied to the fields of medical diagnosis, voice recognition, image recognition, natural language understanding, advertisement, credit, employment, education, criminal judicial and the like, and plays a good role. With continuous exploration and innovation of researchers, the deep learning performance is improved continuously, the application is wider and wider, and the deep learning device has a profound effect on daily life of people.

While deep learning can help people get more accurate predictions, recent studies have shown that deep learning models can also have bias in automatic decision making, which may be manifested in: the probability of predicting that a black person is told to be crimed again is far higher than that of a white person, the accuracy of the white person is far higher than that of the black person when predicting the sex of one person in a photo, and the proportion of men is far higher than that of women when searching a software engineer. In some important situations, such as enterprises, if they use deep learning models to make decisions, this bias may place the enterprise in a business environment that is full of high risk, and if they discard the deep learning models, they may lose advantage in business competition and be eliminated, as the automatic decision support for deep learning is a trend of time development. The bias of the deep learning model can have a lot of negative effects on society, and the bias is deep in various fields, so the research on the measurement of the deep learning algorithm and the fairness thereof is particularly important.

The main reason that the deep learning model has bias is that the sample data set itself has bias, the deep learning model amplifies this bias and the assessment of the deep learning model has bias. Thus, current researchers' work on eliminating bias from deep-learning models mainly includes preprocessing sample data sets to eliminate bias, small-scale modification of deep-learning model parameters to eliminate bias present in the model, and fair assessment of the deep-learning model. However, in the existing method for eliminating the bias of the deep learning model, only one factor of the bias generated by the model is often considered. The problem with this approach, for example, by directly preprocessing the sample dataset, is that the trained model does not learn the dataset containing the bias, and may be sensitive to some bias or extraneous features in identifying the original data with the bias, while the trained model can only eliminate some of the bias because the deep learning model is not considered to amplify the effect of such bias.

In view of the bias of the deep learning model and the limitation of the bias elimination method at present, research on a method for generating an unbiased deep learning model based on transfer learning has extremely important theoretical and practical significance in generating the unbiased deep learning model to help people to automatically decide.

Disclosure of Invention

The invention aims to provide a method for generating an unbiased deep learning model based on transfer learning. The deep learning model automatically filters the features with bias when learning sample data through knowledge transfer, so that fairness of the deep learning model in automatic decision according to input images is guaranteed, and accuracy of image recognition is improved.

In order to achieve the above object, the present invention provides the following technical solutions:

a method for generating an unbiased deep learning model based on transfer learning, comprising the steps of:

(1) Acquiring a sample image, marking task labels and bias labels of the sample image, and constructing an original data set;

(2) Training a biased deep learning model consisting of a first feature extractor and a first classifier by utilizing image data and task labels in an original dataset to obtain a trained biased deep learning model;

(3) Constructing and training an anti-attack network, and attacking an original data set by utilizing the trained anti-attack network to obtain an unbiased data set corresponding to the original data set, so that a bias label in the unbiased data set can not be predicted;

(4) Training an initial unbiased deep learning model with the same structure as the biased deep learning model by using an unbiased data set;

(5) Preparing a third feature extractor, constructing a loss function by utilizing the feature distribution extracted by the third feature extractor from the original sample image and the feature distribution extracted by the second feature extractor from the unbiased image corresponding to the original sample image, optimizing parameters of the third feature extractor by utilizing the loss function, and forming an unbiased deep learning model by utilizing the third feature extractor with determined parameters and a second classifier contained in the trained initial unbiased deep learning model.

Preferably, the constructing and training the challenge-combating network comprises:

constructing an attack resisting network, wherein the attack resisting network comprises a convolution layer and a full connection layer, an activation function adopts a ReLU function, the input of the attack resisting network is the characteristic distribution extracted by a trained first characteristic extractor of an original sample image, the output is a logits layer, and the predictive probability distribution of a bias label of the original sample image is obtained through a softmax function;

constructing a Loss function Loss_NAdv of the anti-attack network, wherein the Loss function aims at enabling the anti-attack network to predict probability distribution of the bias labels according to the characteristic distribution corresponding to the original sample image, and the calculation formula is as follows:

wherein z is _i Is the original sample image x _i The output of the first feature extractor of the trained biased deep learning model; b (B) _i Is the original sample image x _i Is a true prejudice label of (2); nadv (·) denotes the output of the challenge network; l (·) represents a cross entropy function, i is an index of the original sample image, and N is the total number of the original sample images;

the challenge network is trained with a Loss function loss_nadv to optimize model parameters of the challenge network.

Preferably, attacking the original data set with the trained challenge network, obtaining an unbiased data set corresponding to the original data set includes:

(a) Designing a disturbance variable r;

(b) The disturbance variable r is added to the original sample image x _i Obtaining a disturbance sample image, and extracting disturbance characteristic distribution of the disturbance sample image by using a first characteristic extractor of a trained biased deep learning model;

(c) Calculating disturbance characteristic distribution by using a trained anti-attack network to obtain predictive probability distribution, calculating Loss loss_adv according to the predictive probability distribution, updating a disturbance variable r according to the Loss loss_adv when the iteration number does not reach the maximum iteration number, and jumping to execute the step (b) until the maximum iteration number is reached, and outputting a disturbance sample image obtained by using the latest disturbance variable r as an unbiased image to form an unbiased data set;

the calculation formula of Loss loss_adv is:

Loss_Adv＝-αLoss_NAdv+Loss_Y

wherein alpha is an superparameter, the value range is 0-1, loss_Y is a loss function value of a task tag except a bias tag, and the calculation formula is as follows:

wherein c ₁ (. Cndot.) represents the predicted output, y, of the first classifier with biased deep learning model _i Representing an original sample image x _i Is a task tag of (1).

Wherein the Loss function loss_tl for optimizing the third feature extractor parameters is:

Loss_tl＝∑L(h,h')

where h represents the original sample image x _i The feature distribution output by the third feature extractor, h' represents the original sample image x by the second feature extractor of the unbiased deep learning model _i And extracting the corresponding unbiased image to obtain the characteristic distribution.

After the sample image is acquired, the sample image is rotated, turned over, enhanced in color, added with Gaussian noise and randomly scaled, so that the sample image is expanded, and the bias labels comprise race labels, regional labels and gender labels.

Preferably, the first feature extractor and the second feature extractor employ a ResNet-50 model;

the first classifier and the second classifier of the initial unbiased deep learning model employ a network consisting of fully connected layers.

The first classifier and the second classifier of the initial unbiased deep learning model adopt a network consisting of 4 full connection layers;

the attack-resistant network adopts a network consisting of 3 convolution layers and 4 full connection layers, and the activation function adopts a ReLU function.

Preferably, the training parameters of the challenge network, the initial unbiased deep learning model and the third feature extractor are set as: the Batch size was set to 32, the training maximum number of iterations was set to 60, the optimizer was set to Adam, the learning rate was set to 0.001, and the exponential decay rates of the first and second estimates were set to 0.9 and 0.999, respectively.

Compared with the prior art, the invention has the beneficial effects that at least the following steps are included:

according to the method for generating the unbiased deep learning model based on the transfer learning, provided by the embodiment of the invention, the deep learning model can acquire the capability of automatically filtering the bias characteristics of the sample data based on the transfer learning strategy, so that the fairness of model decision is ensured, and the accuracy of image recognition is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for generating an unbiased deep learning model based on transfer learning according to an embodiment of the present invention;

FIG. 2 is a system framework diagram of an unbiased deep learning model based on transfer learning according to an embodiment of the present invention;

FIG. 3 is a flow chart of the construction of generating unbiased data sets according to an embodiment of the present invention;

fig. 4 is a training flowchart of an unbiased deep learning model based on transfer learning according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the scope of the invention.

In order to solve the problem of inaccurate image recognition caused by the prejudice of the deep learning model. The embodiment provides a method for generating an unbiased deep learning model based on transfer learning, as shown in fig. 1, the method for generating the unbiased deep learning model based on transfer learning comprises the following steps:

(1) Definition of bias of deep learning model.

The invention defines the false decision caused by the false correlation characteristic of the deep learning model in automatic decision as the decision bias of the deep learning model. The false relevant features, namely, bias features, may be race, region, gender, etc., for example, when the bias features are gender, when the resume of the software engineer is screened by using the deep learning model, the rejection rate of the female is often higher, so that the occupational discrimination of the female and the unfairness of model decision are caused. Thus, the present invention eliminates model bias by filtering extraneous features and then classifying.

(2) Data set preparation and preprocessing.

The present embodiment selects an image dataset with multi-tag classification, such as a COCO dataset, with one of the bias tags B as a bias feature, such as a gender feature. One or more of the other labels are selected as task labels, the task labels can be professional labels and the like, the data set is preprocessed, and in order to improve the recognition accuracy of the deep learning model, data enhancement operations can be added to expand the data set, including rotation, overturning, color enhancement, gaussian noise addition and random scaling. The image after the data enhancement operation is formed into an original data set, the original data set is divided into a training set and a testing set according to the proportion of 7:3, and the training set and the testing set are respectively used for migration learning and testing of an unbiased model.

(3) And building and training a biased deep learning model.

In this embodiment, the constructed biased deep learning model includes a first feature extractor and a first classifier, wherein the first feature extractor adopts a ResNet-50 model structure, and the first classifier adopts a network formed by 4 full connection layers. Training a biased deep learning model by using a training set of the original data set, and testing and optimizing the biased deep learning model by using a testing set to enable the biased deep learning model to reach a preset recognition accuracy. Because the raw dataset used for training is biased features, the model trained is referred to as a biased deep learning model.

(4) The challenge network is constructed and trained.

The challenge-against network aims at predicting the bias tag B by the output features of the first feature extractor of the biased deep learning model, mainly for generating unbiased data sets. The specific process is as follows:

(4.1) constructing a structure of an anti-attack network, wherein the anti-attack network adopts a network consisting of 3 convolution layers and 4 full connection layers, and the activation function adopts a ReLU function. The input to the attack-resistant network is the output z of the original dataset through the first feature extractor of the biased deep learning model in step (3), z representing the feature distribution of the original dataset. The output of the challenge network is the logits layer, and the predictive probability distribution of the bias tags of the original data set is obtained through the softmax function.

(4.2) designing a loss function of the challenge network, wherein the loss function aims at enabling the challenge network to predict probability distribution of the bias tags according to the corresponding feature distribution of the original sample image, and the probability distribution is calculated according to the following formula:

wherein z is _i Is the original sample image x _i The output of the first feature extractor with the bias deep learning model in the step (3) is passed; b (B) _i Is the original sample image x _i Is a true prejudice label of (2); nadv (·) denotes the output of the challenge network; l (·) represents the cross entropy function.

(4.3) training the challenge-challenge network, setting the Batch size to 32, setting the maximum number of training iterations to 60, setting the learning rate to Adam to 0.001, and setting the exponential decay rates of the first and second estimates to 0.9 and 0.999, respectively. Training the anti-attack network by using the training set of the original data set and the bias label B thereof, and testing and optimizing the anti-attack network by using the testing set so as to enable the anti-attack network to reach the preset identification accuracy.

(5) Generating an unbiased dataset, and attacking the original dataset with a challenge-resistant network by adding a perturbation to the original dataset such that the biased tags B of the generated unbiased dataset are unpredictable, but the other tags are unaffected. Wherein the original data set corresponds one-to-one with the unbiased data set.

The loss function against attack is designed, and the calculation formula is as follows:

Loss_Adv＝-αLoss_NAdv+Loss_Y (2)

where α is a super parameter, loss_y is a Loss function value of other classification tags except for the bias tag B, and its calculation formula is as follows:

wherein c ₁ (. Cndot.) represents the output of the first classifier with bias deep learning model in step (3); y is _i Representing an original sample image x _i Is a real tag of (a).

The disturbance variable r is designed as a matrix of weight×hight×3, wherein weight and hight are respectively the width and height of the sample data image, 3 represents three channels of RGB of the sample data image, and the disturbance variable r is initialized as a zero matrix. And (3) optimizing the disturbance variable r by adopting an Adam optimizer, wherein the parameters of the optimizer are the same as those of the step (4.3).

Setting the maximum number of iterations of a single sample to 1000, as in fig. 3, the specific process of generating unbiased data for each raw dataset sample is as follows:

(5.1) inputting the original sample image x _i Turning to the step (5.2);

(5.2) raw sample image x _i Superimposed with disturbance variable r to obtain disturbance sample image x _i ' turning to step (5.3);

(5.3) perturbation of sample image x _i ' input to the first feature extractor with bias deep learning model in step (3) and output z _i Turning to the step (5.4);

(5.4) z is to _i Input into the attack-resistant network of step (4) and output nadv (z) _i ) Turning to the step (5.5);

(5.5) calculating a loss function value according to formulas (1) to (3), and turning to step (5.6);

(5.6) judging whether the maximum iteration number is reached, if so, outputting a disturbance sample image x _i ' as unbiased image, composing unbiased data set, ending iteration, if not, turning to step (5.7);

(5.7) updating the disturbance variable r according to the loss function value by using Adam, and turning to the step (5.2).

(6) An initial unbiased deep learning model is built and trained.

The initial unbiased deep learning model structure is the same as the deep learning model with bias in step (3). That is, the initial unbiased deep learning model includes a second feature extractor that is structurally identical to the first feature extractor, and also includes a second classifier that is structurally identical to the first classifier result.

Setting the Batch size to 32, training the maximum number of iterations to 60, the optimizer to Adam, learning rate to 0.001, and the first and second estimated exponential decay rates to 0.9 and 0.999, respectively. Training an initial unbiased deep learning model by using the unbiased data set generated in the step (5), and performing test optimization on the model by using a test set to enable the initial unbiased deep learning model to reach a preset recognition accuracy. The output of the second feature extractor of the initial unbiased deep learning model is noted as h', and the second classifier of the initial unbiased deep learning model is noted as c ₂ (·)。

(7) And designing an unbiased deep learning model training framework based on transfer learning.

As shown in fig. 2, the specific process of step (7) is as follows:

(7.1) design of the bias-free deep learning model Structure based on transfer learning, a third feature extractor can be employed, the input of which is the original sample image x _i The output is the middle characteristic h; the third classifier adopts a second classifier c trained by an initial unbiased deep learning model ₂ (·)；

(7.2) designing a loss function of the unbiased deep learning model based on transfer learning, the loss function being intended to enable the unbiased deep learning model to learn knowledge of the initial unbiased deep learning model, so that the unbiased deep learning model can automatically filter features with bias when facing the original dataset, the loss function being calculated as:

Loss_tl＝∑L(h,h') (4)

where h represents the output of the third feature extractor of the unbiased deep learning model and h' represents the output of the second feature extractor of the initial unbiased deep learning model.

(7.3) designing initialized training parameters, setting the Batch size to be 32, setting the maximum training iteration number to be 60, setting the learning rate to be 0.001 by Adam, and setting the first and second estimated exponential decay rates to be 0.9 and 0.999 respectively.

(7.4) designing a training process of an unbiased deep learning model, and combining with fig. 4, the specific process is as follows:

(7.4.1) the original sample image x _i Inputting into a third feature extractor and outputting h, turning to step (7.4.2);

(7.4.2) the original sample image x _i Corresponding unbiased image x _i 'input into the second feature extractor of the initial unbiased deep learning model and output h', go to step (7.4.3);

(7.4.3) calculating a loss function value according to formula (4), and turning to step (7.4.4);

(7.4.4) updating unbiased deep learning model parameters according to the loss function value, and turning to the step (7.4.5);

(7.4.5) judging whether the maximum iteration number is reached, if so, saving the model, and ending training; if not, go to step (7.4.1).

(8) An unbiased deep learning model was trained and tested.

According to the training flow of the step (7.4), a third feature extractor of the unbiased deep learning model is trained by using the original data set and the unbiased data set, and after training, the third feature extractor is connected with a second classifier trained by the initial unbiased deep learning model of the step (6) to serve as an unbiased deep learning model, namely the unbiased deep learning model generated based on transfer learning. And testing the bias degree lambda of the unbiased deep learning model generated based on transfer learning by using a test set of the original data set, wherein the smaller the bias degree lambda value of the model is, the fairer the model is in decision, and the calculation formula is as follows:

wherein n represents the total number of test set sample data in the original data set; nadv (h) _i ) Representing sample data x _i The feature extractor of the unbiased model is subjected to transfer learning, and the output of the attack resisting network is obtained; function l [. Cndot.]Indicating a function, the value is 1 when the equation is held in brackets, and 0 otherwise.

The method for generating the unbiased deep learning model based on the transfer learning provides a new unbiased model training framework, generates the unbiased deep learning model through knowledge transfer of the unbiased model and a challenge-resisting network, solves the problem of bias of training data and model structure parameters, and further ensures fairness of model decision. The provided migration learning unbiased model can select a simple structure, and the calculation time of the deep learning model in practical application is greatly reduced. The proposed strategy based on transfer learning can enable the deep learning model to automatically filter bias characteristics under the condition of ensuring the accuracy of the original target classification task, ensure fairness of the deep learning model in decision making, and provide guidance for researching and eliminating bias of the deep learning model.

The foregoing detailed description of the preferred embodiments and advantages of the invention will be appreciated that the foregoing description is merely illustrative of the presently preferred embodiments of the invention, and that no changes, additions, substitutions and equivalents of those embodiments are intended to be included within the scope of the invention.

Claims

1. A method for generating an unbiased deep learning model based on transfer learning, comprising the steps of:

2. The method for generating an unbiased deep learning model based on transfer learning as claimed in claim 1, in which the constructing and training a challenge-resistant network includes:

3. The method for generating an unbiased deep learning model based on transfer learning as claimed in claim 2, in which the attack on the original dataset by using the trained challenge network, obtaining an unbiased dataset corresponding to the original dataset includes:

(a) Designing a disturbance variable r;

(b) The disturbance variable r is added to the original sample image x _i Obtaining a disturbance sample image, and extracting the disturbance sample image by using a first feature extractor of a trained biased deep learning modelDisturbance characteristic distribution;

the calculation formula of Loss loss_adv is:

Loss_Adv＝-αLoss_NAdv+Loss_Y

4. The method for generating an unbiased deep learning model based on transfer learning as claimed in claim 1, in which the Loss function loss_tl for optimizing the third feature extractor parameters is:

Loss_tl＝∑L(h,h')

5. The method for generating unbiased deep learning model based on transfer learning as claimed in claim 1, in which after the sample image is acquired, the sample image is rotated, flipped, color enhanced, gaussian noise added, and randomly scaled to expand the sample image, and the bias labels include race labels, region labels, and gender labels.

6. The method for generating an unbiased deep learning model based on transfer learning as claimed in claim 1, in which the first feature extractor and the second feature extractor employ a res net-50 model;

7. The method for generating an unbiased deep learning model based on transfer learning as claimed in claim 1, in which the first classifier and the second classifier of the initial unbiased deep learning model employ a network of 4 fully connected layers;

8. The method for generating an unbiased deep learning model based on transfer learning as claimed in claim 1, in which training parameters of the challenge network, the initial unbiased deep learning model and the third feature extractor are set as follows: the Batch size was set to 32, the training maximum number of iterations was set to 60, the optimizer was set to Adam, the learning rate was set to 0.001, and the exponential decay rates of the first and second estimates were set to 0.9 and 0.999, respectively.