CN109858506A

CN109858506A - A kind of visualized algorithm towards convolutional neural networks classification results

Info

Publication number: CN109858506A
Application number: CN201810519569.7A
Authority: CN
Inventors: 周连科; 谢晓东; 褚慈; 王红滨; 李秀明; 王念滨; 赵昱杰; 薛冬梅; 王勇军; 何茜茜
Original assignee: Harbin Engineering University
Current assignee: Harbin Three Cup Tea Technology Co.,Ltd.
Priority date: 2018-05-28
Filing date: 2018-05-28
Publication date: 2019-06-07
Anticipated expiration: 2038-05-28
Also published as: CN109858506B

Abstract

The invention discloses a kind of visualized algorithms towards convolutional neural networks classification results, belong to computer vision and digital image processing techniques field.The present invention uses relevance propagation algorithm in full articulamentum, obtains each neuron in the last layer convolutional layer and goes out the Class Activation mapping graph of the convolutional neural networks according to its contribution calculation to the size of the contribution of final output.After obtaining Class Activation mapping graph, also it just obtains in the last layer convolutional layer to the position of the contributive neuron of classification results, according to the propagation algorithm based on location information of proposition, the neuron position of classification will be supported successively to redirect forward in convolutional layer, until input layer, to obtain in input picture to the output contributive location of pixels set of result, finally obtain the visual image that can explain convolutional neural networks classification foundation, the present invention has higher accuracy in terms of the classification for explaining convolutional neural networks, and the feature between classification can be distinguished in interpretive classification decision.

Description

A kind of visualized algorithm towards convolutional neural networks classification results

Technical field

The invention belongs to computer visions and digital image processing techniques field, and in particular to one kind is towards convolutional Neural net The visualized algorithm of network classification results.

Background technique

Currently, convolutional neural networks have been widely applied to many fields as the most commonly used deep learning model, Such as image classification, speech recognition and natural language processing etc., and all achieve good effect, certain fields Reach or even surmounted the performance of the mankind.As it is using more and more extensive, the understanding of network model is become more and more important, More particularly to when the fields such as automatic driving and medical diagnosis, understanding and verifying to Decision of Neural Network are very heavy The ring wanted, people are not intended to help all decisions of do-it-yourself by a uncomprehending black box model, thus to convolutional Neural The research that network class result explains has very important significance.

Explain that convolutional neural networks classification results visualized algorithm mainly has Class Activation mapping algorithm and sensitivity analysis Algorithm.The former is only used in the full convolutional neural networks without containing full articulamentum, the advantage is that the product that can stop reading with Exact Solutions The classification foundation of neural network；And the latter's usability is wider, can be used in most of convolutional neural networks, but solving When releasing classification foundation, for that can not be applicable in well containing the case where input for being more than a kind of target, accuracy need to be mentioned It is high.

General structure convolutional neural networks can not be applied to for above-mentioned Class Activation mapping algorithm and sensitivity analysis is calculated The accuracy deficiency both of these problems of method interpretation classification results are calculated using different propagation respectively in full articulamentum and convolutional layer Method proposes the Class Activation mapping graph visualized algorithm based on correlation.This method is passed in full articulamentum using correlation first Algorithm is broadcast, obtains each neuron in the last layer convolutional layer to the size of the contribution of final output, then according to its tribute It offers advice and calculates the Class Activation mapping graph of the convolutional neural networks.After obtaining Class Activation mapping graph, last is also just obtained It will according to the propagation algorithm based on location information of proposition to the position of the contributive neuron of classification results in layer convolutional layer The neuron position of classification is supported successively to redirect forward in convolutional layer, until input layer, to obtain in input picture to defeated The contributive location of pixels set of result out, finally obtains the visual image that can explain convolutional neural networks classification foundation.

Summary of the invention

The purpose of the present invention is to provide a kind of solution prior arts to pay no attention in the effect for explaining convolutional neural networks classification The visualized algorithm towards convolutional neural networks classification results thought.

The object of the present invention is achieved like this:

The invention discloses a kind of visualized algorithms towards convolutional neural networks classification results, especially by following steps It realizes:

(1) data set for extracting input picture, is trained convolutional neural networks using data set as training set, obtains Trained model parameter；

(2) calculation method according to Rel-CAM algorithm in full articulamentum is successively counted using output result and model parameter Each neural unit in full articulamentum is calculated to the contribution of output, until convolutional layer；

(3) according to contribution of all neural units to output in the last layer convolutional layer obtained in step (2), calculating should The respective weights in each channel of layer and output result, to obtain the Class Activation mapping graph of the network model；

(4) neural unit that is positive of record Class Activation mapping graph intermediate value, the position of the neural unit as in the layer to defeated These neural units are added to the layer in the contributive neuronal ensemble of output by the position of the contributive pixel of result out；

(5) each neuron in set is successively taken out, all neurons and corresponding are calculated in its preceding layer in receptive field The Hadamard product of weight, and sum to the Hadamard in each channel product, take it to neutralize maximum channel as contribution channel, and take The neuron that its intermediate value is positive is added to the layer to exporting in contributive neuronal ensemble, and removes wherein duplicate nerve Member；

(6) communication process of step (5) is repeated, until the neuronal ensemble in input layer is obtained, the nerve in the set Unit shows the pixel under the position and contributes to output result.

For a kind of visualized algorithm towards convolutional neural networks classification results, the step (2) passes through following step It is rapid to realize:

(2.1) assume that a trained CNN model and a given input picture, model have been divided into c class, Assuming that C node is such output node in output layer, and S is scored at the node_c, Softmax layers are selected in algorithm Output before is mapped to the position of only feature relevant to c class as class score, output,

Wherein,Indicate the correlation that c class neuron is predicted as in output layer, i.e. prediction result correlation is on output layer Distribution；

(2.2) assume that output layer preceding layer is l, then contribution degree of this layer of each neural unit to last output, It is exactly each neuron with prediction result correlation is defined as:

Wherein,Indicate the activation value of i-th of neuron in l layers, andIndicate the neural unit and next layer, i.e., it is defeated Weight connection between layer neuron out；

(2.3) since the last layer only has c class output node to have correlation, only consider each neuron to node C Correlation；If the transmitting between middle layer, consider each neuron of preceding layer to the phase between all neurons of later layer At this moment Guan Xing has:

Wherein,Indicate the correlation in l layers between j-th of neuron and the prediction output of c class,Indicate l-1 Correlation in layer between i-th of neuron and j-th of neuron in next layer of l；

(2.4) according to law of conservation, the sum of l layers correlation of all neurons is equal to the correlation of output layer, institute With i neuron and next layer of correlation, it is equal to its correlation with prediction result, has:

Wherein,Indicate the correlation of i-th of neuron and prediction result in l-1 layers；

Meanwhile being had according to the law of conservation of transmitting:

For a kind of visualized algorithm towards convolutional neural networks classification results, in the step (3), in order to obtain The CAM of classification schemes, and needs first by the correlation elder generation back transfer of prediction result to last one layer of convolutional layer, because in convolutional layer The spatial information in input picture is saved, therefore correlation is first successively transmitted to the last one layer of convolutional layer, in next step It calculates CAM figure to prepare, in general CNN structure, the output of the last layer convolutional layer can be converted into one-dimensional by three-dimensional tensor Vector, in order to connect subsequent full articulamentum, concrete implementation step includes:

(3.1) it is assumed that the output of the last layer convolutional layer is located at m layers of network, according to the law of conservation of correlation:

The sum of each neuron correlation for having the last layer convolutional layer to export according to the law of conservation of correlation is equal to last Class score:

(3.2) when forward-propagating carries out classification prediction, the Feature Mapping figure of m layers of output in convolutional layer, that is, phase The three-dimensional tensor answered has been converted to one-dimensional vector, and the spatial information extracted in feature has been given up in this conversion, so carrying out When reversed correlation backpropagation calculates CAM figure, need first to indicate the one-dimensional vector of m layers of neuron correlation, it is corresponding The space structure of three-dimensional tensor namely this layer of Feature Mapping figure when ground is converted into forward-propagating；

Algorithm is converted into the correlation with Feature Mapping map space structure first by m layers of one-dimensional correlation vector Three-dimensional tensor, and since its value is one-to-one, so its summation remains unchanged, to the correlation tensor after conversion,

Wherein,Indicate in network in m layer correlation tensor neuron that coordinate in k-th of channel is (i, j) and Predict the correlation between classification results；

(3.3) if carrying out global average pond, obtained result to the output feature in each channel are as follows:

Wherein, f_kCoordinate is (i, j) in k-th of channel in the Feature Mapping figure of (i, j) expression the last layer convolutional layer The activation value of neuron, thus:

It can be obtained compared with the calculation formula of CAM figure:

I.e. each Feature Mapping figure carry out global average Chi Huahou with the weight that finally exportsAs shown in above formula, this Sample can obtain the CAM figure of the CNN model comprising full articulamentum after weighted sum, are as follows:

For a kind of visualized algorithm towards convolutional neural networks classification results, in the step (4), if used CNN model have N layers of convolutional layer, each layer of index is 1,2 ..., and N uses matrix A in l layers_lIndicate this layer of all nerve First activation value, W_lIndicate the weight matrix of connection this layer and preceding layer,Indicate k-th of neuron in l layers, X_lIndicate l Position in layer to the contributive neuron of last decision in Feature Mapping figure, that is, it is related to last output result Property for positive value neuron position, m indicates the number of wherein neuron；Hereinafter, by being schemed based on the CAM that is previously obtained, in conjunction with mentioning New transmission method out, inputted in support the CNN decision pixel position.

For a kind of visualized algorithm towards convolutional neural networks classification results, the step (5) specifically include with Lower step:

(5.1) in X_lEach of neuron index, extract activation in l-1 layers in corresponding receptive field Value

(5.2) the Hadamard product (Hadamard product) of these activation values and corresponding convolution kernel weight is calculated

(5.3) available maximum logical to next layer of neuron contribution by summing to Hadamard product in each channel Road, the neuron that Hadamard product is positive in the channel are just recorded by algorithm in contributive neuronal ensemble of classifying；

(5.4) wherein duplicate neuron is removed.

The beneficial effects of the present invention are: Rel-CAM algorithm of the present invention is in the classification for explaining convolutional neural networks Aspect has higher accuracy, and the feature between classification can be distinguished in interpretive classification decision, to help people more The classification foundation of good geography deconvolution neural network solves the prior art and pays no attention in the effect for explaining convolutional neural networks classification The problem of thinking.

Detailed description of the invention

Fig. 1 is the flow diagram of the visualized algorithm towards convolutional neural networks classification results in the present invention；

Fig. 2 is Rel-CAM algorithm and the qualitative relatively figure of Backprop and LRP algorithm in the present invention；

Fig. 3 is that Rel-CAM algorithm and Backprop and LRP algorithm decline in classification confidence, classification confidence rise and Decline the result figure of minimum scale.

Fig. 4 is the structural schematic diagram of image procossing of the present invention.

Specific embodiment

The present invention is described further with reference to the accompanying drawing.

In conjunction with Fig. 1, the invention discloses a kind of visualized algorithms towards convolutional neural networks classification results, by following Step is realized:

Step 1: the data set comprising the input picture to be explained is used to instruct as training set to convolutional neural networks Practice, obtains trained model parameter；

Step 2: according to calculation method of the Rel-CAM algorithm in full articulamentum, using output result and model parameter by Layer calculates each neural unit in full articulamentum to the contribution of output, until convolutional layer；

Step 3: according to contribution of all neural units to output in the last layer convolutional layer obtained in step 2, meter It calculates this layer of each channel and exports the respective weights of result, to obtain the Class Activation mapping graph of the network model；

Step 4: the neural unit that record Class Activation mapping graph intermediate value is positive, the position of the neural unit is as in the layer To the position of the output contributive pixel of result, these neural units are added to the layer to the contributive neuronal ensemble of output In；

Step 5: successively taking out each neuron in set, calculate in its preceding layer in receptive field all neurons with The Hadamard product of respective weights, and sum to the Hadamard in each channel product, it takes it to neutralize maximum channel and is used as contribution channel, And the neuron for taking its intermediate value to be positive is added to the layer to exporting in contributive neuronal ensemble, and removes wherein duplicate mind Through member；

Step 6: repeating the communication process of step 5, until the neuronal ensemble in input layer is obtained, the mind in the set The pixel under the position is shown through unit to contribute output result.

Currently, explaining that the method for visualizing of convolutional neural networks classification results is a hot topic of current machine learning research Direction, domestic and foreign scholars propose a variety of model methods and corresponding algorithm, they are for different network models and specifically Practical problem, respectively have a feature, the present invention exists on the basis of forefathers study for existing sensitivity analysis visualized algorithm The accuracy of interpretive classification result is insufficient and efficiency of algorithm it is beneath, in conjunction with the advantages of Class Activation mapping algorithm and innovation, The Class Activation mapping graph visualized algorithm based on correlation is proposed, main points of view and content are as follows:

(1) Rel-CAM algorithm calculation method in full articulamentum.Relevance propagation algorithm is to explain convolutional neural networks point One of algorithms most in use of class, the overall thought of relevance propagation algorithm are the contributions for understanding each pixel to last prediction result Size, it carries out backpropagation to correlation using the structure of network.Algorithm is reversed along network since the output layer of network The direction of propagation redistributes the score of prediction classification at every layer, until input layer.And the process redistributed is kept Constant rule, i.e., every layer of correlation summation remain unchanged.Herein correlation with R (x) indicate, wherein x indicate single pixel or One neuron of person's middle layer.Certain a kind of CAM figure in order to obtain, it is necessary first to be transmitted to the prediction result of the last layer On the last layer convolutional layer.

First, it is assumed that there is a trained CNN model, and there is a given input picture, model is divided into C class, it is assumed that C node is such output node in output layer, and be scored at S at the node_c, select in algorithm Output before Softmax layers is as class score, because in this case, output can be mapped to the position of only feature relevant to c class It sets；If selecting Softmax layers of output, the output after normalization can be mapped to the position of the feature comprising other class, produce in this way Raw visualization result will be inaccurate, because it contains the feature for being classified as other classes, although the probability meeting of only very little Assign to other classes.So taking into account the above, algorithm uses the output valve before Softmax as the beginning of relevance propagation.In this way, Have:

Wherein,Indicate the correlation for being predicted as c class neuron in output layer namely prediction result correlation in output layer On distribution because only one node is related to c class in output layer, only one valueIn this way, preceding layers The sum of the correlation of each neuron is alsoAssuming that output layer preceding layer is l, then this layer of each neural unit is to last Output contribution degree, that is, the correlation with prediction result of each neuron is defined as:

In formula,Indicate the activation value of i-th of neuron in l layers, andIndicate the neural unit and the (output of next layer Layer) weight connection between neuron, because the last layer only has c, class output node has correlation, need to only consider each Correlation of the neuron to node C.But if it is the transmitting between middle layer, each nerve of preceding layer just must be taken into consideration At this moment member has to the correlation between all neurons of later layer:

WhereinIndicate the correlation in l layers between j-th of neuron and the prediction output of c class,Indicate l-1 layers In correlation between i-th of neuron and j-th of neuron in next layer of l, that is, contribution of the i neuron to j neuron Size, then i neuron is exactly its contribution to next layer to the sum of the contribution of all neurons of next layer, and fixed according to conservation Rule, the sum of l layers correlation of all neurons is equal to the correlation of output layer, so i neuron is related to next layer Property, it is equal to its correlation with prediction result, has:

Wherein,Indicate the correlation of i-th of neuron and prediction result in l-1 layers.Simultaneously according to the conservation of transmitting Law has:

The CAM figure of classification in order to obtain, needs the correlation elder generation back transfer of prediction result to last one layer of convolution first Correlation is first successively transmitted to the last one because saving the spatial information in input picture in convolutional layer by layer Layer convolutional layer is prepared to calculate CAM figure in next step.In general CNN structure, the output of the last layer convolutional layer can be by three Dimension tensor is converted into one-dimensional vector, in order to connect subsequent full articulamentum.It is assumed that the output of the last layer convolutional layer is located at net M layers of network, according to the law of conservation of correlation:

Because when forward-propagating carries out classification prediction, the Feature Mapping figure of m layers of output in convolutional layer, that is, accordingly Three-dimensional tensor be converted to one-dimensional vector, in order in the propagated forward of full articulamentum.It is special that extraction has been given up in this conversion Spatial information in sign, so needing first indicate m layers of mind when carrying out reversed correlation backpropagation calculating CAM figure One-dimensional vector through first correlation, the sky of the three-dimensional tensor being accordingly converted into when forward-propagating namely this layer of Feature Mapping figure Between structure.

Algorithm is converted into the correlation with Feature Mapping map space structure first by m layers of one-dimensional correlation vector Three-dimensional tensor, and due to its value be it is one-to-one, so its summation remains unchanged.Therefore to the correlation after conversion Amount, equally has:

Wherein,Indicate in network in m layer correlation tensor neuron that coordinate in k-th of channel is (i, j) and Predict the correlation between classification results.If carrying out global average pond, obtained result to the output feature in each channel Are as follows:

It can be obtained compared with the calculation formula of CAM figure:

(2) calculation method of the Rel-CAM algorithm in convolutional layer.In convolutional layer, Rel-CAM algorithm, which uses, is based on position The algorithm that information is propagated.The core concept of the algorithm is: assuming that in current layer, if a neuron supports last classification knot Fruit, that is to say, that it with last output the result is that positively related, then in preceding layer with the positively related mind of the neuron It all can be considered the evidence for supporting current layer neuron through member, should also be deemed to be the evidence for supporting last classification results.Correlation It is positive, that is, the product of the neuron in preceding layer and the weight between them is positive value.Here it is Rel-CAM algorithms to roll up The core concept that lamination is successively propagated.

Firstly, if the CNN model used has N layers of convolutional layer, each layer of index is 1,2 ... N.In l layers, use Matrix A_lIndicate this layer of all neuronal activation value, W_lIndicate the weight matrix of connection this layer and preceding layer,Indicate l layers In k-th of neuron, X_lIndicate the position in l layers to the contributive neuron of last decision in Feature Mapping figure, also It is the position with the neuron that the correlation of last output result is positive value, m indicates the number of wherein neuron.Hereinafter, by base In be previously obtained CAM figure, in conjunction with the new transmission method of proposition, inputted in support the CNN decision pixel position.

The figure of CAM obtained in a upper section is located at m layers of network, and the neuron that CAM figure intermediate value is positive is in the layer to most Judgement result afterwards plays the neuron of contribution, therefore X_lAsThe set of element position of the intermediate value greater than 0.It wants below These location informations are successively transmitted until input layer.

After reaching convolutional layer, X_lIt is the set of a three-dimensional index, it is right that each index identifies one, the layer The position of the contributive neuron of last categorised decision.Being explained below methods herein and how reverse-locating in preceding layer has The neuron of identification.It should be noted that the receptive field of the progress pondization operation of typical pond layer is a two-dimensional surface, And receptive field when convolutional layer progress convolution operation is three-dimensional space, therefore, in X_lEach of neuron rope Draw, needs to extract the activation value in l-1 layers in corresponding receptive fieldThen these activation values and corresponding volume are calculated The Hadamard product (Hadamard product) of product core weightBy summing to Hadamard product in each channel, Available to contribute maximum channel to next layer of neuron, the neuron that Hadamard product is positive in the channel is just recorded by algorithm To in contributive neuronal ensemble of classifying.

Following algorithm 1 explains the process for obtaining in convolutional layer and supporting classification neuron position.In algorithm 1,Indicate the activation value in the receptive field of preceding layer, therefore it is a three-dimensional tensor.When the weight of it and corresponding neuronWhen carrying out Hadamard product, obtained result is also the three-dimensional tensor of a same size.Algorithm is first along x-axis and y-axis Direction sums to output, so that positioning most has the Feature Mapping figure of identification.If convolutional layer be not carried out it is any under adopt Sample operation, then the spatial position of deterministic neuron is will not be changed during this conversion, that is, Say, the position (x, y) in succeeding layer, which can be transferred in current layer, contributes on maximum channel, this completes layer and layer it Between location information propagation.Algorithm can also further select the maximum neuron of activation value in contributing maximum channel, but It is that result both in an experiment is almost the same, therefore algorithm has still selected the element in the maximum channel of contribution refreshing as determining Through member.

The algorithm steps of location updating are as follows:

Algorithm 1: the neuron position propagation algorithm of categorised decision is supported in convolutional layer

Input: X_l, to contributive neuron position of classifying from high level obtained in CAM: X_l[1]...X_l[m]

W_l, l layers of weight

A_l-1, the activation value of l-1 layers of neuron

Output: X_l-1, there is in l-1 layers the position for the neuron for supporting classification

1 enables X_l-1=φ

2for i=1:m do

3 neuronsCorresponding convolution kernel weight

4 neuronsActivation value in corresponding receptive field

5 pairs of activation values and weight seek Hadamard product

6 find out the contribution margin in each channel according to Hadamard product, that is, sum to channel each in long-pending tensor, and assign It is worth to C, i.e.,Wherein S (x) is summed to planar element

7 are saved in the neuron position that Hadamard product is positive in channel the position concentration that neuron is determined in current layer, I.e. in X_l-1Middle increase (X_l-1,argmax(C))

8end for

9 for X_l-1The identical position of intermediate value retains one of those.

In the case where containing pond layer, neuron in two-dimentional receptive field range in algorithm extraction preceding layer, and therefrom look for The position of maximum activation value is provided, this is because most of CNN structure reflects feature usually using the method in maximum pond It penetrates figure and carries out down-sampling.In this way, the activation in later layer corresponds to the maximum activation occurred in receptive field in preceding layer.Cause This can select the maximum that possesses in the receptive field of corresponding neuron to swash when activation of the algorithm in down-sampling layer backtracking preceding layer The neuron of value living.

In this way, Rel-CAM algorithm can be first since the prediction result of the last layer when training CNN is to be identified The algorithm based on relevance propagation first is used in full articulamentum, the Class Activation that characteristic of division is positioned can be mapped by generating Figure.Then the set comprising position collection is converted by mapping graph, the propagation algorithm using another kind based on location information, Convolutional layer recalls the position with deterministic neuron, until input layer.Finally, it obtains over an input image certainly Surely the positioning for the feature classified.Although input picture usually contains tri- channels RGB, algorithm only considers that the plane of x and y is empty Between, that is to say, that pixel is only concerned in the positioning of two-dimensional space.

The method of the present invention has used for reference the concept of Class Activation mapping graph when explaining the classification results of convolutional neural networks, because It is shown in an input picture for Class Activation mapping graph and determines that it is divided into the region of generic, thus it is of the invention Method also has the advantages that this, by the qualitative experiment result figure of Fig. 2 can be seen that come, when image is classified as cat or dog, Rel- CAM algorithm all only identifies the pixel region of respective class in image, and not to the pixel region of other classes or environment in image into Line identifier；And Backprop method in both cases all identifies the feature of all classes, illustrates that this method is being explained It cannot distinguish between the feature between classification when categorised decision.LRP method is close with Rel-CAM algorithm proposed in this paper, compares The region of Backprop method positioning has class distinction, but the algorithm is labelled with more not crucial characteristic areas and ring Pixel in border, and it is bigger than Rel-CAM algorithm in calculation amount, therefore the Rel-CAM algorithm in the present invention is explaining CNN Categorised decision on possess better effect, especially when in image containing more than in the presence of a type objects.

Additionally by quantitative experiment compare three kinds of methods under same data set (a) classification confidence decline drop and (b) classification confidence increases increase, it was demonstrated that the Rel-CAM algorithm in the present invention can more accurately explain convolutional Neural net The classification of network.Interpretational criteria therein has used the concept of explanation figure, define image explain E as generate thermodynamic chart H with it is defeated Enter picture I and carry out the corresponding multiplication of element:

In formula,Indicate the corresponding Hadamard product being multiplied of element, I is input picture, and H is the thermodynamic chart that can determine that classification.? In the experiment of each picture, c indicates the classification being divided by model.In simple terms, explanation figure is exactly to obtain each pixel according to algorithm The importance to model decision, covered a part of input picture.

The average decline of classification confidence: a good explanation figure should be marked to most important part of classifying.Depth CNN model can make final decision, therefore a part of shielded image according to all features of input picture in prediction Confidence level of the model in decision will necessarily be reduced.On the other hand, it is influenced due to being remained in explanation figure in entire input picture The most important part of the categorised decision, so this decline should be smaller.Therefore, this measurement compares a picture and exists To the decline of certain kinds model confidence after covering.For example, if an image prediction is by known models with 0.8 confidence level Tiger, when being predicted using explanation figure, this image classification is dropped to 0.4 at the confidence level of tiger by model.So mould Type confidence level falls to 50%.Every a kind of highest 50 images of prediction have carried out Experimental comparison in experimental selection data set, Compare the average decline of several algorithms.

Classification confidence increases: however, the portion that all features that sometimes possible CNN is found are identified all in explanation figure Point, and the feature of other parts is all unnecessary feature, does not play help to categorised decision.In this case, mould Type has increase to the confidence level of certain kinds instead.The entire data of this metric calculation are concentrated use in explanation figure when being predicted, The increased number of model confidence, this number are expressed as a percentage.

Classification confidence decline is minimum: the explanation figure that the first two criterion can evaluate the generation of a method for visualizing is correctly marked Know the ability that the region of classification is influenced in image, and this criterion is explicitly compare explanation figure that different method generates excellent It is bad.It refers in given data set, the explanation figure that more each method for visualizing generates classification confidence on every image The size of decline, which kind of method declines minimum, then its decline minimum number adds 1.Confidence level declines minimum, illustrates this Such prior characteristic of division is identified in the explanation figure that method generates, that is to say, that it explanatory more preferable.Finally Output is a percentage, i.e., the decline minimum number of every kind method accounts for the ratio of all algorithms.

By experimental analysis, as shown in figure 3, Rel-CAM algorithm wants low in terms of the average decline decline of classification confidence In other two kinds of existing algorithms, in terms of confidence level increase, Rel-CAM algorithm is equally matched with other algorithms, only slightly has A little advantages.On the other hand the experimental results showed that can help the positioning of feature to be promoted the performance of classifier, this may be depth The researcher of study provides a new angle and goes to promote the performance of neural network, and as model increases specific identification component, Then it instructs to train further according to another characteristic is known, and then promotes network performance.

Decline minimum aspect in confidence level, it can be seen that ratio shared by Rel-CAM algorithm is maximum, that is to say, that entire In data set, in many cases, Rel-CAM algorithm, which can be identified, influences maximum feature to classification, illustrates that Rel-CAM is wanted Better than other two methods.

To sum up, by qualitative and quantitative analysis, Rel-CAM algorithm of the present invention is in point for explaining convolutional neural networks There is higher accuracy in terms of class, and the feature between classification can be distinguished in interpretive classification decision, to help people More fully understand the classification foundation of convolutional neural networks.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. a kind of visualized algorithm towards convolutional neural networks classification results, which is characterized in that especially by following steps reality It is existing:

(1) data set for extracting input picture, is trained convolutional neural networks using data set as training set, is trained Good model parameter；

(2) calculation method according to Rel-CAM algorithm in full articulamentum is successively calculated entirely using output result and model parameter Each neural unit is to the contribution of output in articulamentum, until convolutional layer；

(3) according to contribution of all neural units to output in the last layer convolutional layer obtained in step (2), it is every to calculate the layer The respective weights in a channel and output result, to obtain the Class Activation mapping graph of the network model；

(4) output is tied as in the layer position of the neural unit that record Class Activation mapping graph intermediate value is positive, the neural unit These neural units are added to the layer in the contributive neuronal ensemble of output by the position of the contributive pixel of fruit；

(5) each neuron in set is successively taken out, all neurons and respective weights in receptive field are calculated in its preceding layer Hadamard product, and sum to the Hadamard in each channel product, take it to neutralize maximum channel as contribution channel, and take wherein The neuron that value is positive is added to the layer to exporting in contributive neuronal ensemble, and removes wherein duplicate neuron；

(6) communication process of step (5) is repeated, until the neuronal ensemble in input layer is obtained, the neural unit in the set The pixel shown under the position contributes output result.

2. a kind of visualized algorithm towards convolutional neural networks classification results according to claim 1, which is characterized in that The step (2) is realized by following steps:

(2.1) assume that a trained CNN model and a given input picture, model have been divided into c class, it is assumed that C node is such output node in output layer, and is scored at S at the node_c, in algorithm before Softmax layers of selection Output as class score, output is mapped to the position of only feature relevant to c class,

Wherein,Indicate the correlation that c class neuron is predicted as in output layer, i.e. point of the prediction result correlation on output layer Cloth；

(2.2) assume that output layer preceding layer is l, then contribution degree of this layer of each neural unit to last output, that is, The correlation with prediction result of each neuron is defined as:

Wherein,Indicate the activation value of i-th of neuron in l layers, andIndicate the neural unit and next layer, i.e. output layer Weight connection between neuron；

(2.3) since the last layer only has c class output node to have correlation, only consider each neuron to node C phase Guan Xing；If the transmitting between middle layer, consider each neuron of preceding layer to the correlation between all neurons of later layer, At this moment have:

Wherein,Indicate the correlation in l layers between j-th of neuron and the prediction output of c class,It indicates in l-1 layers Correlation between i-th of neuron and j-th of neuron in next layer of l；

(2.4) according to law of conservation, the sum of l layers correlation of all neurons is equal to the correlation of output layer, so i is refreshing Correlation through member with next layer, is equal to its correlation with prediction result, has:

Meanwhile being had according to the law of conservation of transmitting:

3. a kind of visualized algorithm towards convolutional neural networks classification results according to claim 1, which is characterized in that In the step (3), in order to obtain classification CAM figure, need first by the correlation elder generation back transfer of prediction result to the end One layer of convolutional layer first successively transmits correlation straight because saving the spatial information in input picture in convolutional layer To last one layer of convolutional layer, prepare to calculate CAM figure in next step, in general CNN structure, the last layer convolutional layer it is defeated One-dimensional vector can be converted by three-dimensional tensor out, in order to connect subsequent full articulamentum, concrete implementation step includes:

(3.2) when forward-propagating carries out classification prediction, the Feature Mapping figure of m layers of output in convolutional layer, that is, accordingly Three-dimensional tensor has been converted to one-dimensional vector, and the spatial information extracted in feature has been given up in this conversion, so carrying out reversely Correlation backpropagation calculate CAM figure when, need first will indicate m layers of neuron correlation one-dimensional vector, accordingly turn Turn to the space structure of the three-dimensional tensor namely this layer of Feature Mapping figure when forward-propagating；

Algorithm is converted into three of the correlation with Feature Mapping map space structure first by m layers of one-dimensional correlation vector Tensor is tieed up, and since its value is one-to-one, so its summation remains unchanged, to the correlation tensor after conversion,

Wherein,Indicate that coordinate is neuron and the prediction point of (i, j) in k-th of channel in m layers of correlation tensor in network Correlation between class result；

Wherein, f_kCoordinate is the neuron of (i, j) in k-th of channel in the Feature Mapping figure of (i, j) expression the last layer convolutional layer Activation value, thus:

It can be obtained compared with the calculation formula of CAM figure:

I.e. each Feature Mapping figure carry out global average Chi Huahou with the weight that finally exportsAs shown in above formula, in this way, plus The CAM figure of the CNN model comprising full articulamentum can be obtained after power summation, are as follows:

4. a kind of visualized algorithm towards convolutional neural networks classification results according to claim 1, it is characterised in that: In the step (4), if the CNN model used has N layers of convolutional layer, each layer of index is 1,2 ... N, in l layers, Use matrix A_lIndicate this layer of all neuronal activation value, W_lIndicate the weight matrix of connection this layer and preceding layer,Indicate l K-th of neuron in layer, X_lIndicate the position in l layers to the contributive neuron of last decision in Feature Mapping figure, It is exactly the position with the neuron that the correlation of last output result is positive value, m indicates the number of wherein neuron；Hereinafter, will Based on be previously obtained CAM figure, in conjunction with the new transmission method of proposition, inputted in support the CNN decision pixel position It sets.

5. a kind of visualized algorithm towards convolutional neural networks classification results according to claim 1, which is characterized in that The step (5) specifically includes the following steps:

(5.1) in X_lEach of neuron index, extract activation value in l-1 layers in corresponding receptive field

(5.3) available that maximum channel is contributed to next layer of neuron by summing to Hadamard product in each channel, it should The neuron that Hadamard product is positive in channel is just recorded by algorithm in contributive neuronal ensemble of classifying；

(5.4) wherein duplicate neuron is removed.