CN107145904A

CN107145904A - Determination method, device and the storage medium of image category

Info

Publication number: CN107145904A
Application number: CN201710296095.XA
Authority: CN
Inventors: 万韶华
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2017-04-28
Filing date: 2017-04-28
Publication date: 2017-09-08

Abstract

The disclosure is directed to determination method, device and the storage medium of a kind of image category, this method includes：Obtain the view data of target image；The target component of depth convolutional neural networks is determined by backward pass-algorithm, the depth convolutional neural networks are used to view data is identified；It is determined that after the target component of depth convolutional neural networks, being inputted view data as depth convolutional neural networks, two class probabilities of the predetermined number of acquisition depth convolutional neural networks output；The classification of target image is determined according to two all class probabilities afterwards.The technical scheme that the disclosure is provided is by improving the structures of depth convolutional neural networks so that depth convolutional neural networks export two orthogonal class probabilities, can realize the multi-tag classification to image.

Description

Determination method, device and the storage medium of image category

Technical field

This disclosure relates to image processing field, more particularly to a kind of image category determination method, device and storage medium.

Background technology

In the related art, depth convolutional neural networks (English：Convolution Neural Networks are referred to as： CNN) it is well used in image recognition, it would be preferable to support the classification to single label of image, for example, for 1000 classifications Classification problem, image is identified by current CNN, and output result is the probability vector of one 1000 dimension, for representing The confidence level of each classification in corresponding 1000 classifications, the confidence level sum of this 1000 classifications is 1, and wherein confidence level is most High classification is exactly the class label of the images to be recognized, that is to say, that can only be determined using the CNN from this 1000 classifications One class label of the image.

The content of the invention

To overcome problem present in correlation technique, the disclosure provides a kind of determination method of image category, device and deposited Storage media.

According to the first aspect of the embodiment of the present disclosure there is provided a kind of determination method of image category, including：

Obtain the view data of target image；

The target component of depth convolutional neural networks is determined by backward pass-algorithm, the depth convolutional neural networks are used It is identified in described image data；

It is determined that after the target component of the depth convolutional neural networks, regarding described image data as the depth convolution Neutral net is inputted, and obtains two class probabilities of the predetermined number of the depth convolutional neural networks output, wherein, it is each described Two class probabilities, one classification of correspondence, and each two class probability is orthogonal；

The classification of the target image is determined according to all two class probabilities.

Optionally, the depth convolutional neural networks, including：The two-dimentional full articulamentum of neuron, at least one one-dimensional nerve First full articulamentum and at least one convolutional layer, wherein, the two-dimentional full articulamentum of neuron is located at least one described one-dimensional god Lower floor through first full articulamentum, at least one described one-dimensional full articulamentum of neuron is located under at least one described convolutional layer Layer.

Optionally, the target component that depth convolutional neural networks are determined by backward pass-algorithm, including：

Every layer of each parameter to the depth convolutional neural networks sets up temporary storage respectively；

The temporary storage that every layer of each parameter of the depth convolutional neural networks is set up is utilized as, after described The parameter of the depth convolutional neural networks is trained to pass-algorithm；

Wherein, when all layers in the depth convolutional neural networks perform once the training depth convolutional Neural After the step of parameter of network, the step of parameter of the training depth convolutional neural networks is performed again, until described When the Grad of every layer of each parameter meets the default condition of convergence in depth convolutional neural networks, by depth convolution god Currency through every layer of each parameter in network is defined as the target component of the depth convolutional neural networks.

Optionally, the step of parameter of the training depth convolutional neural networks includes：

When the current layer of the depth convolutional neural networks is not present next layer, by each parameter of the current layer, The Grad stored respectively in temporary storage corresponding with each parameter of the current layer is added, and obtains the current layer Parameter after renewal；Wherein, the Grad stored in the corresponding temporary storage of each parameter is to perform the training in last time Obtained during the step of the parameter of the depth convolutional neural networks, every layer of each parameter of the depth convolutional neural networks The initial value of the Grad stored in corresponding temporary storage is zero；

Obtain the current Grad of the parameter after each renewal of the current layer；

The current Grad of parameter after each renewal of the current layer is stored in each ginseng with the current layer In the corresponding temporary storage of number.

When one layer in the presence of the current layer of the depth convolutional neural networks, by each parameter correspondence of the current layer Temporary storage in the Grad that stores, respectively with being stored in described next layer of temporary storage and the current layer The Grad of the parameter of each parameter association is added, and obtains the new Grad of each parameter of the current layer；Wherein, each The Grad stored in the corresponding temporary storage of parameter is that the training depth convolutional neural networks were performed in last time Obtain, stored in every layer of the corresponding temporary storage of each parameter of the depth convolutional neural networks during step of parameter The initial value of Grad be zero；

By the new Grad of each parameter of the current layer, it is added, obtains with each parameter of the current layer respectively Parameter to after the renewal of the current layer；

The current Grad of parameter after each renewal of the current layer is stored in each ginseng with the current layer In the corresponding temporary storage of number；

Wherein, the current layer and next layer are two layers of arbitrary neighborhood in the depth convolutional neural networks.

Optionally, two class probability includes, certainly probable value and negative probable value, and the probable value certainly is described Image belongs to the probable value of the two class probabilities correspondence classification, described to determine the mesh according to all two class probabilities The classification of logo image, including：

It will affirm that probable value is more than predetermined probabilities threshold value described two points described in all two class probabilities Class probability, is defined as the class probability of target two；

Classification corresponding to all class probabilities of target two is defined as to the classification of the target image.

Optionally, the two dimension for the predetermined number that the full articulamentum of two-dimentional neuron is determined using logistic regression function Neuron.

According to the second aspect of the embodiment of the present disclosure there is provided a kind of determining device of image category, including：

Image data acquisition module, is configured as obtaining the view data of target image；

View data identification module, is configured as determining that the target of depth convolutional neural networks is joined by backward pass-algorithm Number, the depth convolutional neural networks are used to described image data are identified；

Two class probability output modules, are configured as it is determined that after the target component of the depth convolutional neural networks, inciting somebody to action Described image data are inputted as the depth convolutional neural networks, obtain default of the depth convolutional neural networks output Two several class probabilities, wherein, one classification of each two class probabilities correspondence, and the mutual not phase of each two class probability Close；

Image category determining module, is configured as determining the class of the target image according to all two class probabilities Not.

Optionally, described image data identification module, including memory setting up submodule and parameter training submodule：

The memory setting up submodule, is configured to every layer of each ginseng to the depth convolutional neural networks Number sets up temporary storage；

The parameter training submodule, is configured to, with every layer of each parameter for the depth convolutional neural networks The temporary storage of foundation, the parameter of the depth convolutional neural networks is trained by the backward pass-algorithm；

Optionally, the parameter training submodule, is configured as：

Optionally, two class probability includes, certainly probable value and negative probable value, and the probable value certainly is described Image belongs to the probable value of the two class probabilities correspondence classification；Described image category determination module, including：

Two class probabilities recognize submodule, are configured as that probability will be affirmed described in all two class probabilities Value is defined as the class probability of target two more than two class probability of predetermined probabilities threshold value；

Image category determination sub-module, is configured as classification corresponding to all class probabilities of target two being defined as institute State the classification of target image.

According to the third aspect of the embodiment of the present disclosure there is provided a kind of determining device of image category, described device includes：

Processor；

Memory for storing processor-executable instruction；

Wherein, the processor is configured as：Obtain the view data of target image；Determined by backward pass-algorithm deep The target component of convolutional neural networks is spent, the depth convolutional neural networks are used to described image data are identified；True It is after the target component of the fixed depth convolutional neural networks, described image data are defeated as the depth convolutional neural networks Enter, obtain two class probabilities of the predetermined number of the depth convolutional neural networks output, wherein, each two class probability One classification of correspondence, and each two class probability is orthogonal；The mesh is determined according to all two class probabilities The classification of logo image.

According to the fourth aspect of the embodiment of the present disclosure there is provided a kind of computer-readable recording medium, when the storage medium In computer program instructions by computing device when, the step of first aspect methods described can be realized.

The technical scheme provided by this disclosed embodiment can include the following benefits：

Above-mentioned technical proposal determines depth convolution god after the view data of target image is obtained by backward pass-algorithm Target component through network, the depth convolutional neural networks are used to described image data are identified, it is determined that the depth After the target component for spending convolutional neural networks, inputted described image data as the depth convolutional neural networks, obtain institute Two class probabilities of the predetermined number of depth convolutional neural networks output are stated, wherein, each two class probabilities correspondence one Classification, and each two class probability is orthogonal, and the target image is determined according to all two class probabilities Classification.The technical scheme that the disclosure is provided is by improving the structures of depth convolutional neural networks so that depth convolutional Neural Network exports two orthogonal class probabilities, can realize the multi-tag classification to image.

It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not The disclosure can be limited.

Brief description of the drawings

Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the implementation for meeting the disclosure Example, and be used to together with specification to explain the principle of the disclosure.

Fig. 1 is the structural representation of the CNN based on AlexNet network structures in correlation technique；

Fig. 2 is to show a kind of structural representation of the CNN based on AlexNet network structures according to an exemplary embodiment；

Fig. 3 is a kind of flow chart of the determination method of image category according to an exemplary embodiment；

Fig. 4 is a kind of flow chart of the determination method of image category according to an exemplary embodiment；

Fig. 5 is a kind of flow chart of the determination method of image category according to an exemplary embodiment；

Fig. 6 is a kind of flow chart of the determination method of image category according to an exemplary embodiment；

Fig. 7 is a kind of block diagram of the determining device of image category according to an exemplary embodiment；

Fig. 8 is a kind of block diagram of view data identification module according to an exemplary embodiment；

Fig. 9 is a kind of block diagram of image category determining module according to an exemplary embodiment；

Figure 10 is a kind of block diagram of the determining device of image category according to another exemplary embodiment.

Embodiment

Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Following description is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the disclosure.On the contrary, they be only with it is such as appended The example of the consistent apparatus and method of some aspects be described in detail in claims, the disclosure.

Before embodiment of the disclosure is introduced, depth convolutional neural networks are introduced first.

In the disclosed embodiments, the CNN structures being related to, can be based on AlexNet network structures, based on the structure In CNN, view data can be respectively through a series of convolutional layer (English：Convolution Layer), active coating (English： Activation Layer), pond layer (English：Pooling Layer), full articulamentum (English：Fully Connected Layer processing), and pass through class probability layer (English：Softmax Layer) carry out result output.Schemed using CNN As the identification of data can obtain the notable feature of the observation data to translation, scaling and invariable rotary, the part of view data Experience the characteristics of image that region allows neuron (or processing unit) to have access to basis, for example, orient edge or angle point.But It is that CNN in correlation technique only supports single labeling, does not support multi-tag classification, you can be interpreted as that figure can only be identified One classification of picture, therefore can only also identify one of classification for the image including multiple class.For example, Cat and dog are occurred in that simultaneously in piece image, the recognition result exported after image is identified the CNN in correlation technique In, the confidence level of cat and dog is shifting relation, i.e., its recognition result non-cat is dog, it is impossible to which it is right in same image to realize The identification of the object of multiple classifications.

Therefore, the network structure for the CNN that the embodiment of the present disclosure is proposed is changing for the network structure progress based on AlexNet Enter.AlexNet is the structure for being initially applied to the depth convolutional neural networks that large-scale image recognized and obtained positive result, phase AlexNet network structures include 8 layers in the technology of pass, as shown in Figure 1.Wherein first layer to layer 5 be convolutional layer, with first layer Exemplified by, when the picture specification of input uses 224*224*3, the size of the convolution filter of first layer is 11*11, convolution stride Be set to 4, including 96 convolution filters, thus the first layer output be 96 55*55 sizes image, in convolutional filtering ReLUs (English can also be connected to afterwards：Rectified Linear Units, Chinese：Activation primitive) operate and maximum pond (English：Max-Pooling) processing operation, for improving the training speed of depth convolutional neural networks and improving the warp of training Degree, it is not easy to produce the training result of over-fitting.From first layer to layer 5, every layer be provided with different size and number filtering Device, as shown in figure 1, the size of the convolution filter of the second layer is 5*5, is provided with 256 convolution filters, third layer and the 4th The size of the convolution filter of layer is 3*3, is provided with 384 convolution filters, the size of the convolution filter of layer 5 is 3*3, is provided with 256 convolution filters, by carrying out process of convolution successively to input image data, its processing procedure with it is upper The first layer stated is identical, 256 6*6 of final output characteristic image.Layer 6 is full articulamentum to the 8th layer afterwards, is preceding One three layers of full Connection Neural Network grader is added on the basis of five layers of convolutional layer.By taking layer 6 as an example, the layer 6 Neuron number be 4096, carry out a full junction for 256 sizes being exported to layer 5 for 6*6 characteristic images Reason by above-mentioned 256 characteristic images progress convolution to be changed into a characteristic point, then for each in 4096 neurons Neuron, is all that the characteristic point obtained in 256 characteristic patterns after certain a few characteristic pattern convolution is multiplied by after corresponding weight, Obtained along with a biasing.And the 8th layer of neuron number is set to 1000 (number can be arranged as required to), use In 1000 classifications of training objective image, therefore the 8th layer of final output result is the probability vector of one 1000 dimension, should Each element in probability vector is used for the confidence level for representing each classification in above-mentioned 1000 classifications, this 1000 classifications It is confidence level and for 1, confidence level highest classification is exactly the classification of the target image.

In the embodiment of the disclosure one, CNN structure as shown in Fig. 2 its it is different from the structure of the CNN in correlation technique it It is in two be independent of each other in the 8th layer of use 1000 (quantity of two-dimentional neuron can be set according to actual needs) Tie up neuron and replace original one-dimensional neuron, wherein, each two dimension neuron one classification of correspondence.For example, in this public affairs Open in an embodiment, the output of CNN networks is expressed as output, output=[c_1, c_2 ..., c_k], wherein, k=1000, Wherein c_1, c_2 ..., c_k represent two class probabilities of above-mentioned 1000 two-dimentional neuron output respectively, and c_i represents i-th of class The numbering of other two class probability, because the output of two-dimentional neuron is two class probabilities, therefore c_i includes：C_i=0 and c_i =1, wherein c_i=0 are used to represent that sample image to be identified is not belonging to the corresponding classifications of the c_i, and c_i=1 then represents this width Image belongs to c_i correspondence classifications, and all c_i probable value is added and be not equal to 1, and c_i=0 and c_i=1 probability it With equal to 1.In order to realize this purpose, the CNN provided in the embodiment of the disclosure one output be no longer the probability tieed up of k to Amount, but the Two-dimensional Probabilistic vector of k non-interference, so as to carry out the identification of k classification to image, therefore, it is possible to reality The multi-tag classification of existing image.

Fig. 3 is a kind of flow chart of the determination method of image category according to an exemplary embodiment, such as Fig. 3 institutes Show, this method comprises the following steps：

In step 301, the view data of target image is obtained.

In step 302, the target component of depth convolutional neural networks is determined by backward pass-algorithm, the depth convolution Neutral net is used to view data is identified.

Illustratively, the depth convolutional neural networks include：The two-dimentional full articulamentum of neuron, at least one one-dimensional neuron is complete Articulamentum and at least one convolutional layer, wherein, the two-dimentional full articulamentum of neuron is located at least one one-dimensional full articulamentum of neuron Lower floor, at least one one-dimensional full articulamentum of neuron be located at least one convolutional layer lower floor.That is, the depth convolution The structure of neutral net is from top to bottom, successively convolutional layer, the one-dimensional full articulamentum of neuron and the full articulamentum of two-dimentional neuron, is shown Example ground, the number of plies of convolutional layer can be 5, and the number of plies of the one-dimensional full articulamentum of neuron can be 2, the depth convolutional neural networks Last layer is the full articulamentum of two-dimentional neuron, and its structure can be as shown in Figure 2.

Wherein, backward pass-algorithm (English：Back-Propagation, abbreviation BP) it is that a kind of neural network learning is calculated Method, is the study on multilayer feedforward neural network, and it can iteratively handle the data set (ginseng in training convolutional neural networks The set of the gradient of number and parameter), until meeting the default condition of convergence.Briefly, it is output layer exactly from last layer Start, train each parameter of each layer forward untill first layer, successively via each full articulamentum and convolutional layer, obtain successively The Grad of every layer of each parameter, the Grad obtained can be used for the parameter renewal to every layer and every layer of last layer, Therefore it is referred to as backward transmission, the above method is performed again after all layers of parameters are updated once, until every layer of each ginseng Untill several gradients meets the condition of convergence, therefore a kind of algorithm for successively seeking Grad can also be regarded as.

In addition, the execution sequence that step 302 and step 301 are not fixed, can also first carry out step 302, then perform step Rapid 301.

In step 303, it is determined that after the target component of depth convolutional neural networks, regarding view data as depth convolution Neutral net is inputted, and obtains two class probabilities of the predetermined number of depth convolutional neural networks output, wherein, each two classification is general Rate one classification of correspondence, and each two class probability is orthogonal.

Wherein, the full articulamentum of two-dimentional neuron of the depth convolutional neural networks is determined using logistic regression function The two-dimentional neuron of predetermined number.

Logistic regression (Logistic Regression) is current the more commonly used machine learning method, for estimating certain The possibility of things is planted, is a Nonlinear Classification model, the difference that it is returned from normal linear is linearly to return Return a wide range of number of output, for example from it is negative it is infinite be compressed to just infinite between 0 and 1, using 0 to 1 between floating number as Output value table is shown as the probable value of the function.Logistic regression function is frequently used for classification, and mainly two classify or many points Class.So-called two classification, comprising positive class and negative class, positive class represents proposition represented by the sample to be true, and negative class represents the sample institute table It is false to show proposition, therefore two class probabilities just include sample and belong to the probable value of positive class and belong to the probable value of negative class, example, When the probable value that sample belongs to positive class is more than 0.5, then it is determined that it is positive class, otherwise it is exactly negative class, wherein positive class The probable value sum of probable value and negative class be 1, you can be interpreted as two class probability include the target image belong to correspondence The probability of classification, and the probability of the correspondence classification is not belonging to, when the target image belongs to (two class probability) correspondence class When other probability is more than certain threshold value, it may be determined that the target image belongs to this classification, when the target image is not belonging to this pair When answering the probability of classification more than certain threshold value, it may be determined that the target image is not belonging to this classification.

Therefore, the depth convolutional neural networks exported after the full articulamentum of two-dimentional neuron can be 1000 mutually not Interference two class probabilities, each two class probability respectively correspond to a classification, for indicate respectively the target image belong to and The probability of each classification is not belonging to, all of the target image can be determined according to [c_1, c_2 ..., the c_k] of the output afterwards Class label.

In step 304, the classification of target image is determined according to two all class probabilities.

Illustratively, because 1000 c_i exported in step 303 are orthogonal, thus it is mutual not according to this 1000 c_i Two related class probabilities can determine that out that the classification of target image may have 1 to 1000, therefore according to all two class probabilities The classification for the target image that can determine that out is no longer unique.

In summary, the determination method for the image category that the embodiment of the present disclosure is provided, this method is by obtaining target figure The view data of picture；The target component of depth convolutional neural networks is determined by backward pass-algorithm, the depth convolutional Neural net Network is used to view data is identified；It is determined that after the target component of depth convolutional neural networks, regarding view data as depth Convolutional neural networks input is spent, two class probabilities of the predetermined number of depth convolutional neural networks output are obtained；Afterwards according to institute Two class probabilities having determine the classification of target image.The technical scheme that the embodiment of the present disclosure is provided is by improving depth convolution The structure of neutral net so that depth convolutional neural networks export two orthogonal class probabilities, can realize to image Multi-tag is classified.

Fig. 4 is a kind of flow chart of the determination method of image category according to an exemplary embodiment, such as Fig. 4 institutes Show, the target component that depth convolutional neural networks are determined by backward pass-algorithm described in the step 302 in Fig. 3, the depth Convolutional neural networks are used to view data is identified, and comprise the following steps：

In step 3021, every layer of each parameter to depth convolutional neural networks sets up temporary storage respectively.

In order to prevent during iteration chain type derivation determines parameter, the successively covering of parameter being caused, can be distinguished Interim memory is set up for what every layer of each parameter degree one was answered, to store the gradient of correspondence parameter in being calculated in each round Value, it is ensured that separate between the output result of depth convolutional neural networks., wherein it is desired to explanation, in the present embodiment It is intended to exemplified by the network structure shown in 2, an above-mentioned wheel calculating refers to from input layer (the first Ge Juan basic units) to output layer (two dimension The full articulamentum of neuron) complete primary parameter renewal.

In step 3022, the temporary storage that every layer of each parameter of depth convolutional neural networks is set up is utilized as, The parameter of depth convolutional neural networks is trained by backward pass-algorithm.

Wherein, when all layers in depth convolutional neural networks perform once the parameter of training depth convolutional neural networks After step, the step of parameter of the training depth convolutional neural networks is performed again, until every in depth convolutional neural networks When the Grad of each parameter of layer meets the default condition of convergence, by every layer of each parameter in depth convolutional neural networks Currency is defined as the target component of depth convolutional neural networks.

That is, be iteration by way of the parameter that backward pass-algorithm trains depth convolutional neural networks, should Process embodiment is referring to the embodiment of FIG. 5 below, first last layer from depth convolutional neural networks, that is, two dimension god Step 30221 is proceeded by through first full articulamentum, other layers carry out the operation of step 30222 in addition to last layer.This implementation Current layer described in example can be understood as carrying out this layer that parameter updates calculating, and the current layer can be depth convolution Any layer in neutral net.In addition, when being trained to the depth convolutional neural networks, the input data can be to pre-set Several label images.

Fig. 5 is a kind of flow chart of the determination method of image category according to an exemplary embodiment, such as Fig. 5 institutes Show, by taking any layer in depth convolutional neural networks as an example, the training depth convolutional Neural described in the step 3022 in Fig. 4 The step of parameter of network, includes：

In step 30221, when the current layer of depth convolutional neural networks is not present next layer, by each of current layer Parameter, the Grad stored respectively in temporary storage corresponding with each parameter of current layer is added, and obtains current layer more Parameter after new.

For example, output layer (the two-dimentional full articulamentum of neuron) is the 8th layer in above-mentioned CNN networks, it is also last layer, Therefore the renewal of each parameter of output layer can perform step 30221, by taking the parameter A in this layer as an example, parameter A initial value (or being currency, i.e., parameter A value before this renewal) is A0, then the parameter after parameter A updates is that parameter A is corresponding The initial value A0 sums of Grad Ad and parameter A in temporary storage, are expressed as Anew=Ad+A0, and step 30224 is carried out afterwards Operation.Wherein, parameter A can be any parameter in output layer, in first time undated parameter A, and parameter A is corresponding interim Grad Ad in memory is the Grad that obtained parameter A is calculated after upper once undated parameter A, so if being first Grad Ad in the corresponding temporary storage of secondary undated parameter A, parameter A is zero.

In step 30222, at one layer in the presence of the current layer of depth convolutional neural networks, by each ginseng of current layer The Grad stored in the corresponding temporary storages of number, respectively with stored in next layer of temporary storage it is every with current layer The Grad of the parameter of individual parameter association is added, and obtains the new Grad of each parameter of current layer.

Wherein, current layer and next layer are two layers of arbitrary neighborhood in depth convolutional neural networks.Each parameter is corresponding The Grad stored in temporary storage is obtained when being the step for the parameter that the training depth convolutional neural networks were performed in last time Arrive, the initial value of the Grad stored in every layer of the corresponding temporary storage of each parameter of the depth convolutional neural networks It is zero.

For example, for above-mentioned CNN networks first layer to layer 7, be respectively provided with next layer, by taking layer 7 as an example, layer 7 Next layer be exactly above-mentioned the 8th layer, that is, output layer.Assuming that for the parameter B of layer 7, the initial value of the parameter is B0, The Grad stored in the corresponding temporary storages of parameter B is Bd, and Bd is that obtained parameter is calculated after upper once undated parameter B B Grad, so if being that the Grad Bd in first time undated parameter B, the corresponding temporary storages of parameter B is zero.It is false Be located at the 8th layer has D, E, F respectively with the parameter B parameters associated, and corresponding interim storage is wherein deposited respectively respectively by parameter D, E, F Having stored up parameter D, E, F Grad Dd ', Ed ' and Fd ', (Grad Dd ', Ed ' and Fd ' are in the 8th layer of execution of step 3021 And obtained behind step 3024~3026), then parameter B new Grad Bd_NewFor Bd, Dd ', Ed ' and Fd ' sums.

In step 30223, by the new Grad of each parameter of current layer, respectively with each parameter phase of current layer Plus, obtain the parameter after the renewal of current layer.

For example, for the parameter B of layer 7, initial value B0 and parameter B that the parameter B after renewal is parameter B new gradient Value Bd_NewSum.

To sum up, parameter renewal can be carried out to last layer in CNN networks, pass through step by above-mentioned steps 30221 30222 and step 30223 can in addition to last layer other all layers parameters be updated, in each layer of parameter Complete after updating, then the operation of step 30224 is carried out to the parameter after renewal, to determine the Grad of the parameter after updating, enter And complete the renewal to the Grad in temporary storage.

In step 30224, the current Grad of the parameter after each renewal of current layer is obtained.

Example, by taking the parameter A in step 30221 as an example, after the parameter Anew after obtaining parameter A renewal, calculate Anew current Grad, is designated as Ad ', and perform step 30223 Ad ' is stored in the corresponding temporary storages of parameter A The Ad of storage before replacing it.Wherein, Grad Dd ', Ed ' and the Fd ' being related in step 30222 are also to be obtained by same method And be stored in corresponding temporary storage.So, in the rear successively solution into pass-algorithm to Grad, every layer every The current Grad of individual parameter is determined according to the parameter after renewal, with the real-time change of the parameter of guarantee.

In step 30225, the current Grad of the parameter after each renewal of current layer is stored in and current layer In the corresponding temporary storage of each parameter.

The current Grad of parameter after each renewal of current layer is stored in corresponding temporary storage, so as to Again during the step of the parameter of the training depth convolutional neural networks described in execution step 30221~30226, for each layer Parameters be updated, that is, the parameter that is carried out of step 30221 or step 30222 updates.As can be seen here, pass through For every layer of each parameter setting temporary storage so that each layer of each parameter in the updated, all by the gradient after renewal Value carries out separate storage so that last layer can utilize the Grad of associated parameter in adjacent lower floor to enter in undated parameter Row parameter is updated, therefore the Grad after parameter renewal is not directly covered in an iterative process, but has been delivered to upper one Layer.

In step 30226, judge whether the Grad of every layer of each parameter in depth convolutional neural networks meets pre- If the condition of convergence.

When meeting the default condition of convergence, terminate the training to the parameter of the depth convolutional neural networks, continue step 303 operation.If also there is the parameter for being unsatisfactory for the condition of convergence in depth convolutional neural networks, step is performed again The step of parameter of training depth convolutional neural networks described in 30221~30226, until every layer in depth convolutional neural networks The Grad of each parameter when being satisfied by the default condition of convergence, by every layer of each parameter in depth convolutional neural networks Currency is defined as the target component of depth convolutional neural networks, now just completes the instruction of the parameter of depth convolutional neural networks Practice.

Example, the default condition of convergence could be arranged to the Grad of every layer of parameter of depth convolutional neural networks For 0, or less than one less preset value of the Grad, when meeting the default condition of convergence, illustrate depth now Each parameter of each layer of convolutional neural networks can terminate the training to parameter already close to desirability, carry out to image Classification identification operation.

And when being still unsatisfactory for the default condition of convergence, illustrate also to need to instruct the parameter of depth convolutional neural networks Practice, return to step 30221 starts again at parameter training from back to front from last layer again, the judgement until meeting this step Untill condition.

Fig. 6 is a kind of flow chart of the determination method of image category according to an exemplary embodiment, such as Fig. 6 institutes Show, the classification that target image is determined according to two all class probabilities described in the step 304 in Fig. 3, including：

In step 3041, it will affirm that probable value is more than two classification of predetermined probabilities threshold value generally in two all class probabilities Rate, is defined as the class probability of target two.

By taking 1000 c_i exported in step 304 as an example, a probability threshold value can be pre-set, for filtering affirmative Classification corresponding to two too low class probabilities of probable value, for example, can set the probability threshold value to be：0.5, it is general as c_i=1 When rate value is less than or equal to 0.5, it may be determined that the target image is not belonging to the corresponding classifications of the c_i, when c_i=1 probable value During more than 0.5, it may be determined that the target image belongs to the corresponding classifications of the c_i.It is of course also possible to be determined by negating probability The classification of target image, for example, it may be in two all class probabilities negative probable value the two of predetermined probabilities threshold value will be less than Class probability, is defined as the class probability of target two.

In step 3042, classification corresponding to all class probabilities of target two is defined as to the classification of target image.

All corresponding classifications of the class probability of target two of determination by step 3041 are that target image may belong to One or more class label can be included in category set, category set.Because all two class probabilities are separate , therefore all two class probabilities do not disturb mutually, therefore target image is each corresponding classification of class probability of target two Probability is also no longer shifting, but is independent of each other, therefore the target image that can determine based on the above method is more Individual classification, so as to realize that the multi-tag of the target image is classified.

Fig. 7 is a kind of block diagram of the determining device of image category according to an exemplary embodiment, as shown in fig. 7, The device 700 includes：

Image data acquisition module 710, is configured as obtaining the view data of target image.

View data identification module 720, is configured as determining the mesh of depth convolutional neural networks by backward pass-algorithm Parameter is marked, depth convolutional neural networks are used to view data is identified.

Two class probability output modules 730, are configured as it is determined that after the target component of depth convolutional neural networks, will scheme As data are inputted as depth convolutional neural networks, obtain the two of the predetermined number of depth convolutional neural networks output and classify general Rate, wherein, one classification of each two class probabilities correspondence, and each two class probability is orthogonal.

Image category determining module 740, is configured as determining the classification of target image according to two all class probabilities.

Optionally, the depth convolutional neural networks, including：The two-dimentional full articulamentum of neuron, at least one one-dimensional neuron Full articulamentum and at least one convolutional layer, wherein, the two-dimentional full articulamentum of neuron is connected entirely positioned at least one one-dimensional neuron The lower floor of layer, at least one one-dimensional full articulamentum of neuron is located at the lower floor of at least one convolutional layer.

Optionally, Fig. 8 is a kind of block diagram of view data identification module according to an exemplary embodiment, such as Fig. 8 It is shown, the view data identification module 720, including memory setting up submodule 721 and parameter training submodule 722：

Memory setting up submodule 721, is configured to build every layer of each parameter of depth convolutional neural networks Vertical temporary storage.

Parameter training submodule 722, is configured to, with every layer of each parameter foundation for depth convolutional neural networks Temporary storage, pass through backward pass-algorithm train depth convolutional neural networks parameter.

Wherein, when all layers in depth convolutional neural networks perform once the parameter of training depth convolutional neural networks After step, the step of parameter of training depth convolutional neural networks is performed again, until every layer in depth convolutional neural networks When the Grad of each parameter meets the default condition of convergence, by depth convolutional neural networks every layer of each parameter it is current Value is defined as the target component of depth convolutional neural networks.

Optionally, the parameter training submodule 722, is configured as：It is not present in the current layer of depth convolutional neural networks At next layer, by each parameter of current layer, the ladder stored respectively in temporary storage corresponding with each parameter of current layer Angle value is added, and obtains the parameter after the renewal of current layer；Wherein, the Grad stored in the corresponding temporary storage of each parameter Obtained during the step for being the parameter that training depth convolutional neural networks were performed in last time, every layer of depth convolutional neural networks The initial value of the Grad stored in the corresponding temporary storage of each parameter is zero；

Obtain the current Grad of the parameter after each renewal of current layer；

The current Grad of parameter after each renewal of current layer is stored in corresponding with each parameter of current layer In temporary storage.

Optionally, parameter training submodule 722, is additionally configured to：In the presence of the current layer of depth convolutional neural networks At one layer, the Grad that will be stored in the corresponding temporary storage of each parameter of current layer is deposited with next layer interim respectively The Grad with the parameter of current layer each parameter association that is being stored in reservoir is added, obtain current layer each parameter it is new Grad；Wherein, the Grad stored in the corresponding temporary storage of each parameter is that training depth convolution was performed in last time Obtained during the step of the parameter of neutral net, every layer of the corresponding temporary storage of each parameter of depth convolutional neural networks The initial value of the Grad of middle storage is zero；

By the new Grad of each parameter of current layer, it is added respectively with each parameter of current layer, obtains current layer Renewal after parameter；

Obtain the current Grad of the parameter after each renewal of current layer；

The current Grad of parameter after each renewal of current layer is stored in corresponding with each parameter of current layer In temporary storage；

Wherein, current layer and next layer are two layers of arbitrary neighborhood in depth convolutional neural networks.

Optionally, two class probabilities include, certainly probable value and negative probable value, and probable value is that image belongs to two points certainly The probable value of class probability correspondence classification, probable value and negative probable value sum are 1 certainly.

Fig. 9 is a kind of block diagram of image category determining module according to an exemplary embodiment, as shown in figure 9, should View data identification module 740, including：

Two class probabilities recognize submodule 741, are configured as to affirm that probable value is more than in advance in two all class probabilities If two class probabilities of probability threshold value, it is defined as the class probability of target two.

Image category determination sub-module 742, is configured as classification corresponding to all class probabilities of target two being defined as mesh The classification of logo image.

Optionally, the two dimension nerve for the predetermined number that the full articulamentum of two-dimentional neuron is determined using logistic regression function Member.

In summary, the determining device for the image category that the disclosure is provided, this method is by obtaining the figure of target image As data；The target component of depth convolutional neural networks is determined by backward pass-algorithm, the depth convolutional neural networks are used for View data is identified；It is determined that after the target component of depth convolutional neural networks, regarding view data as depth convolution Neutral net is inputted, and obtains two class probabilities of the predetermined number of depth convolutional neural networks output；Afterwards according to all two Class probability determines the classification of target image.The technical scheme that the disclosure is provided is by improving the knots of depth convolutional neural networks Structure so that depth convolutional neural networks export two orthogonal class probabilities, can realize the multi-tag classification to image.

On the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant this method Embodiment in be described in detail, explanation will be not set forth in detail herein.

Figure 10 is a kind of block diagram of the determining device 1000 of image category according to another exemplary embodiment.Example Such as, device 1000 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console, flat board is set It is standby, Medical Devices, body-building equipment, personal digital assistant etc..

Reference picture 10, device 1000 can include following one or more assemblies：Processing assembly 1002, memory 1004, Electric power assembly 1006, multimedia groupware 1008, audio-frequency assembly 1010, the interface 1012 of input/output (I/O), sensor cluster 1014, and communication component 1016.

The integrated operation of the usual control device 1000 of processing assembly 1002, such as with display, call, data communication, The camera operation operation associated with record operation.Processing assembly 1002 can include one or more processors 1020 to perform Instruction, with all or part of step for the determination for completing image category.In addition, processing assembly 1002 can include it is one or more Module, is easy to the interaction between processing assembly 1002 and other assemblies.For example, processing assembly 1002 can include multimedia mould Block, to facilitate the interaction between multimedia groupware 1008 and processing assembly 1002.

Memory 1004 is configured as storing various types of data supporting the operation in device 1000.These data Example includes the instruction of any application program or method for being used to operate on device 1000, contact data, telephone book data, Message, image, video etc..Memory 1004 can by any kind of volatibility or non-volatile memory device or they Combination realize, such as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM), it is erasable can Program read-only memory (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash memory Reservoir, disk or CD.

Electric power assembly 1006 provides electric power for the various assemblies of device 1000.Electric power assembly 1006 can include power management System, one or more power supplys, and other components associated with generating, managing and distributing electric power for device 1000.

Multimedia groupware 1008 is included in the screen of one output interface of offer between described device 1000 and user. In some embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, Screen may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch and passed Sensor is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or slip be dynamic The border of work, but also the detection duration related to the touch or slide and pressure.In certain embodiments, it is many Media component 1008 includes a front camera and/or rear camera.When device 1000 is in operator scheme, mould is such as shot When formula or video mode, front camera and/or rear camera can receive the multi-medium data of outside.Each preposition shooting Head and rear camera can be a fixed optical lens systems or with focusing and optical zoom capabilities.

Audio-frequency assembly 1010 is configured as output and/or input audio signal.For example, audio-frequency assembly 1010 includes a wheat Gram wind (MIC), when device 1000 is in operator scheme, when such as call model, logging mode and speech recognition mode, microphone quilt It is configured to receive external audio signal.The audio signal received can be further stored in memory 1004 or via communication Component 1016 is sent.In certain embodiments, audio-frequency assembly 1010 also includes a loudspeaker, for exports audio signal.

I/O interfaces 1012 are that interface, above-mentioned peripheral interface module are provided between processing assembly 1002 and peripheral interface module Can be keyboard, click wheel, button etc..These buttons may include but be not limited to：Home button, volume button, start button and Locking press button.

Sensor cluster 1014 includes one or more sensors, and the state for providing various aspects for device 1000 is commented Estimate.For example, sensor cluster 1014 can detect opening/closed mode of device 1000, the relative positioning of component, such as institute Display and keypad that component is device 1000 are stated, sensor cluster 1014 can be with detection means 1000 or device 1,000 1 The position of individual component changes, the existence or non-existence that user contacts with device 1000, the orientation of device 1000 or acceleration/deceleration and dress Put 1000 temperature change.Sensor cluster 1014 can include proximity transducer, be configured in not any physics The presence of object nearby is detected during contact.Sensor cluster 1014 can also include optical sensor, such as CMOS or ccd image sensing Device, for being used in imaging applications.In certain embodiments, the sensor cluster 1014 can also include acceleration sensing Device, gyro sensor, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 1016 is configured to facilitate the communication of wired or wireless way between device 1000 and other equipment.Dress The wireless network based on communication standard, such as WiFi, 2G or 3G, or combinations thereof can be accessed by putting 1000.It is exemplary at one In embodiment, communication component 1016 receives broadcast singal or broadcast correlation from external broadcasting management system via broadcast channel Information.In one exemplary embodiment, the communication component 1016 also includes near-field communication (NFC) module, to promote short distance Communication.For example, radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band can be based in NFC module (UWB) technology, bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, device 1000 can be by one or more application specific integrated circuits (ASIC), numeral Signal processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, the determination for performing image category.

In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided Such as include the memory 1004 of instruction, above-mentioned instruction can be performed to complete image category really by the processor 1020 of device 1000 It is fixed.For example, the non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, magnetic Band, floppy disk and optical data storage devices etc..

Those skilled in the art will readily occur to other embodiment party of the disclosure after considering specification and putting into practice the disclosure Case.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or adaptability Change follows the general principle of the disclosure and including the undocumented common knowledge or usual skill in the art of the disclosure Art means.Description and embodiments are considered only as exemplary, and the true scope of the disclosure and spirit are by following claim Point out.

It should be appreciated that the precision architecture that the disclosure is not limited to be described above and is shown in the drawings, and And various modifications and changes can be being carried out without departing from the scope.The scope of the present disclosure is only limited by appended claim.

Claims

1. a kind of determination method of image category, it is characterised in that including：

Obtain the view data of target image；

Determine the target component of depth convolutional neural networks by backward pass-algorithm, the depth convolutional neural networks are used for pair Described image data are identified；

It is determined that after the target component of the depth convolutional neural networks, regarding described image data as the depth convolutional Neural Network inputs, obtain two class probabilities of the predetermined number of the depth convolutional neural networks output, wherein, it is each described two points Class probability one classification of correspondence, and each two class probability is orthogonal；

2. according to the method described in claim 1, it is characterised in that the depth convolutional neural networks, including：Two-dimentional neuron Full articulamentum, at least one one-dimensional full articulamentum of neuron and at least one convolutional layer, wherein, the two-dimentional neuron is connected entirely Layer is located at the lower floor of at least one one-dimensional full articulamentum of neuron, and at least one described one-dimensional full articulamentum of neuron is located at The lower floor of at least one convolutional layer.

3. method according to claim 1 or 2, it is characterised in that described that depth convolution is determined by backward pass-algorithm The target component of neutral net, including：

The temporary storage that every layer of each parameter of the depth convolutional neural networks is set up is utilized as, passes through the backward biography Pass the parameter of depth convolutional neural networks described in Algorithm for Training；

Wherein, when all layers in the depth convolutional neural networks perform once the training depth convolutional neural networks Parameter the step of after, the step of parameter of the training depth convolutional neural networks is performed again, until the depth When the Grad of every layer of each parameter meets the default condition of convergence in convolutional neural networks, by the depth convolutional Neural net The currency of every layer of each parameter is defined as the target component of the depth convolutional neural networks in network.

4. method according to claim 3, it is characterised in that the parameter of the training depth convolutional neural networks Step includes：

When the current layer of the depth convolutional neural networks is not present next layer, by each parameter of the current layer, difference The Grad stored in temporary storage corresponding with each parameter of the current layer is added, and obtains the renewal of the current layer Parameter afterwards；Wherein, the Grad stored in the corresponding temporary storage of each parameter is to be performed in last time described in the training Obtained during the step of the parameter of depth convolutional neural networks, every layer of each parameter correspondence of the depth convolutional neural networks Temporary storage in the initial value of Grad that stores be zero；

The current Grad of parameter after each renewal of the current layer is stored in each parameter pair with the current layer In the temporary storage answered.

5. method according to claim 3, it is characterised in that the parameter of the training depth convolutional neural networks Step also includes：When one layer in the presence of the current layer of the depth convolutional neural networks,

The Grad that to be stored in the corresponding temporary storage of each parameter of the current layer, faces with described next layer respectively When memory in the Grad of the parameter of each parameter association with the current layer that stores be added, obtain the current layer The new Grad of each parameter；Wherein, the Grad stored in the corresponding temporary storage of each parameter was performed in last time Obtained during the step of the parameter of the training depth convolutional neural networks, every layer of the depth convolutional neural networks The initial value of the Grad stored in the corresponding temporary storage of each parameter is zero；

By the new Grad of each parameter of the current layer, it is added respectively with each parameter of the current layer, obtains institute State the parameter after the renewal of current layer；

The current Grad of parameter after each renewal of the current layer is stored in each parameter pair with the current layer In the temporary storage answered；

6. method according to claim 1 or 2, it is characterised in that two class probability includes, affirms probable value and no Determine probable value, the probable value certainly is the probable value that described image belongs to the two class probabilities correspondence classification；The basis All two class probabilities determine the classification of the target image, including：

It will affirm that probable value is more than two class probability of predetermined probabilities threshold value described in all two class probabilities, really It is set to the class probability of target two；

7. method according to claim 2, it is characterised in that the two-dimentional full articulamentum of neuron is returned using logic Return the two-dimentional neuron of the predetermined number of function determination.

8. a kind of determining device of image category, it is characterised in that including：

View data identification module, is configured as determining the target component of depth convolutional neural networks by backward pass-algorithm, The depth convolutional neural networks are used to described image data are identified；

Two class probability output modules, are configured as it is determined that after the target component of the depth convolutional neural networks, inciting somebody to action described View data is inputted as the depth convolutional neural networks, obtains the predetermined number of the depth convolutional neural networks output Two class probabilities, wherein, one classification of each two class probabilities correspondence, and each two class probability is orthogonal；

Image category determining module, is configured as determining the classification of the target image according to all two class probabilities.

9. device according to claim 8, it is characterised in that the depth convolutional neural networks, including：Two-dimentional neuron Full articulamentum, at least one one-dimensional full articulamentum of neuron and at least one convolutional layer, wherein, the two-dimentional neuron is connected entirely Layer is located at the lower floor of at least one one-dimensional full articulamentum of neuron, and at least one described one-dimensional full articulamentum of neuron is located at The lower floor of at least one convolutional layer.

10. device according to claim 8 or claim 9, it is characterised in that described image data identification module, including memory Setting up submodule and parameter training submodule：

The memory setting up submodule, is configured to build every layer of each parameter of the depth convolutional neural networks Vertical temporary storage；

The parameter training submodule, is configured to, with every layer of each parameter foundation for the depth convolutional neural networks Temporary storage, pass through the parameter that the backward pass-algorithm trains the depth convolutional neural networks；

11. device according to claim 10, it is characterised in that the parameter training submodule, is configured as：

12. device according to claim 10, it is characterised in that the parameter training submodule, is additionally configured to：

When one layer in the presence of the current layer of the depth convolutional neural networks, face each parameter of the current layer is corresponding When memory in the Grad that stores, respectively with stored in described next layer of temporary storage it is each with the current layer The Grad of the parameter of parameter association is added, and obtains the new Grad of each parameter of the current layer；Wherein, each parameter The Grad stored in corresponding temporary storage is the parameter that the training depth convolutional neural networks were performed in last time Step when obtain, the ladder that is stored in every layer of the corresponding temporary storage of each parameter of the depth convolutional neural networks The initial value of angle value is zero；

13. device according to claim 8 or claim 9, it is characterised in that two class probability includes, certainly probable value and Negate probable value, the probable value certainly is the probable value that described image belongs to the two class probabilities correspondence classification；The figure As category determination module, including：

Two class probabilities recognize submodule, are configured as to affirm that probable value is more than in advance described in all two class probabilities If two class probability of probability threshold value, is defined as the class probability of target two；

Image category determination sub-module, is configured as classification corresponding to all class probabilities of target two being defined as the mesh The classification of logo image.

14. device according to claim 9, it is characterised in that the two-dimentional full articulamentum of neuron is using logic The two-dimentional neuron for the predetermined number that regression function is determined.

15. a kind of determining device of image category, it is characterised in that described device includes：

Processor；

Memory for storing processor-executable instruction；

Wherein, the processor is configured as：

Obtain the view data of target image；

16. a kind of computer-readable recording medium, is stored thereon with computer program instructions, it is characterised in that the computer The step of any one of claim 1-7 methods described is realized when programmed instruction is executed by processor.