CN107229968A

CN107229968A - Gradient parameter determines method, device and computer-readable recording medium

Info

Publication number: CN107229968A
Application number: CN201710373287.6A
Authority: CN
Inventors: 万韶华
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2017-05-24
Filing date: 2017-05-24
Publication date: 2017-10-03
Anticipated expiration: 2037-05-24
Also published as: CN107229968B

Abstract

Method, device and computer-readable recording medium are determined the disclosure is directed to a kind of gradient parameter, is related to technical field of image processing, this method includes：Pass through the specified full articulamentum in convolutional neural networks model to be trained, receive the first gradient of the next convolutional layer transmission for specifying full articulamentum, this specifies the specified location that full articulamentum is located between multiple convolutional layers that the convolutional neural networks model includes, and specifies full articulamentum to determine the second gradient by this.The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained, the 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.Due to the first gradient and the second gradient are carried out after summation operation, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add convergence of algorithm speed.

Description

Gradient parameter determines method, device and computer-readable recording medium

Technical field

This disclosure relates to which technical field of image processing, more particularly to a kind of gradient parameter determine method, device and computer Readable storage medium storing program for executing.

Background technology

With the fast development of image processing techniques, convolutional neural networks model has been obtained widely in terms of image recognition Using, for example, if by an image to be identified input to completed training convolutional neural networks model in, pass through the volume Product neural network model can identify the classification of the image.For example, by the image of one " cat " input to completed training In convolutional neural networks model, it can identify that the classification of the image is " cat " by the convolutional neural networks model.

In order to successfully realize image recognition, it usually needs be in advance based on training image and convolutional neural networks model is entered Row training, convolutional neural networks model is usually by multiple convolutional layers, multiple active coatings, multiple pond layers and multiple full connections Layer is composed in series.The training process of convolutional neural networks model includes：Training is inputted in the input layer of convolutional neural networks model Image, it is general from output layer output prediction classification after training image is identified wait the convolutional neural networks model trained Rate.Afterwards, based on the class probability error between the prediction class probability and initial category probability, each layer of gradient ginseng is determined Number, and the original model parameter that convolutional neural networks model includes each layer is adjusted based on each layer of the gradient parameter It is whole.In practical implementations, in order to increase the accuracy of image recognition, generally require and depth instruction is carried out to convolutional neural networks model Practice, the method generally used is to increase the quantity of convolutional layer in convolutional neural networks model.

The content of the invention

To overcome problem present in correlation technique, the disclosure provides a kind of gradient parameter and determines method, device and calculating Machine readable storage medium storing program for executing.

First aspect determines method there is provided a kind of gradient parameter, and methods described includes：

By the specified full articulamentum in convolutional neural networks model to be trained, described specify under full articulamentum is received The first gradient of one convolutional layer transmission, it is described to specify full articulamentum to be located at multiple volumes that the convolutional neural networks model includes Specified location between lamination, the next convolutional layer for specifying full articulamentum is close to the defeated of the convolutional neural networks model Go out layer；

Full articulamentum is specified to determine the second gradient by described, second gradient is true based on first category probable error Surely obtain, the first category probable error is the error between the first prediction class probability and initial category probability, described the One prediction class probability is by being located at the multilayer specified on full articulamentum in the convolutional neural networks model to instruction Practice after processing is identified in image and obtain；

The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained；

The 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.

Alternatively, it is described to specify full articulamentum to determine the second gradient by described, including：

The training is schemed by being located at the multilayer specified on full articulamentum in the convolutional neural networks model As processing is identified, the first prediction class probability is obtained；

The difference between the first prediction class probability and the initial category probability is determined, the first category is obtained Probable error；

Based on the first category probable error, full articulamentum is specified to determine institute using specified gradient descent method by described State the second gradient.

Alternatively, the specified full articulamentum by convolutional neural networks model to be trained, receives described specify Before the first gradient of next convolutional layer transmission of full articulamentum, in addition to：

Processing is identified to the training image in all layers included by the convolutional neural networks model, obtains Two prediction class probabilities；

The difference between the second prediction class probability and the initial category probability is determined, second category probability is obtained Error；

Based on the second category probable error, connected entirely by being located at described specify in the convolutional neural networks model Next convolutional layer of layer determines the first gradient using specified gradient descent method.

Alternatively, the gradient parameter 3rd gradient being defined as training the convolutional neural networks model Afterwards, in addition to：

The product between the gradient length of the 3rd gradient and prescribed coefficient is determined, moving step length is obtained, and will be described The model parameter of full articulamentum is specified, the moving step length, the prescribed coefficient are moved to the gradient direction of the 3rd gradient For any coefficient pre-set；

The 3rd gradient is passed to a upper convolutional layer for the specified full articulamentum, to be passed to gradient parameter Pass.

Alternatively, it is described initial when the model parameter that the convolutional neural networks model includes is original model parameter Model parameter is any parameter pre-set.

Second aspect includes there is provided a kind of gradient parameter determining device, described device：

Receiving module, for by the specified full articulamentum in convolutional neural networks model to be trained, receiving described refer to The first gradient of next convolutional layer transmission of fixed full articulamentum, it is described to specify full articulamentum to be located at the convolutional neural networks mould Specified location between multiple convolutional layers that type includes, the next convolutional layer for specifying full articulamentum is close to convolution god Output layer through network model；

First determining module, for specifying full articulamentum to determine the second gradient by described, second gradient is to be based on First category probable error determines to obtain, and the first category probable error is the first prediction class probability and initial category probability Between error, the first prediction class probability is described to specify full connection by being located in the convolutional neural networks model Multilayer on layer is obtained after training image being identified processing；

Computing module, what the first gradient for the receiving module to be received was determined with first determining module Second gradient carries out summation operation, obtains 3rd gradient；

Second determining module, the 3rd gradient for the computing module to be obtained is defined as being used to train the volume The gradient parameter of product neural network model.

Alternatively, first determining module is used for：

Alternatively, described device also includes：

Recognition processing module, all layers for being included by the convolutional neural networks model enter to the training image Row identifying processing, obtains the second prediction class probability；

3rd determining module, for determining the difference between the second prediction class probability and the initial category probability Value, obtains second category probable error；

4th determining module, for based on the second category probable error, by the convolutional neural networks model Specified gradient descent method is used to determine the first gradient positioned at the next convolutional layer for specifying full articulamentum.

Alternatively, described device also includes：

5th determining module, for determining the product between the gradient length of the 3rd gradient and prescribed coefficient, is obtained Moving step length, and by the model parameter of the specified full articulamentum, the movement is moved to the gradient direction of the 3rd gradient Step-length, the prescribed coefficient is any coefficient pre-set；

Transfer module, the upper convolutional layer for the 3rd gradient to be passed to the specified full articulamentum, with right Gradient parameter is transmitted.

The third aspect includes there is provided a kind of gradient parameter determining device, described device：

Processor；

Memory for storing processor-executable instruction；

Wherein, the processor is configured as：

Fourth aspect is stored with the computer-readable recording medium there is provided a kind of computer-readable recording medium Instruction, it is characterised in that the instruction realizes following steps when being executed by processor：

The technical scheme provided by this disclosed embodiment can include the following benefits：

The embodiment of the present disclosure increases between multiple convolutional layers that convolutional neural networks model includes specifies full articulamentum.It is logical Processing is identified to training image and obtains the first prediction class probability for the multilayer crossed on the specified full articulamentum, is referred to by this Fixed full articulamentum determines the error between the first prediction class probability and initial category probability, and determines second based on the error Gradient.When this specify full articulamentum receive output layer close to convolutional neural networks models the transmission of next convolutional layer the During one gradient, the first gradient and the second gradient are subjected to summation operation, 3rd gradient is obtained, and the 3rd gradient is defined as Gradient parameter for training the convolutional neural networks model.Due to the first gradient and the second gradient are carried out into summation operation Afterwards, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add convergence of algorithm Speed.

It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not The disclosure can be limited.

Brief description of the drawings

Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the implementation for meeting the disclosure Example, and be used to together with specification to explain the principle of the disclosure.

Fig. 1 is the flow chart that a kind of gradient parameter according to an exemplary embodiment determines method.

Fig. 2A is that a kind of gradient parameter according to another exemplary embodiment determines method.

Fig. 2 B are the signals of annexation between each layer in a kind of convolutional neural networks model involved by Fig. 2A embodiments Figure.

Fig. 3 A are a kind of structured flowcharts of gradient parameter determining device according to an exemplary embodiment.

Fig. 3 B are the structured flowcharts of another gradient parameter determining device according to an exemplary embodiment.

Fig. 3 C are the structured flowcharts of another gradient parameter determining device according to an exemplary embodiment.

Fig. 4 is a kind of block diagram of gradient parameter determining device 400 according to an exemplary embodiment.

Embodiment

Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Following description is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the disclosure.On the contrary, they be only with it is such as appended The example of the consistent apparatus and method of some aspects be described in detail in claims, the disclosure.

Before to the embodiment of the present disclosure carrying out that introduction is explained in detail, first the noun being related in the embodiment of the present disclosure is entered Row is simple to be introduced：

Convolutional neural networks model：It is a kind of feedforward neural network, typically by multiple convolutional layers and multiple full-mesh layer group Into.Certainly, in addition, convolutional neural networks model also includes multiple active coatings and multiple pond layers.In the specific implementation, Back-propagation algorithm can be used to be trained convolutional neural networks model.

Predict class probability：Belong to the probability of pre-set categories for training image.Wherein, the pre-set categories can be by technology people Member is self-defined according to the actual requirements to be set, and e.g., the pre-set categories can include " cat ", " dog ", " bear ", " lion ", Tiger etc.. The prediction class probability is by being obtained after training image being identified wait the convolutional neural networks model trained processing.

Initial category probability：It can typically be set by technical staff is self-defined according to the actual requirements, initial category probability one As can also be referred to as the true class probability of training image.

Model parameter：Refer to the model parameter of convolutional neural networks model, the model parameter generally comprises the volume of convolutional layer Product core, weight matrix of full articulamentum etc., are mainly used in training image being identified processing.

Next, being introduced the application scenarios of the embodiment of the present disclosure.At present, in order to improve convolutional neural networks model The accuracy of image is recognized, can typically increase the quantity of convolutional layer in convolutional neural networks model, with to convolutional neural networks Model carries out depth training.However, with the increase of convolution layer number, gradient parameter can be less and less during transmission, Cause the model parameter renewal speed of lower layer network slack-off, or even can not restrain.Therefore, the embodiment of the present disclosure provides a kind of ladder Parameter determination method is spent, this method is increased by the specified location between multiple convolutional layers specifies full articulamentum, and passes through this Specify full articulamentum enhancing gradient parameter so that it is farther that gradient parameter can be transmitted, and adds convergence of algorithm speed, and keep away Exempt from because depth training causes the problem of algorithm can not restrain.The method that the embodiment of the present disclosure is provided can be held by terminal OK, the terminal can be the equipment of such as tablet personal computer, computer or the like, and the embodiment of the present disclosure is not construed as limiting to this.

Fig. 1 is the flow chart that a kind of gradient parameter according to an exemplary embodiment determines method, as shown in figure 1, The gradient parameter determines that method is used in terminal, comprises the following steps：

In a step 101, by the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify complete The first gradient of next convolutional layer transmission of articulamentum, this specifies full articulamentum to be located at what the convolutional neural networks model included Specified location between multiple convolutional layers, this specifies next convolutional layer of full articulamentum close to the convolutional neural networks model Output layer.

In a step 102, full articulamentum is specified to determine the second gradient by this, second gradient is general based on first category Rate error determines to obtain, and the first category probable error is the error between the first prediction class probability and initial category probability, The first prediction class probability is to specify the multilayer on full articulamentum to instruction by being located at this in the convolutional neural networks model Practice after processing is identified in image and obtain.

In step 103, the first gradient and second gradient are subjected to summation operation, obtain 3rd gradient.

At step 104, the 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.

In the disclosed embodiments, increase between multiple convolutional layers that convolutional neural networks model includes and specify full connection Layer.Specify the multilayer on full articulamentum training image to be identified processing by this and obtain the first prediction class probability, lead to Crossing this specifies full articulamentum to determine the error between the first prediction class probability and initial category probability, and true based on the error Fixed second gradient.When this specifies full articulamentum to receive next convolutional layer biography close to the output layer of convolutional neural networks model During the first gradient passed, the first gradient and the second gradient are subjected to summation operation, 3rd gradient is obtained, and by the 3rd gradient It is defined as the gradient parameter for training the convolutional neural networks model.Due to the first gradient being summed with the second gradient After computing, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add algorithm Convergence rate.

Alternatively, full articulamentum is specified to determine the second gradient by this, including：

The multilayer on full articulamentum is specified to carry out the training image by being located at this in the convolutional neural networks model Identifying processing, obtains the first prediction class probability；

The difference between the first prediction class probability and the initial category probability is determined, first category probability mistake is obtained Difference；

Based on the first category probable error, by this specify full articulamentum using specify gradient descent method determine this second Gradient.

Alternatively, by the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify full connection Before the first gradient of next convolutional layer transmission of layer, in addition to：

Processing is identified to the training image in all layers included by the convolutional neural networks model, obtains second pre- Survey class probability；

The difference between the second prediction class probability and the initial category probability is determined, second category probability mistake is obtained Difference；

Based on the second category probable error, specified by being located at this in the convolutional neural networks model under full articulamentum One convolutional layer determines the first gradient using specified gradient descent method.

Alternatively, the 3rd gradient is defined as after the gradient parameter for training the convolutional neural networks model, also Including：

The product between the gradient length of the 3rd gradient and prescribed coefficient is determined, moving step length is obtained, and this is specified The model parameter of full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, and the prescribed coefficient is to pre-set Any coefficient；

The 3rd gradient is passed into the upper convolutional layer for specifying full articulamentum, to be transmitted to gradient parameter.

Alternatively, when the model parameter that the convolutional neural networks model includes is original model parameter, the initial model Parameter is any parameter pre-set.

Above-mentioned all optional technical schemes, can form the alternative embodiment of the disclosure according to any combination, and the disclosure is real Example is applied no longer to repeat this one by one.

Fig. 2A is that a kind of gradient parameter according to another exemplary embodiment determines method, as shown in Figure 2 A, the ladder Spend parameter determination method to be applied in terminal, the gradient parameter determines that method can realize step including following：

In step 201, place is identified to the training image in all layers included by the convolutional neural networks model Reason, obtains the second prediction class probability.

Wherein, the training image can be prestored in the terminal by technical staff.When needing to convolutional neural networks mould When type is trained, terminal is from locally obtaining the training image.As it was noted above, convolutional neural networks model is mainly including multiple Convolutional layer and multiple full articulamentums.Fig. 2 B are refer to, in the training process, terminal inputs acquired training figure from input layer Picture, and processing is identified to the training image in all layers included by convolutional neural networks model, obtains the second prediction class Other probability, in the specific implementation, exporting the second prediction class probability by the output layer of convolutional neural networks model.Wherein, should All layers include all convolutional layers and all full articulamentums.

Certainly, active coating and pond layer are also included in all layers, what is included due to active coating and pond layer is constant, Do not include model parameter, therefore, active coating and pond layer are not done here and excessively emphasize and introduce.

Wherein, all layers included by convolutional neural networks model the training image are identified the realization of processing Journey may refer to correlation technique, and the embodiment of the present disclosure is not construed as limiting to this.

In addition, it should also be noted that, in fact, the input layer may be considered the first layer of convolutional neural networks model Convolutional layer, output layer may be considered the full articulamentum of last layer of convolutional neural networks model.For the ease of in differentiation Lower floor's relation, is typically referred to as input layer and output layer.

In step 202., the difference between the second prediction class probability and initial category probability is determined, Equations of The Second Kind is obtained Other probable error.

In practical implementations, terminal determine this second prediction class probability after, can by this second prediction class probability with Initial category probability is compared, to judge whether to need to continue to be trained convolutional neural networks model.For example, terminal is worked as When determining that the difference between the second prediction class probability and the initial category probability is more than or equal to some predetermined threshold value, explanation The ability of the convolutional neural networks Model Identification does not meet actual demand also, in that case, and terminal should based on resulting Second category probable error, adjustment is proceeded to the model parameter of convolutional neural networks model by iterative method, specific as follows It is literary described.

, whereas if the difference between the second prediction class probability and the initial category probability is less than the predetermined threshold value, Then illustrate that the ability of the convolutional neural networks Model Identification also is compliant with actual demand, that is, illustrate that the convolutional neural networks model can The classification of image is recognized accurately, in that case, it may be determined that complete the training to the convolutional neural networks model.

Wherein, the predetermined threshold value can be set by technical staff is self-defined according to the actual requirements, can also be given tacit consent to by terminal Set, the embodiment of the present disclosure is not limited this.

It should be noted that the above-mentioned difference according between the second prediction class probability and initial category probability is determined Whether completion training is only exemplary, in another embodiment, can also be judged whether to complete training according to iterations, For example, when iterations reaches preset times, it is determined that completing the training to convolutional neural networks model, otherwise, continuing to volume Product neural network model is trained.

Wherein, the preset times can be set by technical staff is self-defined according to the actual requirements, or, can also be by the end Default setting is held, the embodiment of the present disclosure is not limited this.

In step 203, based on the second category probable error, by being specified in the convolutional neural networks model positioned at this Next convolutional layer of full articulamentum determines the first gradient using specified gradient descent method.

If being determined not yet to complete the training to convolutional neural networks model according to the above method, terminal needs to be based on being somebody's turn to do Second category probable error, by specifying in gradient descent method, the model parameter to convolutional neural networks model is adjusted.

It that is to say, terminal is determined after the second category probable error by convolutional neural networks model, by the second category Probable error propagates back to the output layer of the convolutional neural networks model, and is declined by the output layer using specified gradient Method, determines the gradient parameter of the output layer.Afterwards, the output layer is carried out using the gradient parameter to the model parameter of the output layer Adjustment, also, identified gradient parameter is passed to a upper convolutional layer for the output layer.

It should be noted that in the specific implementation, this specifies gradient descent method to be SGD (Stochastic Gradient Descent, stochastic gradient descent method).Certainly, this specifies gradient descent method to be other gradient descent methods, The disclosure is implemented not limit this.

A upper convolutional layer for the output layer is received after the gradient parameter, continues terraced using specifying based on the gradient parameter Descent method is spent, gradient parameter is determined again, and the model parameter of itself is adjusted based on the gradient parameter determined again.It Afterwards, a upper convolutional layer for the output layer continues to pass to the identified gradient parameter into a upper convolutional layer for the output layer A upper convolutional layer.

According to above-mentioned implementation procedure, until gradient parameter is delivered to next convolutional layer of specified full articulamentum, this is specified Next convolutional layer of full articulamentum, based on the gradient parameter transmitted, first gradient is determined using specified gradient descent method, and The first gradient is specified into full articulamentum to this.

For example, refer to Fig. 2 B, specify next convolutional layer of full articulamentum, i.e., the 20th layer of convolutional layer receive this After the gradient parameter of 21 layers of convolutional layer transmission, based on the gradient parameter transmitted, using specified gradient descent method determine this One gradient.Afterwards, the first gradient is based on by the 20th layer of convolutional layer, the model parameter to the 20th layer of convolutional layer is entered Row adjustment, and the first gradient is passed into the specified full articulamentum.

It should be noted that in practical implementations, technical staff can be according to the actual requirements between the plurality of convolutional layer Specified location increase specify full articulamentum, for example, between the 20th layer of convolutional layer and the 19th layer of convolutional layer increase specify Full articulamentum.Generally, the specified location be gradient parameter be less than some threshold value position, that is to say, in order that compared with Small gradient parameter can continue to transmit to low layer, full articulamentum can be specified in position increase, with to the less gradient Parameter carries out enhancing processing.In actual applications, this specifies full articulamentum to be typically also known as branch's monitor.

It it should be noted that the embodiment of the present disclosure is not limited the quantity of specified location, that is to say, actually realizing Cheng Zhong, multiple specified locations that can be between the plurality of convolutional layer increase the specified full articulamentum, for example, can be the 100th Increase between the convolutional layer and the 99th layer of convolutional layer of layer and specify full articulamentum, and in the 20th layer of convolutional layer and the 19th layer Increase between convolutional layer and specify full articulamentum.

In addition, it should also be noted that, the embodiment of the present disclosure in the specified location it is increased specify full articulamentum number Amount is not also limited, for example, can be connected in the specified location 3 specifies full articulamentum.

Fig. 2 B are refer to, the connection that Fig. 2 B are schematically illustrated between each layer that convolutional neural networks model includes is closed System, includes multiple convolutional layers, it is assumed that in the 20th layer of convolutional layer and the 19th layer of convolutional layer in the convolutional neural networks model Between increase specify full articulamentum, and this specify full articulamentum 21a quantity be 3.

In step 204, by the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify complete The first gradient of next convolutional layer transmission of articulamentum.

As it was noted above, this specifies full articulamentum to be located between multiple convolutional layers that the convolutional neural networks model includes Specified location, and this specifies next convolutional layer of full articulamentum close to the output layer of the convolutional neural networks model.

In step 205, full articulamentum is specified to determine the second gradient by this, second gradient is general based on first category Rate error determines to obtain, and the first category probable error is the error between the first prediction class probability and initial category probability, The first prediction class probability is to specify the multilayer on full articulamentum to instruction by being located at this in the convolutional neural networks model Practice after processing is identified in image and obtain.

Because the first gradient is passed over after multilayer, therefore, the usual very little of the first gradient, if continue to Lower transmission, may cause algorithm not restrain, accordingly, it would be desirable to be adjusted to the first gradient, to redefine downward transmission Gradient parameter.

Therefore, when the specified full articulamentum receives the first gradient, determining the second gradient.In the specific implementation, terminal Specify the multilayer on full articulamentum that processing is identified to the training image by being located at this in the convolutional neural networks model, The first prediction class probability is obtained, the difference between the first prediction class probability and the initial category probability is determined, obtains The first category probable error, based on the first category probable error, specifies full articulamentum to decline using specified gradient by this Method determines second gradient.

For example, refer to Fig. 2 B, terminal specifies the multilayer on full articulamentum 21a close to input layer to scheme training by this As processing is identified, the first prediction class probability is obtained.The first prediction classification is calculated by the specified full articulamentum 21a general First category probable error between rate and initial category probability, and by the error back propagation, full articulamentum 21a is specified by this Based on the first category probable error, second gradient is determined using specified gradient descent method.

In fact, from being described above, this specifies full articulamentum equivalent to the output layer of the convolutional neural networks model, I.e. in the disclosed embodiments, it is thus necessary to determine that indicate two class probability errors, respectively first category probable error and second Class probability error.First category probable error determines that this by being arranged on the specified full articulamentum between multiple convolutional layers Two class probability errors are determined by the output layer of the convolutional neural networks model.

In step 206, the first gradient and second gradient are subjected to summation operation, obtain 3rd gradient.

The first gradient and second gradient are carried out after summation operation, the first gradient can be caused to strengthen, i.e., backward The gradient parameter of propagation has obtained specifying the reinforcement of full articulamentum, in this way, solving as depth trains what gradient parameter reduced Problem.

It should be noted that may refer to ladder on the implementation process that first gradient and two gradient are carried out to summation operation Algorithm is spent, the embodiment of the present disclosure is not limited this.

In step 207, the 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.

It that is to say, during subsequent gradients parameter is transmitted, the 3rd gradient is defined as the ladder for needing to transmit by terminal Spend parameter.In this way, can make it that gradient parameter travels to the low layer of network well, so as to accelerate convolutional neural networks model Convergence rate.

So far, the gradient parameter for realizing embodiment of the present disclosure offer determines method.Further, managed for the ease of depth Solution, the embodiment of the present disclosure additionally provides following steps 208 and step 209.

In a step 208, the product between the gradient length of the 3rd gradient and prescribed coefficient is determined, obtains moving step It is long, and by the model parameter of the specified full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, this specifies system Number is any coefficient pre-set.

As it was noted above, during subsequent gradients parameter is transmitted, being transmitted to the 3rd gradient.Realized actual In, the model parameter of full articulamentum can be specified to be adjusted this based on the 3rd gradient, that is to say, when it is determined that the 3rd ladder After degree, the gradient length of the 3rd gradient can be multiplied by prescribed coefficient, obtain moving step length.This is specified into full articulamentum Model parameter, the moving step length is moved to the gradient direction of the 3rd gradient, so as to realize the model that full articulamentum is specified to this Parameter is adjusted.

It should be noted that when the model parameter that the convolutional neural networks model includes is original model parameter, this is first Beginning model parameter is any parameter pre-set.

It that is to say, in the disclosed embodiments, due to can be between the convolutional layer that convolutional neural networks model includes Full articulamentum is specified in specified location increase, you can not limit the quantity for the convolutional layer that the convolutional neural networks model includes, i.e., Convolutional neural networks model can be trained with carrying out depth.So, here can not be to the first of the convolutional neural networks model Beginning model parameter is limited, and the original model parameter can be any parameter.

Similarly, the embodiment of the present disclosure can not also be defined to above-mentioned prescribed coefficient, i.e., the prescribed coefficient can be pre- Any coefficient first set.

In step 209, the 3rd gradient is passed into the upper convolutional layer for specifying full articulamentum, to join to gradient Number is transmitted.

Specify full articulamentum to be adjusted the first gradient by this to obtain after the 3rd gradient, can be by the 3rd ladder Degree continues to transmit to the low layer of network, for example, continuing with Parameter Map 2B, this specifies full articulamentum 21a can be by the 3rd gradient The upper convolutional layer for specifying full articulamentum is passed to, that is, passes to the 19th layer.

Further, the 19th layer of convolutional layer is received after the 3rd gradient, terraced using specifying based on the 3rd gradient Degree descent method determines 4th gradient.Afterwards, based on the 4th gradient, the model parameter to the 19th layer of convolutional layer is adjusted It is whole, and the 4th gradient is passed to the 18th layer of convolutional layer.Until the gradient parameter is transferred into input layer, complete to convolution The once adjustment of neural network model parameter.

Further, the convolutional neural networks model proceeds to know based on the model parameter after adjustment to training image Other places are managed, and according to above-mentioned implementation procedure, the model parameter that convolutional neural networks model includes is adjusted again.As above It is described, until second prediction class probability and initial category probability between difference be less than some predetermined threshold value, or, work as iteration When number of times reaches preset times, it is determined that completing the training to convolutional neural networks model.

Fig. 3 A are a kind of structured flowcharts of gradient parameter determining device according to an exemplary embodiment.Reference picture 3A, the device includes receiving module 310, the first determining module 312 and the determining module 316 of computing module 314 and second.

The receiving module 310, should for by the specified full articulamentum in convolutional neural networks model to be trained, receiving The first gradient of next convolutional layer transmission of full articulamentum is specified, this specifies full articulamentum to be located at the convolutional neural networks model Including multiple convolutional layers between specified location, this specify full articulamentum next convolutional layer close to the convolutional neural networks The output layer of model.

First determining module 312, for specifying full articulamentum to determine the second gradient by this, second gradient is to be based on First category probable error determines to obtain, the first category probable error be the first prediction class probability and initial category probability it Between error, this first prediction class probability be by the convolutional neural networks model be located at this specify full articulamentum on Multilayer is obtained after training image being identified processing.

The computing module 314, for the first gradient and first determining module 312 for receiving the receiving module 310 Second gradient determined carries out summation operation, obtains 3rd gradient.

Second determining module 316,3rd gradient for the computing module 314 to be obtained is defined as being used to train the volume The gradient parameter of product neural network model.

Alternatively, the first determining module 312 is used for：

Alternatively, Fig. 3 B are refer to, the device also includes：

Recognition processing module 318, all layers for being included by the convolutional neural networks model enter to the training image Row identifying processing, obtains the second prediction class probability；

3rd determining module 320, for determining the difference between the second prediction class probability and the initial category probability, Obtain second category probable error；

4th determining module 322, for based on the second category probable error, passing through the convolutional neural networks model middle position Next convolutional layer of full articulamentum is specified to use specified gradient descent method to determine the first gradient in this.

Alternatively, Fig. 3 C are refer to, the device also includes：

5th determining module 324, for determining the product between the gradient length of the 3rd gradient and prescribed coefficient, is obtained Moving step length, and by the model parameter of the specified full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, should Prescribed coefficient is any coefficient pre-set；

Transfer module 326, for the 3rd gradient to be passed into the upper convolutional layer for specifying full articulamentum, with to ladder Degree parameter is transmitted.

On the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant this method Embodiment in be described in detail, explanation will be not set forth in detail herein.

Fig. 4 is a kind of block diagram of gradient parameter determining device 400 according to an exemplary embodiment.For example, device 400 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console, tablet device, medical treatment is set It is standby, body-building equipment, personal digital assistant etc..

Reference picture 4, device 400 can include following one or more assemblies：Processing assembly 402, memory 404, power supply Component 406, multimedia groupware 408, audio-frequency assembly 410, the interface 412 of input/output (I/O), sensor cluster 414, and Communication component 416.

The integrated operation of the usual control device 400 of processing assembly 402, such as with display, call, data communication, phase Machine operates the operation associated with record operation.Processing assembly 402 can refer to including one or more processors 420 to perform Order, to complete all or part of step of above-mentioned method.In addition, processing assembly 402 can include one or more modules, just Interaction between processing assembly 402 and other assemblies.For example, processing assembly 402 can include multi-media module, it is many to facilitate Interaction between media component 408 and processing assembly 402.

Memory 404 is configured as storing various types of data supporting the operation in device 400.These data are shown Example includes the instruction of any application program or method for being operated on device 400, and contact data, telephone book data disappears Breath, picture, video etc..Memory 404 can be by any kind of volatibility or non-volatile memory device or their group Close and realize, such as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) is erasable to compile Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash Device, disk or CD.

Power supply module 406 provides power supply for the various assemblies of device 400.Power supply module 406 can include power management system System, one or more power supplys, and other components associated with generating, managing and distributing power supply for device 400.

Multimedia groupware 408 is included in the screen of one output interface of offer between described device 400 and user.One In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action Border, but also detection touches or slide related duration and pressure with described.In certain embodiments, many matchmakers Body component 408 includes a front camera and/or rear camera.When device 400 be in operator scheme, such as screening-mode or During video mode, front camera and/or rear camera can receive the multi-medium data of outside.Each front camera and Rear camera can be a fixed optical lens system or with focusing and optical zoom capabilities.

Audio-frequency assembly 410 is configured as output and/or input audio signal.For example, audio-frequency assembly 410 includes a Mike Wind (MIC), when device 400 be in operator scheme, when such as call model, logging mode and speech recognition mode, microphone by with It is set to reception external audio signal.The audio signal received can be further stored in memory 404 or via communication set Part 416 is sent.In certain embodiments, audio-frequency assembly 410 also includes a loudspeaker, for exports audio signal.

I/O interfaces 412 is provide interface between processing assembly 402 and peripheral interface module, above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include but be not limited to：Home button, volume button, start button and lock Determine button.

Sensor cluster 414 includes one or more sensors, and the state for providing various aspects for device 400 is commented Estimate.For example, sensor cluster 414 can detect opening/closed mode of device 400, the relative positioning of component is for example described Component is the display and keypad of device 400, and sensor cluster 414 can be with 400 1 components of detection means 400 or device Position change, the existence or non-existence that user contacts with device 400, the orientation of device 400 or acceleration/deceleration and device 400 Temperature change.Sensor cluster 414 can include proximity transducer, be configured to detect in not any physical contact The presence of neighbouring object.Sensor cluster 414 can also include optical sensor, such as CMOS or ccd image sensor, for into As being used in application.In certain embodiments, the sensor cluster 414 can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 416 is configured to facilitate the communication of wired or wireless way between device 400 and other equipment.Device 400 can access the wireless network based on communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary implementation In example, communication component 416 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 416 also includes near-field communication (NFC) module, to promote junction service.Example Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, device 400 can be believed by one or more application specific integrated circuits (ASIC), numeral Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, real shown in above-mentioned Fig. 1 or Fig. 2A for performing The method that example offer is provided.

In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided Such as include the memory 404 of instruction, above-mentioned instruction can be performed to complete the above method by the processor 420 of device 400.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..

A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of mobile terminal When device is performed so that mobile terminal is able to carry out a kind of gradient parameter and determines method, and methods described includes：

By the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify the next of full articulamentum The first gradient of individual convolutional layer transmission, this specify full articulamentum be located at multiple convolutional layers that the convolutional neural networks model includes it Between specified location, this specify full articulamentum next convolutional layer close to the convolutional neural networks model output layer；

Full articulamentum is specified to determine the second gradient by this, second gradient is determined based on first category probable error Arrive, the first category probable error is the error between the first prediction class probability and initial category probability, the first prediction class Other probability is to specify the multilayer on full articulamentum to know training image by being located at this in the convolutional neural networks model Obtained after the reason of other places；

Alternatively, by this full articulamentum should be specified to determine the second gradient, including：

Alternatively, should pass through convolutional neural networks model to be trained in specified full articulamentum, receive this specify connect entirely Connect layer next convolutional layer transmission first gradient before, in addition to：

Alternatively, the 3rd gradient is defined as after the gradient parameter for training the convolutional neural networks model by this, Also include：

Those skilled in the art will readily occur to its of the disclosure after considering specification and putting into practice invention disclosed herein Its embodiment.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or Person's adaptations follow the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.Description and embodiments are considered only as exemplary, and the true scope of the disclosure and spirit are by following Claim is pointed out.

It should be appreciated that the precision architecture that the disclosure is not limited to be described above and is shown in the drawings, and And various modifications and changes can be being carried out without departing from the scope.The scope of the present disclosure is only limited by appended claim.

Claims

1. a kind of gradient parameter determines method, it is characterised in that methods described includes：

By the specified full articulamentum in convolutional neural networks model to be trained, the next of the specified full articulamentum is received The first gradient of convolutional layer transmission, it is described to specify full articulamentum to be located at multiple convolutional layers that the convolutional neural networks model includes Between specified location, it is described specify full articulamentum next convolutional layer close to the convolutional neural networks model output Layer；

Full articulamentum is specified to determine the second gradient by described, second gradient is determined based on first category probable error Arrive, the first category probable error is the error between the first prediction class probability and initial category probability, and described first is pre- It is by being located at the multilayer specified on full articulamentum in the convolutional neural networks model to training figure to survey class probability As being obtained after processing is identified；

2. the method as described in claim 1, it is characterised in that described to specify full articulamentum to determine the second gradient by described, Including：

The training image is entered by being located at the multilayer specified on full articulamentum in the convolutional neural networks model Row identifying processing, obtains the first prediction class probability；

The difference between the first prediction class probability and the initial category probability is determined, the first category probability is obtained Error；

Based on the first category probable error, full articulamentum is specified using specifying gradient descent method to determine described the by described Two gradients.

3. the method as described in claim 1, it is characterised in that the finger by convolutional neural networks model to be trained Before fixed full articulamentum, the first gradient for receiving the next convolutional layer transmission for specifying full articulamentum, in addition to：

Based on the second category probable error, by being located at the specified full articulamentum in the convolutional neural networks model Next convolutional layer determines the first gradient using specified gradient descent method.

4. the method as described in claim 1, it is characterised in that described to be defined as the 3rd gradient to be used to train the volume After the gradient parameter of product neural network model, in addition to：

The product between the gradient length of the 3rd gradient and prescribed coefficient is determined, moving step length is obtained, and specify described The model parameter of full articulamentum, moves the moving step length, the prescribed coefficient is pre- to the gradient direction of the 3rd gradient Any coefficient first set；

The 3rd gradient is passed to a upper convolutional layer for the specified full articulamentum, to be transmitted to gradient parameter.

5. method as claimed in claim 4, it is characterised in that when the model parameter that the convolutional neural networks model includes is During original model parameter, the original model parameter is any parameter pre-set.

6. a kind of gradient parameter determining device, it is characterised in that described device includes：

Receiving module, for by the specified full articulamentum in convolutional neural networks model to be trained, receiving described specify entirely The first gradient of next convolutional layer transmission of articulamentum, it is described to specify full articulamentum to be located at the convolutional neural networks model bag Specified location between the multiple convolutional layers included, the next convolutional layer for specifying full articulamentum is close to the convolutional Neural net The output layer of network model；

First determining module, for specifying full articulamentum to determine the second gradient by described, second gradient is to be based on first Class probability error determines to obtain, and the first category probable error is between the first prediction class probability and initial category probability Error, it is described first prediction class probability be by the convolutional neural networks model be located at it is described specify full articulamentum it On multilayer to training image be identified processing after obtain；

Computing module, for described in the first gradient for receiving the receiving module and first determining module determination Second gradient carries out summation operation, obtains 3rd gradient；

Second determining module, the 3rd gradient for the computing module to be obtained is defined as being used to train the convolution god Gradient parameter through network model.

7. device as claimed in claim 6, it is characterised in that first determining module is used for：

8. device as claimed in claim 6, it is characterised in that described device also includes：

Recognition processing module, all layers for being included by the convolutional neural networks model are known to the training image Other places are managed, and obtain the second prediction class probability；

3rd determining module, for determining the difference between the second prediction class probability and the initial category probability, is obtained To second category probable error；

4th determining module, for based on the second category probable error, by being located in the convolutional neural networks model The next convolutional layer for specifying full articulamentum determines the first gradient using specified gradient descent method.

9. device as claimed in claim 6, it is characterised in that described device also includes：

5th determining module, for determining the product between the gradient length of the 3rd gradient and prescribed coefficient, is moved Step-length, and by the model parameter of the specified full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, The prescribed coefficient is any coefficient pre-set；

Transfer module, the upper convolutional layer for the 3rd gradient to be passed to the specified full articulamentum, with to gradient Parameter is transmitted.

10. device as claimed in claim 9, it is characterised in that the model parameter included when the convolutional neural networks model During for original model parameter, the original model parameter is any parameter pre-set.

11. a kind of gradient parameter determining device, it is characterised in that described device includes：

Processor；

Memory for storing processor-executable instruction；

Wherein, the processor is configured as performing the method described in the claims any one of 1-5.

12. be stored with instruction on a kind of computer-readable recording medium, the computer-readable recording medium, it is characterised in that The method described in the claims any one of 1-5 is realized in the instruction when being executed by processor.