CN107229968A - Gradient parameter determines method, device and computer-readable recording medium - Google Patents
Gradient parameter determines method, device and computer-readable recording medium Download PDFInfo
- Publication number
- CN107229968A CN107229968A CN201710373287.6A CN201710373287A CN107229968A CN 107229968 A CN107229968 A CN 107229968A CN 201710373287 A CN201710373287 A CN 201710373287A CN 107229968 A CN107229968 A CN 107229968A
- Authority
- CN
- China
- Prior art keywords
- gradient
- full articulamentum
- neural networks
- convolutional neural
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Method, device and computer-readable recording medium are determined the disclosure is directed to a kind of gradient parameter, is related to technical field of image processing, this method includes:Pass through the specified full articulamentum in convolutional neural networks model to be trained, receive the first gradient of the next convolutional layer transmission for specifying full articulamentum, this specifies the specified location that full articulamentum is located between multiple convolutional layers that the convolutional neural networks model includes, and specifies full articulamentum to determine the second gradient by this.The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained, the 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.Due to the first gradient and the second gradient are carried out after summation operation, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add convergence of algorithm speed.
Description
Technical field
This disclosure relates to which technical field of image processing, more particularly to a kind of gradient parameter determine method, device and computer
Readable storage medium storing program for executing.
Background technology
With the fast development of image processing techniques, convolutional neural networks model has been obtained widely in terms of image recognition
Using, for example, if by an image to be identified input to completed training convolutional neural networks model in, pass through the volume
Product neural network model can identify the classification of the image.For example, by the image of one " cat " input to completed training
In convolutional neural networks model, it can identify that the classification of the image is " cat " by the convolutional neural networks model.
In order to successfully realize image recognition, it usually needs be in advance based on training image and convolutional neural networks model is entered
Row training, convolutional neural networks model is usually by multiple convolutional layers, multiple active coatings, multiple pond layers and multiple full connections
Layer is composed in series.The training process of convolutional neural networks model includes:Training is inputted in the input layer of convolutional neural networks model
Image, it is general from output layer output prediction classification after training image is identified wait the convolutional neural networks model trained
Rate.Afterwards, based on the class probability error between the prediction class probability and initial category probability, each layer of gradient ginseng is determined
Number, and the original model parameter that convolutional neural networks model includes each layer is adjusted based on each layer of the gradient parameter
It is whole.In practical implementations, in order to increase the accuracy of image recognition, generally require and depth instruction is carried out to convolutional neural networks model
Practice, the method generally used is to increase the quantity of convolutional layer in convolutional neural networks model.
The content of the invention
To overcome problem present in correlation technique, the disclosure provides a kind of gradient parameter and determines method, device and calculating
Machine readable storage medium storing program for executing.
First aspect determines method there is provided a kind of gradient parameter, and methods described includes:
By the specified full articulamentum in convolutional neural networks model to be trained, described specify under full articulamentum is received
The first gradient of one convolutional layer transmission, it is described to specify full articulamentum to be located at multiple volumes that the convolutional neural networks model includes
Specified location between lamination, the next convolutional layer for specifying full articulamentum is close to the defeated of the convolutional neural networks model
Go out layer;
Full articulamentum is specified to determine the second gradient by described, second gradient is true based on first category probable error
Surely obtain, the first category probable error is the error between the first prediction class probability and initial category probability, described the
One prediction class probability is by being located at the multilayer specified on full articulamentum in the convolutional neural networks model to instruction
Practice after processing is identified in image and obtain;
The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained;
The 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.
Alternatively, it is described to specify full articulamentum to determine the second gradient by described, including:
The training is schemed by being located at the multilayer specified on full articulamentum in the convolutional neural networks model
As processing is identified, the first prediction class probability is obtained;
The difference between the first prediction class probability and the initial category probability is determined, the first category is obtained
Probable error;
Based on the first category probable error, full articulamentum is specified to determine institute using specified gradient descent method by described
State the second gradient.
Alternatively, the specified full articulamentum by convolutional neural networks model to be trained, receives described specify
Before the first gradient of next convolutional layer transmission of full articulamentum, in addition to:
Processing is identified to the training image in all layers included by the convolutional neural networks model, obtains
Two prediction class probabilities;
The difference between the second prediction class probability and the initial category probability is determined, second category probability is obtained
Error;
Based on the second category probable error, connected entirely by being located at described specify in the convolutional neural networks model
Next convolutional layer of layer determines the first gradient using specified gradient descent method.
Alternatively, the gradient parameter 3rd gradient being defined as training the convolutional neural networks model
Afterwards, in addition to:
The product between the gradient length of the 3rd gradient and prescribed coefficient is determined, moving step length is obtained, and will be described
The model parameter of full articulamentum is specified, the moving step length, the prescribed coefficient are moved to the gradient direction of the 3rd gradient
For any coefficient pre-set;
The 3rd gradient is passed to a upper convolutional layer for the specified full articulamentum, to be passed to gradient parameter
Pass.
Alternatively, it is described initial when the model parameter that the convolutional neural networks model includes is original model parameter
Model parameter is any parameter pre-set.
Second aspect includes there is provided a kind of gradient parameter determining device, described device:
Receiving module, for by the specified full articulamentum in convolutional neural networks model to be trained, receiving described refer to
The first gradient of next convolutional layer transmission of fixed full articulamentum, it is described to specify full articulamentum to be located at the convolutional neural networks mould
Specified location between multiple convolutional layers that type includes, the next convolutional layer for specifying full articulamentum is close to convolution god
Output layer through network model;
First determining module, for specifying full articulamentum to determine the second gradient by described, second gradient is to be based on
First category probable error determines to obtain, and the first category probable error is the first prediction class probability and initial category probability
Between error, the first prediction class probability is described to specify full connection by being located in the convolutional neural networks model
Multilayer on layer is obtained after training image being identified processing;
Computing module, what the first gradient for the receiving module to be received was determined with first determining module
Second gradient carries out summation operation, obtains 3rd gradient;
Second determining module, the 3rd gradient for the computing module to be obtained is defined as being used to train the volume
The gradient parameter of product neural network model.
Alternatively, first determining module is used for:
The training is schemed by being located at the multilayer specified on full articulamentum in the convolutional neural networks model
As processing is identified, the first prediction class probability is obtained;
The difference between the first prediction class probability and the initial category probability is determined, the first category is obtained
Probable error;
Based on the first category probable error, full articulamentum is specified to determine institute using specified gradient descent method by described
State the second gradient.
Alternatively, described device also includes:
Recognition processing module, all layers for being included by the convolutional neural networks model enter to the training image
Row identifying processing, obtains the second prediction class probability;
3rd determining module, for determining the difference between the second prediction class probability and the initial category probability
Value, obtains second category probable error;
4th determining module, for based on the second category probable error, by the convolutional neural networks model
Specified gradient descent method is used to determine the first gradient positioned at the next convolutional layer for specifying full articulamentum.
Alternatively, described device also includes:
5th determining module, for determining the product between the gradient length of the 3rd gradient and prescribed coefficient, is obtained
Moving step length, and by the model parameter of the specified full articulamentum, the movement is moved to the gradient direction of the 3rd gradient
Step-length, the prescribed coefficient is any coefficient pre-set;
Transfer module, the upper convolutional layer for the 3rd gradient to be passed to the specified full articulamentum, with right
Gradient parameter is transmitted.
Alternatively, it is described initial when the model parameter that the convolutional neural networks model includes is original model parameter
Model parameter is any parameter pre-set.
The third aspect includes there is provided a kind of gradient parameter determining device, described device:
Processor;
Memory for storing processor-executable instruction;
Wherein, the processor is configured as:
By the specified full articulamentum in convolutional neural networks model to be trained, described specify under full articulamentum is received
The first gradient of one convolutional layer transmission, it is described to specify full articulamentum to be located at multiple volumes that the convolutional neural networks model includes
Specified location between lamination, the next convolutional layer for specifying full articulamentum is close to the defeated of the convolutional neural networks model
Go out layer;
Full articulamentum is specified to determine the second gradient by described, second gradient is true based on first category probable error
Surely obtain, the first category probable error is the error between the first prediction class probability and initial category probability, described the
One prediction class probability is by being located at the multilayer specified on full articulamentum in the convolutional neural networks model to instruction
Practice after processing is identified in image and obtain;
The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained;
The 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.
Fourth aspect is stored with the computer-readable recording medium there is provided a kind of computer-readable recording medium
Instruction, it is characterised in that the instruction realizes following steps when being executed by processor:
By the specified full articulamentum in convolutional neural networks model to be trained, described specify under full articulamentum is received
The first gradient of one convolutional layer transmission, it is described to specify full articulamentum to be located at multiple volumes that the convolutional neural networks model includes
Specified location between lamination, the next convolutional layer for specifying full articulamentum is close to the defeated of the convolutional neural networks model
Go out layer;
Full articulamentum is specified to determine the second gradient by described, second gradient is true based on first category probable error
Surely obtain, the first category probable error is the error between the first prediction class probability and initial category probability, described the
One prediction class probability is by being located at the multilayer specified on full articulamentum in the convolutional neural networks model to instruction
Practice after processing is identified in image and obtain;
The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained;
The 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.
The technical scheme provided by this disclosed embodiment can include the following benefits:
The embodiment of the present disclosure increases between multiple convolutional layers that convolutional neural networks model includes specifies full articulamentum.It is logical
Processing is identified to training image and obtains the first prediction class probability for the multilayer crossed on the specified full articulamentum, is referred to by this
Fixed full articulamentum determines the error between the first prediction class probability and initial category probability, and determines second based on the error
Gradient.When this specify full articulamentum receive output layer close to convolutional neural networks models the transmission of next convolutional layer the
During one gradient, the first gradient and the second gradient are subjected to summation operation, 3rd gradient is obtained, and the 3rd gradient is defined as
Gradient parameter for training the convolutional neural networks model.Due to the first gradient and the second gradient are carried out into summation operation
Afterwards, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add convergence of algorithm
Speed.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not
The disclosure can be limited.
Brief description of the drawings
Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the implementation for meeting the disclosure
Example, and be used to together with specification to explain the principle of the disclosure.
Fig. 1 is the flow chart that a kind of gradient parameter according to an exemplary embodiment determines method.
Fig. 2A is that a kind of gradient parameter according to another exemplary embodiment determines method.
Fig. 2 B are the signals of annexation between each layer in a kind of convolutional neural networks model involved by Fig. 2A embodiments
Figure.
Fig. 3 A are a kind of structured flowcharts of gradient parameter determining device according to an exemplary embodiment.
Fig. 3 B are the structured flowcharts of another gradient parameter determining device according to an exemplary embodiment.
Fig. 3 C are the structured flowcharts of another gradient parameter determining device according to an exemplary embodiment.
Fig. 4 is a kind of block diagram of gradient parameter determining device 400 according to an exemplary embodiment.
Embodiment
Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Following description is related to
During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the disclosure.On the contrary, they be only with it is such as appended
The example of the consistent apparatus and method of some aspects be described in detail in claims, the disclosure.
Before to the embodiment of the present disclosure carrying out that introduction is explained in detail, first the noun being related in the embodiment of the present disclosure is entered
Row is simple to be introduced:
Convolutional neural networks model:It is a kind of feedforward neural network, typically by multiple convolutional layers and multiple full-mesh layer group
Into.Certainly, in addition, convolutional neural networks model also includes multiple active coatings and multiple pond layers.In the specific implementation,
Back-propagation algorithm can be used to be trained convolutional neural networks model.
Predict class probability:Belong to the probability of pre-set categories for training image.Wherein, the pre-set categories can be by technology people
Member is self-defined according to the actual requirements to be set, and e.g., the pre-set categories can include " cat ", " dog ", " bear ", " lion ", Tiger etc..
The prediction class probability is by being obtained after training image being identified wait the convolutional neural networks model trained processing.
Initial category probability:It can typically be set by technical staff is self-defined according to the actual requirements, initial category probability one
As can also be referred to as the true class probability of training image.
Model parameter:Refer to the model parameter of convolutional neural networks model, the model parameter generally comprises the volume of convolutional layer
Product core, weight matrix of full articulamentum etc., are mainly used in training image being identified processing.
Next, being introduced the application scenarios of the embodiment of the present disclosure.At present, in order to improve convolutional neural networks model
The accuracy of image is recognized, can typically increase the quantity of convolutional layer in convolutional neural networks model, with to convolutional neural networks
Model carries out depth training.However, with the increase of convolution layer number, gradient parameter can be less and less during transmission,
Cause the model parameter renewal speed of lower layer network slack-off, or even can not restrain.Therefore, the embodiment of the present disclosure provides a kind of ladder
Parameter determination method is spent, this method is increased by the specified location between multiple convolutional layers specifies full articulamentum, and passes through this
Specify full articulamentum enhancing gradient parameter so that it is farther that gradient parameter can be transmitted, and adds convergence of algorithm speed, and keep away
Exempt from because depth training causes the problem of algorithm can not restrain.The method that the embodiment of the present disclosure is provided can be held by terminal
OK, the terminal can be the equipment of such as tablet personal computer, computer or the like, and the embodiment of the present disclosure is not construed as limiting to this.
Fig. 1 is the flow chart that a kind of gradient parameter according to an exemplary embodiment determines method, as shown in figure 1,
The gradient parameter determines that method is used in terminal, comprises the following steps:
In a step 101, by the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify complete
The first gradient of next convolutional layer transmission of articulamentum, this specifies full articulamentum to be located at what the convolutional neural networks model included
Specified location between multiple convolutional layers, this specifies next convolutional layer of full articulamentum close to the convolutional neural networks model
Output layer.
In a step 102, full articulamentum is specified to determine the second gradient by this, second gradient is general based on first category
Rate error determines to obtain, and the first category probable error is the error between the first prediction class probability and initial category probability,
The first prediction class probability is to specify the multilayer on full articulamentum to instruction by being located at this in the convolutional neural networks model
Practice after processing is identified in image and obtain.
In step 103, the first gradient and second gradient are subjected to summation operation, obtain 3rd gradient.
At step 104, the 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.
In the disclosed embodiments, increase between multiple convolutional layers that convolutional neural networks model includes and specify full connection
Layer.Specify the multilayer on full articulamentum training image to be identified processing by this and obtain the first prediction class probability, lead to
Crossing this specifies full articulamentum to determine the error between the first prediction class probability and initial category probability, and true based on the error
Fixed second gradient.When this specifies full articulamentum to receive next convolutional layer biography close to the output layer of convolutional neural networks model
During the first gradient passed, the first gradient and the second gradient are subjected to summation operation, 3rd gradient is obtained, and by the 3rd gradient
It is defined as the gradient parameter for training the convolutional neural networks model.Due to the first gradient being summed with the second gradient
After computing, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add algorithm
Convergence rate.
Alternatively, full articulamentum is specified to determine the second gradient by this, including:
The multilayer on full articulamentum is specified to carry out the training image by being located at this in the convolutional neural networks model
Identifying processing, obtains the first prediction class probability;
The difference between the first prediction class probability and the initial category probability is determined, first category probability mistake is obtained
Difference;
Based on the first category probable error, by this specify full articulamentum using specify gradient descent method determine this second
Gradient.
Alternatively, by the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify full connection
Before the first gradient of next convolutional layer transmission of layer, in addition to:
Processing is identified to the training image in all layers included by the convolutional neural networks model, obtains second pre-
Survey class probability;
The difference between the second prediction class probability and the initial category probability is determined, second category probability mistake is obtained
Difference;
Based on the second category probable error, specified by being located at this in the convolutional neural networks model under full articulamentum
One convolutional layer determines the first gradient using specified gradient descent method.
Alternatively, the 3rd gradient is defined as after the gradient parameter for training the convolutional neural networks model, also
Including:
The product between the gradient length of the 3rd gradient and prescribed coefficient is determined, moving step length is obtained, and this is specified
The model parameter of full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, and the prescribed coefficient is to pre-set
Any coefficient;
The 3rd gradient is passed into the upper convolutional layer for specifying full articulamentum, to be transmitted to gradient parameter.
Alternatively, when the model parameter that the convolutional neural networks model includes is original model parameter, the initial model
Parameter is any parameter pre-set.
Above-mentioned all optional technical schemes, can form the alternative embodiment of the disclosure according to any combination, and the disclosure is real
Example is applied no longer to repeat this one by one.
Fig. 2A is that a kind of gradient parameter according to another exemplary embodiment determines method, as shown in Figure 2 A, the ladder
Spend parameter determination method to be applied in terminal, the gradient parameter determines that method can realize step including following:
In step 201, place is identified to the training image in all layers included by the convolutional neural networks model
Reason, obtains the second prediction class probability.
Wherein, the training image can be prestored in the terminal by technical staff.When needing to convolutional neural networks mould
When type is trained, terminal is from locally obtaining the training image.As it was noted above, convolutional neural networks model is mainly including multiple
Convolutional layer and multiple full articulamentums.Fig. 2 B are refer to, in the training process, terminal inputs acquired training figure from input layer
Picture, and processing is identified to the training image in all layers included by convolutional neural networks model, obtains the second prediction class
Other probability, in the specific implementation, exporting the second prediction class probability by the output layer of convolutional neural networks model.Wherein, should
All layers include all convolutional layers and all full articulamentums.
Certainly, active coating and pond layer are also included in all layers, what is included due to active coating and pond layer is constant,
Do not include model parameter, therefore, active coating and pond layer are not done here and excessively emphasize and introduce.
Wherein, all layers included by convolutional neural networks model the training image are identified the realization of processing
Journey may refer to correlation technique, and the embodiment of the present disclosure is not construed as limiting to this.
In addition, it should also be noted that, in fact, the input layer may be considered the first layer of convolutional neural networks model
Convolutional layer, output layer may be considered the full articulamentum of last layer of convolutional neural networks model.For the ease of in differentiation
Lower floor's relation, is typically referred to as input layer and output layer.
In step 202., the difference between the second prediction class probability and initial category probability is determined, Equations of The Second Kind is obtained
Other probable error.
In practical implementations, terminal determine this second prediction class probability after, can by this second prediction class probability with
Initial category probability is compared, to judge whether to need to continue to be trained convolutional neural networks model.For example, terminal is worked as
When determining that the difference between the second prediction class probability and the initial category probability is more than or equal to some predetermined threshold value, explanation
The ability of the convolutional neural networks Model Identification does not meet actual demand also, in that case, and terminal should based on resulting
Second category probable error, adjustment is proceeded to the model parameter of convolutional neural networks model by iterative method, specific as follows
It is literary described.
, whereas if the difference between the second prediction class probability and the initial category probability is less than the predetermined threshold value,
Then illustrate that the ability of the convolutional neural networks Model Identification also is compliant with actual demand, that is, illustrate that the convolutional neural networks model can
The classification of image is recognized accurately, in that case, it may be determined that complete the training to the convolutional neural networks model.
Wherein, the predetermined threshold value can be set by technical staff is self-defined according to the actual requirements, can also be given tacit consent to by terminal
Set, the embodiment of the present disclosure is not limited this.
It should be noted that the above-mentioned difference according between the second prediction class probability and initial category probability is determined
Whether completion training is only exemplary, in another embodiment, can also be judged whether to complete training according to iterations,
For example, when iterations reaches preset times, it is determined that completing the training to convolutional neural networks model, otherwise, continuing to volume
Product neural network model is trained.
Wherein, the preset times can be set by technical staff is self-defined according to the actual requirements, or, can also be by the end
Default setting is held, the embodiment of the present disclosure is not limited this.
In step 203, based on the second category probable error, by being specified in the convolutional neural networks model positioned at this
Next convolutional layer of full articulamentum determines the first gradient using specified gradient descent method.
If being determined not yet to complete the training to convolutional neural networks model according to the above method, terminal needs to be based on being somebody's turn to do
Second category probable error, by specifying in gradient descent method, the model parameter to convolutional neural networks model is adjusted.
It that is to say, terminal is determined after the second category probable error by convolutional neural networks model, by the second category
Probable error propagates back to the output layer of the convolutional neural networks model, and is declined by the output layer using specified gradient
Method, determines the gradient parameter of the output layer.Afterwards, the output layer is carried out using the gradient parameter to the model parameter of the output layer
Adjustment, also, identified gradient parameter is passed to a upper convolutional layer for the output layer.
It should be noted that in the specific implementation, this specifies gradient descent method to be SGD (Stochastic
Gradient Descent, stochastic gradient descent method).Certainly, this specifies gradient descent method to be other gradient descent methods,
The disclosure is implemented not limit this.
A upper convolutional layer for the output layer is received after the gradient parameter, continues terraced using specifying based on the gradient parameter
Descent method is spent, gradient parameter is determined again, and the model parameter of itself is adjusted based on the gradient parameter determined again.It
Afterwards, a upper convolutional layer for the output layer continues to pass to the identified gradient parameter into a upper convolutional layer for the output layer
A upper convolutional layer.
According to above-mentioned implementation procedure, until gradient parameter is delivered to next convolutional layer of specified full articulamentum, this is specified
Next convolutional layer of full articulamentum, based on the gradient parameter transmitted, first gradient is determined using specified gradient descent method, and
The first gradient is specified into full articulamentum to this.
For example, refer to Fig. 2 B, specify next convolutional layer of full articulamentum, i.e., the 20th layer of convolutional layer receive this
After the gradient parameter of 21 layers of convolutional layer transmission, based on the gradient parameter transmitted, using specified gradient descent method determine this
One gradient.Afterwards, the first gradient is based on by the 20th layer of convolutional layer, the model parameter to the 20th layer of convolutional layer is entered
Row adjustment, and the first gradient is passed into the specified full articulamentum.
It should be noted that in practical implementations, technical staff can be according to the actual requirements between the plurality of convolutional layer
Specified location increase specify full articulamentum, for example, between the 20th layer of convolutional layer and the 19th layer of convolutional layer increase specify
Full articulamentum.Generally, the specified location be gradient parameter be less than some threshold value position, that is to say, in order that compared with
Small gradient parameter can continue to transmit to low layer, full articulamentum can be specified in position increase, with to the less gradient
Parameter carries out enhancing processing.In actual applications, this specifies full articulamentum to be typically also known as branch's monitor.
It it should be noted that the embodiment of the present disclosure is not limited the quantity of specified location, that is to say, actually realizing
Cheng Zhong, multiple specified locations that can be between the plurality of convolutional layer increase the specified full articulamentum, for example, can be the 100th
Increase between the convolutional layer and the 99th layer of convolutional layer of layer and specify full articulamentum, and in the 20th layer of convolutional layer and the 19th layer
Increase between convolutional layer and specify full articulamentum.
In addition, it should also be noted that, the embodiment of the present disclosure in the specified location it is increased specify full articulamentum number
Amount is not also limited, for example, can be connected in the specified location 3 specifies full articulamentum.
Fig. 2 B are refer to, the connection that Fig. 2 B are schematically illustrated between each layer that convolutional neural networks model includes is closed
System, includes multiple convolutional layers, it is assumed that in the 20th layer of convolutional layer and the 19th layer of convolutional layer in the convolutional neural networks model
Between increase specify full articulamentum, and this specify full articulamentum 21a quantity be 3.
In step 204, by the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify complete
The first gradient of next convolutional layer transmission of articulamentum.
As it was noted above, this specifies full articulamentum to be located between multiple convolutional layers that the convolutional neural networks model includes
Specified location, and this specifies next convolutional layer of full articulamentum close to the output layer of the convolutional neural networks model.
In step 205, full articulamentum is specified to determine the second gradient by this, second gradient is general based on first category
Rate error determines to obtain, and the first category probable error is the error between the first prediction class probability and initial category probability,
The first prediction class probability is to specify the multilayer on full articulamentum to instruction by being located at this in the convolutional neural networks model
Practice after processing is identified in image and obtain.
Because the first gradient is passed over after multilayer, therefore, the usual very little of the first gradient, if continue to
Lower transmission, may cause algorithm not restrain, accordingly, it would be desirable to be adjusted to the first gradient, to redefine downward transmission
Gradient parameter.
Therefore, when the specified full articulamentum receives the first gradient, determining the second gradient.In the specific implementation, terminal
Specify the multilayer on full articulamentum that processing is identified to the training image by being located at this in the convolutional neural networks model,
The first prediction class probability is obtained, the difference between the first prediction class probability and the initial category probability is determined, obtains
The first category probable error, based on the first category probable error, specifies full articulamentum to decline using specified gradient by this
Method determines second gradient.
For example, refer to Fig. 2 B, terminal specifies the multilayer on full articulamentum 21a close to input layer to scheme training by this
As processing is identified, the first prediction class probability is obtained.The first prediction classification is calculated by the specified full articulamentum 21a general
First category probable error between rate and initial category probability, and by the error back propagation, full articulamentum 21a is specified by this
Based on the first category probable error, second gradient is determined using specified gradient descent method.
In fact, from being described above, this specifies full articulamentum equivalent to the output layer of the convolutional neural networks model,
I.e. in the disclosed embodiments, it is thus necessary to determine that indicate two class probability errors, respectively first category probable error and second
Class probability error.First category probable error determines that this by being arranged on the specified full articulamentum between multiple convolutional layers
Two class probability errors are determined by the output layer of the convolutional neural networks model.
In step 206, the first gradient and second gradient are subjected to summation operation, obtain 3rd gradient.
The first gradient and second gradient are carried out after summation operation, the first gradient can be caused to strengthen, i.e., backward
The gradient parameter of propagation has obtained specifying the reinforcement of full articulamentum, in this way, solving as depth trains what gradient parameter reduced
Problem.
It should be noted that may refer to ladder on the implementation process that first gradient and two gradient are carried out to summation operation
Algorithm is spent, the embodiment of the present disclosure is not limited this.
In step 207, the 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.
It that is to say, during subsequent gradients parameter is transmitted, the 3rd gradient is defined as the ladder for needing to transmit by terminal
Spend parameter.In this way, can make it that gradient parameter travels to the low layer of network well, so as to accelerate convolutional neural networks model
Convergence rate.
So far, the gradient parameter for realizing embodiment of the present disclosure offer determines method.Further, managed for the ease of depth
Solution, the embodiment of the present disclosure additionally provides following steps 208 and step 209.
In a step 208, the product between the gradient length of the 3rd gradient and prescribed coefficient is determined, obtains moving step
It is long, and by the model parameter of the specified full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, this specifies system
Number is any coefficient pre-set.
As it was noted above, during subsequent gradients parameter is transmitted, being transmitted to the 3rd gradient.Realized actual
In, the model parameter of full articulamentum can be specified to be adjusted this based on the 3rd gradient, that is to say, when it is determined that the 3rd ladder
After degree, the gradient length of the 3rd gradient can be multiplied by prescribed coefficient, obtain moving step length.This is specified into full articulamentum
Model parameter, the moving step length is moved to the gradient direction of the 3rd gradient, so as to realize the model that full articulamentum is specified to this
Parameter is adjusted.
It should be noted that when the model parameter that the convolutional neural networks model includes is original model parameter, this is first
Beginning model parameter is any parameter pre-set.
It that is to say, in the disclosed embodiments, due to can be between the convolutional layer that convolutional neural networks model includes
Full articulamentum is specified in specified location increase, you can not limit the quantity for the convolutional layer that the convolutional neural networks model includes, i.e.,
Convolutional neural networks model can be trained with carrying out depth.So, here can not be to the first of the convolutional neural networks model
Beginning model parameter is limited, and the original model parameter can be any parameter.
Similarly, the embodiment of the present disclosure can not also be defined to above-mentioned prescribed coefficient, i.e., the prescribed coefficient can be pre-
Any coefficient first set.
In step 209, the 3rd gradient is passed into the upper convolutional layer for specifying full articulamentum, to join to gradient
Number is transmitted.
Specify full articulamentum to be adjusted the first gradient by this to obtain after the 3rd gradient, can be by the 3rd ladder
Degree continues to transmit to the low layer of network, for example, continuing with Parameter Map 2B, this specifies full articulamentum 21a can be by the 3rd gradient
The upper convolutional layer for specifying full articulamentum is passed to, that is, passes to the 19th layer.
Further, the 19th layer of convolutional layer is received after the 3rd gradient, terraced using specifying based on the 3rd gradient
Degree descent method determines 4th gradient.Afterwards, based on the 4th gradient, the model parameter to the 19th layer of convolutional layer is adjusted
It is whole, and the 4th gradient is passed to the 18th layer of convolutional layer.Until the gradient parameter is transferred into input layer, complete to convolution
The once adjustment of neural network model parameter.
Further, the convolutional neural networks model proceeds to know based on the model parameter after adjustment to training image
Other places are managed, and according to above-mentioned implementation procedure, the model parameter that convolutional neural networks model includes is adjusted again.As above
It is described, until second prediction class probability and initial category probability between difference be less than some predetermined threshold value, or, work as iteration
When number of times reaches preset times, it is determined that completing the training to convolutional neural networks model.
In the disclosed embodiments, increase between multiple convolutional layers that convolutional neural networks model includes and specify full connection
Layer.Specify the multilayer on full articulamentum training image to be identified processing by this and obtain the first prediction class probability, lead to
Crossing this specifies full articulamentum to determine the error between the first prediction class probability and initial category probability, and true based on the error
Fixed second gradient.When this specifies full articulamentum to receive next convolutional layer biography close to the output layer of convolutional neural networks model
During the first gradient passed, the first gradient and the second gradient are subjected to summation operation, 3rd gradient is obtained, and by the 3rd gradient
It is defined as the gradient parameter for training the convolutional neural networks model.Due to the first gradient being summed with the second gradient
After computing, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add algorithm
Convergence rate.
Fig. 3 A are a kind of structured flowcharts of gradient parameter determining device according to an exemplary embodiment.Reference picture
3A, the device includes receiving module 310, the first determining module 312 and the determining module 316 of computing module 314 and second.
The receiving module 310, should for by the specified full articulamentum in convolutional neural networks model to be trained, receiving
The first gradient of next convolutional layer transmission of full articulamentum is specified, this specifies full articulamentum to be located at the convolutional neural networks model
Including multiple convolutional layers between specified location, this specify full articulamentum next convolutional layer close to the convolutional neural networks
The output layer of model.
First determining module 312, for specifying full articulamentum to determine the second gradient by this, second gradient is to be based on
First category probable error determines to obtain, the first category probable error be the first prediction class probability and initial category probability it
Between error, this first prediction class probability be by the convolutional neural networks model be located at this specify full articulamentum on
Multilayer is obtained after training image being identified processing.
The computing module 314, for the first gradient and first determining module 312 for receiving the receiving module 310
Second gradient determined carries out summation operation, obtains 3rd gradient.
Second determining module 316,3rd gradient for the computing module 314 to be obtained is defined as being used to train the volume
The gradient parameter of product neural network model.
Alternatively, the first determining module 312 is used for:
The multilayer on full articulamentum is specified to carry out the training image by being located at this in the convolutional neural networks model
Identifying processing, obtains the first prediction class probability;
The difference between the first prediction class probability and the initial category probability is determined, first category probability mistake is obtained
Difference;
Based on the first category probable error, by this specify full articulamentum using specify gradient descent method determine this second
Gradient.
Alternatively, Fig. 3 B are refer to, the device also includes:
Recognition processing module 318, all layers for being included by the convolutional neural networks model enter to the training image
Row identifying processing, obtains the second prediction class probability;
3rd determining module 320, for determining the difference between the second prediction class probability and the initial category probability,
Obtain second category probable error;
4th determining module 322, for based on the second category probable error, passing through the convolutional neural networks model middle position
Next convolutional layer of full articulamentum is specified to use specified gradient descent method to determine the first gradient in this.
Alternatively, Fig. 3 C are refer to, the device also includes:
5th determining module 324, for determining the product between the gradient length of the 3rd gradient and prescribed coefficient, is obtained
Moving step length, and by the model parameter of the specified full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, should
Prescribed coefficient is any coefficient pre-set;
Transfer module 326, for the 3rd gradient to be passed into the upper convolutional layer for specifying full articulamentum, with to ladder
Degree parameter is transmitted.
Alternatively, when the model parameter that the convolutional neural networks model includes is original model parameter, the initial model
Parameter is any parameter pre-set.
In the disclosed embodiments, increase between multiple convolutional layers that convolutional neural networks model includes and specify full connection
Layer.Specify the multilayer on full articulamentum training image to be identified processing by this and obtain the first prediction class probability, lead to
Crossing this specifies full articulamentum to determine the error between the first prediction class probability and initial category probability, and true based on the error
Fixed second gradient.When this specifies full articulamentum to receive next convolutional layer biography close to the output layer of convolutional neural networks model
During the first gradient passed, the first gradient and the second gradient are subjected to summation operation, 3rd gradient is obtained, and by the 3rd gradient
It is defined as the gradient parameter for training the convolutional neural networks model.Due to the first gradient being summed with the second gradient
After computing, enhance gradient parameter, accordingly, it is determined that after the gradient parameter can transmit it is deeper, so as to add algorithm
Convergence rate.
On the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant this method
Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 4 is a kind of block diagram of gradient parameter determining device 400 according to an exemplary embodiment.For example, device
400 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console, tablet device, medical treatment is set
It is standby, body-building equipment, personal digital assistant etc..
Reference picture 4, device 400 can include following one or more assemblies:Processing assembly 402, memory 404, power supply
Component 406, multimedia groupware 408, audio-frequency assembly 410, the interface 412 of input/output (I/O), sensor cluster 414, and
Communication component 416.
The integrated operation of the usual control device 400 of processing assembly 402, such as with display, call, data communication, phase
Machine operates the operation associated with record operation.Processing assembly 402 can refer to including one or more processors 420 to perform
Order, to complete all or part of step of above-mentioned method.In addition, processing assembly 402 can include one or more modules, just
Interaction between processing assembly 402 and other assemblies.For example, processing assembly 402 can include multi-media module, it is many to facilitate
Interaction between media component 408 and processing assembly 402.
Memory 404 is configured as storing various types of data supporting the operation in device 400.These data are shown
Example includes the instruction of any application program or method for being operated on device 400, and contact data, telephone book data disappears
Breath, picture, video etc..Memory 404 can be by any kind of volatibility or non-volatile memory device or their group
Close and realize, such as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) is erasable to compile
Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash
Device, disk or CD.
Power supply module 406 provides power supply for the various assemblies of device 400.Power supply module 406 can include power management system
System, one or more power supplys, and other components associated with generating, managing and distributing power supply for device 400.
Multimedia groupware 408 is included in the screen of one output interface of offer between described device 400 and user.One
In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen
Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings
Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action
Border, but also detection touches or slide related duration and pressure with described.In certain embodiments, many matchmakers
Body component 408 includes a front camera and/or rear camera.When device 400 be in operator scheme, such as screening-mode or
During video mode, front camera and/or rear camera can receive the multi-medium data of outside.Each front camera and
Rear camera can be a fixed optical lens system or with focusing and optical zoom capabilities.
Audio-frequency assembly 410 is configured as output and/or input audio signal.For example, audio-frequency assembly 410 includes a Mike
Wind (MIC), when device 400 be in operator scheme, when such as call model, logging mode and speech recognition mode, microphone by with
It is set to reception external audio signal.The audio signal received can be further stored in memory 404 or via communication set
Part 416 is sent.In certain embodiments, audio-frequency assembly 410 also includes a loudspeaker, for exports audio signal.
I/O interfaces 412 is provide interface between processing assembly 402 and peripheral interface module, above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock
Determine button.
Sensor cluster 414 includes one or more sensors, and the state for providing various aspects for device 400 is commented
Estimate.For example, sensor cluster 414 can detect opening/closed mode of device 400, the relative positioning of component is for example described
Component is the display and keypad of device 400, and sensor cluster 414 can be with 400 1 components of detection means 400 or device
Position change, the existence or non-existence that user contacts with device 400, the orientation of device 400 or acceleration/deceleration and device 400
Temperature change.Sensor cluster 414 can include proximity transducer, be configured to detect in not any physical contact
The presence of neighbouring object.Sensor cluster 414 can also include optical sensor, such as CMOS or ccd image sensor, for into
As being used in application.In certain embodiments, the sensor cluster 414 can also include acceleration transducer, gyro sensors
Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 416 is configured to facilitate the communication of wired or wireless way between device 400 and other equipment.Device
400 can access the wireless network based on communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary implementation
In example, communication component 416 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 416 also includes near-field communication (NFC) module, to promote junction service.Example
Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology,
Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 400 can be believed by one or more application specific integrated circuits (ASIC), numeral
Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, real shown in above-mentioned Fig. 1 or Fig. 2A for performing
The method that example offer is provided.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided
Such as include the memory 404 of instruction, above-mentioned instruction can be performed to complete the above method by the processor 420 of device 400.For example,
The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk
With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of mobile terminal
When device is performed so that mobile terminal is able to carry out a kind of gradient parameter and determines method, and methods described includes:
By the specified full articulamentum in convolutional neural networks model to be trained, receive this and specify the next of full articulamentum
The first gradient of individual convolutional layer transmission, this specify full articulamentum be located at multiple convolutional layers that the convolutional neural networks model includes it
Between specified location, this specify full articulamentum next convolutional layer close to the convolutional neural networks model output layer;
Full articulamentum is specified to determine the second gradient by this, second gradient is determined based on first category probable error
Arrive, the first category probable error is the error between the first prediction class probability and initial category probability, the first prediction class
Other probability is to specify the multilayer on full articulamentum to know training image by being located at this in the convolutional neural networks model
Obtained after the reason of other places;
The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained;
The 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.
Alternatively, by this full articulamentum should be specified to determine the second gradient, including:
The multilayer on full articulamentum is specified to carry out the training image by being located at this in the convolutional neural networks model
Identifying processing, obtains the first prediction class probability;
The difference between the first prediction class probability and the initial category probability is determined, first category probability mistake is obtained
Difference;
Based on the first category probable error, by this specify full articulamentum using specify gradient descent method determine this second
Gradient.
Alternatively, should pass through convolutional neural networks model to be trained in specified full articulamentum, receive this specify connect entirely
Connect layer next convolutional layer transmission first gradient before, in addition to:
Processing is identified to the training image in all layers included by the convolutional neural networks model, obtains second pre-
Survey class probability;
The difference between the second prediction class probability and the initial category probability is determined, second category probability mistake is obtained
Difference;
Based on the second category probable error, specified by being located at this in the convolutional neural networks model under full articulamentum
One convolutional layer determines the first gradient using specified gradient descent method.
Alternatively, the 3rd gradient is defined as after the gradient parameter for training the convolutional neural networks model by this,
Also include:
The product between the gradient length of the 3rd gradient and prescribed coefficient is determined, moving step length is obtained, and this is specified
The model parameter of full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient, and the prescribed coefficient is to pre-set
Any coefficient;
The 3rd gradient is passed into the upper convolutional layer for specifying full articulamentum, to be transmitted to gradient parameter.
Alternatively, when the model parameter that the convolutional neural networks model includes is original model parameter, the initial model
Parameter is any parameter pre-set.
Those skilled in the art will readily occur to its of the disclosure after considering specification and putting into practice invention disclosed herein
Its embodiment.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or
Person's adaptations follow the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.Description and embodiments are considered only as exemplary, and the true scope of the disclosure and spirit are by following
Claim is pointed out.
It should be appreciated that the precision architecture that the disclosure is not limited to be described above and is shown in the drawings, and
And various modifications and changes can be being carried out without departing from the scope.The scope of the present disclosure is only limited by appended claim.
Claims (12)
1. a kind of gradient parameter determines method, it is characterised in that methods described includes:
By the specified full articulamentum in convolutional neural networks model to be trained, the next of the specified full articulamentum is received
The first gradient of convolutional layer transmission, it is described to specify full articulamentum to be located at multiple convolutional layers that the convolutional neural networks model includes
Between specified location, it is described specify full articulamentum next convolutional layer close to the convolutional neural networks model output
Layer;
Full articulamentum is specified to determine the second gradient by described, second gradient is determined based on first category probable error
Arrive, the first category probable error is the error between the first prediction class probability and initial category probability, and described first is pre-
It is by being located at the multilayer specified on full articulamentum in the convolutional neural networks model to training figure to survey class probability
As being obtained after processing is identified;
The first gradient and second gradient are subjected to summation operation, 3rd gradient is obtained;
The 3rd gradient is defined as the gradient parameter for training the convolutional neural networks model.
2. the method as described in claim 1, it is characterised in that described to specify full articulamentum to determine the second gradient by described,
Including:
The training image is entered by being located at the multilayer specified on full articulamentum in the convolutional neural networks model
Row identifying processing, obtains the first prediction class probability;
The difference between the first prediction class probability and the initial category probability is determined, the first category probability is obtained
Error;
Based on the first category probable error, full articulamentum is specified using specifying gradient descent method to determine described the by described
Two gradients.
3. the method as described in claim 1, it is characterised in that the finger by convolutional neural networks model to be trained
Before fixed full articulamentum, the first gradient for receiving the next convolutional layer transmission for specifying full articulamentum, in addition to:
Processing is identified to the training image in all layers included by the convolutional neural networks model, obtains second pre-
Survey class probability;
The difference between the second prediction class probability and the initial category probability is determined, second category probability mistake is obtained
Difference;
Based on the second category probable error, by being located at the specified full articulamentum in the convolutional neural networks model
Next convolutional layer determines the first gradient using specified gradient descent method.
4. the method as described in claim 1, it is characterised in that described to be defined as the 3rd gradient to be used to train the volume
After the gradient parameter of product neural network model, in addition to:
The product between the gradient length of the 3rd gradient and prescribed coefficient is determined, moving step length is obtained, and specify described
The model parameter of full articulamentum, moves the moving step length, the prescribed coefficient is pre- to the gradient direction of the 3rd gradient
Any coefficient first set;
The 3rd gradient is passed to a upper convolutional layer for the specified full articulamentum, to be transmitted to gradient parameter.
5. method as claimed in claim 4, it is characterised in that when the model parameter that the convolutional neural networks model includes is
During original model parameter, the original model parameter is any parameter pre-set.
6. a kind of gradient parameter determining device, it is characterised in that described device includes:
Receiving module, for by the specified full articulamentum in convolutional neural networks model to be trained, receiving described specify entirely
The first gradient of next convolutional layer transmission of articulamentum, it is described to specify full articulamentum to be located at the convolutional neural networks model bag
Specified location between the multiple convolutional layers included, the next convolutional layer for specifying full articulamentum is close to the convolutional Neural net
The output layer of network model;
First determining module, for specifying full articulamentum to determine the second gradient by described, second gradient is to be based on first
Class probability error determines to obtain, and the first category probable error is between the first prediction class probability and initial category probability
Error, it is described first prediction class probability be by the convolutional neural networks model be located at it is described specify full articulamentum it
On multilayer to training image be identified processing after obtain;
Computing module, for described in the first gradient for receiving the receiving module and first determining module determination
Second gradient carries out summation operation, obtains 3rd gradient;
Second determining module, the 3rd gradient for the computing module to be obtained is defined as being used to train the convolution god
Gradient parameter through network model.
7. device as claimed in claim 6, it is characterised in that first determining module is used for:
The training image is entered by being located at the multilayer specified on full articulamentum in the convolutional neural networks model
Row identifying processing, obtains the first prediction class probability;
The difference between the first prediction class probability and the initial category probability is determined, the first category probability is obtained
Error;
Based on the first category probable error, full articulamentum is specified using specifying gradient descent method to determine described the by described
Two gradients.
8. device as claimed in claim 6, it is characterised in that described device also includes:
Recognition processing module, all layers for being included by the convolutional neural networks model are known to the training image
Other places are managed, and obtain the second prediction class probability;
3rd determining module, for determining the difference between the second prediction class probability and the initial category probability, is obtained
To second category probable error;
4th determining module, for based on the second category probable error, by being located in the convolutional neural networks model
The next convolutional layer for specifying full articulamentum determines the first gradient using specified gradient descent method.
9. device as claimed in claim 6, it is characterised in that described device also includes:
5th determining module, for determining the product between the gradient length of the 3rd gradient and prescribed coefficient, is moved
Step-length, and by the model parameter of the specified full articulamentum, the moving step length is moved to the gradient direction of the 3rd gradient,
The prescribed coefficient is any coefficient pre-set;
Transfer module, the upper convolutional layer for the 3rd gradient to be passed to the specified full articulamentum, with to gradient
Parameter is transmitted.
10. device as claimed in claim 9, it is characterised in that the model parameter included when the convolutional neural networks model
During for original model parameter, the original model parameter is any parameter pre-set.
11. a kind of gradient parameter determining device, it is characterised in that described device includes:
Processor;
Memory for storing processor-executable instruction;
Wherein, the processor is configured as performing the method described in the claims any one of 1-5.
12. be stored with instruction on a kind of computer-readable recording medium, the computer-readable recording medium, it is characterised in that
The method described in the claims any one of 1-5 is realized in the instruction when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710373287.6A CN107229968B (en) | 2017-05-24 | 2017-05-24 | Gradient parameter determination method, gradient parameter determination device and computer-readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710373287.6A CN107229968B (en) | 2017-05-24 | 2017-05-24 | Gradient parameter determination method, gradient parameter determination device and computer-readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107229968A true CN107229968A (en) | 2017-10-03 |
CN107229968B CN107229968B (en) | 2021-06-29 |
Family
ID=59933968
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710373287.6A Active CN107229968B (en) | 2017-05-24 | 2017-05-24 | Gradient parameter determination method, gradient parameter determination device and computer-readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107229968B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590534A (en) * | 2017-10-17 | 2018-01-16 | 北京小米移动软件有限公司 | Train the method, apparatus and storage medium of depth convolutional neural networks model |
CN110033019A (en) * | 2019-03-06 | 2019-07-19 | 腾讯科技(深圳)有限公司 | Method for detecting abnormality, device and the storage medium of human body |
CN111506104A (en) * | 2020-04-03 | 2020-08-07 | 北京邮电大学 | Method and device for planning position of unmanned aerial vehicle |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102919A (en) * | 2014-07-14 | 2014-10-15 | 同济大学 | Image classification method capable of effectively preventing convolutional neural network from being overfit |
CN104794527A (en) * | 2014-01-20 | 2015-07-22 | 富士通株式会社 | Method and equipment for constructing classification model based on convolutional neural network |
CN105069413A (en) * | 2015-07-27 | 2015-11-18 | 电子科技大学 | Human body gesture identification method based on depth convolution neural network |
CN105469041A (en) * | 2015-11-19 | 2016-04-06 | 上海交通大学 | Facial point detection system based on multi-task regularization and layer-by-layer supervision neural networ |
US20160174902A1 (en) * | 2013-10-17 | 2016-06-23 | Siemens Aktiengesellschaft | Method and System for Anatomical Object Detection Using Marginal Space Deep Neural Networks |
CN106156807A (en) * | 2015-04-02 | 2016-11-23 | 华中科技大学 | The training method of convolutional neural networks model and device |
CN106250931A (en) * | 2016-08-03 | 2016-12-21 | 武汉大学 | A kind of high-definition picture scene classification method based on random convolutional neural networks |
CN106548201A (en) * | 2016-10-31 | 2017-03-29 | 北京小米移动软件有限公司 | The training method of convolutional neural networks, image-recognizing method and device |
WO2017058479A1 (en) * | 2015-09-29 | 2017-04-06 | Qualcomm Incorporated | Selective backpropagation |
CN106650721A (en) * | 2016-12-28 | 2017-05-10 | 吴晓军 | Industrial character identification method based on convolution neural network |
-
2017
- 2017-05-24 CN CN201710373287.6A patent/CN107229968B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160174902A1 (en) * | 2013-10-17 | 2016-06-23 | Siemens Aktiengesellschaft | Method and System for Anatomical Object Detection Using Marginal Space Deep Neural Networks |
CN104794527A (en) * | 2014-01-20 | 2015-07-22 | 富士通株式会社 | Method and equipment for constructing classification model based on convolutional neural network |
CN104102919A (en) * | 2014-07-14 | 2014-10-15 | 同济大学 | Image classification method capable of effectively preventing convolutional neural network from being overfit |
CN106156807A (en) * | 2015-04-02 | 2016-11-23 | 华中科技大学 | The training method of convolutional neural networks model and device |
CN105069413A (en) * | 2015-07-27 | 2015-11-18 | 电子科技大学 | Human body gesture identification method based on depth convolution neural network |
WO2017058479A1 (en) * | 2015-09-29 | 2017-04-06 | Qualcomm Incorporated | Selective backpropagation |
CN105469041A (en) * | 2015-11-19 | 2016-04-06 | 上海交通大学 | Facial point detection system based on multi-task regularization and layer-by-layer supervision neural networ |
CN106250931A (en) * | 2016-08-03 | 2016-12-21 | 武汉大学 | A kind of high-definition picture scene classification method based on random convolutional neural networks |
CN106548201A (en) * | 2016-10-31 | 2017-03-29 | 北京小米移动软件有限公司 | The training method of convolutional neural networks, image-recognizing method and device |
CN106650721A (en) * | 2016-12-28 | 2017-05-10 | 吴晓军 | Industrial character identification method based on convolution neural network |
Non-Patent Citations (5)
Title |
---|
ANISH SHAH 等: "Deep Residual Networks with Exponential Linear Unit", 《ARXIV》 * |
RAZVAN PASCANU 等: "On the difficulty of training Recurrent Neural Networks", 《ARXIV》 * |
张玉平 等: "基于卷积计算的多层脉冲神经网络的监督学习", 《计算机工程与科学》 * |
敖道敢: "无监督特征学习结合神经网络应用于图像识别", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
费建超 等: "基于梯度的多输入卷积神经网络", 《光电工程》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590534A (en) * | 2017-10-17 | 2018-01-16 | 北京小米移动软件有限公司 | Train the method, apparatus and storage medium of depth convolutional neural networks model |
CN107590534B (en) * | 2017-10-17 | 2021-02-09 | 北京小米移动软件有限公司 | Method and device for training deep convolutional neural network model and storage medium |
CN110033019A (en) * | 2019-03-06 | 2019-07-19 | 腾讯科技(深圳)有限公司 | Method for detecting abnormality, device and the storage medium of human body |
CN110033019B (en) * | 2019-03-06 | 2021-07-27 | 腾讯科技(深圳)有限公司 | Method and device for detecting abnormality of human body part and storage medium |
CN111506104A (en) * | 2020-04-03 | 2020-08-07 | 北京邮电大学 | Method and device for planning position of unmanned aerial vehicle |
CN111506104B (en) * | 2020-04-03 | 2021-10-01 | 北京邮电大学 | Method and device for planning position of unmanned aerial vehicle |
Also Published As
Publication number | Publication date |
---|---|
CN107229968B (en) | 2021-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108171254A (en) | Image tag determines method, apparatus and terminal | |
CN107329742A (en) | SDK call method and device | |
CN107527059A (en) | Character recognition method, device and terminal | |
CN107194464A (en) | The training method and device of convolutional neural networks model | |
CN106528709A (en) | Social information recommendation method and apparatus | |
CN107220667A (en) | Image classification method, device and computer-readable recording medium | |
CN106603667A (en) | Screen information sharing method and device | |
CN107145904A (en) | Determination method, device and the storage medium of image category | |
CN107679483A (en) | Number plate recognition methods and device | |
CN107943266A (en) | power consumption control method, device and equipment | |
CN106778531A (en) | Face detection method and device | |
CN105049219B (en) | Flow booking method and system, mobile terminal and server | |
CN104407924B (en) | Memory Optimize Method and device | |
CN106775224A (en) | Remark information method to set up and device | |
CN107229968A (en) | Gradient parameter determines method, device and computer-readable recording medium | |
CN107341509A (en) | The training method and device of convolutional neural networks | |
CN107527024A (en) | Face face value appraisal procedure and device | |
CN107748867A (en) | The detection method and device of destination object | |
CN106203306A (en) | The Forecasting Methodology at age, device and terminal | |
CN107563994A (en) | The conspicuousness detection method and device of image | |
CN106203275A (en) | Method, device and electronic equipment for unlocked by fingerprint | |
CN107590534A (en) | Train the method, apparatus and storage medium of depth convolutional neural networks model | |
CN107491681A (en) | Finger print information processing method and processing device | |
CN104063424B (en) | Web page picture shows method and demonstration device | |
CN104035764B (en) | Object control method and relevant apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |