CN112668716A

CN112668716A - Training method and device of neural network model

Info

Publication number: CN112668716A
Application number: CN202011602983.8A
Authority: CN
Inventors: 辛冠希; 黄源浩; 肖振中
Original assignee: Orbbec Inc
Current assignee: Orbbec Inc
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2021-04-16

Abstract

The application is suitable for the technical field of machine learning, and provides a training method of a neural network model, which comprises the following steps: inputting the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculating a first loss value according to the first characteristic information and a first loss function; inputting the training sample set into a preset teacher neural network model for processing to obtain fourth characteristic information, and calculating a second loss value according to the first characteristic information, the fourth characteristic information and a second loss function; determining a target loss value according to the first loss value and the second loss value; if the target loss value does not meet the preset suspension condition, updating the preset student neural network according to the target loss value; and if the target loss value meets the preset suspension condition, outputting the trained student neural network model. The method and the device can effectively improve the training convergence condition of the neural network model, enable the neural network to have better learning characteristics, and improve the recognition accuracy and generalization capability of the neural network.

Description

Training method and device of neural network model

Technical Field

The application belongs to the technical field of machine learning, and particularly relates to a training method and equipment of a neural network model.

Background

With the development of artificial intelligence, deep learning increasingly exhibits its irreplaceability. In the training process, a loss function is preset, training data are transmitted to a preset neural network for forward propagation to extract characteristics of the data, the extracted characteristics are substituted into the loss function to calculate a loss value, and finally, a back propagation algorithm is used for updating preset parameters of a preset neural network model according to the loss value.

However, the loss function in the prior art generally adopts a single loss function, which may result in insufficient learning characteristics of the trained model or slow learning speed and low generalization capability.

Disclosure of Invention

The embodiment of the application provides a training method and equipment of a neural network model, and can solve the problems that the model obtained by current training is insufficient in learning characteristics or low in learning speed and generalization capability.

In a first aspect, an embodiment of the present application provides a method for training a neural network model, including:

acquiring a training sample set;

inputting the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculating a first loss value according to the first characteristic information and a first loss function;

inputting the training sample set into a preset teacher neural network model for processing to obtain fourth characteristic information, and calculating a second loss value according to the first characteristic information, the fourth characteristic information and a second loss function;

determining a target loss value according to the first loss value and the second loss value;

if the target loss value does not meet the preset suspension condition, updating the preset student neural network according to the target loss value, and returning to execute the process of inputting the training sample set into a preset student neural network model for processing;

and if the target loss value meets the preset suspension condition, outputting the trained student neural network model.

Further, the preset student neural network model comprises a first feature extraction structure, a second feature extraction structure and a third feature extraction structure;

the inputting the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculating a first loss value according to the first characteristic information and a first loss function, including:

inputting the training sample set into the first feature extraction structure for processing to obtain second feature information, and calculating a first auxiliary loss value according to the second feature information;

inputting the second feature information into the second feature extraction structure for processing to obtain third feature information, and calculating a second auxiliary loss value according to the third feature information;

and inputting the third feature information into the third feature extraction structure for processing to obtain first feature information, and calculating a third auxiliary loss value according to the first feature information.

Further, the first feature extraction structure comprises a convolutional layer;

the inputting the training sample set into the first feature extraction structure for processing to obtain second feature information, and calculating a first auxiliary loss value according to the second feature information, including:

inputting the training sample set into the convolutional layer for feature extraction to obtain second feature information;

and calculating a first auxiliary loss value according to the second characteristic information and preset ideal data.

Further, the first feature extraction structure further comprises a full connection layer and an output layer; the activation function of the output layer is a soft-max function;

the calculating a first auxiliary loss value according to the second characteristic information and preset ideal data comprises:

inputting the second feature information into the full-connection layer for processing to obtain a feature vector;

inputting the feature vectors into the output layer, and determining the probability that the classes of the training samples in the training sample set belong to the target classes corresponding to the preset ideal data according to the preset ideal data;

and calculating a first auxiliary loss value according to the probability and a preset cross entropy loss function.

Further, the preset teacher neural network model includes a Resnet152 network.

Further, the calculating a second loss value according to the first characteristic information, the fourth characteristic information and a second loss function includes:

performing square error calculation on the first characteristic information and the fourth characteristic information to obtain a second loss value;

alternatively, the first and second electrodes may be,

performing mean square error calculation on the first characteristic information and the fourth characteristic information to obtain a second loss value;

alternatively, the first and second electrodes may be,

and calculating the absolute value of the first characteristic information and the fourth characteristic information to obtain a second loss value.

Further, the determining a target loss value according to the first loss value and the second loss value includes:

and carrying out weighted summation on the first auxiliary loss value, the second auxiliary loss value, the third auxiliary loss value and the second loss value to obtain a target loss value.

In a second aspect, an embodiment of the present application provides an object identification method, including:

acquiring an object picture;

inputting the object picture into a preset object recognition model for processing to obtain an object recognition result corresponding to the object picture; wherein the object recognition model is obtained by the method for training a neural network model according to the first aspect.

In a third aspect, an embodiment of the present application provides a training apparatus for a neural network model, including:

a first obtaining unit, configured to obtain a training sample set;

the first processing unit is used for inputting the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculating a first loss value according to the first characteristic information and a first loss function;

the second processing unit is used for inputting the training sample set into a preset teacher neural network model for processing to obtain fourth characteristic information, and calculating a second loss value according to the first characteristic information, the fourth characteristic information and a second loss function;

a first determining unit, configured to determine a target loss value according to the first loss value and the second loss value;

the third processing unit is used for updating the preset student neural network according to the target loss value if the target loss value does not meet the preset suspension condition, and returning to execute the process of inputting the training sample set into a preset student neural network model;

and the fourth processing unit is used for outputting the trained student neural network model if the target loss value meets the preset suspension condition.

the first processing unit is specifically configured to:

the first processing unit is specifically further configured to:

Further, the preset teacher neural network model includes a Resnet152 network.

Further, the second processing unit is specifically configured to:

alternatively, the first and second electrodes may be,

Further, the first determining unit is specifically configured to:

In a fourth aspect, an embodiment of the present application provides an object recognition apparatus, including:

an acquisition unit for acquiring a picture of an object;

the processing unit is used for inputting the object picture into a preset object recognition model for processing to obtain an object recognition result corresponding to the object picture; wherein the object recognition model is obtained by the method for training a neural network model according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a training apparatus for a neural network model, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements the training method for the neural network model according to the first aspect.

In a sixth aspect, an embodiment of the present application provides an object identification device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the object identification method according to the second aspect.

In a seventh aspect, the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the method for training a neural network model according to the first aspect.

In an eighth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the object identification method according to the second aspect.

On one hand, in the embodiment of the application, a training sample set is obtained; inputting the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculating a first loss value according to the first characteristic information and a first loss function; inputting the training sample set into a preset teacher neural network model for processing to obtain fourth characteristic information, and calculating a second loss value according to the first characteristic information, the fourth characteristic information and a second loss function; determining a target loss value according to the first loss value and the second loss value; if the target loss value does not meet the preset suspension condition, updating the preset student neural network according to the target loss value, and returning to execute the process of inputting the training sample set into the preset student neural network model for processing; and if the target loss value meets the preset suspension condition, outputting the trained student neural network model. In the application, the target loss value is determined by presetting the first loss value obtained by the student neural network model and the second loss value obtained by the teacher neural network model, so that the training convergence condition of the neural network model is effectively improved, the neural network can better perform feature learning, and the recognition accuracy and the generalization capability of the neural network are improved.

On the other hand, the object identification method provided by the application acquires an object picture; inputting the object picture into a preset object recognition model for processing to obtain an object recognition result corresponding to the object picture; the object recognition model is obtained by training a neural network model through a training method. The object recognition model is obtained by training of a neural network model, so that the object recognition method can better extract the characteristics of the object image, and the accuracy of object recognition is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a schematic flow chart diagram of a method for training a neural network model provided in a first embodiment of the present application;

FIG. 2 is a schematic flowchart of a refinement of S102 in a training method of a neural network model provided in a first embodiment of the present application;

FIG. 3 is a schematic flow chart diagram of an object identification method according to a second embodiment of the present application;

FIG. 4 is a schematic diagram of a training apparatus for neural network models provided in a third embodiment of the present application;

fig. 5 is a schematic view of an object recognition apparatus according to a fourth embodiment of the present application;

FIG. 6 is a schematic diagram of a training apparatus for a neural network model provided in a fifth embodiment of the present application;

fig. 7 is a schematic diagram of an object recognition apparatus according to a sixth embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

Referring to fig. 1, fig. 1 is a schematic flow chart of a training method of a neural network model according to a first embodiment of the present application. In this embodiment, an execution subject of the training method of the neural network model is a device having a training function of the neural network model, for example, a desktop computer, a server, and the like. The training method of the neural network model shown in fig. 1 may include:

s101: a training sample set is obtained.

The method comprises the steps that a device obtains a training sample set, wherein the training sample set comprises training samples and sample labels corresponding to the training samples, and each training sample corresponds to one sample label.

In one embodiment, the training samples may include one of RGB images, IR images, depth images, or other images. The training samples may be used to train a neural network model for image processing, and are not limited herein.

In one embodiment, to perform the acceleration operation, the number of training samples in the training sample set may be a multiple of 32 based on computer principles.

S102: inputting the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculating a first loss value according to the first characteristic information and a first loss function.

The device is pre-stored with a preset student neural network model, and the preset student neural network model is used for training to obtain a finally needed neural network model. The preset student neural network model is simple in structure and general in performance. And the preset student neural network model learns the behavior of the preset teacher neural network model.

The device inputs the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculates a first loss value according to the first characteristic information and a first loss function.

Wherein the first loss function may be a cross entropy loss function.

In one embodiment, in order to more accurately learn the behavior of the preset teacher neural network model, the preset student neural network model comprises a first feature extraction structure, a second feature extraction structure and a third feature extraction structure; the first feature extraction structure, the second feature extraction structure and the third feature extraction structure are connected in series to form a preset student neural network model. S102 may include S1021 to S1023, and as shown in fig. 2, S1021 to S1023 are as follows:

s1021: and inputting the training sample set into the first feature extraction structure for processing to obtain second feature information, and calculating a first auxiliary loss value according to the second feature information.

The device inputs the training sample set into the first feature extraction structure for processing to obtain second feature information, and calculates a first auxiliary loss value according to the second feature information.

Specifically, the first feature extraction structure includes a convolutional layer; the first feature extraction structure may include multiple convolutional layers, which may be 1 layer or 2 layers, and three layers are preferred in this embodiment for extracting global features, which is not limited herein.

The convolutional layer can be used for extracting second characteristic information of the training samples in the training sample set, and can also be used for reducing the dimension in the process of extracting the characteristics of the training samples so as to reduce the calculated amount.

And inputting the training sample set into the convolutional layer by the equipment for feature extraction to obtain second feature information. In one embodiment, the first feature extraction structure includes at least 3 convolutional layers, to which are connected active layers. Preferably, the activation layer is a Relu activation function layer. The activation layer carries out nonlinear activation on the characteristic layer extracted by linear calculation of the convolutional layer, and introduces nonlinear characteristics to improve the nonlinear expression capability of the network to be trained.

It can be understood that before the first feature extraction structure is input, a batch normalization layer may be provided to normalize the training samples, so as to prevent the problem that the data distribution changes during the training process, which may result in disappearance of the gradient or explosion of the gradient.

And after the second characteristic information is obtained, the equipment calculates a first auxiliary loss value according to the second characteristic information and preset ideal data. The device calculates a first auxiliary loss value according to the second characteristic information, preset ideal data and a preset loss function.

In one embodiment, the first feature extraction structure further comprises a fully connected layer and an output layer; the activation function of the output layer is a soft-max function; inputting second feature information into the full-connection layer for processing to obtain a feature vector; the fully-connected layer is composed of a plurality of neurons and is in a fully-connected state with the last convolution layer, so that the second characteristic information can obtain a characteristic vector with the dimension same as the number of the neurons after passing through the fully-connected layer.

The activation function of the output layer is a soft-max function; the device inputs the feature vectors into an output layer, and determines the probability that the class of the training samples in the training sample set belongs to the target class corresponding to the preset ideal data according to the preset ideal data. The output layer is connected with the full connection layer, the output layer comprises neurons which are consistent with the number of the training samples, each neuron is endowed with the characteristic vector of the full connection layer by combining with different weight matrixes and offsets, and each neuron is respectively corresponding to the probability that the category of the training sample belongs to a certain category in the preset ideal data.

It can be understood that the preset ideal data includes the same class and number as the training samples, and the sum of the probabilities corresponding to each neuron is 1.

And finally, the equipment calculates a first auxiliary loss value according to the probability and a preset cross entropy loss function.

In one embodiment, the classification result takes the maximum probability, assuming that the first supplemental loss value is L₁Then, then

Where N is expressed as the number of categories, y_ikRepresenting an indication variable, wherein if the category of the training sample is the same as that in the preset ideal data, the training sample is 1, otherwise, the training sample is 0; p is a radical of_ikAnd the prediction probability of predicting that the class belongs to a certain class K in the preset ideal data is shown.

It can be understood that the fact that the class of the training sample is the same as the class of the preset ideal data indicates that the probability that the class of the training sample belongs to a certain class of the preset ideal data is highest when the training sample passes through the output layer; and the calculation of the loss function is not limited to this, but can also be a log-log loss function, which is not limited here.

S1022: and inputting the second characteristic information into a second characteristic extraction structure for processing to obtain third characteristic information, and calculating a second auxiliary loss value according to the third characteristic information.

In this embodiment, the device inputs the second feature information into the second feature extraction structure for processing to obtain third feature information, and calculates the specific details of the second auxiliary loss value according to the third feature information, refer to that the training sample set is input into the first feature extraction structure for processing in S1021 to obtain the second feature information, and calculate the description about the first auxiliary loss value according to the second feature information, which is not described herein again.

The number of convolution layers of the second feature extraction structure and the first feature extraction structure may be the same or different.

Preferably, the number of convolutional layers of the second feature extraction structure is greater than the number of convolutional layers of the first feature extraction structure, so as to extract more global features.

S1023: and inputting the third feature information into a third feature extraction structure for processing to obtain first feature information, and calculating a third auxiliary loss value according to the first feature information.

In this embodiment, inputting the third feature information into the third feature extraction structure for processing to obtain the first feature information, and calculating the specific details of the third auxiliary loss value according to the first feature information may refer to that the training sample set is input into the first feature extraction structure for processing in S1021 to obtain the second feature information, and the description about the first auxiliary loss value is calculated according to the second feature information, which is not described herein again.

The number of convolution layers of the third feature extraction structure and the first feature extraction structure may be the same or different.

S103: and inputting the training sample set into a preset teacher neural network model for processing to obtain fourth characteristic information, and calculating a second loss value according to the first characteristic information, the fourth characteristic information and a second loss function.

The device is pre-stored with a preset teacher neural network model, and the function realized by the teacher neural network model is the same as the function realized by the preset student neural network model. For example, both the teacher network model and the pre-set student neural network model may be used for computer image processing and output the processed image data.

In one embodiment, the pre-set teacher neural network model includes a Resnet152 network. Resnet152 is a highly complex super-large network model, including 152 convolutional layers, the more abundant the features that can be extracted, and the more complete the features extracted by the deep-depth network. However, the Resnet152 is slow in training speed and needs high-performance calculation, and the teacher network model is used for teaching the student network model to learn the accurate behavior of the student network model, so that a smaller and faster student network model can be obtained, and a method which can be comparable to the output result of the teacher network model, namely knowledge distillation, is obtained.

The device calculates a second loss value based on the first characteristic information, the fourth characteristic information, and a second loss function.

In one embodiment, the first feature information and the fourth feature information are subjected to square error calculation to obtain a second loss value. Transmitting the training sample set to a preset teacher network model, acquiring fourth characteristic information, and performing square error calculation on the fourth characteristic information and the first characteristic information of the student network model, namely a square loss function:

where Y is a matrix indicating fourth feature information, f (x) is also a matrix indicating first feature information, and n represents the number of samples included in the training samples.

The squared difference L (Y, f (x)) obtained by the calculation solution based on the squared loss function is a matrix, and each element included in the matrix is added based on the matrix to obtain a second loss value.

In an embodiment, the device may also perform mean square error calculation on the first feature information and the fourth feature information to obtain a second loss value; or, the absolute value of the first feature information and the absolute value of the fourth feature information are calculated to obtain the second loss value, which is not limited herein.

S104: a target loss value is determined based on the first loss value and the second loss value.

The device determines a target loss value according to the first loss value and the second loss value, and specifically, the target loss value can be calculated by setting weight coefficients of the first loss value and the second loss value in the device; or the apparatus may calculate the target loss value according to the first loss value and the second loss value according to a preset calculation rule.

In one embodiment, the apparatus may perform a weighted summation of the first auxiliary penalty value, the second auxiliary penalty value, the third auxiliary penalty value, and the second penalty value to obtain a target penalty value.

The target loss value L may be:

L＝L₁+L₂+L₃+L₄

wherein L is₁Is a first auxiliary loss value, L₂Is the second auxiliary loss value, L₃Is a third auxiliary loss value L₄Is the second loss value.

It is to be understood that the expression of the final loss value is not limited thereto, and different loss values may be configured appropriately by using a preset weight to obtain the target loss value, which is not limited herein.

S105: and if the target loss value does not meet the preset suspension condition, updating the preset student neural network according to the target loss value, and returning to execute the process of inputting the training sample set into the preset student neural network model for processing.

And pre-storing a preset suspension condition in the equipment, if the target loss value does not meet the preset suspension condition, updating the preset student neural network according to the target loss value, performing back propagation by using a batch gradient descent method, updating the learning parameters of the model, and returning to execute the process of inputting the training sample set into the preset student neural network model.

S106: and if the target loss value meets the preset suspension condition, outputting the trained student neural network model.

In the embodiment of the application, a training sample set is obtained; inputting the training sample set into a preset student neural network model for processing to obtain first characteristic information, and calculating a first loss value according to the first characteristic information and a first loss function; inputting the training sample set into a preset teacher neural network model for processing to obtain fourth characteristic information, and calculating a second loss value according to the first characteristic information, the fourth characteristic information and a second loss function; determining a target loss value according to the first loss value and the second loss value; if the target loss value does not meet the preset suspension condition, updating the preset student neural network according to the target loss value, and returning to execute the process of inputting the training sample set into the preset student neural network model for processing; and if the target loss value meets the preset suspension condition, outputting the trained student neural network model. In the application, the target loss value is determined by presetting the first loss value obtained by the student neural network model and the second loss value obtained by the teacher neural network model, so that the training convergence condition of the neural network model is effectively improved, the neural network can learn better characteristics, and the recognition accuracy and generalization capability of the neural network are improved.

Referring to fig. 3, fig. 3 is a schematic flow chart of an object identification method according to a second embodiment of the present application. The object recognition method in this embodiment is executed by a device having an object recognition function, for example, a desktop computer, a server, a robot, or the like. The object recognition method as shown in fig. 1 may include:

s201: and acquiring a picture of the object.

And when the equipment detects the image recognition instruction, acquiring an object picture. The mode of acquiring the object picture by the device is not limited here, and the object picture may be acquired by shooting through a device having a camera shooting function and then sending the shot object picture to the device, or the device itself may have a camera shooting function and acquire the object picture, for example, a robot having a camera shooting module may directly acquire the object picture through its camera shooting function.

S202: inputting the object picture into a preset object recognition model for processing to obtain an object recognition result corresponding to the object picture; wherein the object recognition model is obtained by a training method of the neural network model according to any one of claims 1 to 7.

According to the object identification method, an object picture is obtained; inputting the object picture into a preset object recognition model for processing to obtain an object recognition result corresponding to the object picture; the object recognition model is obtained by training a neural network model through a training method. The object recognition model is obtained by training of a neural network model, so that the object recognition method can better extract the characteristics of the object image, and the accuracy of object recognition is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Referring to fig. 4, fig. 4 is a schematic diagram of a training apparatus for a neural network model according to a third embodiment of the present application. The units are used for executing the steps in the embodiment corresponding to the figures 1-2. Please refer to the related descriptions of the embodiments corresponding to fig. 1-2. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 4, the training device 4 for neural network model includes:

a first obtaining unit 410, configured to obtain a training sample set;

the first processing unit 420 is configured to input the training sample set into a preset student neural network model for processing, obtain first feature information, and calculate a first loss value according to the first feature information and a first loss function;

the second processing unit 430 is configured to input the training sample set into a preset teacher neural network model for processing, to obtain fourth feature information, and calculate a second loss value according to the first feature information, the fourth feature information, and a second loss function;

a first determining unit 440 for determining a target loss value according to the first loss value and the second loss value;

the third processing unit 450 is configured to update the preset student neural network according to the target loss value if the target loss value does not meet the preset suspension condition, and return to execute the process of inputting the training sample set into the preset student neural network model;

and a fourth processing unit 460, configured to output the trained student neural network model if the target loss value meets the preset suspension condition.

the first processing unit 420 is specifically configured to:

inputting the training sample set into a first feature extraction structure for processing to obtain second feature information, and calculating a first auxiliary loss value according to the second feature information;

inputting the second characteristic information into a second characteristic extraction structure for processing to obtain third characteristic information, and calculating a second auxiliary loss value according to the third characteristic information;

and inputting the third feature information into a third feature extraction structure for processing to obtain first feature information, and calculating a third auxiliary loss value according to the first feature information.

Further, the first feature extraction structure comprises a convolution layer;

the first processing unit 420 is further specifically configured to:

inputting the feature vectors into an output layer, and determining the probability that the classes of the training samples in the training sample set belong to target classes corresponding to preset ideal data according to the preset ideal data;

Further, the pre-set teacher neural network model includes a Resnet152 network.

Further, the second processing unit 430 is specifically configured to:

alternatively, the first and second electrodes may be,

Further, the first determining unit 440 is specifically configured to:

Referring to fig. 5, fig. 5 is a schematic view of an object recognition device according to a fourth embodiment of the present application. The units are included for performing the steps in the corresponding embodiment of fig. 3. Please refer to the related description of the embodiment in fig. 3. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 5, the object recognition device 5 includes:

an obtaining unit 510, configured to obtain a picture of an object;

the processing unit 520 is configured to input the object picture into a preset object recognition model for processing, so as to obtain an object recognition result corresponding to the object picture; wherein the object recognition model is obtained by a training method of the neural network model of any one of claims 1 to 7.

Fig. 6 is a schematic diagram of a training apparatus for a neural network model provided in a fifth embodiment of the present application. As shown in fig. 6, the training device 6 of the neural network model of this embodiment includes: a processor 60, a memory 61 and a computer program 62, such as a training program for neural network models, stored in said memory 61 and executable on said processor 60. The processor 60, when executing the computer program 62, implements the steps in the above-described embodiments of the training method for the neural network model, such as the steps 101 to 106 shown in fig. 1. Alternatively, the processor 60, when executing the computer program 62, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 410 to 460 shown in fig. 4.

Illustratively, the computer program 62 may be partitioned into one or more modules/units that are stored in the memory 61 and executed by the processor 60 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program 62 in the training device 6 of the neural network model. For example, the computer program 62 may be divided into a first acquiring unit, a first processing unit, a second processing unit, a first determining unit, a third processing unit, and a fourth processing unit, and each unit has the following specific functions:

a first obtaining unit, configured to obtain a training sample set;

a first determining unit for determining a target loss value according to the first loss value and the second loss value;

the third processing unit is used for updating the preset student neural network according to the target loss value if the target loss value does not meet the preset suspension condition, and returning to execute the process of inputting the training sample set into the preset student neural network model for processing;

The training device of the neural network model may include, but is not limited to, a processor 60, a memory 61. It will be appreciated by those skilled in the art that fig. 6 is merely an example of the training device 6 of the neural network model and does not constitute a limitation of the training device 6 of the neural network model and may include more or less components than those shown, or combine certain components, or different components, for example, the training device of the neural network model may also include input-output devices, network access devices, buses, etc.

The Processor 60 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 61 may be an internal storage unit of the training device 6 of the neural network model, for example, a hard disk or a memory of the training device 6 of the neural network model. The memory 61 may also be an external storage device of the training device 6 of the neural network model, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, which is equipped on the training device 6 of the neural network model. Further, the training device 6 of the neural network model may also comprise both an internal memory unit and an external memory device of the training device 6 of the neural network model. The memory 61 is used for storing the computer program and other programs and data required by the training device of the neural network model. The memory 61 may also be used to temporarily store data that has been output or is to be output.

Fig. 7 is a schematic diagram of an object recognition apparatus according to a sixth embodiment of the present application. As shown in fig. 7, the object recognition apparatus 7 of this embodiment includes: a processor 70, a memory 71 and a computer program 72, such as an object identification program, stored in said memory 71 and executable on said processor 70. The processor 70, when executing the computer program 72, implements the steps in each of the above-described embodiments of the fault handling method, such as the steps 201 to 202 shown in fig. 3. Alternatively, the processor 70, when executing the computer program 72, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 510 to 520 shown in fig. 5.

Illustratively, the computer program 72 may be partitioned into one or more modules/units that are stored in the memory 71 and executed by the processor 70 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 72 in the object identification device 7. For example, the computer program 72 may be divided into an acquisition unit and a processing unit, and the specific functions of each unit are as follows:

an acquisition unit for acquiring a picture of an object;

the processing unit is used for inputting the object picture into a preset object recognition model for processing to obtain an object recognition result corresponding to the object picture; wherein the object recognition model is obtained by a training method of the neural network model of any one of claims 1 to 7.

The object recognition device may include, but is not limited to, a processor 70, a memory 71. It will be appreciated by those skilled in the art that fig. 7 is merely an example of the object recognition device 7 and does not constitute a limitation of the object recognition device 7 and may include more or less components than those shown, or some components may be combined, or different components, for example, the object recognition device may also include an input-output device, a network access device, a bus, etc.

The Processor 70 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 71 may be an internal storage unit of the object recognition device 7, such as a hard disk or a memory of the object recognition device 7. The memory 71 may also be an external storage device of the object recognition device 7, such as a plug-in hard disk provided on the object recognition device 7, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the object recognition device 7 may also comprise both an internal storage unit and an external storage device of the object recognition device 7. The memory 71 is used for storing the computer program and other programs and data required by the object identification device. The memory 71 may also be used to temporarily store data that has been output or is to be output.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

An embodiment of the present application further provides a timing device of a virtual timer, where the timing device of the virtual timer includes: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of any of the various method embodiments described above when executing the computer program.

The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.

The embodiments of the present application provide a computer program product, which when running on a mobile terminal, enables the mobile terminal to implement the steps in the above method embodiments when executed.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/device and method may be implemented in other ways. For example, the above-described apparatus/device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A training method of a neural network model is characterized by comprising the following steps:

acquiring a training sample set;

2. The training method of the neural network model according to claim 1, wherein the preset student neural network model comprises a first feature extraction structure, a second feature extraction structure and a third feature extraction structure;

3. The method of training a neural network model of claim 2, wherein the first feature extraction structure comprises convolutional layers;

4. A method of training a neural network model according to claim 3, wherein the first feature extraction structure further comprises a fully connected layer and an output layer; the activation function of the output layer is a soft-max function;

5. The method of claim 1, wherein the pre-set teacher neural network model comprises a Resnet152 network.

6. The method of training a neural network model according to claim 1, wherein said calculating a second loss value from the first feature information, the fourth feature information, and a second loss function comprises:

alternatively, the first and second electrodes may be,

7. A method of training a neural network model as claimed in claim 2, wherein said determining a target loss value from said first loss value and said second loss value comprises:

8. An object recognition method, comprising:

acquiring an object picture;

inputting the object picture into a preset object recognition model for processing to obtain an object recognition result corresponding to the object picture; wherein the object recognition model is obtained by a training method of the neural network model according to any one of claims 1 to 7.

9. A training apparatus for a neural network model, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the training method for a neural network model according to any one of claims 1 to 7 when executing the computer program.

10. An apparatus for object recognition comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the method for object recognition according to claim 8.