CN109766872B

CN109766872B - Image recognition method and device

Info

Publication number: CN109766872B
Application number: CN201910101257.9A
Authority: CN
Inventors: 张玉兵
Original assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd
Current assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority date: 2019-01-31
Filing date: 2019-01-31
Publication date: 2021-07-09
Anticipated expiration: 2039-01-31
Also published as: CN109766872A; WO2020155939A1

Abstract

The invention discloses an image recognition method and device. Wherein, the method comprises the following steps: acquiring an image to be identified; acquiring a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets; and identifying the image to be identified by using the image identification model to obtain an identification result. The invention solves the technical problem of low identification accuracy of the image identification method in the prior art.

Description

Image recognition method and device

Technical Field

The invention relates to the field of image recognition, in particular to an image recognition method and device.

Background

In the existing image recognition field, especially the mainstream human face recognition field, recognition is mainly performed through an image recognition model, the image recognition model is obtained by training based on a deep learning algorithm model, and the influence of the quality of deep learning model training on the recognition accuracy is very important. In the whole deep learning model training process, the data set used for training is the most important, and the final algorithm performance of the deep learning model is influenced decisively.

Currently, deep learning models are basically performed on a single training data set, for example, in the field of face recognition, the training data set may be face data acquired in a certain scene or a public face database downloaded from the internet. Since different data sets may cover the same person, and since naming rules are not uniform, it is difficult to combine the face pictures of the same person according to their file names. When the face recognition classification training is performed, the face pictures of the same person must be required to share the same label class number, so that a plurality of face data sets with the possibility of people intersection cannot be simultaneously utilized. The deep learning model obtained only based on the training of a single training data set has low accuracy in image recognition, and cannot meet the requirements of different application occasions.

Aiming at the problem of low identification accuracy of the image identification method in the prior art, no effective solution is provided at present.

Disclosure of Invention

The embodiment of the invention provides an image identification method and device, which at least solve the technical problem of low identification accuracy of the image identification method in the prior art.

According to an aspect of an embodiment of the present invention, there is provided an image recognition method including: acquiring an image to be identified; acquiring a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets; and identifying the image to be identified by using the image identification model to obtain an identification result.

Further, the method further comprises: acquiring a plurality of data sets; classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing the classification result of each image, and the labels of at least two images in the plurality of data sets are the same; and extracting sample images from each classified data set to obtain a plurality of training sets.

Further, before extracting the sample image from each classified data set to obtain a plurality of training sets, the method further includes: extracting preset features of each image in each classified data set; performing alignment operation on each image based on the preset characteristics of each image; and extracting a sample image from each operated data set to obtain a plurality of training sets.

Further, in the case that each image is a face image, the preset features at least include one of the following: eyes, eyebrows, nose tip, and corners of the mouth.

Further, extracting a sample image from each of the operated data sets to obtain a plurality of training sets, including: randomly extracting a sample image from each data set after the operation; and acquiring a storage path and a label of the sample image to obtain a plurality of training sets.

Further, acquiring a plurality of data sets includes: acquiring a video image and a preset data set acquired by acquisition equipment; and detecting the video image and a preset data set to obtain a plurality of data sets.

Further, the method further comprises: establishing an initial model based on a branch training algorithm, wherein the initial model at least comprises: a plurality of loss functions, the plurality of loss functions corresponding to the plurality of training sets one to one; inputting a plurality of training sets into an initial model in parallel, and training the initial model; judging whether the trained model meets a preset condition or not; and if the model obtained by training meets the preset condition, determining the model obtained by training as an image recognition model.

Further, inputting a plurality of training sets into the initial model in parallel, and training the initial model, including: inputting a plurality of training sets into the initial model in parallel to obtain function values of a plurality of loss functions; obtaining a gradient value of each parameter in the initial model according to the function values of the plurality of loss functions and a chain type derivative algorithm; and updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.

Further, judging whether the trained model meets a preset condition or not includes: acquiring a verification set; verifying the trained model by using a verification set to obtain the precision of the trained model; judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process; and if the precision of the model obtained by training is the same as the historical precision, determining that the model obtained by training meets the preset condition.

Further, if the precision of the trained model is different from the historical precision, the precision of the trained model is determined to be the historical precision, and the initial model is continuously trained.

Further, the precision is used for characterizing the proportion of the sum of the verification results of all the verification samples in the verification set to the total number of all the verification samples.

Further, obtaining a validation set, comprising: acquiring other images except for the sample images in the plurality of data sets; and randomly extracting image verification pairs from other images to obtain a verification set.

Further, the image verification pair includes: a positive exemplar pair containing images with the same two labels and a negative exemplar pair containing images with different labels.

Further, the loss function is a squared loss function.

According to another aspect of the embodiments of the present invention, there is also provided an image recognition apparatus including: the first acquisition module is used for acquiring an image to be identified; the second acquisition module is used for acquiring a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets; and the identification module is used for identifying the image to be identified by using the image identification model to obtain an identification result.

According to another aspect of the embodiments of the present invention, there is also provided a storage medium including a stored program, wherein when the program runs, an apparatus where the storage medium is located is controlled to execute the above-mentioned image recognition method.

According to another aspect of the embodiments of the present invention, there is also provided a processor, configured to execute a program, where the program executes the image recognition method described above.

In the embodiment of the invention, an initial model can be established based on a branch training algorithm, the initial model is trained through a plurality of training sets generated by different data sets to obtain an image recognition model, and further an image to be recognized input by a user is recognized through the image recognition model to obtain a final recognition result. Compared with the prior art, the accuracy of the image recognition model combining the branch training of the multiple data sets is higher than that of the image recognition model trained based on a single data set, the technical effect of improving the recognition accuracy is achieved, and the technical problem that the recognition accuracy of the image recognition method in the prior art is low is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow chart of an image recognition method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an alternative face picture according to an embodiment of the invention;

FIG. 3 is a schematic diagram of an alternative aligned face picture according to an embodiment of the invention;

FIG. 4 is a schematic diagram of an alternative face recognition deep neural network model based on a single data set input, according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an alternative face recognition deep neural network model based on multiple data set inputs, in accordance with embodiments of the present invention;

FIG. 6 is a flow diagram of an alternative image recognition method according to an embodiment of the present invention; and

fig. 7 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

In accordance with an embodiment of the present invention, there is provided an embodiment of an image recognition method, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer-executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

Fig. 1 is a flowchart of an image recognition method according to an embodiment of the present invention, as shown in fig. 1, the method including the steps of:

and step S102, acquiring an image to be identified.

Specifically, the image to be recognized may be an image that needs to be recognized, and in the embodiment of the present invention, a human face image is taken as an example for detailed description.

And step S104, obtaining a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets.

Specifically, in order to improve the image recognition accuracy, a plurality of training sets may be constructed in advance through a plurality of different data sets, and the initial model may be trained through the training sets, so as to obtain a final image recognition model.

In the field of face recognition, different data sets cannot be simply and directly combined into a single data set, because different data sets may contain face pictures of the same person between them, and a user cannot determine which of the same persons are contained in different data sets. A deep neural network model can be established by a set score training method to obtain an initial model, different data sets are separately subjected to branch training, so that a trained image recognition model can be obtained, and the trained image recognition model is deployed in an application scene.

And S106, identifying the image to be identified by using the image identification model to obtain an identification result.

Specifically, in the field of face recognition, a face recognition process can be performed by comparing face features, feat-IDs (using euclidean distance).

In the embodiment of the application, an initial model can be established based on a branch training algorithm, the initial model is trained through a plurality of training sets generated by different data sets to obtain an image recognition model, and further an image to be recognized input by a user is recognized through the image recognition model to obtain a final recognition result. Compared with the prior art, the accuracy of the image recognition model combining the branch training of the multiple data sets is higher than that of the image recognition model trained based on a single data set, the technical effect of improving the recognition accuracy is achieved, and the technical problem that the recognition accuracy of the image recognition method in the prior art is low is solved.

Optionally, in the above embodiment of the present invention, the method further includes: acquiring a plurality of data sets; classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing the classification result of each image, and the labels of at least two images in the plurality of data sets are the same; and extracting sample images from each classified data set to obtain a plurality of training sets.

Specifically, in the field of face recognition, in order to construct a plurality of training sets, face pictures in different application scenes can be obtained in advance to obtain a plurality of data sets. Because the public face data sets downloaded from the internet are generally labeled, for the data sets which are not labeled, face pictures can be manually detected and extracted, classification and labeling are carried out, the face pictures belonging to the same person are put together and labeled, and the label of each picture is obtained. Suppose the total number of people is N, and each person has M face pictures. A certain number of face pictures can be randomly extracted from each labeled data set to obtain each training set.

Optionally, in the foregoing embodiment of the present invention, before extracting the sample image from each classified data set to obtain a plurality of training sets, the method further includes: extracting preset features of each image in each classified data set; performing alignment operation on each image based on the preset characteristics of each image; and extracting a sample image from each operated data set to obtain a plurality of training sets.

Optionally, in a case that each image is a face image, the preset features at least include one of the following: eyes, eyebrows, nose tip, and corners of the mouth.

Specifically, in the field of face recognition, the face angle and the face position in a face picture are not consistent, and in order to ensure that stable features are extracted and obtain a good face recognition effect, the face picture needs to be aligned, so as to remove the influence of the face angle on the face recognition. The key points include the positions of the eyes, nose tip, mouth corners, etc., as shown in fig. 2. The aligned faces are shown in fig. 3.

Optionally, in the foregoing embodiment of the present invention, extracting a sample image from each operated data set to obtain a plurality of training sets includes: randomly extracting a sample image from each data set after the operation; and acquiring a storage path and a label of the sample image to obtain a plurality of training sets.

Specifically, a face picture containing face identity information and verification information at the same time can be randomly extracted from a face picture subjected to labeling and face alignment to obtain a sample image, and each extracted training sample is as follows: identity information (class number) of the face pictures img _1 and img _ 1.

The face picture img _1 refers to a storage path of the 1 st face picture, the class number refers to a label which is labeled for the person in advance, and the class number generally starts from 0. Different labels represent numerical codes for different persons within the same data set. For example, if there are 100 people in the first data set, the class numbers are 1-0, 1-1, 1-2, … …, 1-99; the second data set or scene covers 50 persons and the class numbers are 2-0, 2-1, 2-2, … …, 2-49 respectively. The two groups of class numbers are not identical and are respectively from different data sets.

Optionally, in the above embodiment of the present invention, acquiring a plurality of data sets includes: acquiring a video image and a preset data set acquired by acquisition equipment; and detecting the video image and a preset data set to obtain a plurality of data sets.

Specifically, in the field of face recognition, the capture device may be a camera installed in different application scenarios, the camera is used to capture Video pictures, and the Video pictures are stored in the computer system through network transmission and a data line, and the application scenarios may be usage scenarios corresponding to engineering projects, such as VTM (Video Teller Machine) verification in a bank, VIP identification in a jewelry store, and the like. The preset data set may be a public face data set downloaded from the internet.

The face data sets obtained by the above method may cover the same person, for example, a customer who is photographed with a camera in a bank and a jewelry shop, and a photo thereof may appear on the internet and be sorted into the public face data set. And face pictures of the same person may also be included between the face data sets a and B disclosed on the internet.

And for the video pictures collected by the camera, carrying out face detection on the collected video pictures, and extracting the face pictures to store in a hard disk of a computer system.

Optionally, in the above embodiment of the present invention, the method further includes: establishing an initial model based on a branch training algorithm, wherein the initial model at least comprises: a plurality of loss functions, the plurality of loss functions corresponding to the plurality of training sets one to one; inputting a plurality of training sets into an initial model in parallel, and training the initial model; judging whether the trained model meets a preset condition or not; and if the model obtained by training meets the preset condition, determining the model obtained by training as an image recognition model.

It should be noted that, in the conventional image recognition model, only one Softmax Loss function is used as a target for training, and the image recognition model based on a single dataset input shown in fig. 4 only includes one classification Loss function, where Loss is Softmax Loss 1.

Different data sets can be separately subjected to branch training and input into the same image recognition model in parallel, and aligned face pictures in the ith data set are butted to a corresponding loss function SoftmaxLoss i after being subjected to forward propagation to obtain characteristics and are used as independent target functions for optimization. As shown in fig. 5, when a face image in the ith personal face data set is input into the initial model for branch training, the corresponding Loss function is lost — softmax Loss i.

It should be noted that the image recognition models shown in fig. 4 and 5 show schematic diagrams of simplified general residual error networks.

Optionally, the loss function is a squared loss function.

Specifically, in the field of face recognition, in order to perform a face recognition process using euclidean distance, the plurality of loss functions in the initial model may be square loss functions.

Further, the preset condition may be a training end judgment condition, when the model obtained through training meets the preset condition, it is determined that the training is ended, and finally the model obtained through training is the trained image recognition model.

Optionally, in the foregoing embodiment of the present invention, inputting a plurality of training sets into the initial model in parallel, and training the initial model includes: inputting a plurality of training sets into the initial model in parallel to obtain function values of a plurality of loss functions; obtaining a gradient value of each parameter in the initial model according to the function values of the plurality of loss functions and a chain type derivative algorithm; and updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.

Specifically, after a plurality of training sets are input into the initial model in parallel, a function value Loss of the Loss function can be obtained through branch training, then a gradient value of each parameter in the image recognition model shown in fig. 5 is obtained according to the Loss and the chain derivation algorithm, finally the model parameters are updated according to the random gradient descent algorithm to obtain the trained model, and after the trained model meets the training end judgment condition, the trained model can be determined to be the final image recognition model.

Optionally, in the above embodiment of the present invention, the determining whether the trained model meets the preset condition includes: acquiring a verification set; verifying the trained model by using a verification set to obtain the precision of the trained model; judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process; and if the precision of the model obtained by training is the same as the historical precision, determining that the model obtained by training meets the preset condition.

It should be noted that, in the training process of the image recognition model, the currently trained model may be tested on the verification set at intervals of fixed iteration times, and as the model is trained, the precision of the trained model on the verification set may be continuously improved, but as the model is continuously trained, when the model tends to converge or an overfitting phenomenon occurs, the precision of the model on the verification set may not be stably improved, indicating that the model training may be stopped.

Optionally, the precision is used to characterize the ratio of the sum of the validation results of all validation samples in the validation set to the total number of all validation samples.

Specifically, in the field of face recognition, a verification set is composed of randomly extracted face picture verification pairs. According to the rules of the international standard face verification test set LFW, the number of face picture verification pairs in the verification set is 6000. For a verification set containing 6000 face image verification pairs, the test accuracy may be defined as:

wherein x is_iAnd the verification result is used for representing the verification pair of the ith personal face picture. If the recognition result of the model is the same as the actual label of the face image verification pair, the verification is determined to be correct, namely x_i1 is ═ 1; if the model is recognizedIf the difference result is different from the actual label of the face image verification pair, determining that the verification is wrong, namely x_i＝0。

Further, the historical accuracy may be the accuracy of the trained model obtained when the trained model is verified last time. If the precision of the trained model is the same as the historical precision in the verification process, namely the precision of the trained model is not stably improved any more, the training can be determined to be finished, and the trained model is used as a final image recognition model.

Optionally, in the foregoing embodiment of the present invention, if the accuracy of the trained model is different from the historical accuracy, it is determined that the accuracy of the trained model is the historical accuracy, and the training of the initial model is continued.

In an optional scheme, if the precision of the model obtained by training is different from the historical precision, that is, the model obtained by training satisfies that the preset condition is not met, it is determined that the training is not finished, the training needs to be continued, and the precision is used as the historical precision in the next model verification process. And judging whether the precision of the trained model is the same as the historical precision or not again, thereby determining whether the trained model meets the preset condition or not.

Optionally, in the foregoing embodiment of the present invention, acquiring the verification set includes: acquiring other images except for the sample images in the plurality of data sets; and randomly extracting image verification pairs from other images to obtain a verification set.

Optionally, the image verification pair comprises: a positive exemplar pair containing images with the same two labels and a negative exemplar pair containing images with different labels.

Specifically, in the field of face recognition, if there are K individual face pictures used for making the training set, the remaining N-K individual face pictures may be used for making the verification set. The verification set is composed of face photo verification pairs which are randomly extracted, positive sample pairs and negative sample pairs are extracted, the number of the positive sample pairs is the same as that of the negative sample pairs, and 3000 positive sample pairs and 3000 negative sample pairs are respectively selected for the verification set containing 6000 face photo verification pairs. The positive sample pair is the a picture of the nth person and the b picture of the nth person; the negative example pair is the c picture of the ith person and the d picture of the jth person. The image recognition model judges the two face images in the positive sample pair as the same person, and can determine that the verification result is correct; the image recognition model judges that the two human face pictures in the negative sample pair are not one person, and a verification structure can be determined; otherwise, the verification result is wrong.

Fig. 6 is a flowchart of an alternative image recognition method according to an embodiment of the present invention, which is described by taking the field of face recognition as an example, and as shown in fig. 6, the method includes: collecting face pictures under a plurality of scenes; carrying out face detection on the collected face pictures, and extracting the face pictures to be stored in a hard disk of the computer; manually classifying and labeling the detected and extracted face pictures, putting the face pictures belonging to the same person together and marking the face pictures; carrying out key point alignment operation on the face picture to remove the influence of the face angle on face identification; randomly extracting a face picture pair which simultaneously contains face identity information and verification information from the marked and face aligned pictures for training, namely extracting a face identity-verification training set; establishing a face recognition deep neural network model by combining a branch training algorithm, wherein the model comprises a plurality of loss functions; training the face recognition deep neural network model based on multiple data sets to obtain a trained network model; judging whether the testing precision of the trained network model on the verification set is continuously improved, namely judging whether the training end condition is reached; if not, continuing to train the model; if yes, obtaining a face recognition algorithm network model and model parameters; and deploying the trained face recognition algorithm network model into an application scene, and performing a face recognition process by comparing face features, namely, feat-ID (adopting an Euclidean distance).

The scheme provided by the embodiment can be used for bank VIP identification projects, face pictures are collected in real application scenes, and meanwhile, some public face data sets are downloaded from the Internet; then, the face pictures in the data sets are detected and aligned, and a corresponding face identity-verification training set is made; the face recognition algorithm model is trained by using the method, so that the face recognition algorithm with high recognition rate and recognition effect in a bank VIP recognition scene is obtained. The branch training face deep neural network model combining a plurality of data sets has higher accuracy than the face recognition algorithm of a universal deep learning network based on single data set training (including successive fine tuning on a plurality of data sets).

Example 2

According to an embodiment of the present invention, there is provided an embodiment of an image recognition apparatus.

Fig. 7 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention, as shown in fig. 7, the apparatus including:

the first obtaining module 72 is configured to obtain an image to be identified.

A second obtaining module 74, configured to obtain a pre-established image recognition model, where the image recognition model is obtained by training an initial model through multiple training sets, the initial model is a recognition model established based on a branch training algorithm, one training set is extracted from the same data set, and different training sets are extracted from different data sets.

And the identification module 76 is configured to identify the image to be identified by using the image identification model to obtain an identification result.

Optionally, in the above embodiment of the present invention, the apparatus further includes: a third obtaining module for obtaining a plurality of data sets; the classification module is used for classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing the classification result of each image, and the labels of at least two images in the plurality of data sets are the same; and the first extraction module is used for extracting sample images from each classified data set to obtain a plurality of training sets.

Optionally, in the above embodiment of the present invention, the apparatus further includes: the second extraction module is used for extracting preset characteristics of each image in each classified data set; the alignment module is used for carrying out alignment operation on each image based on the preset characteristics of each image; and the third extraction module is used for extracting a sample image from each operated data set to obtain a plurality of training sets.

Optionally, in the foregoing embodiment of the present invention, the third extracting module includes: an extraction unit for randomly extracting a sample image from each data set after the operation; and the first acquisition unit is used for acquiring the storage path and the label of the sample image to obtain a plurality of training sets.

Optionally, in the foregoing embodiment of the present invention, the third obtaining module includes: the second acquisition unit is used for acquiring the video image and the preset data set acquired by the acquisition equipment; and the detection unit is used for detecting the video image and the preset data set to obtain a plurality of data sets.

Optionally, in the above embodiment of the present invention, the apparatus further includes: the establishing module is used for establishing an initial model based on a branch training algorithm, wherein the initial model at least comprises: a plurality of loss functions, the plurality of loss functions corresponding to the plurality of training sets one to one; the training module is used for inputting a plurality of training sets into the initial model in parallel and training the initial model; the judging module is used for judging whether the trained model meets a preset condition or not; and the determining module is used for determining the model obtained by training as the image recognition model if the model obtained by training meets the preset condition.

Optionally, the loss function is a squared loss function.

Optionally, in the above embodiment of the present invention, the training module includes: the input unit is used for inputting the training sets into the initial model in parallel to obtain function values of a plurality of loss functions; the processing unit is used for obtaining a gradient value of each parameter in the initial model according to the function values of the loss functions and the chain type derivative algorithm; and the updating unit is used for updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.

Optionally, in the foregoing embodiment of the present invention, the determining module includes: a third obtaining unit configured to obtain a verification set; the verification unit is used for verifying the trained model by using a verification set to obtain the precision of the trained model; the judging unit is used for judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process; and the determining unit is used for determining that the trained model meets the preset condition if the precision of the trained model is the same as the historical precision.

Optionally, in the above embodiment of the present invention, the training module is further configured to determine that the accuracy of the trained model is the historical accuracy if the accuracy of the trained model is different from the historical accuracy, and continue training the initial model.

Optionally, in the foregoing embodiment of the present invention, the third obtaining unit is configured to obtain other images than the sample image in the multiple data sets, and randomly extract the image verification pair from the other images to obtain the verification set.

Example 3

According to an embodiment of the present invention, there is provided an embodiment of a storage medium including a stored program, wherein an apparatus in which the storage medium is located is controlled to execute the image recognition method in the above-described embodiment 1 when the program is executed.

Example 4

According to an embodiment of the present invention, an embodiment of a processor for running a program is provided, where the program executes the image recognition method in embodiment 1 described above.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. An image recognition method, comprising:

acquiring an image to be identified;

obtaining a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, one training set is extracted from the same data set, different training sets are extracted from different data sets, the different data sets are pre-obtained data under different application scenes, and the initial model at least comprises: a plurality of loss functions in one-to-one correspondence with the plurality of training sets;

identifying the image to be identified by using the image identification model to obtain an identification result;

wherein the method further comprises:

inputting the training sets into the initial model in parallel, and training the initial model;

wherein the inputting the plurality of training sets into the initial model in parallel, and the training the initial model comprises:

inputting the training sets into the initial model in parallel to obtain function values of the loss functions;

obtaining a gradient value of each parameter in the initial model according to the function values of the loss functions and a chain type derivative algorithm;

and updating the gradient value of each parameter according to a random gradient descent algorithm to obtain the trained model.

2. The method of claim 1, further comprising:

acquiring a plurality of data sets;

classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing a classification result of each image, and the labels of at least two images in the plurality of data sets are the same;

and extracting sample images from each classified data set to obtain a plurality of training sets.

3. The method of claim 2, wherein prior to extracting the sample images from each classified data set, resulting in the plurality of training sets, the method further comprises:

extracting preset features of each image in each classified data set;

performing alignment operation on each image based on the preset characteristics of each image;

and extracting the sample image from each operated data set to obtain a plurality of training sets.

4. The method according to claim 3, wherein in the case that each image is a human face image, the preset features comprise at least one of: eyes, eyebrows, nose tip, and corners of the mouth.

5. The method of claim 3, wherein extracting the sample image from each of the manipulated data sets, resulting in the plurality of training sets, comprises:

randomly extracting the sample image from each data set after the operation;

and acquiring a storage path and a label of the sample image to obtain the plurality of training sets.

6. The method of claim 2, wherein acquiring a plurality of data sets comprises:

acquiring a video image and a preset data set acquired by acquisition equipment;

and detecting the video image and the preset data set to obtain the plurality of data sets.

7. The method of claim 2, further comprising:

establishing the initial model based on the branch training algorithm;

judging whether the trained model meets a preset condition or not;

and if the model obtained by training meets the preset condition, determining the model obtained by training as the image recognition model.

8. The method of claim 7, wherein determining whether the trained model satisfies a predetermined condition comprises:

acquiring a verification set;

verifying the model obtained by training by using the verification set to obtain the precision of the model obtained by training;

judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process;

and if the precision of the model obtained by training is the same as the historical precision, determining that the model obtained by training meets the preset condition.

9. The method of claim 8, wherein if the accuracy of the trained model is different from the historical accuracy, determining the accuracy of the trained model to be the historical accuracy, and continuing to train the initial model.

10. The method of claim 9, wherein the precision is used to characterize a ratio of a sum of validation results of all validation samples in the validation set to a total number of all validation samples.

11. The method of claim 8, wherein obtaining a validation set comprises:

acquiring other images except for the sample images in the plurality of data sets;

and randomly extracting image verification pairs from the other images to obtain the verification set.

12. The method of claim 11, wherein the image verification pair comprises: the image processing method comprises a positive sample pair and a negative sample pair, wherein the positive sample pair comprises two images with the same label, and the negative sample pair comprises two images with different labels.

13. The method of claim 7, wherein the loss function is a squared loss function.

14. An image recognition apparatus, comprising:

the first acquisition module is used for acquiring an image to be identified;

a second obtaining module, configured to obtain a pre-established image recognition model, where the image recognition model is obtained by training an initial model through multiple training sets, the initial model is a recognition model established based on a branch training algorithm, a same training set is extracted from a same data set, different training sets are extracted from different data sets, the different data sets are pre-obtained data in different application scenarios, and the initial model at least includes: a plurality of loss functions in one-to-one correspondence with the plurality of training sets;

the identification module is used for identifying the image to be identified by utilizing the image identification model to obtain an identification result;

wherein the apparatus further comprises:

the training module is used for inputting a plurality of training sets into the initial model in parallel and training the initial model;

wherein the training module comprises:

the input unit is used for inputting the training sets into the initial model in parallel to obtain function values of a plurality of loss functions; the processing unit is used for obtaining a gradient value of each parameter in the initial model according to the function values of the loss functions and the chain type derivative algorithm; and the updating unit is used for updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.

15. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device where the storage medium is located is controlled to execute the image recognition method according to any one of claims 1 to 13.

16. A processor, characterized in that the processor is configured to run a program, wherein the program is configured to execute the image recognition method according to any one of claims 1 to 13 when running.