CN109766872B - Image recognition method and device - Google Patents

Image recognition method and device Download PDF

Info

Publication number
CN109766872B
CN109766872B CN201910101257.9A CN201910101257A CN109766872B CN 109766872 B CN109766872 B CN 109766872B CN 201910101257 A CN201910101257 A CN 201910101257A CN 109766872 B CN109766872 B CN 109766872B
Authority
CN
China
Prior art keywords
training
model
image
sets
initial model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910101257.9A
Other languages
Chinese (zh)
Other versions
CN109766872A (en
Inventor
张玉兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN201910101257.9A priority Critical patent/CN109766872B/en
Publication of CN109766872A publication Critical patent/CN109766872A/en
Priority to PCT/CN2019/127817 priority patent/WO2020155939A1/en
Application granted granted Critical
Publication of CN109766872B publication Critical patent/CN109766872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image recognition method and device. Wherein, the method comprises the following steps: acquiring an image to be identified; acquiring a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets; and identifying the image to be identified by using the image identification model to obtain an identification result. The invention solves the technical problem of low identification accuracy of the image identification method in the prior art.

Description

Image recognition method and device
Technical Field
The invention relates to the field of image recognition, in particular to an image recognition method and device.
Background
In the existing image recognition field, especially the mainstream human face recognition field, recognition is mainly performed through an image recognition model, the image recognition model is obtained by training based on a deep learning algorithm model, and the influence of the quality of deep learning model training on the recognition accuracy is very important. In the whole deep learning model training process, the data set used for training is the most important, and the final algorithm performance of the deep learning model is influenced decisively.
Currently, deep learning models are basically performed on a single training data set, for example, in the field of face recognition, the training data set may be face data acquired in a certain scene or a public face database downloaded from the internet. Since different data sets may cover the same person, and since naming rules are not uniform, it is difficult to combine the face pictures of the same person according to their file names. When the face recognition classification training is performed, the face pictures of the same person must be required to share the same label class number, so that a plurality of face data sets with the possibility of people intersection cannot be simultaneously utilized. The deep learning model obtained only based on the training of a single training data set has low accuracy in image recognition, and cannot meet the requirements of different application occasions.
Aiming at the problem of low identification accuracy of the image identification method in the prior art, no effective solution is provided at present.
Disclosure of Invention
The embodiment of the invention provides an image identification method and device, which at least solve the technical problem of low identification accuracy of the image identification method in the prior art.
According to an aspect of an embodiment of the present invention, there is provided an image recognition method including: acquiring an image to be identified; acquiring a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets; and identifying the image to be identified by using the image identification model to obtain an identification result.
Further, the method further comprises: acquiring a plurality of data sets; classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing the classification result of each image, and the labels of at least two images in the plurality of data sets are the same; and extracting sample images from each classified data set to obtain a plurality of training sets.
Further, before extracting the sample image from each classified data set to obtain a plurality of training sets, the method further includes: extracting preset features of each image in each classified data set; performing alignment operation on each image based on the preset characteristics of each image; and extracting a sample image from each operated data set to obtain a plurality of training sets.
Further, in the case that each image is a face image, the preset features at least include one of the following: eyes, eyebrows, nose tip, and corners of the mouth.
Further, extracting a sample image from each of the operated data sets to obtain a plurality of training sets, including: randomly extracting a sample image from each data set after the operation; and acquiring a storage path and a label of the sample image to obtain a plurality of training sets.
Further, acquiring a plurality of data sets includes: acquiring a video image and a preset data set acquired by acquisition equipment; and detecting the video image and a preset data set to obtain a plurality of data sets.
Further, the method further comprises: establishing an initial model based on a branch training algorithm, wherein the initial model at least comprises: a plurality of loss functions, the plurality of loss functions corresponding to the plurality of training sets one to one; inputting a plurality of training sets into an initial model in parallel, and training the initial model; judging whether the trained model meets a preset condition or not; and if the model obtained by training meets the preset condition, determining the model obtained by training as an image recognition model.
Further, inputting a plurality of training sets into the initial model in parallel, and training the initial model, including: inputting a plurality of training sets into the initial model in parallel to obtain function values of a plurality of loss functions; obtaining a gradient value of each parameter in the initial model according to the function values of the plurality of loss functions and a chain type derivative algorithm; and updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.
Further, judging whether the trained model meets a preset condition or not includes: acquiring a verification set; verifying the trained model by using a verification set to obtain the precision of the trained model; judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process; and if the precision of the model obtained by training is the same as the historical precision, determining that the model obtained by training meets the preset condition.
Further, if the precision of the trained model is different from the historical precision, the precision of the trained model is determined to be the historical precision, and the initial model is continuously trained.
Further, the precision is used for characterizing the proportion of the sum of the verification results of all the verification samples in the verification set to the total number of all the verification samples.
Further, obtaining a validation set, comprising: acquiring other images except for the sample images in the plurality of data sets; and randomly extracting image verification pairs from other images to obtain a verification set.
Further, the image verification pair includes: a positive exemplar pair containing images with the same two labels and a negative exemplar pair containing images with different labels.
Further, the loss function is a squared loss function.
According to another aspect of the embodiments of the present invention, there is also provided an image recognition apparatus including: the first acquisition module is used for acquiring an image to be identified; the second acquisition module is used for acquiring a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets; and the identification module is used for identifying the image to be identified by using the image identification model to obtain an identification result.
According to another aspect of the embodiments of the present invention, there is also provided a storage medium including a stored program, wherein when the program runs, an apparatus where the storage medium is located is controlled to execute the above-mentioned image recognition method.
According to another aspect of the embodiments of the present invention, there is also provided a processor, configured to execute a program, where the program executes the image recognition method described above.
In the embodiment of the invention, an initial model can be established based on a branch training algorithm, the initial model is trained through a plurality of training sets generated by different data sets to obtain an image recognition model, and further an image to be recognized input by a user is recognized through the image recognition model to obtain a final recognition result. Compared with the prior art, the accuracy of the image recognition model combining the branch training of the multiple data sets is higher than that of the image recognition model trained based on a single data set, the technical effect of improving the recognition accuracy is achieved, and the technical problem that the recognition accuracy of the image recognition method in the prior art is low is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of an image recognition method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an alternative face picture according to an embodiment of the invention;
FIG. 3 is a schematic diagram of an alternative aligned face picture according to an embodiment of the invention;
FIG. 4 is a schematic diagram of an alternative face recognition deep neural network model based on a single data set input, according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an alternative face recognition deep neural network model based on multiple data set inputs, in accordance with embodiments of the present invention;
FIG. 6 is a flow diagram of an alternative image recognition method according to an embodiment of the present invention; and
fig. 7 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
In accordance with an embodiment of the present invention, there is provided an embodiment of an image recognition method, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer-executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.
Fig. 1 is a flowchart of an image recognition method according to an embodiment of the present invention, as shown in fig. 1, the method including the steps of:
and step S102, acquiring an image to be identified.
Specifically, the image to be recognized may be an image that needs to be recognized, and in the embodiment of the present invention, a human face image is taken as an example for detailed description.
And step S104, obtaining a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, the same training set is extracted from the same data set, and different training sets are extracted from different data sets.
Specifically, in order to improve the image recognition accuracy, a plurality of training sets may be constructed in advance through a plurality of different data sets, and the initial model may be trained through the training sets, so as to obtain a final image recognition model.
In the field of face recognition, different data sets cannot be simply and directly combined into a single data set, because different data sets may contain face pictures of the same person between them, and a user cannot determine which of the same persons are contained in different data sets. A deep neural network model can be established by a set score training method to obtain an initial model, different data sets are separately subjected to branch training, so that a trained image recognition model can be obtained, and the trained image recognition model is deployed in an application scene.
And S106, identifying the image to be identified by using the image identification model to obtain an identification result.
Specifically, in the field of face recognition, a face recognition process can be performed by comparing face features, feat-IDs (using euclidean distance).
In the embodiment of the application, an initial model can be established based on a branch training algorithm, the initial model is trained through a plurality of training sets generated by different data sets to obtain an image recognition model, and further an image to be recognized input by a user is recognized through the image recognition model to obtain a final recognition result. Compared with the prior art, the accuracy of the image recognition model combining the branch training of the multiple data sets is higher than that of the image recognition model trained based on a single data set, the technical effect of improving the recognition accuracy is achieved, and the technical problem that the recognition accuracy of the image recognition method in the prior art is low is solved.
Optionally, in the above embodiment of the present invention, the method further includes: acquiring a plurality of data sets; classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing the classification result of each image, and the labels of at least two images in the plurality of data sets are the same; and extracting sample images from each classified data set to obtain a plurality of training sets.
Specifically, in the field of face recognition, in order to construct a plurality of training sets, face pictures in different application scenes can be obtained in advance to obtain a plurality of data sets. Because the public face data sets downloaded from the internet are generally labeled, for the data sets which are not labeled, face pictures can be manually detected and extracted, classification and labeling are carried out, the face pictures belonging to the same person are put together and labeled, and the label of each picture is obtained. Suppose the total number of people is N, and each person has M face pictures. A certain number of face pictures can be randomly extracted from each labeled data set to obtain each training set.
Optionally, in the foregoing embodiment of the present invention, before extracting the sample image from each classified data set to obtain a plurality of training sets, the method further includes: extracting preset features of each image in each classified data set; performing alignment operation on each image based on the preset characteristics of each image; and extracting a sample image from each operated data set to obtain a plurality of training sets.
Optionally, in a case that each image is a face image, the preset features at least include one of the following: eyes, eyebrows, nose tip, and corners of the mouth.
Specifically, in the field of face recognition, the face angle and the face position in a face picture are not consistent, and in order to ensure that stable features are extracted and obtain a good face recognition effect, the face picture needs to be aligned, so as to remove the influence of the face angle on the face recognition. The key points include the positions of the eyes, nose tip, mouth corners, etc., as shown in fig. 2. The aligned faces are shown in fig. 3.
Optionally, in the foregoing embodiment of the present invention, extracting a sample image from each operated data set to obtain a plurality of training sets includes: randomly extracting a sample image from each data set after the operation; and acquiring a storage path and a label of the sample image to obtain a plurality of training sets.
Specifically, a face picture containing face identity information and verification information at the same time can be randomly extracted from a face picture subjected to labeling and face alignment to obtain a sample image, and each extracted training sample is as follows: identity information (class number) of the face pictures img _1 and img _ 1.
The face picture img _1 refers to a storage path of the 1 st face picture, the class number refers to a label which is labeled for the person in advance, and the class number generally starts from 0. Different labels represent numerical codes for different persons within the same data set. For example, if there are 100 people in the first data set, the class numbers are 1-0, 1-1, 1-2, … …, 1-99; the second data set or scene covers 50 persons and the class numbers are 2-0, 2-1, 2-2, … …, 2-49 respectively. The two groups of class numbers are not identical and are respectively from different data sets.
Optionally, in the above embodiment of the present invention, acquiring a plurality of data sets includes: acquiring a video image and a preset data set acquired by acquisition equipment; and detecting the video image and a preset data set to obtain a plurality of data sets.
Specifically, in the field of face recognition, the capture device may be a camera installed in different application scenarios, the camera is used to capture Video pictures, and the Video pictures are stored in the computer system through network transmission and a data line, and the application scenarios may be usage scenarios corresponding to engineering projects, such as VTM (Video Teller Machine) verification in a bank, VIP identification in a jewelry store, and the like. The preset data set may be a public face data set downloaded from the internet.
The face data sets obtained by the above method may cover the same person, for example, a customer who is photographed with a camera in a bank and a jewelry shop, and a photo thereof may appear on the internet and be sorted into the public face data set. And face pictures of the same person may also be included between the face data sets a and B disclosed on the internet.
And for the video pictures collected by the camera, carrying out face detection on the collected video pictures, and extracting the face pictures to store in a hard disk of a computer system.
Optionally, in the above embodiment of the present invention, the method further includes: establishing an initial model based on a branch training algorithm, wherein the initial model at least comprises: a plurality of loss functions, the plurality of loss functions corresponding to the plurality of training sets one to one; inputting a plurality of training sets into an initial model in parallel, and training the initial model; judging whether the trained model meets a preset condition or not; and if the model obtained by training meets the preset condition, determining the model obtained by training as an image recognition model.
It should be noted that, in the conventional image recognition model, only one Softmax Loss function is used as a target for training, and the image recognition model based on a single dataset input shown in fig. 4 only includes one classification Loss function, where Loss is Softmax Loss 1.
Different data sets can be separately subjected to branch training and input into the same image recognition model in parallel, and aligned face pictures in the ith data set are butted to a corresponding loss function SoftmaxLoss i after being subjected to forward propagation to obtain characteristics and are used as independent target functions for optimization. As shown in fig. 5, when a face image in the ith personal face data set is input into the initial model for branch training, the corresponding Loss function is lost — softmax Loss i.
It should be noted that the image recognition models shown in fig. 4 and 5 show schematic diagrams of simplified general residual error networks.
Optionally, the loss function is a squared loss function.
Specifically, in the field of face recognition, in order to perform a face recognition process using euclidean distance, the plurality of loss functions in the initial model may be square loss functions.
Further, the preset condition may be a training end judgment condition, when the model obtained through training meets the preset condition, it is determined that the training is ended, and finally the model obtained through training is the trained image recognition model.
Optionally, in the foregoing embodiment of the present invention, inputting a plurality of training sets into the initial model in parallel, and training the initial model includes: inputting a plurality of training sets into the initial model in parallel to obtain function values of a plurality of loss functions; obtaining a gradient value of each parameter in the initial model according to the function values of the plurality of loss functions and a chain type derivative algorithm; and updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.
Specifically, after a plurality of training sets are input into the initial model in parallel, a function value Loss of the Loss function can be obtained through branch training, then a gradient value of each parameter in the image recognition model shown in fig. 5 is obtained according to the Loss and the chain derivation algorithm, finally the model parameters are updated according to the random gradient descent algorithm to obtain the trained model, and after the trained model meets the training end judgment condition, the trained model can be determined to be the final image recognition model.
Optionally, in the above embodiment of the present invention, the determining whether the trained model meets the preset condition includes: acquiring a verification set; verifying the trained model by using a verification set to obtain the precision of the trained model; judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process; and if the precision of the model obtained by training is the same as the historical precision, determining that the model obtained by training meets the preset condition.
It should be noted that, in the training process of the image recognition model, the currently trained model may be tested on the verification set at intervals of fixed iteration times, and as the model is trained, the precision of the trained model on the verification set may be continuously improved, but as the model is continuously trained, when the model tends to converge or an overfitting phenomenon occurs, the precision of the model on the verification set may not be stably improved, indicating that the model training may be stopped.
Optionally, the precision is used to characterize the ratio of the sum of the validation results of all validation samples in the validation set to the total number of all validation samples.
Specifically, in the field of face recognition, a verification set is composed of randomly extracted face picture verification pairs. According to the rules of the international standard face verification test set LFW, the number of face picture verification pairs in the verification set is 6000. For a verification set containing 6000 face image verification pairs, the test accuracy may be defined as:
Figure BDA0001965728740000081
wherein x isiAnd the verification result is used for representing the verification pair of the ith personal face picture. If the recognition result of the model is the same as the actual label of the face image verification pair, the verification is determined to be correct, namely xi1 is ═ 1; if the model is recognizedIf the difference result is different from the actual label of the face image verification pair, determining that the verification is wrong, namely xi=0。
Further, the historical accuracy may be the accuracy of the trained model obtained when the trained model is verified last time. If the precision of the trained model is the same as the historical precision in the verification process, namely the precision of the trained model is not stably improved any more, the training can be determined to be finished, and the trained model is used as a final image recognition model.
Optionally, in the foregoing embodiment of the present invention, if the accuracy of the trained model is different from the historical accuracy, it is determined that the accuracy of the trained model is the historical accuracy, and the training of the initial model is continued.
In an optional scheme, if the precision of the model obtained by training is different from the historical precision, that is, the model obtained by training satisfies that the preset condition is not met, it is determined that the training is not finished, the training needs to be continued, and the precision is used as the historical precision in the next model verification process. And judging whether the precision of the trained model is the same as the historical precision or not again, thereby determining whether the trained model meets the preset condition or not.
Optionally, in the foregoing embodiment of the present invention, acquiring the verification set includes: acquiring other images except for the sample images in the plurality of data sets; and randomly extracting image verification pairs from other images to obtain a verification set.
Optionally, the image verification pair comprises: a positive exemplar pair containing images with the same two labels and a negative exemplar pair containing images with different labels.
Specifically, in the field of face recognition, if there are K individual face pictures used for making the training set, the remaining N-K individual face pictures may be used for making the verification set. The verification set is composed of face photo verification pairs which are randomly extracted, positive sample pairs and negative sample pairs are extracted, the number of the positive sample pairs is the same as that of the negative sample pairs, and 3000 positive sample pairs and 3000 negative sample pairs are respectively selected for the verification set containing 6000 face photo verification pairs. The positive sample pair is the a picture of the nth person and the b picture of the nth person; the negative example pair is the c picture of the ith person and the d picture of the jth person. The image recognition model judges the two face images in the positive sample pair as the same person, and can determine that the verification result is correct; the image recognition model judges that the two human face pictures in the negative sample pair are not one person, and a verification structure can be determined; otherwise, the verification result is wrong.
Fig. 6 is a flowchart of an alternative image recognition method according to an embodiment of the present invention, which is described by taking the field of face recognition as an example, and as shown in fig. 6, the method includes: collecting face pictures under a plurality of scenes; carrying out face detection on the collected face pictures, and extracting the face pictures to be stored in a hard disk of the computer; manually classifying and labeling the detected and extracted face pictures, putting the face pictures belonging to the same person together and marking the face pictures; carrying out key point alignment operation on the face picture to remove the influence of the face angle on face identification; randomly extracting a face picture pair which simultaneously contains face identity information and verification information from the marked and face aligned pictures for training, namely extracting a face identity-verification training set; establishing a face recognition deep neural network model by combining a branch training algorithm, wherein the model comprises a plurality of loss functions; training the face recognition deep neural network model based on multiple data sets to obtain a trained network model; judging whether the testing precision of the trained network model on the verification set is continuously improved, namely judging whether the training end condition is reached; if not, continuing to train the model; if yes, obtaining a face recognition algorithm network model and model parameters; and deploying the trained face recognition algorithm network model into an application scene, and performing a face recognition process by comparing face features, namely, feat-ID (adopting an Euclidean distance).
The scheme provided by the embodiment can be used for bank VIP identification projects, face pictures are collected in real application scenes, and meanwhile, some public face data sets are downloaded from the Internet; then, the face pictures in the data sets are detected and aligned, and a corresponding face identity-verification training set is made; the face recognition algorithm model is trained by using the method, so that the face recognition algorithm with high recognition rate and recognition effect in a bank VIP recognition scene is obtained. The branch training face deep neural network model combining a plurality of data sets has higher accuracy than the face recognition algorithm of a universal deep learning network based on single data set training (including successive fine tuning on a plurality of data sets).
Example 2
According to an embodiment of the present invention, there is provided an embodiment of an image recognition apparatus.
Fig. 7 is a schematic diagram of an image recognition apparatus according to an embodiment of the present invention, as shown in fig. 7, the apparatus including:
the first obtaining module 72 is configured to obtain an image to be identified.
Specifically, the image to be recognized may be an image that needs to be recognized, and in the embodiment of the present invention, a human face image is taken as an example for detailed description.
A second obtaining module 74, configured to obtain a pre-established image recognition model, where the image recognition model is obtained by training an initial model through multiple training sets, the initial model is a recognition model established based on a branch training algorithm, one training set is extracted from the same data set, and different training sets are extracted from different data sets.
Specifically, in order to improve the image recognition accuracy, a plurality of training sets may be constructed in advance through a plurality of different data sets, and the initial model may be trained through the training sets, so as to obtain a final image recognition model.
In the field of face recognition, different data sets cannot be simply and directly combined into a single data set, because different data sets may contain face pictures of the same person between them, and a user cannot determine which of the same persons are contained in different data sets. A deep neural network model can be established by a set score training method to obtain an initial model, different data sets are separately subjected to branch training, so that a trained image recognition model can be obtained, and the trained image recognition model is deployed in an application scene.
And the identification module 76 is configured to identify the image to be identified by using the image identification model to obtain an identification result.
Specifically, in the field of face recognition, a face recognition process can be performed by comparing face features, feat-IDs (using euclidean distance).
In the embodiment of the application, an initial model can be established based on a branch training algorithm, the initial model is trained through a plurality of training sets generated by different data sets to obtain an image recognition model, and further an image to be recognized input by a user is recognized through the image recognition model to obtain a final recognition result. Compared with the prior art, the accuracy of the image recognition model combining the branch training of the multiple data sets is higher than that of the image recognition model trained based on a single data set, the technical effect of improving the recognition accuracy is achieved, and the technical problem that the recognition accuracy of the image recognition method in the prior art is low is solved.
Optionally, in the above embodiment of the present invention, the apparatus further includes: a third obtaining module for obtaining a plurality of data sets; the classification module is used for classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing the classification result of each image, and the labels of at least two images in the plurality of data sets are the same; and the first extraction module is used for extracting sample images from each classified data set to obtain a plurality of training sets.
Optionally, in the above embodiment of the present invention, the apparatus further includes: the second extraction module is used for extracting preset characteristics of each image in each classified data set; the alignment module is used for carrying out alignment operation on each image based on the preset characteristics of each image; and the third extraction module is used for extracting a sample image from each operated data set to obtain a plurality of training sets.
Optionally, in a case that each image is a face image, the preset features at least include one of the following: eyes, eyebrows, nose tip, and corners of the mouth.
Optionally, in the foregoing embodiment of the present invention, the third extracting module includes: an extraction unit for randomly extracting a sample image from each data set after the operation; and the first acquisition unit is used for acquiring the storage path and the label of the sample image to obtain a plurality of training sets.
Optionally, in the foregoing embodiment of the present invention, the third obtaining module includes: the second acquisition unit is used for acquiring the video image and the preset data set acquired by the acquisition equipment; and the detection unit is used for detecting the video image and the preset data set to obtain a plurality of data sets.
Optionally, in the above embodiment of the present invention, the apparatus further includes: the establishing module is used for establishing an initial model based on a branch training algorithm, wherein the initial model at least comprises: a plurality of loss functions, the plurality of loss functions corresponding to the plurality of training sets one to one; the training module is used for inputting a plurality of training sets into the initial model in parallel and training the initial model; the judging module is used for judging whether the trained model meets a preset condition or not; and the determining module is used for determining the model obtained by training as the image recognition model if the model obtained by training meets the preset condition.
Optionally, the loss function is a squared loss function.
Optionally, in the above embodiment of the present invention, the training module includes: the input unit is used for inputting the training sets into the initial model in parallel to obtain function values of a plurality of loss functions; the processing unit is used for obtaining a gradient value of each parameter in the initial model according to the function values of the loss functions and the chain type derivative algorithm; and the updating unit is used for updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.
Optionally, in the foregoing embodiment of the present invention, the determining module includes: a third obtaining unit configured to obtain a verification set; the verification unit is used for verifying the trained model by using a verification set to obtain the precision of the trained model; the judging unit is used for judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process; and the determining unit is used for determining that the trained model meets the preset condition if the precision of the trained model is the same as the historical precision.
Optionally, the precision is used to characterize the ratio of the sum of the validation results of all validation samples in the validation set to the total number of all validation samples.
Optionally, in the above embodiment of the present invention, the training module is further configured to determine that the accuracy of the trained model is the historical accuracy if the accuracy of the trained model is different from the historical accuracy, and continue training the initial model.
Optionally, in the foregoing embodiment of the present invention, the third obtaining unit is configured to obtain other images than the sample image in the multiple data sets, and randomly extract the image verification pair from the other images to obtain the verification set.
Optionally, the image verification pair comprises: a positive exemplar pair containing images with the same two labels and a negative exemplar pair containing images with different labels.
Example 3
According to an embodiment of the present invention, there is provided an embodiment of a storage medium including a stored program, wherein an apparatus in which the storage medium is located is controlled to execute the image recognition method in the above-described embodiment 1 when the program is executed.
Example 4
According to an embodiment of the present invention, an embodiment of a processor for running a program is provided, where the program executes the image recognition method in embodiment 1 described above.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (16)

1. An image recognition method, comprising:
acquiring an image to be identified;
obtaining a pre-established image recognition model, wherein the image recognition model is obtained by training an initial model through a plurality of training sets, the initial model is a recognition model established based on a branch training algorithm, one training set is extracted from the same data set, different training sets are extracted from different data sets, the different data sets are pre-obtained data under different application scenes, and the initial model at least comprises: a plurality of loss functions in one-to-one correspondence with the plurality of training sets;
identifying the image to be identified by using the image identification model to obtain an identification result;
wherein the method further comprises:
inputting the training sets into the initial model in parallel, and training the initial model;
wherein the inputting the plurality of training sets into the initial model in parallel, and the training the initial model comprises:
inputting the training sets into the initial model in parallel to obtain function values of the loss functions;
obtaining a gradient value of each parameter in the initial model according to the function values of the loss functions and a chain type derivative algorithm;
and updating the gradient value of each parameter according to a random gradient descent algorithm to obtain the trained model.
2. The method of claim 1, further comprising:
acquiring a plurality of data sets;
classifying each image in the plurality of data sets to obtain a label of each image, wherein the label is used for representing a classification result of each image, and the labels of at least two images in the plurality of data sets are the same;
and extracting sample images from each classified data set to obtain a plurality of training sets.
3. The method of claim 2, wherein prior to extracting the sample images from each classified data set, resulting in the plurality of training sets, the method further comprises:
extracting preset features of each image in each classified data set;
performing alignment operation on each image based on the preset characteristics of each image;
and extracting the sample image from each operated data set to obtain a plurality of training sets.
4. The method according to claim 3, wherein in the case that each image is a human face image, the preset features comprise at least one of: eyes, eyebrows, nose tip, and corners of the mouth.
5. The method of claim 3, wherein extracting the sample image from each of the manipulated data sets, resulting in the plurality of training sets, comprises:
randomly extracting the sample image from each data set after the operation;
and acquiring a storage path and a label of the sample image to obtain the plurality of training sets.
6. The method of claim 2, wherein acquiring a plurality of data sets comprises:
acquiring a video image and a preset data set acquired by acquisition equipment;
and detecting the video image and the preset data set to obtain the plurality of data sets.
7. The method of claim 2, further comprising:
establishing the initial model based on the branch training algorithm;
inputting the training sets into the initial model in parallel, and training the initial model;
judging whether the trained model meets a preset condition or not;
and if the model obtained by training meets the preset condition, determining the model obtained by training as the image recognition model.
8. The method of claim 7, wherein determining whether the trained model satisfies a predetermined condition comprises:
acquiring a verification set;
verifying the model obtained by training by using the verification set to obtain the precision of the model obtained by training;
judging whether the precision of the model obtained by training is the same as the historical precision, wherein the historical precision is the precision of the model obtained by training in the last verification process;
and if the precision of the model obtained by training is the same as the historical precision, determining that the model obtained by training meets the preset condition.
9. The method of claim 8, wherein if the accuracy of the trained model is different from the historical accuracy, determining the accuracy of the trained model to be the historical accuracy, and continuing to train the initial model.
10. The method of claim 9, wherein the precision is used to characterize a ratio of a sum of validation results of all validation samples in the validation set to a total number of all validation samples.
11. The method of claim 8, wherein obtaining a validation set comprises:
acquiring other images except for the sample images in the plurality of data sets;
and randomly extracting image verification pairs from the other images to obtain the verification set.
12. The method of claim 11, wherein the image verification pair comprises: the image processing method comprises a positive sample pair and a negative sample pair, wherein the positive sample pair comprises two images with the same label, and the negative sample pair comprises two images with different labels.
13. The method of claim 7, wherein the loss function is a squared loss function.
14. An image recognition apparatus, comprising:
the first acquisition module is used for acquiring an image to be identified;
a second obtaining module, configured to obtain a pre-established image recognition model, where the image recognition model is obtained by training an initial model through multiple training sets, the initial model is a recognition model established based on a branch training algorithm, a same training set is extracted from a same data set, different training sets are extracted from different data sets, the different data sets are pre-obtained data in different application scenarios, and the initial model at least includes: a plurality of loss functions in one-to-one correspondence with the plurality of training sets;
the identification module is used for identifying the image to be identified by utilizing the image identification model to obtain an identification result;
wherein the apparatus further comprises:
the training module is used for inputting a plurality of training sets into the initial model in parallel and training the initial model;
wherein the training module comprises:
the input unit is used for inputting the training sets into the initial model in parallel to obtain function values of a plurality of loss functions; the processing unit is used for obtaining a gradient value of each parameter in the initial model according to the function values of the loss functions and the chain type derivative algorithm; and the updating unit is used for updating the gradient value of each parameter according to a random gradient descent algorithm to obtain a trained model.
15. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device where the storage medium is located is controlled to execute the image recognition method according to any one of claims 1 to 13.
16. A processor, characterized in that the processor is configured to run a program, wherein the program is configured to execute the image recognition method according to any one of claims 1 to 13 when running.
CN201910101257.9A 2019-01-31 2019-01-31 Image recognition method and device Active CN109766872B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910101257.9A CN109766872B (en) 2019-01-31 2019-01-31 Image recognition method and device
PCT/CN2019/127817 WO2020155939A1 (en) 2019-01-31 2019-12-24 Image recognition method and device, storage medium and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910101257.9A CN109766872B (en) 2019-01-31 2019-01-31 Image recognition method and device

Publications (2)

Publication Number Publication Date
CN109766872A CN109766872A (en) 2019-05-17
CN109766872B true CN109766872B (en) 2021-07-09

Family

ID=66455816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910101257.9A Active CN109766872B (en) 2019-01-31 2019-01-31 Image recognition method and device

Country Status (2)

Country Link
CN (1) CN109766872B (en)
WO (1) WO2020155939A1 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766872B (en) * 2019-01-31 2021-07-09 广州视源电子科技股份有限公司 Image recognition method and device
CN110569911B (en) * 2019-09-11 2022-06-07 深圳绿米联创科技有限公司 Image recognition method, device, system, electronic equipment and storage medium
CN110674720A (en) * 2019-09-18 2020-01-10 深圳市网心科技有限公司 Picture identification method and device, electronic equipment and storage medium
CN110784465B (en) * 2019-10-25 2023-04-07 新华三信息安全技术有限公司 Data stream detection method and device and electronic equipment
CN111141412A (en) * 2019-12-25 2020-05-12 深圳供电局有限公司 Cable temperature and anti-theft dual-monitoring method and system and readable storage medium
CN111814810A (en) * 2020-08-11 2020-10-23 Oppo广东移动通信有限公司 Image recognition method and device, electronic equipment and storage medium
CN112149741B (en) * 2020-09-25 2024-04-16 北京百度网讯科技有限公司 Training method and device for image recognition model, electronic equipment and storage medium
CN112529008A (en) * 2020-11-03 2021-03-19 浙江大华技术股份有限公司 Image recognition method, image feature processing method, electronic device and storage medium
CN112395439B (en) * 2020-11-17 2024-03-01 林铭 Image database implementation method and system and network communication equipment thereof
CN112766052A (en) * 2020-12-29 2021-05-07 有米科技股份有限公司 CTC-based image character recognition method and device
CN112766162B (en) * 2021-01-20 2023-12-22 北京市商汤科技开发有限公司 Living body detection method, living body detection device, electronic equipment and computer readable storage medium
CN112818865A (en) * 2021-02-02 2021-05-18 北京嘀嘀无限科技发展有限公司 Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium
CN113052561A (en) * 2021-04-01 2021-06-29 苏州惟信易量智能科技有限公司 Flow control system and method based on wearable device
CN113657406B (en) * 2021-07-13 2024-04-23 北京旷视科技有限公司 Model training and feature extraction method and device, electronic equipment and storage medium
CN113743499B (en) * 2021-09-02 2023-09-05 广东工业大学 View angle irrelevant feature dissociation method and system based on contrast learning
CN114792426B (en) * 2021-10-25 2024-05-28 北京中电兴发科技有限公司 Image data equalization method in pedestrian attribute identification
CN114264361A (en) * 2021-12-07 2022-04-01 深圳市博悠半导体科技有限公司 Object identification method and device combining radar and camera and intelligent electronic scale
CN114782757A (en) * 2022-06-21 2022-07-22 北京远舢智能科技有限公司 Cigarette defect detection model training method and device, electronic equipment and storage medium
CN115019218B (en) * 2022-08-08 2022-11-15 阿里巴巴(中国)有限公司 Image processing method and processor
CN116612358B (en) * 2023-07-20 2023-10-03 腾讯科技(深圳)有限公司 Data processing method, related device, equipment and storage medium

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6456991B1 (en) * 1999-09-01 2002-09-24 Hrl Laboratories, Llc Classification method and apparatus based on boosting and pruning of multiple classifiers
US7813538B2 (en) * 2007-04-17 2010-10-12 University Of Washington Shadowing pipe mosaicing algorithms with application to esophageal endoscopy
US8442330B2 (en) * 2009-03-31 2013-05-14 Nbcuniversal Media, Llc System and method for automatic landmark labeling with minimal supervision
CN104715227B (en) * 2013-12-13 2020-04-03 北京三星通信技术研究有限公司 Method and device for positioning key points of human face
CN105404877A (en) * 2015-12-08 2016-03-16 商汤集团有限公司 Human face attribute prediction method and apparatus based on deep study and multi-task study
WO2017177371A1 (en) * 2016-04-12 2017-10-19 Xiaogang Wang Method and system for object re-identification
CN105975959B (en) * 2016-06-14 2019-09-03 广州视源电子科技股份有限公司 Face characteristic neural network based extracts modeling, face identification method and device
CN106503669B (en) * 2016-11-02 2019-12-10 重庆中科云丛科技有限公司 Training and recognition method and system based on multitask deep learning network
CN106778684A (en) * 2017-01-12 2017-05-31 易视腾科技股份有限公司 deep neural network training method and face identification method
CN107025443A (en) * 2017-04-06 2017-08-08 江南大学 Stockyard smoke monitoring and on-time model update method based on depth convolutional neural networks
CN107247947B (en) * 2017-07-07 2021-02-09 智慧眼科技股份有限公司 Face attribute identification method and device
CN107392164A (en) * 2017-07-28 2017-11-24 深圳市唯特视科技有限公司 A kind of Expression analysis method based on the estimation of Facial action unit intensity
CN107633242A (en) * 2017-10-23 2018-01-26 广州视源电子科技股份有限公司 Training method, device, equipment and the storage medium of network model
CN107844784A (en) * 2017-12-08 2018-03-27 广东美的智能机器人有限公司 Face identification method, device, computer equipment and readable storage medium storing program for executing
CN108509860A (en) * 2018-03-09 2018-09-07 西安电子科技大学 HOh Xil Tibetan antelope detection method based on convolutional neural networks
CN108921092B (en) * 2018-07-02 2021-12-17 浙江工业大学 Melanoma classification method based on convolution neural network model secondary integration
CN109766872B (en) * 2019-01-31 2021-07-09 广州视源电子科技股份有限公司 Image recognition method and device

Also Published As

Publication number Publication date
CN109766872A (en) 2019-05-17
WO2020155939A1 (en) 2020-08-06

Similar Documents

Publication Publication Date Title
CN109766872B (en) Image recognition method and device
CN110147726B (en) Service quality inspection method and device, storage medium and electronic device
CN111325115B (en) Cross-modal countervailing pedestrian re-identification method and system with triple constraint loss
EP2579183B1 (en) Biometric training and matching engine
US9514356B2 (en) Method and apparatus for generating facial feature verification model
CN105975959A (en) Face characteristic extraction modeling method based on neural network, face identification method, face characteristic extraction modeling device and face identification device
JP6969663B2 (en) Devices and methods for identifying the user's imaging device
CN109903053B (en) Anti-fraud method for behavior recognition based on sensor data
CN113449725B (en) Object classification method, device, equipment and storage medium
CN107633242A (en) Training method, device, equipment and the storage medium of network model
CN109919252A (en) The method for generating classifier using a small number of mark images
CN109902223A (en) A kind of harmful content filter method based on multi-modal information feature
CN111401105B (en) Video expression recognition method, device and equipment
CN108108711A (en) Face supervision method, electronic equipment and storage medium
CN110503099A (en) Information identifying method and relevant device based on deep learning
CN110443174A (en) A kind of pedestrian's recognition methods again based on decoupling self-adaptive identification feature learning
CN109858344A (en) Love and marriage object recommendation method, apparatus, computer equipment and storage medium
CN110580510A (en) clustering result evaluation method and system
CN111241873A (en) Image reproduction detection method, training method of model thereof, payment method and payment device
CN111738199A (en) Image information verification method, image information verification device, image information verification computing device and medium
CN109977735A (en) Move the extracting method and device of wonderful
CN112926557A (en) Method for training multi-mode face recognition model and multi-mode face recognition method
CN105224957B (en) A kind of method and system of the image recognition based on single sample
CN113743160A (en) Method, apparatus and storage medium for biopsy
CN110874602A (en) Image identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant