CN113887561B

CN113887561B - Face recognition method, device, medium and product based on data analysis

Info

Publication number: CN113887561B
Application number: CN202111032903.4A
Authority: CN
Inventors: 杨政华; 吴志伟; 杨海军; 黄振杰
Original assignee: Guangdong Lvan Industry And Commerce Co ltd
Current assignee: Guangdong Lvan Industry And Commerce Co ltd
Priority date: 2021-09-03
Filing date: 2021-09-03
Publication date: 2022-08-09
Anticipated expiration: 2041-09-03
Also published as: CN113887561A

Abstract

The invention provides a face recognition method based on data analysis, which comprises the following steps: analyzing the distribution of the data set by adopting relative entropy to obtain the distribution degree of the long tail of the face training data set; calculating a smoothing coefficient through an inverse correlation relation between the data long tail distribution degree and the smoothing coefficient; calculating the algorithm learning rate under the label smoothness through the learning rate and the smoothness coefficient of the face recognition under the condition of not using the label smoothness; calculating a cross entropy loss function by taking the smoothing coefficient and the learning rate as model hyper-parameters; and updating the weight parameters of the neural network through a cross entropy loss function, and identifying the face image based on the trained neural network model. The method quantitatively calculates the long-tail distribution degree of the training data based on the relative entropy, avoids the long-tail distribution of qualitative description data, and cannot distinguish the difference between different data sets; the design of the label smoothing parameters improves the face recognition performance, the design learning rate fully excavates positive and negative sample data information in the training process, and the model is guaranteed to be rapidly and fully learned.

Description

Face recognition method, device, medium and product based on data analysis

Technical Field

The invention relates to the technical field of face recognition, in particular to a face recognition method, device, medium and product based on data analysis.

Background

The main objective of the face recognition algorithm is to map low-dimensional input face images into high-dimensional features, the similarity of the face features of a single ID is high, and the similarity of the face features of different IDs is low, namely the face recognition algorithm is compact in class and dispersed among classes. In order to achieve the goal, a mainstream face recognition algorithm is designed aiming at a loss function and assisted by training strategies such as label smoothing, learning rate attenuation and the like, and hyper-parameters related to a training process are generally set based on experience and can obtain an optimal value only by adjusting parameters for many times. The face recognition algorithm needs a larger training data set, and meanwhile, the uniform data set can obviously improve the performance of the algorithm, but the current algorithm does not fully mine the relation between the data set and the hyper-parameters so as to achieve the aim of fully utilizing the existing data.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a face recognition method based on data analysis, which solves the problem that a face recognition algorithm does not fully mine the relation between a data set and a hyper-parameter so as to achieve the aim of fully utilizing the existing data.

The invention provides a face recognition method based on data analysis, which comprises the following steps:

calculating the long tail distribution degree, and analyzing the distribution of the data set by adopting relative entropy to obtain the long tail distribution degree of the face training data set;

calculating a label smoothing parameter, and calculating a smoothing coefficient through an inverse correlation relation between the data long tail distribution degree and the smoothing coefficient;

calculating a learning rate, namely calculating an algorithm learning rate under label smoothness by using the learning rate of the face recognition under the condition of not using label smoothness and the smoothing coefficient;

calculating a cross entropy loss function, and calculating the cross entropy loss function by taking the smoothing coefficient and the learning rate as model hyper-parameters;

and face recognition, namely updating weight parameters of the neural network through a cross entropy loss function, and recognizing a face image based on the trained neural network model.

Further, in the step of calculating the long-tail distribution degree, the long-tail distribution degree of the data is expressed as:

wherein D is _KL(data) The distribution degree of the long tail of the face training data set is represented, the relative entropy of the data set which is uniformly distributed in an ideal state and the actual probability distribution of the current data set is represented, and S is { S ═ S ₀ ,s ₁ ,s ₂ …,s _n Is a face recognition training set, s _i Number of face images per ID, n number of data set ID, q _i True probability distribution for ith personal face ID with respect to data set, p _i Is a uniform distribution with respect to n.

Further, the inverse correlation relationship between the data long-tail distribution degree and the smoothing coefficient is

Where ε is a smoothing coefficient designed based on data analysis.

Further, the algorithm learning rate calculation formula under the label smoothing is as follows

Wherein, lr _norm Science representing face recognition without tag smoothingLearning rate, lr _{label_smooth} Representing the algorithm learning rate under label smoothing.

Further, the cross entropy loss function is calculated by the formula

Wherein x is _class Representing the current positive sample, y _class Is the cross entropy loss value.

An electronic device, comprising: a processor;

a memory; and a program, wherein the program is stored in the memory and configured to be executed by the processor, the program comprising instructions for performing a method of face recognition based on data analysis.

A computer-readable storage medium, on which a computer program is stored, which computer program is executed by a processor for a method of face recognition based on data analysis.

A computer program product comprising a computer program/instructions which, when executed by a processor, implement a method of face recognition based on data analysis.

Compared with the prior art, the invention has the beneficial effects that:

the invention provides a face recognition method based on data analysis, which guides hyper-parameter design by quantitatively analyzing the long tail distribution information of training data. Firstly, the long-tail distribution degree of training data is quantitatively calculated based on relative entropy, so that the long-tail distribution of qualitative description data is avoided, and the difference between different data sets cannot be distinguished; label smoothing parameters are designed quantitatively based on the long-tail distribution degree, and the face recognition performance is improved; and the learning rate is designed based on the label smoothing coefficient, positive and negative sample data information in the training process is fully mined, and the model is ensured to be rapidly and fully learned. The method solves the problem that the face recognition algorithm does not fully mine the relation between the data set and the hyper-parameters so as to achieve the aim of fully utilizing the existing data.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood and to implement them in accordance with the contents of the description, the following detailed description is given with reference to the preferred embodiments of the present invention and the accompanying drawings. The detailed description of the present invention is given in detail by the following examples and the accompanying drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow chart of a face recognition method based on data analysis according to the present invention;

FIG. 2 is a flowchart of a hyperparameter solution process according to an embodiment of the invention;

fig. 3 is a flowchart of a face recognition training process according to an embodiment of the present invention.

Detailed Description

The present invention will be further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the embodiments or technical features described below can be used to form a new embodiment without conflict.

A face recognition method based on data analysis, as shown in fig. 1-3, includes the following steps:

and calculating the distribution degree of long tails, wherein the data is the basis of a deep learning face recognition algorithm, and the magnitude and the distribution mode of the data determine the final performance of face recognition. In the actual acquisition process, images of a few IDs can be acquired in a large quantity through the Internet, and most IDs can only acquire limited data, so that the current public data set presents obvious long-tail distribution. Practice proves that the uniformly distributed data is more friendly to face recognition and better in algorithm performance, but the long tail distribution phenomenon is not quantitatively analyzed at present so as to guide the design of the hyper-parameters. In the embodiment, relative entropy is adopted to analyze data set distribution, the long tail distribution phenomenon of a training data set is quantitatively analyzed at first, and quantitative indexes of the long tail distribution of the data are given for guiding the design of a label smoothing coefficient and a learning rate hyper-parameter in the training process. The specific method comprises the following steps:

the definition of relative entropy (KL divergence) is

Introducing KL divergence to quantitatively express the long tail distribution phenomenon of data, and assuming that a face recognition training set is S ═ S ₀ ,s ₁ ,s ₂ …,s _n In which s is _i Defining probability distribution p in divergence of long tail distribution KL for the number of face images of each ID _i Comprises the following steps:

where n is the number of IDs of the data set, defining q _i Is the true probability distribution for the data set:

the long tail distribution of the data is expressed as:

wherein D is _KL(data) The distribution degree of the long tail of the face training data set is represented, and the distribution degree is the relative entropy of the data set in an ideal state, wherein the data set is uniformly distributed and the actual probability distribution of the current data set.

Calculating a label smoothing parameter, wherein label smoothing is an important skill for improving the performance of a face recognition model, and a label smoothing coefficient is generally designed based on experience, but has different values under different data distributions. The more serious the long tail distribution phenomenon of the training data is, the smaller the label smoothing coefficient is, so as to promote the learning of positive samples and reduce the learning of negative samples; and under the condition that the long tail distribution phenomenon is not serious, the label smoothing coefficient can be increased, and negative sample learning is promoted. The distribution degree of the long tail of the data is inversely proportional to the smooth coefficient epsilon, when the distribution degree of the long tail of the data is 0.52 and the smooth coefficient is 0.1, the performance of the face recognition model is optimal, and the relation between the smooth coefficient and the distribution degree of the long tail is defined as follows:

the quantitative design of the smoothing coefficient epsilon can reduce the parameter adjusting times of algorithm personnel and save the training time and hardware resources of the face recognition algorithm.

Calculating a learning rate, wherein under the same learning rate, the update amplitude of the weight parameter of the label smoothing model is inconsistent with that of the weight parameter of the label smoothing model without the label smoothing model, and the update amplitude of the weight parameter of the positive sample under the condition of label smoothing is obviously smaller than that of the weight parameter without the label smoothing model, and analyzing as follows:

wherein, y _norm Representing an untagged smooth parameter gradient, y _{label_smooth} Indicating the parameter gradient after adding the label and smoothing. Each time the parameter is updated, the positive sample parameter update without tagging smoothing is greater in magnitude than the tagging smoothing, and the magnitude is ε. Therefore, in order to guarantee equal learning of the positive sample, the algorithm learning rate under the label smoothness is calculated by the learning rate and the smoothness coefficient of the face recognition under the condition of not using the label smoothness. The learning rate and the smoothing parameters under the label smoothing are designed as follows:

lr _norm indicating the learning rate, lr, of face recognition without label smoothing _{label_smooth} Representing the algorithm learning rate under label smoothing. By adjusting the learning rateThe parameters can further mine the advantage of label smoothness, and the learning process of the model can be accelerated due to the larger learning rate.

And calculating a cross entropy loss function, and calculating the cross entropy loss function by taking the smoothing coefficient and the learning rate as model hyper-parameters. The loss function under label smoothing is:

An electronic device, comprising: a processor;

A computer-readable storage medium, on which a computer program is stored, the computer program being executed by a processor for a method of face recognition based on data analysis.

The foregoing is merely a preferred embodiment of the invention and is not intended to limit the invention in any manner; those skilled in the art can readily practice the invention as shown and described in the drawings and detailed description herein; however, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the scope of the invention as defined by the appended claims; meanwhile, any changes, modifications, and evolutions of the equivalent changes of the above embodiments according to the actual techniques of the present invention are still within the protection scope of the technical solution of the present invention.

Claims

1. A face recognition method based on data analysis is characterized by comprising the following steps:

calculating a cross entropy loss function, and taking the smoothing coefficient and the algorithm learning rate under the label smoothing as a model hyper-parameter to calculate the cross entropy loss function;

2. A method of face recognition based on data analysis as claimed in claim 1, characterized in that: in the step of calculating the long tail distribution degree, the long tail distribution degree of the data is expressed as:

wherein D is _KL(data) The distribution degree of the long tail of the face training data set is represented, and the data set is uniform in an ideal stateRelative entropy of distribution to actual probability distribution of current data set, S ═ S ₀ ,s ₁ ,s ₂ …,s _n Is the face recognition training set, s _i Number of face images per ID, n number of data set ID, q _i True probability distribution, p, for the ith personal face ID with respect to the data set _i Is a uniform distribution with respect to n.

3. A method of face recognition based on data analysis as claimed in claim 2, characterized in that: the inverse correlation relation between the data long tail distribution degree and the smooth coefficient is

Where ε is a smoothing coefficient designed based on data analysis.

4. A face recognition method based on data analysis as claimed in claim 3, characterized in that: the algorithm learning rate calculation formula under the label smoothness is

Wherein, lr _norm Indicating the learning rate, lr, of face recognition without label smoothing _{label_smooth} Representing the algorithm learning rate under label smoothing.

5. A face recognition method based on data analysis as claimed in claim 3, characterized in that: the cross entropy loss function calculation formula is

6. An electronic device, characterized by comprising: a processor;

a memory; and a program, wherein the program is stored in the memory and configured to be executed by the processor, the program comprising instructions for carrying out the method according to any one of claims 1-5.

7. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program is executed by a processor for performing the method according to any of claims 1-5.