CN112364809A

CN112364809A - High-accuracy face recognition improved algorithm

Info

Publication number: CN112364809A
Application number: CN202011332325.1A
Authority: CN
Inventors: 宋强; 张颖
Original assignee: University of Science and Technology Liaoning USTL
Current assignee: University of Science and Technology Liaoning USTL
Priority date: 2020-11-24
Filing date: 2020-11-24
Publication date: 2021-02-12

Abstract

The invention provides a high-accuracy face recognition improved algorithm, which adopts the technical scheme that the face recognition improved algorithm is based on a convolutional neural network and comprises the following steps: the method comprises the following steps: selecting a network structure with double-symmetrical LeNet parallel connection; step two: respectively extracting global features and local features of the input image by adopting a DCT-LBP combined processing method; step three: cosine correction is added to the output layer Softmax regression classification to enhance generalization capability. The invention is a face recognition improved algorithm based on a convolutional neural network, which can effectively improve the face recognition accuracy and can obtain all information of face images with high accuracy.

Description

High-accuracy face recognition improved algorithm

Technical Field

The invention relates to the technical field of face recognition algorithms, in particular to a high-accuracy face recognition improved algorithm.

Background

In the prior art, the face recognition method is mainly based on deep learning, and the rate and the accuracy rate are better than those of other face recognition technologies but still need to be further improved.

The double-symmetrical LeNet network is based on a classical convolution neural network LeNet, adopts a completely symmetrical mode parallel synchronization model, and realizes a network structure of double-symmetrical LeNet parallel connection. And respectively carrying out image processing, independently obtaining high-level feature vectors, and carrying out a merging and classifying process on an output layer to obtain the effect of improving the accuracy.

The DCT-LBP joint processing process means that DCT obtains low-frequency information in a frequency domain by utilizing Fourier transform and reserves integral characteristics. LBP feature extraction describes image texture features by analyzing the relation between pixel points, and achieves a good effect in the aspect of local feature extraction.

The Softmax regression classification is a linear multi-classification model, which is an extension of the Logistic regression model. On the multi-classification problem, the real weight vector can be converted into probability distribution, so that the characteristic information classification process is realized.

Disclosure of Invention

In order to overcome the technical problem of low face recognition accuracy, the invention provides a high-accuracy face recognition improved algorithm, and the recognition accuracy is improved under the condition of not influencing the training efficiency.

In order to achieve the purpose, the invention adopts the following technical scheme:

the technical scheme adopted by the invention is that on the basis of a classical convolutional neural network, a network structure with double-symmetrical LeNet parallel connection is adopted, the characteristic extraction is carried out by utilizing a DCT-LBP combined processing method, and the cosine correction is carried out in a regression classification way on an output layer.

A high-accuracy face recognition improved algorithm comprises the following steps:

the method comprises the following steps: selecting a network structure with double-symmetrical LeNet parallel connection: based on a classical convolutional neural network model LeNet, a double-symmetry LeNet parallel connection network structure is adopted, a synchronous model adopts two parallel networks to respectively perform image processing, independently obtain high-level feature vectors and combine the high-level feature vectors in an output layer;

step two: respectively extracting global features and local features of the input image by adopting a DCT-LBP combined processing method: the LBP processing is a local binary pattern, pixel-level LBP coding is carried out on a face image, and a statistical histogram of the face image is obtained according to the coding and is used as the local feature of the face with space invariance; DCT processing is that the face image information is subjected to DCT discrete cosine transform to obtain a low-frequency coefficient which is used as the local feature of the face;

step three: when the image information reaches the output layer, comparing and classifying the face image information and information in the database by adopting Softmax regression classification to obtain correct and complete character information; cosine correction is added in regression classification, redundancy is reduced, generalization capability is enhanced, and overfitting is reduced.

Further, the second step is specifically as follows: DCT-LBP joint processing utilizes binarization processing of pixel-level LBP coding to obtain a statistical histogram of the jump times in a local binary pattern of the DCT-LBP joint processing for extracting local features of face information; the calculation formula is shown as formula (1):

wherein

In the formula: a is_cIs the central pixel gray value, a_iIs the gray value of the pixel of the surrounding neighborhood point, S is a threshold function, f_LBP(m_c，n_c) LBP value for the center pixel;

DCT discrete transformation obtains a low-frequency coefficient as the global feature of the face information; the calculation formula is shown as formula (2):

wherein

In the formula: f (u, v) is a DCT coefficient, u, v represents the coordinate position of a certain numerical value in the matrix after DCT transformation, and F (x, y) represents a plurality of numerical values in the image data matrix;

the LBP and DCT are subjected to weighted fusion joint processing, and the calculation formula is shown as formula (3):

S＝a·DCT+b·LBP (3)

in the formula: a is the weighting coefficient of DCT, b is the weighting coefficient of LBP, and a + b is 1; and S is the weighted image.

Further, in the third step, Softmax regression is a linear multi-classification model, which is an extension of Logistic regression model; on the multi-classification problem, the true weight vector can be converted into a probability distribution, and the loss function expression is shown in formula (4):

in the formula: m is the number of samples, and N is the number of categories; x is the number of_iIs the feature vector of the ith sample, y_iIs marked as a category; w and b are respectively used as weight matrix and offset vector of the full connection layer, W_jIs the jth column of the weight matrix, b_jIs the corresponding bias term;

cosine correction is added in Softmax regression classification, large intra-class variation brought by a loss function is eliminated, the intra-class variation is enabled to be more compact, the characteristics are more discriminant, and the intra-class cosine similarity loss function is shown as a formula (5):

in the formula:

the included angle between the characteristic vector of the ith sample and the corresponding class weight vector is defined;

to facilitate forward and backward propagation, the intra-class cosine similarity loss function is transformed as shown in equation (6):

in the formula: :

is the actual loss layer input and the function can effectively describe intra-class variation;

in order to make the learned features have discriminability, the features are trained under the co-supervision of Softmax loss and intra-class cosine similarity loss, and the obtained expression formula (7) is shown in the specification:

in the formula: λ is a scalar to balance the two loss functions;

and intra-class cosine correction is added between the features and the weight vectors in the output layer Softmax regression classification, so that intra-class is more compact, inter-class is far away as possible, the generalization capability of the convolutional neural network model is enhanced, the overfitting phenomenon is reduced, and the recognition rate is improved.

In the training process, in order to make the cosine similarity measurement consistent with the cosine similarity measurement in the testing stage, the Euclidean distance of the sample similarity is converted into the cosine distance, the weight and the characteristic are normalized to the same value S, so that the Euclidean distance is automatically learned, and the characteristics of the human face samples in different classes can be better separated on a hypersphere; the final joint expression is shown in formula (8):

in the formula: lambda is the balance coefficient of the normalized combined expression;

in an LFW database with Face image information of nearly 5000 people in different environments, a Casia-Web Face database is used as a training set, and an LFW database is used as a test set; and verifying the accuracy of the improved face recognition algorithm by using the data comparison of the test results.

Based on a double-symmetrical LeNet network structure, performing DCT-LBP combined processing on an input face image, respectively extracting global features and local features, adding cosine correction in output layer Softmax regression classification, confirming accuracy of identified information, and verifying effectiveness of the algorithm by using an identification result.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention adopts a network structure of double-symmetry LeNet parallel connection, utilizes a DCT-LBP combined processing method to extract the characteristics of the face information image, and performs classification matching on the global characteristics and the local characteristics by Softmax regression cosine correction on an output layer, thereby greatly improving the accuracy of face recognition. The high-accuracy face recognition improved algorithm has the advantages that the structure optimization can respectively and independently obtain high-level feature vectors to perfect feature information; meanwhile, the information of the frequency domain and the spatial domain is combined, the whole feature extraction and the local feature extraction are fully considered, and the optimal feature extraction state is achieved by combining complementation; in the classification, the generalization capability is enhanced through the regression classification after cosine correction, the overfitting phenomenon is reduced, and the accuracy is optimal;

2. because the texture detail information of the LBP feature description shows high-frequency characteristics, and the low-frequency coefficient after DCT discrete transformation has complementarity, the face feature description is better carried out by combining the characteristics, and the performance of the face detection and identification system is favorably improved. The combined processing is used for eliminating the impurities of the image, reducing the data volume, recovering the real image information and enhancing the training effect of the convolution network. The invention utilizes a DCT-LBP combined processing method to respectively extract image information of the global characteristic and the local characteristic and independently obtain a high-level characteristic vector. The problem of high-frequency information loss in the DCT process and the limitation of LBP feature extraction are made up. Therefore, the DCT-LBP combined processing has good effect in the characteristic extraction process;

3. the intra-class cosine correction is added between the features and the weight vectors in the output layer Softmax regression classification, so that intra-class compactness can be realized, inter-class distance is kept as far as possible, the generalization capability of a convolutional neural network model is enhanced, the overfitting phenomenon is reduced, and the recognition rate is improved.

Drawings

FIG. 1 is a general overview of the network architecture of the improved algorithm of the present invention;

FIG. 2 is a network training process overview of the present invention.

Detailed Description

The following detailed description of the present invention will be made with reference to the accompanying drawings.

The invention comprehensively considers the optimization of the network structure, the perfection of the image information extraction and the improvement of the classification output, adopts the network structure of the double-symmetry LeNet parallel connection, the image feature extraction method of DCT-LBP combined processing and the classification identification method after cosine correction, makes up the defects of single feature and classification overfitting, enhances the generalization capability and leads the face identification accuracy rate to be higher.

The invention provides a high-accuracy face recognition improved algorithm, which comprises the following steps:

the method comprises the following steps: selecting a network structure with double-symmetrical LeNet parallel connection: based on a classical convolutional neural network model LeNet, a double-symmetry LeNet parallel connection network structure is adopted, two parallel networks are adopted for a synchronous model, image processing is respectively carried out, high-level feature vectors can be independently obtained, and merging is carried out at an output layer.

Step two: respectively extracting global features and local features of the input image by adopting a DCT-LBP combined processing method: the LBP processing is a local binary pattern that is used to perform pixel-level LBP coding on a face image and then to obtain a statistical histogram of the coded LBP coding as a local feature of a face with spatial invariance. DCT processing means that the obtained low-frequency coefficient is used as the local feature of the human face after the human face image information is subjected to DCT discrete cosine transform. Because the texture detail information of the LBP feature description shows high-frequency characteristics, and the low-frequency coefficient after DCT discrete transformation has complementarity, the face feature description is better carried out by combining the characteristics, and the performance of the face detection and identification system is favorably improved. The combined processing is used for eliminating the impurities of the image, reducing the data volume, recovering the real image information and enhancing the training effect of the convolution network. The invention utilizes a DCT-LBP combined processing method to respectively extract image information of the global characteristic and the local characteristic and independently obtain a high-level characteristic vector. The problem of high-frequency information loss in the DCT process and the limitation of LBP feature extraction are made up. Therefore, the DCT-LBP combined processing has good effect in the characteristic extraction process.

DCT-LBP combination processing, specifically, a pixel level LBP coding binarization processing is utilized to obtain a statistical histogram of the jump times in a local binary pattern for extracting the local characteristics of the face information. The calculation formula is shown as formula (1):

wherein

In the formula: a is_cIs the central pixel gray value, a_iIs the gray value of the pixel of the surrounding neighborhood point, S is a threshold function, f_LBP(m_c，n_c) The LBP value of the central pixel.

And performing DCT discrete transformation to obtain a low-frequency coefficient as the global feature of the face information. The calculation formula is shown as formula (2):

wherein

the LBP and DCT are subjected to weighted fusion joint processing, and the extraction calculation formula is shown as formula (3):

S＝a·DCT+b·LBP (3)

Step three: and when the image information reaches the output layer after a series of processing, comparing and classifying the face image information and the information in the database by adopting Softmax regression classification to obtain correct and complete character information. Adding cosine correction to regression classification can reduce redundancy, enhance generalization ability and reduce overfitting.

The Softmax regression is a linear multi-classification model, an extension of the Logistic regression model. On the multi-classification problem, the true weight vector can be converted into a probability distribution, and the loss function expression is shown in formula (4):

in the formula: m is the number of samples, and N is the number of categories; x is the number of_iIs the feature vector of the ith sample, y_iIs marked as a category; w and b are respectively used as weight matrix and offset vector of the full connection layer, W_jIs the jth column of the weight matrix, b_jIs the corresponding bias term.

in the formula:

is the included angle between the feature vector of the ith sample and the corresponding class weight vector.

in the formula: :

is the actual loss layer input and the function can effectively describe intra-class variations.

in the formula: λ is a scalar to balance the two loss functions.

The intra-class cosine correction is added between the features and the weight vectors in the output layer Softmax regression classification, so that intra-class compactness can be realized, inter-class distance is kept as far as possible, the generalization capability of a convolutional neural network model is enhanced, the overfitting phenomenon is reduced, and the recognition rate is improved. In the training process, in order to make the cosine similarity measurement consistent with the cosine similarity measurement in the testing stage, the Euclidean distance of the sample similarity is converted into the cosine distance, the weight and the feature are normalized to the same value S, the Euclidean distance is automatically learned, and the features of the human face samples in different classes can be well separated on the hypersphere. The final joint expression is shown in formula (8):

in the formula: and lambda is the balance coefficient of the normalized combined expression.

In an LFW database with Face image information of nearly 5000 people in different environments, a Casia-Web Face database is used as a training set, and an LFW database is used as a test set. And verifying the accuracy of the improved face recognition algorithm by using the data comparison of the test results.

Through comparison data analysis in an LFW (linear frequency modulation) face database experiment, when a is 0.875 and b is 0.125 in the DCT-LBP (discrete cosine transform-binary matching pursuit) process, the extraction ratio of global features to local features is approximate to 7:1, and the identification effect on a network structure in which the cosine-corrected bisymmetric LeNet is connected in parallel is the best. The accuracy of the identification rose from 97.20% to 99.05%. The face recognition improvement algorithm with high accuracy rate achieves good effect.

The high-accuracy face recognition improved algorithm has the advantages of network structure optimization, combination improvement of frequency domain and space domain in feature extraction, and cosine correction optimization in regression classification. And the complementary binding property between the whole feature extraction and the local feature extraction and the generalization capability of classification and identification are considered to complement each other, so that the optimal identification accuracy is achieved.

The invention is described in further detail below with reference to the figures and the detailed description.

Firstly, a structure of parallel connection of a double-symmetrical LeNet network is adopted, local features and global features of face image information are respectively extracted by a DCT-LBP combined processing method, finally, intra-class cosine correction is introduced in regression classification of an output layer by combining feature information in a full connection layer through network training such as convolution, down sampling and the like, and the training effect is good.

The invention provides a high-accuracy face recognition improved algorithm, which is a general overview of a network structure of the improved algorithm shown in fig. 1, and is a network training process overview of the improved algorithm shown in fig. 2. The method comprises the following steps:

step 1, a structure of parallel connection of a double-symmetrical LeNet network is used, a synchronous model adopts two parallel networks to respectively perform image processing, high-level feature vectors can be independently obtained, and merging is performed at an output layer.

And 2, extracting the characteristic information of the face image respectively by using a DCT-LBP combined processing method. The LBP processing is actually a local binary pattern, and performs pixel LBP coding on face image information to obtain a statistical histogram of the coding as a local feature of a face with spatial invariance. And DCT processing means that after the face image information is subjected to DCT discrete cosine transform, the obtained low-frequency coefficient is used as the global feature of the face. The DCT-LBP combined processing combines and complements the advantages of feature extraction of the two, and has better performance in image information extraction in face recognition.

The calculation formula for obtaining the local feature information by the LBP binarization processing is shown in formula (1):

wherein

wherein

S＝a·DCT+b·LBP (3)

and 3, after a series of processing, the image information reaches an output layer, and the face image information and the information in the database are compared and classified by adopting Softmax regression classification to obtain correct and complete figure information. The Softmax regression is a linear multi-classification model, an extension of the Logistic regression model. On the multi-classification problem, the true weight vector can be converted into a probability distribution, and the loss function expression is shown in formula (4):

in the regression classification, intraclass cosine correction is added, large intraclass change brought by a loss function is eliminated, so that the intraclass cosine correction is more compact, the characteristics are more discriminative, and the intraclass cosine similarity loss function is shown as a formula (5):

in the training process, in order to make the cosine similarity measurement consistent with the cosine similarity measurement in the testing stage, the Euclidean distance of the sample similarity is converted into the cosine distance, the weight and the feature are normalized to the same value S, the Euclidean distance is automatically learned, and the features of the human face samples in different classes can be well separated on the hypersphere. The network training has the capability of reducing redundancy, enhancing generalization capability and reducing overfitting phenomenon. The final joint expression is shown in formula (6):

and step 4, combining the feature extraction of the face image processed by the combination of the double symmetric network LeNet and the DCT-LBP and the output of the Softmax regression classification after cosine correction, so that the face recognition has very high accuracy in the improved algorithm.

The invention relates to a high-accuracy face recognition improved algorithm, and provides a new and more perfect high-accuracy algorithm for face image information recognition. And the superiority and effectiveness of the improved algorithm are proved through specific experimental data, the problem of low accuracy of the deep learning-based face recognition network training is solved to a great extent, the face recognition operation result of the improved algorithm is more accurate, and the method has great significance in the image processing fields of face recognition, artificial intelligence and the like.

The above embodiments are implemented on the premise of the technical solution of the present invention, and detailed embodiments and specific operation procedures are given, but the scope of the present invention is not limited to the above embodiments. The methods used in the above examples are conventional methods unless otherwise specified.

Claims

1. A high-accuracy face recognition improved algorithm is characterized in that a network structure with double-symmetry LeNet parallel connection is adopted, feature extraction is carried out by using a DCT-LBP combined processing method, and regression classification cosine correction is carried out on an output layer, and the method comprises the following steps:

2. The improved algorithm for high-accuracy face recognition according to claim 1, wherein the second step is specifically as follows: DCT-LBP joint processing utilizes binarization processing of pixel-level LBP coding to obtain a statistical histogram of the jump times in a local binary pattern of the DCT-LBP joint processing for extracting local features of face information; the calculation formula is shown as formula (1):

wherein

wherein

S＝a·DCT+b·LBP (3)

3. The improved algorithm for high accuracy face recognition according to claim 1, wherein in step three, the Softmax regression is a linear multi-classification model, which is an extension of Logistic regression model; on the multi-classification problem, the true weight vector can be converted into a probability distribution, and the loss function expression is shown in formula (4):

in the formula: m is the number of samples, and N is the number of categories; x is the number of_iIs the feature vector of the ith sample, y_iIs marked as a category; w and b are respectively asWeight matrix and offset vector, W, of fully-connected layers_jIs the jth column of the weight matrix, b_jIs the corresponding bias term;

in the formula:

intra-class cosine correction is added between the features and the weight vectors in the output layer Softmax regression classification, so that intra-class is more compact, inter-class is far away as possible, the generalization capability of a convolutional neural network model is enhanced, the overfitting phenomenon is reduced, and the recognition rate is improved;

in order to make the cosine similarity measurement consistent with the cosine similarity measurement in the testing stage, the Euclidean distance of the sample similarity is converted into the cosine distance, the weight and the characteristics are normalized to the same value S, so that the Euclidean distance is automatically learned, and the characteristics of the human face samples in different classes can be well separated on a hypersphere; the final joint expression is shown in formula (6):

4. The improved algorithm for high-accuracy Face recognition according to claim 1, wherein in an LFW database with Face image information of nearly 5000 people in different environments, a Casia-Web Face database is used as a training set, and an LFW database is used as a test set; and verifying the accuracy of the improved face recognition algorithm by using the data comparison of the test results.