CN110298225A

CN110298225A - A method of blocking the human face five-sense-organ positioning under environment

Info

Publication number: CN110298225A
Application number: CN201910242013.2A
Authority: CN
Inventors: 舒畅; 李阳; 周宁; 傅志中; 李晓峰
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2019-03-28
Filing date: 2019-03-28
Publication date: 2019-10-01

Abstract

The human face five-sense-organ location algorithm that the invention discloses a kind of in the case where blocking environment, belongs to field of image processing.Its overall step are as follows: S1: data set is simultaneously divided into training set and test set two parts by building data set；S2: being normalized data set and carries out data set expansion to it；S3: building human face five-sense-organ point occlusion detection model；S4: training face facial feature localization coarse localization model and the rough position for predicting human face five-sense-organ point；S5: the target position of the result of fusion coarse localization model and occlusion detection model creation finely positioning model；S6: training face facial feature localization finely positioning model simultaneously predicts finely positioning stage position.The present invention proposes a kind of human face five-sense-organ location algorithm, solves the problems, such as currently to can be applied to block under the human face five-sense-organ positioning scene in more serious situation to block environment human face facial feature localization algorithm inaccurate.

Description

A method of blocking the human face five-sense-organ positioning under environment

Technical field

The present invention relates to technical field of image processing, and in particular to a kind of human face five-sense-organ location algorithm blocked under environment.

Background technique

Human face five-sense-organ positioning or face alignment (such as eyes, nose, mouth and chin) are for such as face recognition, face The task of portion's tracking, FA Facial Animation and 3 dimension D facial modellings etc is essential.With current personal and network photo quick-fried Fried formula increases, and needs a kind of full-automatic, highly efficient and robust facial alignment schemes.For working as front in free environment Method, these requirements are still challenging, and due to facial appearance, illumination and partial occlusion will lead to many face points position very Difficulty is fixed.

The target of human face five-sense-organ positioning is to accurately position some face points.Accurately human face five-sense-organ positioning is for very It is all a very crucial step: such as face verification, facial Expression Analysis, motion analysis for more visual tasks.These tasks are all Need facial feature localization as its pretreated step.From the perspective of entirety, human face five-sense-organ positioning can be expressed as in face The problem of predefined face points (also referred to as face shape) is searched on portion's image, usually since rough original shape, And gradually estimate that accurately shape is until convergence.In search process, usually using two different information sources: facial appearance And shape information.The latter is intended to the spatial relationship between the clearly position of simulation of facial point, to ensure that the face points estimated can To form effective face shape.Some methods do not use shape information clearly, but usually combine both information sources Get up.Although many methods all achieve relatively good achievement in public data collection, for some rugged environments, Such as block and compare under serious situation, for example wear dark sunglasses or in the case of face blocked by other barriers, this The performance of a little methods is not good enough, is primarily due to local minimum problem.It thus must be to the circumstance of occlusion of each face point position Differentiated, then carries out unrelated crucial point location using this some of supplemental information.

The more popular method of academic circles at present is building Gradual regression analysis model.Local binarization characteristic model is using at random Forest extracts the binaryzation feature near picture face point, and each stage is then completed using support vector regression model Recurrence task；The block message that conditional regression tree-model extracts around human face five-sense-organ point using random forest is put into support vector and returns Return and obtains predicted value in model；Some think that the secondary attribute of face can help model prediction face point position, and propose Dynamic is early to stop tactful (early stopping) to solve the overfitting problem of secondary attribute study, to improve the essence of model Exactness；The all not explicit solution provided under circumstance of occlusion of the above method, thus under large area circumstance of occlusion Human face five-sense-organ positioning performance is not good enough.

Summary of the invention

Present invention aims at at present in a natural environment since partial occlusion causes face key point location inaccurate True problem provides a kind of human face five-sense-organ location algorithm for introducing occlusion detection module, mainly comprises the steps that

Step 1: data prediction: reading the accurate position of face of facial image and the face of correspondence image point in training set It sets, facial image is converted into gray level image, and whitening processing is carried out to facial image, the facial image after being normalized, Subtract mean value divided by standard deviation, as shown in formula one.

Wherein I_iWhat is indicated is i-th image for inputting facial image, what μ and σ divided that table indicates be training set mean value and Standard deviation, IR_iWhat is indicated is i-th image after normalized.

Step 2: data set expands: carrying out data set expansion to the training set that public data is concentrated.By reversion, translate, Facial image in training set is extended for original 20 times by rotation process, and data set, which expands in step, will include to face Amplification and reduce, since it is considered that actual conditions human face it is shared in picture ratio it is different caused by result difference.

Step 3: seeking average shape: utilizing the face point position of the training set after the expansion got in step 2 Solve average value, the original shape S as face point initial alignment₀, and by S₀Be converted to S_0,i, i.e., every face figure in training set In the coordinate system of picture.

Step 4: building occlusion detection model: creation is based on occlusion detection more of mobile network's model (MobileNet) Business regression model, the training set constructed using in step 2 export occlusion state vector, the often one-dimensional of vector is face as input The state being blocked at some face point position in image, as shown in formula two.

Wherein j=0, what 1 ..., J-1, J were indicated is the number of human face five-sense-organ key point.

Step 5: building coarse localization model: being based on convolutional neural networks model (Convolutional Neural Networks) building human face five-sense-organ positions regression model, and for i-th facial image, the target of model prediction is actual position Information S_gt,iWith S_0,iBetween discrepancy delta S_r,i, as shown in formula three.

ΔS_r,i=S_gt,i-S_0,iFormula three

Step 6: generate be accurately positioned simulated target position: the human face five-sense-organ point occlusion state that combining step four obtains to Human face five-sense-organ point rough position obtained in amount and step 5, obtains modified finely positioning phase targets position, such as formula four It is shown.

What wherein weight was indicated is the amplitude for blocking scaling at a position,What is indicated is the occlusion state of prediction, Δ S_r,i,jThat indicate is the difference between accurate positioning and coarse localization, S_p,i,jWhat is indicated is the target position in finely positioning stage.

Step 7: building finely positioning model: being based on convolutional neural networks model (Convolutional Neural Networks finely positioning stage regression model) is constructed, the target position in training is the target position solved in step 6, In building model process in view of influence of the magnitude of inclination to positioning result of face, thus model additional studies face is rectified Positive matrix coefficient.The facial image of input is can to further decrease the fitting of model in this way through the image after overcorrection Difficulty obtains the output result in finely positioning stage by anti-corrective operations after models fitting.

Step 8: obtaining final result of the invention, formula to the anti-scaling of finely positioning prediction result that step 7 obtains As shown in formula five.

WhereinIndicate be prediction finely positioning as a result, eps indicate be a very little positive number, in order to prevent Except 0 phenomenon, and S_o,i,jWhat is indicated is the final prediction result of algorithm.

The beneficial effects of the invention are as follows the deep learning models used stage by stage, under occlusion on facial image The specific location of face point is predicted.It is pre- for the face point position for blocking place in order to make up conventional face's facial feature localization model Indeterminable disadvantage, algorithm predict whether each face point is blocked by occlusion detection, mend for face key point position prediction Additional information has been filled, model accuracy is further increased.

Detailed description of the invention

Fig. 1 is a kind of flow chart of the method for human face five-sense-organ positioning blocked under environment provided by the invention

Fig. 2 is algorithm input gray level facial image

Fig. 3 is mean value picture (a) and standard deviation picture (b) in training set facial image

Fig. 4 is coarse localization result figure

Fig. 5 is the coarse localization result figure for adding block information

Fig. 6 is algorithm final effect figure

Specific embodiment

Here is that the present invention is further illustrated in conjunction with the accompanying drawings and embodiments.

The purpose of the present embodiment is that the position of each face key point of a width facial image is predicted under occlusion Out, following steps are specifically included, general process is as shown in Figure 1:

Step 1: data prediction: reading the accurate position of face of facial image and the face of correspondence image point in training set Set, facial image be converted into gray level image, as shown in Fig. 2, and to facial image carry out whitening processing, after being normalized Facial image, the facial image after being normalized subtract mean value divided by standard deviation, the mean value image and standard difference image of image As shown in Figure 3.

Step 2: data set expands: carrying out data set expansion to the training set that public data is concentrated.

Step 4: building occlusion detection model: creation is based on occlusion detection more of mobile network's model (MobileNet) Business regression model, the training set constructed using in step 2 export occlusion state vector, the often one-dimensional of vector is face as input The state being blocked at some face point position in image, as shown in formula one.

Step 5: building coarse localization model: being based on convolutional neural networks model (Convolutional Neural Networks) building human face five-sense-organ positions regression model, and for i-th facial image, the target of model prediction is actual position Information S_gt,iWith S_0,iBetween discrepancy delta S_r,i, as shown in formula two, the human face five-sense-organ position result of coarse localization model prediction As shown in figure 4, the occlusion state information in additional step four, as shown in figure 5, figure orbicular spot is the unshielding point of prediction, and is pitched Point prediction is to block a little.

ΔS_r,i=S_gt,i-S_0,iFormula two

Step 6: generate be accurately positioned simulated target position: the human face five-sense-organ point occlusion state that combining step four obtains to Human face five-sense-organ point rough position obtained in amount and step 5, obtains modified finely positioning phase targets position, such as formula three It is shown,

Step 8: obtaining final result of the invention, formula to the anti-scaling of finely positioning prediction result that step 6 obtains As shown in formula four.

WhereinIndicate be prediction finely positioning as a result, eps indicate be a very little positive number, in order to prevent Except 0 phenomenon, and S_o,i,jWhat is indicated is the final prediction result of algorithm, as shown in Figure 6.

Above embodiment is not limitation of the present invention, and the present invention is also not limited to the example above, this technology neck The variations, modifications, additions or substitutions that the technical staff in domain is made within the scope of technical solution of the present invention, also belong to this hair Bright protection scope.

Claims

1. a kind of method for the human face five-sense-organ positioning blocked under environment, it is characterised in that the overall step of the method are as follows:

Step 1: data prediction: the face exact position of facial image and the face of correspondence image point in training set is read, it will Facial image is converted to gray level image, and carries out whitening processing to facial image, the facial image after being normalized；

Step 2: data set expands: carrying out data set expansion to the training set that public data is concentrated；

Step 3: seeking average shape: being solved using the face point position of the training set after the expansion got in step 2 Average value, the original shape S as face point initial alignment₀, and by S₀Be converted to S_0,i, i.e., every facial image in training set In coordinate system；

Step 4: building occlusion detection model: creation is returned based on the occlusion detection multitask of mobile network's model (MobileNet) Return model, the training set constructed using in step 2 exports occlusion state vector, the often one-dimensional of vector is facial image as input In the state that is blocked at some face point position, as shown in formula one；

Wherein j=0, what 1 ..., J-1, J were indicated is the number of human face five-sense-organ key point；

Step 5: building coarse localization model: being based on convolutional neural networks model (Convolutional Neural Networks) building human face five-sense-organ positions regression model, and for i-th facial image, the target of model prediction is actual position Information S_gt,iWith S_0,iBetween discrepancy delta S_r,i, as shown in formula two；

ΔS_r,i=S_gt,i-S_0,iFormula two

Simulated target position: the human face five-sense-organ point occlusion state vector sum that combining step four obtains is accurately positioned Step 6: generating The point rough position of human face five-sense-organ obtained in step 5, obtains modified finely positioning phase targets position, as shown in formula three；

What wherein weight was indicated is the amplitude for blocking scaling at a position,That indicate is the occlusion state of prediction, Δ S_r,i,j That indicate is the difference between accurate positioning and coarse localization, S_p,i,jWhat is indicated is the target position in finely positioning stage；

Step 7: building finely positioning model: being based on convolutional neural networks model (Convolutional Neural Networks finely positioning stage regression model) is constructed, the target position in training is the target position solved in step 6, In building model process in view of influence of the magnitude of inclination to positioning result of face, thus model additional studies face is rectified Positive matrix coefficient.The facial image of input is can to further decrease the fitting of model in this way through the image after overcorrection Difficulty obtains the output result in finely positioning stage by anti-corrective operations after models fitting；

Step 8: obtaining final result of the invention to the anti-scaling of finely positioning prediction result that step 6 obtains, formula is such as public Shown in formula four；

WhereinIndicate be prediction finely positioning as a result, eps indicate be a very little positive number, in order to prevent except 0 is existing As, and S_o,i,jWhat is indicated is the final prediction result of algorithm.

2. method as claimed in claim 1, it is characterised in that: occlusion detection model is based on general tight in the step 4 Gather type convolutional neural networks model creation, its object is to the arithmetic speeds of acceleration model；The target of model is returned for multitask Return model, is capable of the ability of abundant mining model extraction feature.

3. method as claimed in claim 1, it is characterised in that: the side of scaling and anti-scaling in the step 5 and step 7 Method can evade encountered in human face five-sense-organ location algorithm because of the local minimum problem caused by occlusion issue, thus further Model is improved in the prediction accuracy for blocking place.