CN105224935A

CN105224935A - A kind of real-time face key point localization method based on Android platform

Info

Publication number: CN105224935A
Application number: CN201510713055.1A
Authority: CN
Inventors: 刘青山; 王东; 杨静; 邓健康
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2015-10-28
Filing date: 2015-10-28
Publication date: 2016-01-06
Anticipated expiration: 2035-10-28
Also published as: CN105224935B

Abstract

The invention discloses a kind of real-time face key point localization method based on Android platform, belong to technical field of computer vision.The inventive method comprises the following steps: collect face training plan sheet collection, and demarcate key point; N sample in random selecting training set is as the original shape of each training sample; Calculate each training sample standardization target; Extract the shape indexing feature of each key point; Correlation analysis is adopted to select appropriate feature; Adopt two-layer enhancing regressive structure (exterior layer and interior layer); Calculate the recurrence device in each stage; The method of Face datection is used to estimate face window, according to the forecast of regression model face key point position trained.Current existing method computation complexity is higher, runs slow on a mobile platform; And to noise-sensitive, the precision of location is low.The present invention carrys out constraint shapes with the linear combination of sample, applies the precision and the efficiency that improve face key point location based on the method returned.

Description

A kind of real-time face key point localization method based on Android platform

Technical field

The present invention relates to technical field of computer vision, relate to a kind of face key point localization method.

Background technology

Face datection and key point location technology, as the gordian technique of computer vision research, have been widely used in the aspects such as intelligent monitoring, identification, Expression analysis at present.Face key point location refers to specific face organ such as the eyes, face, nose etc. that accurately navigate to people in facial image, and obtains its geometric parameter, thus provides information accurately for the research such as Expression analysis or recognition of face.

At present, the location of face key point is existing on computers to be realized preferably, real-time and accuracy are all very high, and representational work has active shape model (ActiveShapeModel, ASM), Bayes tangent space shape (BayesianTangentShapeModel, BTSM), active appearance models (ActiveAppearanceModel, AAM), affined partial model (ConstrainedLocalModel, CLM) etc.But existing algorithm computation complexity is higher, rarely has application calculating, on the limited mobile platform of storage capacity.

Nowadays, in state-owned several hundred million intelligent terminal (mainly mobile phone, panel computer) user, intelligent terminal becomes the information carrier of people gradually.Mobile platform performance have also been obtained and significantly promotes, and this is just for human face characteristic point real time location tracking on mobile platform provides possibility.But the general computation complexity of current existing facial modeling track algorithm is higher, memory consumption is comparatively large, processing speed is comparatively slow, is difficult to be grafted directly to mobile platform.

Summary of the invention

In order to solve the problem, the invention discloses high-level efficiency, high-precision " explicit shape recurrence " face key point localization method, predicting the key point of whole face by directly learning a vector regression function.Intrinsic shape constraining is encoded in our cascade learning framework naturally, and is applied in the process of test by from coarse to fine.The present invention tries hard to the performance at many aspects General Promotion algorithm, be embodied in Feature Selection and larger improvement has been done in model training aspect, algorithm operational efficiency is significantly promoted, and ensure that the accuracy of location simultaneously, the detection and location carrying out human face characteristic point on a mobile platform in real time can be realized.

The present invention is divided into two stages of training and testing.Training stage mainly learns regression model.Be different from most of homing method, this method is not the shape with preset parameter.The present invention predicts the key point of whole face by directly learning a vector regression function, then on training set, explicitly minimizes positioning error.

In order to realize efficient recurrence, we use simple pixel difference feature, i.e. the intensity difference of two pixels in the picture.Such feature calculation complexity is very little, is exactly the position according to key point and a side-play amount, obtains the pixel value of this position, then the difference of the pixel that calculating two is such, thus obtains shape indexing feature.What adopt in the method is local coordinate but not global coordinate system, enhances the robustness of feature greatly.

In order to make pixel difference feature keep geometric invariance, by estimating that the shape obtained carrys out index pixel.Here index pixel Δ is carried out with local coordinate ^l=(Δ x ^l, Δ y ^l), l is a standardization key point on the face.Such indexed mode can to yardstick, and the changes such as rotation maintain the invariance and make algorithm robust more.This index also can help us around the point of some static state, list more how useful feature (point of such as eye center is darker than nose, but the center of two eyes is again similar) simultaneously.In the process that reality realizes, we convert back local coordinate system in the global coordinate system of original image and obtain shape indexing pixel, then in calculating pixel difference feature.Such method makes the travelling speed of test phase faster.If the estimation shape of a sample is S.So π is passed through in the position of m key point _lο S obtains, π here _lthe x of m key point is obtained, y coordinate from shape vector.Δ ^lfor local coordinate system, corresponding global coordinate system can be expressed as so on the original image for different sample Δs ^lbe identical, but the global coordinate system for extracting underlying pixel data point correspondingly can regulate and ensure geometric invariance.

The enhancing that present invention uses a two-stage returns device, the i.e. first order 10 grades, the second layer 500 grades.In this secondary structure, in the first order, each node is the cascade of 500 Weak Classifiers, namely the recurrence device of a second layer.Return in device at the second layer, feature remains unchanged, and in ground floor, feature is change.At ground floor, the output of each node is the input of a upper node.It is all the feature of getting in the key point of upper level estimation.

The present invention adopts a two-layer reinforcement Regression Model, i.e. basic face key point positioning framework (exterior layer) and stage regression device R ^t(interior layer) cascade mode.Interior layer stage regression device is also known as making original recurrence device.The key difference that exterior layer and interior layer return device is that shape indexing feature is fixing in interior layer returns.Namely feature only corresponds to the front shape S once estimated ^t-1and remained unchanged before current original recurrence device succeeds in school.The training stage can be made more to save time for shape indexing feature invariant in keeping interior layer to return and to make the interior layer that learns return device more stable.The structure returning device in the present invention can be described as T=10, K=500, i.e. ground floor (exterior layer) 10 grades, the second layer (interior layer) 500 grades.Each node in ground floor is the cascade of 500 Weak Classifiers, and namely the second layer returns device.The input of each node of ground floor is the output of a upper node.The second layer returns device and adopts so-called fern structure.Each fern is the combination of 5 characteristic sum threshold values, and feature space is divided into 2 ^findividual bins.

Each binb corresponds to one and returns output y _b: Ω _bthe sample of representative in b bin.Y _boptimum solution be a contraction factor is introduced in order to prevent over-fitting when training when the sample size in bin is enough large, beta impact is less, otherwise then can play lower estimate and be worth amplitude.

Original recurrence device: the present invention fern represents original recurrence device, a fern is the combination of F characteristic sum threshold value, and feature space and all training samples are divided into 2 ^finside individual bin.Each bin and one returns and exports y _bclosely related. represent the regressive object of training sample, so the output of a bin be exactly minimize fall into this bin all training samples to the mean square distance of regressive object: Ω _bit is the sample in b bin.Its optimum solution gets average to all regressive objects in bin: in order to avoid over-fitting introduces a contraction factor β: can find out that contraction factor β does not almost affect when number of training is enough large time, otherwise β can reduce the magnitude of estimation.

The method operating process that this patent is complete is as follows:

Step 1, collection training pictures, and key point is demarcated to training set face picture;

Step 2, calculate each training sample standardization target: m ο S is standardized target sample;

Step 3, combined training sample set each sample packages contains a training image, a true shape and an original shape; N sample in random selecting training set is as the original shape of each training sample;

Step 4, extract shape indexing feature according to the position of each key point;

Step 5, employing correlation analysis select appropriate feature as final feature;

Step 6, regressive structure is set: adopt 2-level to strengthen regressive structure as basic face registration framework (exterior layer), inner stage regression device R is set at every one deck exterior layer ^t(interior layer);

Step 7, calculate the recurrence device R in each stage ^t:

The method of step 8, use Face datection estimates face window; Forecast of regression model face key point position according to training:

Accompanying drawing explanation

Fig. 1 is the basic procedure schematic diagram of the real-time face key point localization method that the present invention is based on Android platform.

Embodiment

Below in conjunction with accompanying drawing, technical scheme of the present invention is described in detail.The invention discloses a kind of real-time face key point localization method based on Android platform.

As shown in Figure 1, the present invention can be divided into learning position and two stages of test position fix, specifically, comprises the steps:

The mark of step 1, training set unique point and standardization: the face calibration point related in the present invention is chosen according to MPEG-4 standard, 84 unique points defined due to face defined parameters FDP are too numerous and diverse, therefore we have chosen wherein 68 unique points as required, and the unique point on eye contour also only remains eight (i.e. eyes each four points in upper and lower, left and right near the eyes) to improve operation efficiency.The facial image of MUCT and self-calibration is taken from training storehouse, and total about 4,000 width facial images, carry out specular process to these facial images, final acquisition has the training set of 8,000 width facial images.

Step 2, standardization is carried out to sample image shape: because the attitude of face shape in image is by large small scale, the anglec of rotation, location parameter describes, therefore needing the attitude parameter by suitably changing shape, comprising size, angle, positions etc. make all images reach consistent as far as possible, such as change of scale can allow the distance between calibration point fix, and it is fixing that rotational transform can allow calibration point line point to, and shift transformation makes centroid position fix.First we try to achieve the average face of all training samples then minimize input amendment S with between L2 distance m ο S is standardized shape

Step 3, combined training sample set each sample packages contains a training image, a true shape and an original shape.10 samples in random selecting training set are as the original shape of each training sample.Intangibly add the quantity of training sample with this method producing multiple original shape to a width picture, also improve the generalization of training simultaneously.

Step 4, a certain specified point on standardized face set up local coordinate system Δ ^l=(Δ x ^l, Δ y ^l).For different training samples, local coordinate system Δ ^lidentical.According to the location index pixel of local coordinate system key point.Subscript l represents the concrete mark point corresponding to pixel.Then local coordinate system is converted back original image global coordinate system and extract pixel difference feature π _lrepresent the x extracting m mark point from shape vector, y coordinate.Each weak learner R ^tfeature based on image I and last estimation shape S ^t-1.At each stage regression device R ^tin, a stochastic generation P local coordinate system namely P shape indexing pixel is defined.First by random selecting one mark point (the _{l α}individual mark point) generate each local coordinate system, then produce X-and Y-skew at random in a uniformly distributed manner.P pixel one is met together

Produce P ²individual pixel difference feature.

Step 5, employing correlation analysis select appropriate feature as final feature.

Creating P ²after individual feature, bring new challenge---the effective feature of fast selecting in huge feature space." in right amount " means and selects much smaller than P ²characteristic Number, be beneficial to less computation complexity like this, improve the efficiency of algorithm.The present invention chooses F/P ²feature is set up good fern and is returned device.We utilize the correlativity between characteristic sum regressive object to carry out selected characteristic.Selected characteristic has following two principles: keep high correlation between each the characteristic sum regressive object in fern; Maintenance between feature to feature is low relevant, in complementary relationship.Representing regressive object with Y, is N (sample number) row N _fpthe matrix that (mark is counted) arranges.X represents pixel difference feature, is a capable P of N ²the matrix of row.Our target is the P at X matrix ²select in row and list with the F of Y matrix height correlation.Since Y is matrix, Y, in order to the individual column vector projection υ obtained from unit Gauss, is projected to a column vector Y by us _probupper: Y _prob=Y υ.(PearsonCorrelation) feature the most relevant to projection target can be selected out: repeat to obtain F the feature be applicable to for F time with multiple projections.This F feature is the appropriate feature that will choose out.Correlation calculations corr (the Y of regressive object and pixel difference feature _proj, ρ _m-ρ _n) design as follows:

c o r r (Y_{p r o j}, ρ_{m} - ρ_{n}) = \frac{cov (Y_{p r o j}, ρ_{m}) - cov (Y_{p r o j}, ρ_{n})}{\sqrt{σ (Y_{p r o j}) σ (ρ_{m} - ρ_{n})}},

Wherein

σ(ρ _m-ρ _n)＝cov(ρ _m,ρ _m)+cov(ρ _n,ρ _n)-2cov(ρ _m,ρ _n)。Here the calculating of correlativity is made up of two parts: target-pixel covariance and pixel-pixel covariance.Because shape indexing feature is changeless in interior layer cascade returns, then reused in each interior layer so pixel-pixel covariance can calculate in advance.And for original recurrence device, we only need to calculate all target-pixel covariance to form correlativity, correlativity here and the number of pixel characteristic linear.Therefore, correlation calculations complexity is from O (NP ²) drop to O (NP).

Step 6, regressive structure is set: adopt 2-level to strengthen regressive structure as basic face registration framework (exterior layer), inner stage regression device R is set at every one deck exterior layer ^t(interior layer).The weak learner of interior layer is called as original recurrence device.Here inside and outside recurrence utensil has some similaritys, but their difference be interior layer all stage regression wherein shape indexing feature remain unchanged.

Step 7, learn inner cascade and return, inner cascade returns and comprises K original recurrence device { r ₁..., r _k, i.e. ferns.These original recurrence devices wolfishly matching regressive object successively.In each original recurrence device process recurrence device carry over remnants.In iterative process each time, these remnants are by as the target being used for learning new recurrence device.Calculate the recurrence device R in each stage ^t:

Claims

1., based on a real-time face key point localization method for Android platform, it is characterized in that structure one is based on the non-ginseng shape returned, and predicting the key point of whole face, comprising the following steps by directly learning a vector regression function:

Step 2, calculate each training sample standardization target: be standardized target sample;

Step 7, calculate the recurrence device R in each stage ^t: