WO2008151471A1 - A robust precise eye positioning method in complicated background image - Google Patents

A robust precise eye positioning method in complicated background image Download PDF

Info

Publication number
WO2008151471A1
WO2008151471A1 PCT/CN2007/001894 CN2007001894W WO2008151471A1 WO 2008151471 A1 WO2008151471 A1 WO 2008151471A1 CN 2007001894 W CN2007001894 W CN 2007001894W WO 2008151471 A1 WO2008151471 A1 WO 2008151471A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
image
sample
training
classifier
Prior art date
Application number
PCT/CN2007/001894
Other languages
French (fr)
Chinese (zh)
Inventor
Xiaoqing Ding
Yong Ma
Chi Fang
Changsong Liu
Liangrui Peng
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to PCT/CN2007/001894 priority Critical patent/WO2008151471A1/en
Publication of WO2008151471A1 publication Critical patent/WO2008151471A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor

Definitions

  • the present invention relates to the field of face recognition technology, and in particular to a robust eye accurate positioning method in a complex background image. Background technique
  • Face detection is the determination of the position and size of a face in an image or image sequence. It is currently widely used in systems such as face recognition, video surveillance, and intelligent human-machine interfaces. Face detection, especially face detection in complex backgrounds, is also a difficult problem. This is due to the appearance of the face, the skin color, the expression, the movement in the three-dimensional space, and the external factors such as beard, hair, glasses, and light, which cause great changes in the face pattern, and because the background object is very complicated, it is difficult to Faces are separated.
  • the mainstream method of face detection is based on the detection method of sample statistical learning.
  • Such methods generally introduce the category of "non-human face".
  • the parameters of the "face” category are distinguished from those of the "non-human face” category, and the parameters of the model are obtained instead of the surface layer based on the visual impression. law.
  • This is more reliable in statistics, not only avoids errors caused by incomplete and inaccurate observations, but also increases the range of detection by increasing the training samples to improve the robustness of the detection system;
  • Using a simple to complex multi-layer classifier structure most of the background window is first excluded by a simple classifier, and then the remaining window is further judged by a complex classifier, thereby achieving a faster detection speed.
  • this method does not take into account the fact that the risk of classification error between face and non-face is very unbalanced in the actual image (this is because the prior probability of the face appearing in the image is much lower than that of the non-face) Probability is detected, and the main purpose of face detection is to find the position of the face, so the risk that the face is misclassified into a non-face is much greater than that of the non-face, and only the method based on the minimum classification error rate is used.
  • the present invention proposes a face detection method based on Cost Sensitive AdaBoost (CS-AdaBoost), which minimizes the risk of classification and makes each layer of training.
  • CS-AdaBoost Cost Sensitive AdaBoost
  • the classifier guarantees a very low rejection rate of the face mode while minimizing the false acceptance rate of the non-face class, thereby implementing a complex background image with fewer classifier layers and a simpler classifier structure. More high-performance face detection, which is currently not used in all other literature.
  • the object of the present invention is to realize a face detector capable of robustly locating a face under a complex background image, the face detection method comprising two stages of training and detecting;
  • a robust eye accurate positioning method in a complex background image comprising two stages of training and detecting;
  • a large number of samples are first collected, that is, a manual calibration method is used to collect training samples from the face images, and then the samples are normalized.
  • feature extraction is performed to obtain the feature database of the training samples.
  • the parameters of the classifier are determined experimentally, and the eye positioning classifier is trained.
  • the detection phase for an input face image area / (X, , 0 ⁇ x ⁇ W face , Q ⁇ y ⁇ H face , first estimate the area where the left and right human eyes may exist, and then exhaustively in the two areas Judging all the small windows (defining the small window as a rectangular area sub-image in the input image), extracting features for each small window, and then using the monocular detector to determine, thereby obtaining all human eye candidate positions in the region; The eye candidates combine to use the global properties to select the best combination, and the final position of the eye is obtained, thereby obtaining excellent eye positioning accuracy.
  • the training phase in turn contains the following steps - sample collection and normalization, estimation of the area of the left and right eyes using the projection function, training of the monocular detector, and training of the eye to the detector
  • a manual eye calibration method is used to cut a single eye image from the face image, and a non-eye sample is randomly cut from the non-eye portion of the face image, and the single eye image and the non-eye image are respectively taken as Positive and negative samples are used to train monocular detectors;
  • the eye is sampled from the face image according to the set ratio, and the non-eye pair sample is randomly cut from the face image, and the eye pair image
  • the non-eye-to-eye image is used as a positive example sample and a counter-example sample to train a monocular detector; the sample thus collected includes not only two eyes but also an eyebrow, a nose, etc., which embodies the constraint relationship between the eye and the surrounding organs;
  • the cropping from the face image is performed on the sample in the following proportions: the line connecting the center of the eyeball is taken as the X axis, and the vertical line perpendicular to the center line of the eyeball is taken as the Y axis, and the foot is located
  • the center line of the inner distance of the eyes is ⁇ ; when the distance between the centers of the eyes is set to dist, the horizontal distance between the center of the eye and the outer frame of the eye is %, and the upper and lower frames are each cut.
  • the distance from the foot is ' ⁇
  • ⁇ ( ⁇ , ⁇ ) ⁇ (0( ⁇ , ⁇ )- ⁇ ) + ⁇ ( thus adjusting the mean and variance of the grayscale of the image to the given value and ⁇ ., completes the grayscale normalization of the sample;
  • the training of the monocular detector uses a normalized single eye sample and a non-eye sample microstructure feature library, and a single eye detector is trained by the AdaBoost algorithm; the specific training process is as follows:
  • Class (a) The black area and the white area are bilaterally symmetrical and equal in area, with w indicating the width of each of the areas, and h indicating the height of each of the areas;
  • Class (d) The two black areas are in the first quadrant and the third quadrant respectively.
  • the two white areas are in the second and fourth quadrants respectively.
  • the area of each black area and each white area is equal, and the definition of w and h Same as (a):
  • Each of the microstructure features is obtained by calculating the difference between the grayscale sum of the pixels in the black area and the white area in the image covered by the template, and the position of the template relative to the image and the size of the template can be changed, since each feature extraction is only It involves the calculation of the pixel sum in the rectangular area, and it is convenient to quickly obtain a micro-structure feature of arbitrary scale and arbitrary position by using the integral image of the whole image;
  • g(x,y,w,h) 2JI(x + w-l,y + h-T) + II(x-l,y-V)-II(x + w-l,y- ⁇ )
  • g(x,y,w,h) -II(x-l,y-V)-II(x + 2w-l,y-l)-II(x-i,y + 2h- ⁇ )
  • g(x,y,w,h) II(x + wl,y + hl) + II(xl,yl)-II(x + wl,yV ⁇ -II(xl,y + hl) change number-JJO + w-3,y + /z-3)-H(x + l,:F + l) + //(x + l, ⁇ + 3) + HO; + wl,3 + l) Township x,y
  • the values of w, h can extract the microstructure features of different positions of the sample image, and for the eye/non-eye sample images normalized to 24 ⁇ 12, 42727 features can be obtained, thereby composing the feature quantity of the sample image ⁇ "d 1 ⁇ 42727 '
  • the AdaBoost algorithm selects the best performance single class-based weak classifiers in each iteration to achieve the purpose of feature selection; on the other hand, integrate these weak classifiers into one Strong classifier, and by cascading multiple strong classifiers to get a complete eye detector; specifically, it includes the following components:
  • the simplest tree classifier is constructed as a weak classifier corresponding to each dimension feature:
  • sub is a 24x12 pixel sample, g ;
  • (sub) represents the jth feature extracted from the sample, is the decision threshold corresponding to the jth feature, and the threshold is calculated by counting all collected eye and non-eye samples
  • the j features are such that the FRR of the eye sample satisfies the specified requirements, indicating the decision input of the tree classifier constructed using the jth feature Therefore, each weak classifier only needs to compare the threshold once to complete the decision; a total of 42727 weak classifiers are obtained; (ii) an eye/non-eye strong classifier design based on the AdaBoost algorithm
  • Step 3 Eye training for the classifier
  • the eye trains the classifier using the normalized eye-to-sample and non-eye-pair samples, extracts the feature libraries of the two types of samples, and uses the AdaBoost algorithm to train the eye-pair classifier.
  • the eye uses the same microstructural features and training process as the previous one-eye detector.
  • the AdaBoost algorithm is used to select a weak classifier based on a single feature from a large number of microstructure features to form a strong classifier. Strong classifiers are cascaded together; the specific training process of the eye to the classifier also includes feature extraction, feature selection, training of strong classifiers, and cascade of multi-layer strong classifiers:
  • the normalized eye-to-sample and non-eye-pair samples to extract the high-dimensional microstructure features of the eye-pair and non-eye-pair samples according to the feature extraction method described in step 2.1 above.
  • the feature points that make up the sample are ⁇ ( ), 1 ⁇ y ⁇ 71210;
  • the coordinate area of the sample in the whole image is (x. ⁇ x' ⁇ x Q + 24, ⁇ 0 ⁇ / ⁇ 0 +14) , then; ⁇ and respectively are:
  • each dimension micro-structure feature is as follows: For a sample image of 25x15 pixels, a total of 71210-dimensional microstructure features i ⁇ /), l ⁇ _ / ⁇ 71110 are obtained.
  • the eye also uses a layered structure for the detector.
  • the background window in the image is first excluded by a strong classifier with a simple structure, and then the remaining window is judged by a strong classifier with a complicated structure. Specifically, it includes the following components:
  • the detection phase refers to determining the center position of the eye area of an input face, and includes the following steps:
  • the eye detection stage is for an input face area, and the following steps are used to accurately position the center of the eye: Step 1 Estimate the area where the left and right eyes are located ⁇ ⁇ ;
  • the peak of ( x) is the vertical boundary of the area where the left and right eyes are located. Define this position as ⁇
  • the upper and lower boundaries of ⁇ ⁇ can be counted by the distribution of the eyes in the vertical direction of the face in the sample;
  • ⁇ ⁇ 3 ⁇ 4 e (x, ' x peak ⁇ x ⁇ W face ' 0.05H face ⁇ y ⁇ 0A5H faci! where H face and if /ace are the height and width of the face derived from the sample statistics;
  • Step 2 Using a single-eye detector to detect eye candidates
  • the left and right eye candidates are paired, more features of the candidate surrounding regions are extracted, and then each pair of candidates is verified using the eye pair classifier, and finally from all posterior probabilities based on posterior probabilities
  • the optimal position of the binocular is estimated in the candidate pair, specifically for each pair of eye candidates, including the following processing steps -
  • the image is cut according to the way the eye cuts the sample in step 1.1 of the training phase, and then the size normalization and illumination normalization are performed to obtain an eye of 25 x 15 pixels.
  • the verification steps for each eye candidate pair image are as follows:
  • step 3 (3) (i ii) If judged, the value is incremented by 1, returning to step 3 (3) (ii); otherwise, the eye candidate pair is discarded; if it is judged by all layer strength classifiers, the candidate pair is considered to be a valid candidate pair , output its position and its confidence;
  • the candidate pairs that pass the judgment are sorted according to the confidence level from large to small, and the average position of the first three pairs of candidate pairs with the highest confidence is taken as the eye center position, and the eye position is output.
  • step 2 2) using the training sample set to train the i-th layer eye/non-eye strong classifier using the AdaBoost algorithm described in step 2 (3) (ii);
  • step 2 If the value does not reach the predetermined value, the value increases by 1. Return to step 2 to continue the training; otherwise, stop the training;
  • step 3 (3) (iii) the entire eye uses a hierarchical structure for the verifier, and training the connection of the multi-layer strong classifier includes the following steps:
  • Initialization 1;
  • the training goal for defining each level of strong classifier is on the eye-to-training set? ⁇ 0.1%, on the non-eye-pair training set 3 ⁇ 4i? ⁇ 50% ; defining the entire eye-to-detector in the eye-pair training ? Ei on the target set i ⁇ l%, in the eyes of non-target 3 ⁇ 4i on the training set ⁇ lxl0- 3??;
  • step (b) If 3 ⁇ 4?, F ⁇ R does not reach the predetermined value, the shell lj/value is increased by 1, returning to step (b) to continue the training; otherwise, the training is stopped; the training is to obtain a 9-layer structure from simple to complex strong classifier; These strong classifier cascades form a complete eye-pair detector; to verify the effectiveness of the invention, we performed the following experiment:
  • test set used by the eye positioning algorithm includes the following three parts:
  • Test Set 1 Consisting of a face database of Yale B and Aerolnfo.
  • the Ministry of Public Security consists of 4,353 images of 209 people.
  • the Yale B database consists of 15 people, 165 images, which are characterized by complex illumination changes.
  • the Aerolnfo database provided by China Aerospace Information Co., Ltd., includes 3,740 images of 165 people, characterized by external illumination, The posture of the face changes complexly, and the background is complex, and the quality of the face image is poor.
  • the Ministry of Public Security has a face database of 448 images of 30 people, which is characterized by complex lighting changes, and some people wear glasses and strong Reflective
  • Test Set 2 Consists of the English part of the BANCA Face Database, which includes a total of 6540 images of 82 people. It is characterized by a large change in image background and image quality, including images acquired under controlled, degraded and harsh scenes. In addition, illumination and face pose changes are also complicated, and many people also wear black-rimmed glasses;
  • Test Set 3 The JAFFE database, which includes 213 face images, is characterized by rich facial expression changes; tests performed on collections with such rich sources and changes should truly reflect the performance of a positioning algorithm: Table 1 and others Performance comparison of positioning algorithm under different allowable errors
  • the algorithm has stable performance on different test sets, which is better than the positioning accuracy of the FacelT, and the FacelT is sensitive to the opening and closing of the human eye and the size of the face in the experiment; and zhou [zh ° u ZH ' Geng X Projeeti ° n funeti ° ns fM eye deteetiOT Pattern Re ⁇ gniti ° n ' 2 (m) method, the accuracy of the method in the JAFFE database within 0.10 is 98.6%, and its method The positioning accuracy of the error within 0.25 is only 97.2%.
  • Figure 1 shows the hardware composition of a typical eye positioning system
  • Figure 2 The collection process of training samples
  • Figure 3 Example of a single eye sample and an eye pair sample
  • Figure 4 is a block diagram of the eye positioning system
  • Figure 5 Five microstructure feature templates used
  • Figure 8 cascading structure of multi-level strong classifier
  • Figure 9 The training process of a strong classifier based on the AdaBoost algorithm
  • Figure 10 is a schematic diagram of the eye to template ratio
  • Figure 11 is a face recognition sign-in system based on the algorithm. detailed description
  • Fig. 1 The hardware structure of the entire human eye positioning system is shown in Fig. 1.
  • 101 is a scanner
  • 102 is a camera
  • 103 is a computer.
  • the training process and identification process of the system are shown in Figure 4. The following sections describe the various parts of the system in detail:
  • the input to the system is a single face area image.
  • the face detection portion is not included in the present invention and will not be described in detail.
  • a single eye image is cut out from the face image by a manual calibration method, and a non-eye sample is randomly cut from the non-eye portion of the face image.
  • the single eye image and the non-eye image are used as positive and negative samples, respectively, for training the monocular detector.
  • Some training samples are shown in Figure 3(a).
  • the eye-to-sample is obtained by cropping from the face image according to the scale shown in Fig. 7, and the non-eye pair sample is randomly cut from the face image.
  • the eye-to-image and non-eye-to-image images are used as training positive and negative samples, respectively, for training monocular detectors.
  • Some of the samples collected are shown in Figure 3(b).
  • the samples collected in this way include not only the two eyes but also the eyebrows, the nose and other parts, which embodies the constraint relationship between the eyes and the surrounding organs.
  • the collected sample images of each size are normalized to a specified size.
  • the original sample image be ] MxW
  • the image width be M
  • the height be N
  • the value of the pixel at the y column of the Xth row of the image is F(x, y) ( 0 ⁇ x ⁇ M , 0 ⁇ y ⁇ N
  • the present invention transforms the original sample image into a standard size sample image using back projection and linear interpolation, and the correspondence between the input image and the normalized image [G xW is - .
  • the linear interpolation method for a given (X, , order:
  • Training of a single eye detector uses a normalized single eye sample and a non-eye sample microstructure feature library, and a single eye detector is trained using the AdaBoost algorithm; the specific training process is as follows:
  • the microstructure feature can quickly obtain a micro-structure feature of any scale and arbitrary position in the image by using the integral map of the whole image, it provides a possibility for real-time detection of the eye.
  • the invention uses the five types of microstructure templates in FIG. 6 to extract high-dimensional microstructure features of the human eye mode; and obtains features by calculating the difference between the gray levels of the pixels in the corresponding black and white regions in the image, and expresses the characteristics of the eye mode. .
  • the pixel sum can be quickly calculated by the addition and subtraction of the integral map by 3 times.
  • microstructure features can be calculated by adding and subtracting several times through the corresponding integral image.
  • the present invention uses the AdaBoost algorithm to select features and train classifiers.
  • the AdaBoost algorithm selects the best performance single class-based weak classifier in each iteration to achieve the purpose of feature selection; on the other hand, these weak classifiers are integrated into one strong classifier, and by multiple strong The classifiers are cascaded to get an excellent eye detector.
  • it includes the following components:
  • Weak classifiers must have very high classification speeds, and the entire strong classifier can achieve a sufficiently high classification speed.
  • the present invention constructs the simplest tree classifier for each dimension feature as a weak classifier:
  • sub is a 24x12 pixel sample
  • (sub) represents the jth feature extracted from the sample
  • the present invention combines the AdaBoost algorithm with the weak classifier construction method described above for training eye/non-eye strong classifiers.
  • T is the number of weak classifiers that you want to use
  • should increase gradually with the increase of the number of strong classifiers. See Table 1 for specific selection values; The maximum value FmaxCj) and the minimum value Fmin(j) of each feature distribution on the statistical sample set (where _ is the feature number: 1 ⁇ 7 ⁇ 42727);
  • the present invention adopts the confidence that the obtained mode belongs to the eye
  • the training goal for defining each level of strong classifier is i3 ⁇ 4i? ⁇ 0.1% on the eye training set, _ ⁇ 2 ⁇ on the non-eye training set 60%; defines the target Fi?i? ⁇ l% of the entire eye detector on the eye training set, and the target ⁇ SxlO" 4 on the non-eye training set;
  • the present invention pairs left and right eye candidates, extracts more features of the candidate surrounding regions, and then uses the eye pair classifier to verify each pair of candidates, and finally according to the posterior probability.
  • the optimal position of both eyes is estimated for all candidate pairs (as shown in Figures 4 and 5).
  • Eye training for the classifier includes the following steps:
  • microstructure templates in Figure 6 were used to extract high-dimensional microstructure features of both the eye and non-eye pairs. It is also possible to quickly obtain a microstructure feature of arbitrary scale and arbitrary position by using the integral image of the entire image //(X, - ⁇ ⁇ /( ⁇ ', ). Also define the square integral image ⁇ I(x', y' ) -I(x',y'), used to calculate the variance of each rectangular region.
  • any of the above microstructure features can be quickly calculated by adding and subtracting the integral image several times.
  • a total of 71,210 features are obtained, which constitute the feature vector ⁇ (_/) of the sample, 1 ⁇ _/ ⁇ 71210.
  • it is necessary to normalize the gray mean and variance for each 25x15 pixel sample image so firstly calculate the mean of the small window; ⁇ and the variance ⁇ , and then normalize each dimension feature. , in which the sum of the grayscale sums of the pixels in the small window area of 25 X 15 pixels ( ⁇ . ⁇ ' ⁇ + 0 + 24, y Q ⁇ / ⁇ + 14) is
  • an eye-to-detector In order to achieve a fast enough verification speed, an eye-to-detector must adopt a layered structure (as shown in Figure 8).
  • the background window in the image is first excluded by a simple classifier with a simple structure, and then a strong classifier with complex structure is used. Judge the remaining windows.
  • This section still uses the AdaBoost algorithm to select features and train the classifier, as shown in Figure 9. Specifically, it includes the following components - 1 weak classifier construction
  • the weak classifier still uses a tree classifier constructed with one-dimensional features:
  • the CS-AdaBoost algorithm is combined with the weak classifier construction method described above for training eye-to-strong classifiers.
  • T is the number of weak classifiers that you want to use
  • should gradually increase with the number of strong classifiers. See Table 2 for specific selection values
  • the present invention uses (l I /(sub)) to obtain the posterior probability that the pattern belongs to the eye pair, here /( S ub) .
  • the entire eye has a hierarchical structure for the validator, as shown in Figure 8.
  • Initialization ⁇ ⁇ 1 ;
  • the training objective for defining each class of strong classifiers is i?i? ⁇ 0.1% on the eye-to-training set, and 3 ⁇ 4 ⁇ 50% on the non-eye-pair training set; defining the entire eye-to-detection ? in ⁇ l% target 3 ⁇ 4 eye on the training set, in the eyes of non-target on the training set ⁇ ⁇ 1 10- 3; j) using the training set of training i-layer strong classifier;
  • step (b) If ⁇ , does not reach the predetermined value, then ⁇ + 1, return to step (b) continue training; otherwise stop training. .
  • the eye detection phase includes the following steps: 1. Estimating the area where the left and right eyes are located ⁇ , ⁇ CL rigl eye uses the mean and variance functions of the vertical projection of the face grayscale image to determine ⁇ , ⁇ and ⁇ , / The boundary line in the horizontal direction is then determined according to the distribution law of the eye in the vertical direction of the face region from the training sample, and ⁇ / ⁇ ⁇ and
  • the peak value of the ratio of the mean function of the vertical direction grayscale projection to the variance function is taken as the vertical boundary line between the left and right eyes, as shown in Fig. 5(b). Define the location of this peak; ⁇ ,
  • the upper and lower boundaries of Q leJieye and Q rlg/lleye can be counted using the distribution of eye positions in the vertical direction in the face sample.
  • Eye detectors are used to detect left and right eye candidate positions in the ⁇ 3 ⁇ 4 and ⁇ regions, respectively, and the confidence of each candidate position is estimated.
  • the specific detection process for eye candidates is as follows -
  • step 3 If judged, then +1 returns to step 3; otherwise, the small window is discarded; if judged by all layer strength classifiers, the small window is considered to contain an eye candidate, and its position and its confidence are output. Otherwise, the small window is discarded and no subsequent processing is performed;
  • the present invention outputs up to the top 20 candidate positions according to the candidate confidence level.
  • the present invention pairs left and right eye candidates, extracts more features of the candidate surrounding regions, and then uses the eye pair classifier to verify each pair of candidates, and finally according to the posterior probability.
  • the best position for both eyes is estimated for all candidate pairs.
  • the processing steps include the following:
  • the image is first cut according to the position of the left and right eye candidates according to the position shown in the template (Fig. 10), and then the size normalization and illumination normalization are performed to obtain an eye candidate pair image of 25 x 15 pixels.
  • step 3 If judged, Bay z' + l, return to step 1; otherwise discard the eye candidate pair; if judged by all layer strong classifiers, then consider the candidate pair as a valid candidate pair, output its position and its confidence Finally, the candidate pairs that pass the judgment are sorted according to the confidence level from large to small, and the average position of the first three pairs of candidate pairs with the highest confidence is taken as the eye center position. Output eye position.
  • the present invention employs a positioning error metric that is independent of the size of the face. Since the center distance of the eyes of the frontal face generally does not change with the expression and the like, and has relative stability, the center distance of the eyes of the artificial calibration is used as a reference.
  • the left and right eye and mouth positions are manually calibrated as P / e , ⁇ and , respectively, and the left and right eye and mouth positions of the automatic positioning are P fe ', ' and , respectively, and the Euclidean distance between 4 and , 4 is the Euclidean distance between the two, d re is the Euclidean distance between P r and P re , d m is P mic; Euclidean distance from P m .
  • the eye positioning error is defined as:
  • Embodiment 1 Face-based identification check-in system (as shown in Figure 11)
  • Face authentication is one of the most friendly authentication methods in biometric authentication technology that has received wide attention recently. It is designed to use face images for computer automatic personal identification to replace traditional passwords, certificates, seals and other authentication methods. It is not easy to forge, not lost, and convenient.
  • the system uses face information to automatically verify the identity of the person.
  • the face detection module used therein is the research result of this paper.
  • the system also participated in the FAT2004 competition organized by ICPR2004.
  • the competition included 13 face recognition algorithms from 11 academic and commercial institutions including Carnegie Mellon University in the United States, Neuroinformatik Institute in Germany, and Surrey University in the United Kingdom.
  • the system submitted by the laboratory won the first place in the three evaluation indicators with a lower error rate of about 50% than the second.
  • the research results of this paper are applied to the eye positioning module of the system submitted by the actual implementation, thus ensuring that the overall performance of the system is at the advanced level in the world.
  • the present invention can accurately and accurately locate an eye in an image with a complex background, and obtains excellent positioning results in experiments, and has a very broad application prospect.
  • PT/CN2007/001894 The above-mentioned embodiments are only preferred embodiments of the present invention, and those common changes and substitutions made by those skilled in the art within the scope of the technical solutions of the present invention should be included in the scope of protection of the present invention. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Ophthalmology & Optometry (AREA)
  • Geometry (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

A robust precise eye positioning method in complicated background image. It adopts a microstructure feature with high efficiency and high redundancy to express gradation distributing character of both local and global areas of eye mode and adopts AdaBoost algorithm to choose the most differential microstructure features to form a strong classifier. It also comprehensively considers the local characters and the global characters that can express this kind of constraint relation to obtain more robust positioning effect.

Description

复杂背景图像中鲁棒的眼睛精确定位方法  Robust eye precise positioning method in complex background images
技术领域 Technical field
本发明涉及人脸识别技术领域, 尤其涉及复杂背景图像中鲁棒的眼睛精确定位方法。 背景技术  The present invention relates to the field of face recognition technology, and in particular to a robust eye accurate positioning method in a complex background image. Background technique
人脸检测就是在图像或图像序列中确定人脸的位置、大小等信息。它目前广泛应用于人脸 识别、视频监控、智能人机接口等***中。人脸检测尤其是复杂背景下的人脸检测同时也是一 个困难的问题。这是由于外貌、肤色、表情、在三维空间中的运动等人脸本身的原因以及胡须、 头发、眼镜、光照等外界因素造成人脸模式类内变化巨大, 并且由于背景物体非常复杂, 难于 与人脸区分开来。  Face detection is the determination of the position and size of a face in an image or image sequence. It is currently widely used in systems such as face recognition, video surveillance, and intelligent human-machine interfaces. Face detection, especially face detection in complex backgrounds, is also a difficult problem. This is due to the appearance of the face, the skin color, the expression, the movement in the three-dimensional space, and the external factors such as beard, hair, glasses, and light, which cause great changes in the face pattern, and because the background object is very complicated, it is difficult to Faces are separated.
目前人脸检测的主流方法是基于样本统计学***衡的特点(这是由于图像中人脸出现的先验概率远低于非人脸的先验概率,并且人脸检测主 要目的是找出人脸的位置, 所以人脸被误分为非人脸的风险远大于非人脸误判为人脸), 只采 用基于最小分类错误率的方法来训练每一层分类器,通过调整分类器的阈值来达到对人脸较低 的错误拒识率(False Rejection Rate, FRR), 这样并不能同时达到对非人脸模式较低的错误接 受率(False Acceptance Rate, FAR); 从而造成分类器层数过多、 结构过于复杂、 检测速度慢, 使算法整体性能下降。 针对此类算法存在的缺陷, 本发明提出'了一种基于风险敏感 AdaBoost 算法 (Cost Sensitive AdaBoost, 简称 CS- AdaBoost)的人脸检测方法, 采用最小化分类风险 的原则使训练得到的每一层分类器在保证对人脸模式极低的拒识率的同时,尽可能降低非人脸 类别的误接受率,从而以更少的分类器层数、更简单的分类器结构实现复杂背景图像下更高性 能的人脸检测, 这是目前所有其他文献里没有使用过的方法。 . 本发明的目的在于实现一个能在复杂背景图像下鲁棒定位人脸的人脸检测器,该人脸检测 方法包括训练和检测两个阶段; At present, the mainstream method of face detection is based on the detection method of sample statistical learning. Such methods generally introduce the category of "non-human face". By statistically learning the collected samples, the parameters of the "face" category are distinguished from those of the "non-human face" category, and the parameters of the model are obtained instead of the surface layer based on the visual impression. law. This is more reliable in statistics, not only avoids errors caused by incomplete and inaccurate observations, but also increases the range of detection by increasing the training samples to improve the robustness of the detection system; Using a simple to complex multi-layer classifier structure, most of the background window is first excluded by a simple classifier, and then the remaining window is further judged by a complex classifier, thereby achieving a faster detection speed. However, this method does not take into account the fact that the risk of classification error between face and non-face is very unbalanced in the actual image (this is because the prior probability of the face appearing in the image is much lower than that of the non-face) Probability is detected, and the main purpose of face detection is to find the position of the face, so the risk that the face is misclassified into a non-face is much greater than that of the non-face, and only the method based on the minimum classification error rate is used. Train each layer classifier to adjust the threshold of the classifier to achieve a lower False Rejection Rate (FRR) for the face, which does not simultaneously achieve a low false acceptance rate for the non-face mode ( False Acceptance Rate, FAR); resulting in too many classifier layers, too complicated structure, slow detection speed, and reduced overall performance of the algorithm. Aiming at the defects of such algorithms, the present invention proposes a face detection method based on Cost Sensitive AdaBoost (CS-AdaBoost), which minimizes the risk of classification and makes each layer of training. The classifier guarantees a very low rejection rate of the face mode while minimizing the false acceptance rate of the non-face class, thereby implementing a complex background image with fewer classifier layers and a simpler classifier structure. More high-performance face detection, which is currently not used in all other literature. The object of the present invention is to realize a face detector capable of robustly locating a face under a complex background image, the face detection method comprising two stages of training and detecting;
复杂背景图像中鲁棒的眼睛精确定位方法, 该眼睛精确定位方法包括训练和检测两个阶 段;  A robust eye accurate positioning method in a complex background image, the eye precise positioning method comprising two stages of training and detecting;
在训练阶段,首先采集大量样本, 即采用人手工标定的方法,从人脸图像中收集训练样本, 然后对样本进行归一化处理。利用釆集到的训练样本,进行特征抽取,得到训练样本的特征库, 在特征库的基础上, 通过实验确定分类器的参数, 训练得到眼睛定位分类器;  In the training phase, a large number of samples are first collected, that is, a manual calibration method is used to collect training samples from the face images, and then the samples are normalized. Using the training samples collected by the ,, feature extraction is performed to obtain the feature database of the training samples. Based on the feature database, the parameters of the classifier are determined experimentally, and the eye positioning classifier is trained.
在检测阶段, 对一张输入的人脸图像区域 /(X, , 0≤x < Wface , Q≤y < Hface , 首先估计左 右人眼可能存在的区域,然后在两个区域中穷举判断所有的小窗口(定义小窗口为输入图像中 的一个矩形区域子图像), 对每一个小窗口抽取特征, 然后使用单眼检测器进行判断, 从而得 到区域中所有人眼候选位置;然后将左右眼睛候选结合起来利用全局性质从中选择出最佳组合 方式, 定位得到眼睛的最终位置, 由此可以得到极好的眼睛定位准确率。 In the detection phase, for an input face image area / (X, , 0 ≤ x < W face , Q ≤ y < H face , first estimate the area where the left and right human eyes may exist, and then exhaustively in the two areas Judging all the small windows (defining the small window as a rectangular area sub-image in the input image), extracting features for each small window, and then using the monocular detector to determine, thereby obtaining all human eye candidate positions in the region; The eye candidates combine to use the global properties to select the best combination, and the final position of the eye is obtained, thereby obtaining excellent eye positioning accuracy.
所述训练阶段依次含有以下步骤- 样本釆集与归一化、利用投影函数估计左右眼睛所在区域、单眼检测器的训练、眼睛对检测器 的训练  The training phase in turn contains the following steps - sample collection and normalization, estimation of the area of the left and right eyes using the projection function, training of the monocular detector, and training of the eye to the detector
步骤 1 样本釆集与归一化  Step 1 Sample collection and normalization
(1) 样本的采集  (1) Collection of samples
为训练单眼检测器, 采用人手工标定的方法, 从人脸图片中切割出单只眼睛图像, 并从 人脸图像非眼睛部位随机割取非眼睛样本,单只眼睛图像和非眼睛图像分别作为正例样本和反 例样本用于训练单眼检测器;  In order to train the monocular detector, a manual eye calibration method is used to cut a single eye image from the face image, and a non-eye sample is randomly cut from the non-eye portion of the face image, and the single eye image and the non-eye image are respectively taken as Positive and negative samples are used to train monocular detectors;
另外为训练眼睛对检测器, 还根据人手工标定的眼睛位置, 按照设定的比例从人脸图像 中剪裁得到眼睛对样本,并从人脸图像中随机割取非眼睛对样本,眼睛对图像和非眼睛对图像 分别作为正例样本和反例样本用于训练单眼检测器;这样采集到的样本不只包括两只眼睛还包 括眉毛、 鼻子等部位, 体现了眼睛与周围器官的约束关系;  In addition, in order to train the eye to the detector, according to the position of the eye manually calibrated by the person, the eye is sampled from the face image according to the set ratio, and the non-eye pair sample is randomly cut from the face image, and the eye pair image The non-eye-to-eye image is used as a positive example sample and a counter-example sample to train a monocular detector; the sample thus collected includes not only two eyes but also an eyebrow, a nose, etc., which embodies the constraint relationship between the eye and the surrounding organs;
所述从人脸图像中剪裁得到眼睛对样本是按下述比例进行的: 以双眼眼球中心的连线作 为 X轴, 以垂直于所述双眼眼球中心线的垂直线作为 Y轴,垂足位于双眼内侧间距 ^处的所 述中心线上; 当设定双眼眼球中心的间距为 dist时, 双眼眼球中心距左、右外边框的水平距离 各为 %, 而剪裁时上、 下两条边框各距垂足距离为 '^  The cropping from the face image is performed on the sample in the following proportions: the line connecting the center of the eyeball is taken as the X axis, and the vertical line perpendicular to the center line of the eyeball is taken as the Y axis, and the foot is located The center line of the inner distance of the eyes is ^; when the distance between the centers of the eyes is set to dist, the horizontal distance between the center of the eye and the outer frame of the eye is %, and the upper and lower frames are each cut. The distance from the foot is '^
(2) 尺寸归一化 (2) Size normalization
将收集好的各尺寸的样本图像, 包括单眼和非单眼、 眼睛对和非眼睛对图像, 归一化为 指定尺寸:设原始样本图像为 [ (x, );WxAi, 图像宽度为 M,高度为 N, 图像位于第 c行第;列 的象素点的值为 , 0≤x<M,
Figure imgf000005_0001
, 图像宽 度为 W, 高度为 H, 输入图像 [ ^, !^与归一化后图像 ] ί{ ^之间的对应关系为:
Sample images of various sizes collected, including monocular and non-monocular, eye-pair, and non-eye-pair images, normalized to Specified size: Let the original sample image be [ (x, ); W xAi , the image width is M, the height is N, the image is at the c-th row; the value of the pixel of the column is 0 ≤ x < M,
Figure imgf000005_0001
, image width is W, height is H, input image [ ^, ! The correspondence between ^ and the normalized image] ί{ ^ is:
其中 和
Figure imgf000005_0002
根据线性插值方法, 对 于给定 (X, , 令:
Where and
Figure imgf000005_0002
According to the linear interpolation method, for a given (X, , order:
其中: 程可表示为:Where: Cheng can be expressed as:
Figure imgf000005_0003
Figure imgf000005_0003
G(x, y) = F(x0 +Ax,y0+Ay) = E(x。 , y,)^y + + 1, y0) -― G(x, y) = F(x 0 +A x , y 0 +A y ) = E(x. , y,)^ y + + 1, y 0 ) -―
+F(x0 , y0 + l)Ax(l -Ay) + F(x0 +l,y0+ 1)(1 - Ax)(l -Ay) +F(x 0 , y 0 + l)A x (l -A y ) + F(x 0 +l,y 0 + 1)(1 - A x )(l -A y )
(3) 灰度归一化  (3) Grayscale normalization
由于外界光照、成像设备等因素可能导致图像亮度或对比度异常, 出现强阴影或反光等情 况,所以还需要对几何归一化后的样本进行灰度均值、方差归一化处理, 将样本图片灰度的均 值^和方差 调整到给定值 μϋ和 σ。: Due to factors such as external illumination and imaging equipment, the brightness or contrast of the image may be abnormal, and strong shadows or reflections may occur. Therefore, it is necessary to perform gray-scale mean and variance normalization on the geometrically normalized samples, and gray the sample image. The mean ^ and variance of the degrees are adjusted to the given values μ ϋ and σ. :
首先采用下式计算样本图像 G(x, (0≤x<W, 0<y<H 的均值和方差:  First, calculate the sample image G(x, (0≤x<W, 0<y<H mean and variance) using the following formula:
Figure imgf000005_0004
Figure imgf000005_0004
然后对每个像素点的灰度值进行如下变换: Then the gray value of each pixel is transformed as follows:
Ι(χ,γ) = ^(0(χ,γ)-μ) + μ( 从而将图像灰度的均值和方差调整到给定值 和 σ。, 完成样本的灰度归一化; Ι(χ,γ) = ^(0(χ,γ)-μ) + μ ( thus adjusting the mean and variance of the grayscale of the image to the given value and σ., completes the grayscale normalization of the sample;
步骤 2单眼检测器的训练 Step 2 Training of the single eye detector
单眼检测器的训练使用归一化后的单眼睛样本和非眼睛样本的微结构特征库, 利用 AdaBoost算法训练得到单只眼睛检测器; 其具体训练过程如下:  The training of the monocular detector uses a normalized single eye sample and a non-eye sample microstructure feature library, and a single eye detector is trained by the AdaBoost algorithm; the specific training process is as follows:
(1) 特征提取  (1) Feature extraction
①设定下述五种类型微结构模板;  1Set the following five types of microstructure templates;
设定:用下述五种类型微结构模板来提取人脸样本的五种微结构特征,每一种微结构特征 通过计算模板黑色区域和白色区域内所对应图像中像素灰度和的差值来得到,所述五种类型微 结构特征 y, w, h)分别表示如下-Setting: Extract the five microstructure features of the face sample with the following five types of microstructure templates, each of which has microstructural features By calculating the difference between the gray levels of the pixels in the corresponding image in the black area and the white area of the template, the five types of microstructure features y, w, h) are respectively expressed as follows -
(a)类:黑色区域和白色区域左右对称且面积相等, 用 w表示其中各区域的宽, h表示其中各 区域的高; Class (a): The black area and the white area are bilaterally symmetrical and equal in area, with w indicating the width of each of the areas, and h indicating the height of each of the areas;
(b)类:黑色区域和白色区域上下对称且面积相等, w、 h的定义与 (a)类相同;  Class (b): The black area and the white area are vertically symmetrical and equal in area, and the definitions of w and h are the same as (a);
(c)类: 在水平方向, 黑色区域位于两块白色区域之间, 且黑色区域和每块白色区域的面 积相等, w、 h的定义与 (a)类相同;  Class (c): In the horizontal direction, the black area is located between two white areas, and the black area and each white area have the same area, and the definitions of w and h are the same as (a);
(d)类: 两块黑色区域分别处于第一象限和第三象限, 两块白色区域分别处于第二和第四 象限, 每块黑色区域和每块白色区域的面积相等, w、 h的定义与 (a)类相同:  Class (d): The two black areas are in the first quadrant and the third quadrant respectively. The two white areas are in the second and fourth quadrants respectively. The area of each black area and each white area is equal, and the definition of w and h Same as (a):
(e)类: 黑色区域位于白色区域的中央, 且黑色区域的上、下两边, 左、右两边分别距离白 色区域的上、 下两边, 左、 右两边 2个像素, w、 h分别表示白色区域周框的宽和高: (ii)快速计算积分图像:  Class (e): The black area is located in the center of the white area, and the upper and lower sides of the black area, the left and right sides are respectively separated from the upper and lower sides of the white area, and the left and right sides are 2 pixels, w and h respectively represent white. The width and height of the area's perimeter box: (ii) Quickly calculate the integral image:
对于所述图像 /(x,y),定义其对应的积分图像 II(x,y)为从 (0,0)到 (X, 范围内的所有 像素之和, 即
Figure imgf000006_0001
∑ ∑ I(x',y')
For the image /(x,y), define its corresponding integral image II(x,y) from (0,0) to (X, the sum of all pixels in the range, ie
Figure imgf000006_0001
∑ ∑ I(x',y')
0≤x'≤x 0≤y'≤y  0≤x'≤x 0≤y'≤y
iii)快速提取单眼睛和非单眼样本的高维微结构特征:  Iii) Rapid extraction of high-dimensional microstructure features of single-eye and non-single-eye samples:
每一种微结构特征通过计算模板所覆盖图像中黑色区域和白色区域内像素灰度和的 差值来得到,并且模板相对于图像的位置以及模板的尺寸可以改变, 由于每一种特征提取 只涉及到矩形区域中像素和的计算问题, 便于利用整幅图像的积分图像快速得到任意尺 度、 任意位置的一种微结构特征;  Each of the microstructure features is obtained by calculating the difference between the grayscale sum of the pixels in the black area and the white area in the image covered by the template, and the position of the template relative to the image and the size of the template can be changed, since each feature extraction is only It involves the calculation of the pixel sum in the rectangular area, and it is convenient to quickly obtain a micro-structure feature of arbitrary scale and arbitrary position by using the integral image of the whole image;
(a)  (a)
g(x,y,w,h) = 2-IJ(x + w-\,y-\) + II(x + 2-w-l,y + h-V)  g(x,y,w,h) = 2-IJ(x + w-\,y-\) + II(x + 2-w-l,y + h-V)
+ II(x-l,y + h-V)-2-II(x + w-l,y + h-V)  + II(x-l,y + h-V)-2-II(x + w-l,y + h-V)
-II{x + 2-w-\,y-\)-II{x-\,y-\)  -II{x + 2-w-\,y-\)-II{x-\,y-\)
(b)  (b)
g(x,y,w,h) = 2JI(x + w-l,y + h-T) + II(x-l,y-V)-II(x + w-l,y-\)  g(x,y,w,h) = 2JI(x + w-l,y + h-T) + II(x-l,y-V)-II(x + w-l,y-\)
-2II(x-l,y + h-l)-II(x + w-\,y + 2h-l) + II(x-l,y + 2h-\)  -2II(x-l,y + h-l)-II(x + w-\,y + 2h-l) + II(x-l,y + 2h-\)
(c)  (c)
'g(x,y,M',h) = 2II(x + 2w-l,y + h-l) + 2II(x + w-l>y-Y)-2II(x + 2w-l,y-Y) 'g(x,y,M',h) = 2II(x + 2w-l,y + hl) + 2II(x + wl > yY)-2II(x + 2w-l,yY)
-2II(x + w-l,y + h-i)-II(x + 3w-l,y + h-l)-II(x-l,y-l)  -2II(x + w-l,y + h-i)-II(x + 3w-l,y + h-l)-II(x-l,y-l)
+ II(x-\,y + h-l) + II(x + 3w-l,y-r) _ + II(x-\,y + hl) + II(x + 3w-l,yr) _
g(x,y,w,h) = -II(x-l,y-V)-II(x + 2w-l,y-l)-II(x-i,y + 2h-\)  g(x,y,w,h) = -II(x-l,y-V)-II(x + 2w-l,y-l)-II(x-i,y + 2h-\)
- II{x+w-\,y + h-\) + 2II{x + w-\,y-\)+2II{x~\,y + h-\)  - II{x+w-\,y + h-\) + 2II{x + w-\,y-\)+2II{x~\,y + h-\)
-II(x + 2w-l,y + 2h-i) + 2II(x + 2w-l,y + h-Y) + 2II(x + w-l,y + 2h-Y)  -II(x + 2w-l, y + 2h-i) + 2II(x + 2w-l, y + h-Y) + 2II(x + w-l, y + 2h-Y)
(e)  (e)
g(x,y,w,h) = II(x + w-l,y + h-l) + II(x-l,y-l)-II(x + w-l,y-V}-II(x-l,y + h-l)改变 数 -JJO + w-3,y + /z-3)-H(x + l,:F + l) + //(x + l,} + 3) + HO; + w-l,3 + l) 乡 x,y,w,h的值可提取样本图像不同位置的微结构特征,对于尺寸归一化为 24X12的眼睛 /非眼睛 样本图像可得到 42727个特征, 从而组成该样本图像的特征量 ^^"d1 ≤ 42727' g(x,y,w,h) = II(x + wl,y + hl) + II(xl,yl)-II(x + wl,yV}-II(xl,y + hl) change number-JJO + w-3,y + /z-3)-H(x + l,:F + l) + //(x + l,} + 3) + HO; + wl,3 + l) Township x,y The values of w, h can extract the microstructure features of different positions of the sample image, and for the eye/non-eye sample images normalized to 24×12, 42727 features can be obtained, thereby composing the feature quantity of the sample image ^^"d 1 ≤ 42727 '
(2)归一化样本图像特征量  (2) normalized sample image feature quantity
首先计算 24X12像素的样本图像区域^^^ ^+?^;;^ ^^+^内的像素灰度和的 均值 ^和方差 : = [II(x0+23,y0+U) + II(x0 -l,y0 -l)-/7( 0 -\,y0 + 11) -//(¾+ 23, -l)]/288 σ = { [SqrII(x0 + 23, _y0 + 11) + SqrII(x0 - 1, y0 - 1) - SqrII(x0 - 1, y0 + 11) - SqrII(x0 + 23, y0 - 1)] / 288 - μ2 },/2 First calculate the sample image area of 24X12 pixels ^^^ ^+? ^;;^ ^^+^ The mean and sum of the grayscale sums of pixels: = [II(x 0 +23, y 0 +U) + II(x 0 -l,y 0 -l)-/7 ( 0 -\,y 0 + 11) -//(3⁄4+ 23, -l)]/288 σ = { [SqrII(x 0 + 23, _y 0 + 11) + SqrII(x 0 - 1, y 0 - 1) - SqrII(x 0 - 1, y 0 + 11) - SqrII(x 0 + 23, y 0 - 1)] / 288 - μ 2 } ,/2
其次, 对每一个微结构对征进行如下归一化:  Second, the normalization of each microstructure is as follows:
FVU) 对于一个 24x12像素的样本图像, 共得到 42727维微结构特征 F /'),1≤ 7 < 42727; FVU) For a sample image of 24x12 pixels, a total of 42727 dimensional microstructure features F / '), 1 ≤ 7 < 42727;
(3)特征选择和分类器设计  (3) Feature selection and classifier design
使用 AdaBoost算法选择特征和训练分类器:一方面 AdaBoost算法在每轮迭代中选择出性 能最好的基于单个特征的弱分类器,达到特征选择的目的;另一方面把这些弱分类器集成为一 个强分类器, 并通过将多个强分类器级联起来得到一个完整的眼睛检测器; 具体来说, 包括以 下几个组成部分:  Using the AdaBoost algorithm to select features and training classifiers: On the one hand, the AdaBoost algorithm selects the best performance single class-based weak classifiers in each iteration to achieve the purpose of feature selection; on the other hand, integrate these weak classifiers into one Strong classifier, and by cascading multiple strong classifiers to get a complete eye detector; specifically, it includes the following components:
(i)弱分类器的构造  (i) Construction of weak classifier
对应于每一维特征构造最简单的树分类器来作为弱分类器:
Figure imgf000007_0001
The simplest tree classifier is constructed as a weak classifier corresponding to each dimension feature:
Figure imgf000007_0001
其中 sub是一个 24x12像素的样本, g;(sub)表示从该样本提取得到的第 j个特征, 是第 j个特征对应的判决阈值, 该阈值通过统计所有采集的眼睛和非眼睛样本的第 j个特征使得眼 睛样本的 FRR满足规定的要求来得到, 表示使用第 j个特征构造的树分类器的判决输 出, 这样每个弱分类器只需要比较一次阈值就可以完成判决; 共可得到 42727个弱分类器; (ii) 基于 AdaBoost算法的眼睛 /非眼睛强分类器设计 Where sub is a 24x12 pixel sample, g ; (sub) represents the jth feature extracted from the sample, is the decision threshold corresponding to the jth feature, and the threshold is calculated by counting all collected eye and non-eye samples The j features are such that the FRR of the eye sample satisfies the specified requirements, indicating the decision input of the tree classifier constructed using the jth feature Therefore, each weak classifier only needs to compare the threshold once to complete the decision; a total of 42727 weak classifiers are obtained; (ii) an eye/non-eye strong classifier design based on the AdaBoost algorithm
. (iii)多层强分类器的级联  (iii) Cascade of multi-layer strong classifiers
步骤 3 眼睛对分类器的训练  Step 3 Eye training for the classifier
眼睛对分类器的训练使用归一化后的眼睛对样本和非眼睛对样本,分别提取两类样本的特 征库, 利用 AdaBoost算法训练得到眼睛对分类器。 眼睛对分类器所使用微结构特征及训练过 程与前文单眼检测器的完全相同, 都是使用 AdaBoost算法从大量微结构特征中选择基于单个 特征的弱分类器来构成强分类器,并将多层强分类器级联在一起; 眼睛对分类器的具体训练过 程同样包括特征提取、 特征选择、 强分类器的训练、 多层强分类器的级联:  The eye trains the classifier using the normalized eye-to-sample and non-eye-pair samples, extracts the feature libraries of the two types of samples, and uses the AdaBoost algorithm to train the eye-pair classifier. The eye uses the same microstructural features and training process as the previous one-eye detector. The AdaBoost algorithm is used to select a weak classifier based on a single feature from a large number of microstructure features to form a strong classifier. Strong classifiers are cascaded together; the specific training process of the eye to the classifier also includes feature extraction, feature selection, training of strong classifiers, and cascade of multi-layer strong classifiers:
(1)特征提取  (1) Feature extraction
使用归一化后的眼睛对样本和非眼睛对样本按上述步骤 2.1所述的特征提取方法提取眼睛 对和非眼睛对样本的高维微结构特征, 对于尺寸归一化为 25X15像素的样本, 共得到 71210 个特征, 组成该样本的特征点是 ^Ϋ( ),1≤ y<71210;  Use the normalized eye-to-sample and non-eye-pair samples to extract the high-dimensional microstructure features of the eye-pair and non-eye-pair samples according to the feature extraction method described in step 2.1 above. For samples with a size normalized to 25×15 pixels, a total of 71210 features, the feature points that make up the sample are ^Ϋ( ), 1≤ y<71210;
(2)为了减轻光照的影响, 按步骤 2.2所述的灰度均值和方差归一化方法对每一个 25X 15 像素样本进行灰度均值和方差的归一化:  (2) In order to reduce the influence of illumination, normalize the gray mean and variance for each 25X 15 pixel sample according to the gray mean and variance normalization method described in step 2.2:
首先, 快速计算出所述 25X15像素样本的灰度均值 ^和方差 , 样本在整幅图像中的坐 标区域为 (x。 < x'≤xQ + 24, ^0</<^0+14), 则; ^和 分别为: First, the gray level mean and variance of the 25×15 pixel samples are quickly calculated. The coordinate area of the sample in the whole image is (x. <x'≤x Q + 24, ^ 0 </<^ 0 +14) , then; ^ and respectively are:
~i = [II(x0+24,y0+l4) + II(xQ -l,y0 -l)-//(x0— l,y。 +14)-//(x0 +24,^— 1)]/365 = { [SqrII(x0 + 24, ^0+14) + SqrII(x0一 1, y0— 1)— SqrII(x0 -l,y0+U) ~i = [II(x 0 +24,y 0 +l4) + II(x Q -l,y 0 -l)-//(x 0 — l,y. +14)-//(x 0 + 24,^— 1)]/365 = { [SqrII(x 0 + 24, ^ 0 +14) + SqrII(x 0 -1, y 0 — 1) — SqrII(x 0 -l, y 0 +U)
-SqrII(x0 + 24,yQ -l)]/365- 72}12 -SqrII(x 0 + 24,y Q -l)]/365- 7 2 } 12
其次对每一个维微结构特征进行如下的归一化:
Figure imgf000008_0001
对于一个 25x15像素的样本图像, 共得到 71210维微结构特征 i^/),l≤_/≤71210。
Secondly, the normalization of each dimension micro-structure feature is as follows:
Figure imgf000008_0001
For a sample image of 25x15 pixels, a total of 71210-dimensional microstructure features i^/), l ≤ _ / ≤ 71110 are obtained.
(3)特征选择及分类器设计  (3) Feature selection and classifier design
眼睛对检测器也采用分层结构,先由结构简单的强分类器排除掉图像中的背景窗口,然后 由结构复杂的强分类器对余下窗口进行判断。 具体来说包括以下几个组成部分:  The eye also uses a layered structure for the detector. The background window in the image is first excluded by a strong classifier with a simple structure, and then the remaining window is judged by a strong classifier with a complicated structure. Specifically, it includes the following components:
(i)弱分类器的构造 弱分类器仍使用一维特征构造的树分类器; , , Κ ί if g sub)< or gj(snb)>0(i) Constructing a weak classifier The weak classifier still uses a tree classifier constructed with one-dimensional features; , , Κ ί if g sub)< or gj (snb)>0
2,(SUb) = ^  2, (SUb) = ^
[ 0, otherwise 共可得到 71210个弱分类器。  [0, otherwise A total of 71210 weak classifiers are available.
(ii) 基于 AdaBoost算法的眼睛对 /非眼睛对强分类器设计  (ii) Eye-to-non-eye-to-strong classifier design based on AdaBoost algorithm
(iii)多层强分类器的级联 (iii) Cascade of multi-layer strong classifiers
所述检测阶段是指是判断一张输入的人脸区域眼睛中心位置, 包含以下步骤:  The detection phase refers to determining the center position of the eye area of an input face, and includes the following steps:
所述眼睛检测阶段, 是对一张输入的人脸区域, 使用以下步骤来精确定位眼睛中心位置: 步骤 1 估计左右眼睛所在的区域 Ω^^Ω ^;  The eye detection stage is for an input face area, and the following steps are used to accurately position the center of the eye: Step 1 Estimate the area where the left and right eyes are located Ω^^Ω ^;
使用人脸图像竖直方向投影的均值、 方差函数来确定 Ω/¾/ ^和 Ω ^在水平方向上的分界 线, 然后根据从训练样本中统计到的眼睛在人脸区域竖直方向上的分布规律, 确定 0/£/^和 的上下边界, 从而估计出 Ω/ε/ 和 QrigUse the mean and variance functions of the vertical projection of the face image to determine the boundary between Ω /3⁄4/ ^ and Ω ^ in the horizontal direction, and then based on the eye from the training sample in the vertical direction of the face region The distribution law determines the upper and lower boundaries of 0 / £ / ^ and estimates Ω / ε / and Q rig ;
. (1)利用投影函数确定眼睛区域的左右分界线 (1) Using the projection function to determine the left and right boundaries of the eye area
取检测到的人脸区域的上半部分, 以其垂直方向灰度投影的均值函数与方差函数的比值 Taking the upper half of the detected face area, the ratio of the mean function to the variance function of the gray level projection in the vertical direction
(x)的峰值作为左右两眼所在区域的竖直分界线, 定义此位置为 ^The peak of ( x) is the vertical boundary of the area where the left and right eyes are located. Define this position as ^
VPFv(x) VPF v (x)
MPFv(x) MPF v (x)
xnmk =argmax- peak o≤x<ivfacc VPFv(x) x nmk =argmax- p eak o≤ x <iv facc VPF v (x)
(2)利用样本统计信息得到眼睛区域的上下边界  (2) Using the sample statistics to get the upper and lower boundaries of the eye area
akfleye , Ω ^的上下边界则可以利用样本中眼睛在人脸竖直方向的分布位置统计出来; 有 a kfleye , the upper and lower boundaries of Ω ^ can be counted by the distribution of the eyes in the vertical direction of the face in the sample;
Ω,*ε=0, , 0<x<x , 0.05Hace<^<0.45H/oce Ω,* ε =0, , 0<x<x , 0.05H ace <^<0.45H /oce
ΩΓ¾ e = (x, ' xpeak<x<Wface' 0.05Hface<y<0A5Hfaci! 其中 Hface、 if/ace为利用样本统计得出的人脸高度和宽度; Ω Γ3⁄4 e = (x, ' x peak <x<W face ' 0.05H face <y<0A5H faci! where H face and if /ace are the height and width of the face derived from the sample statistics;
步骤 2利用单眼检测器检测眼睛候选  Step 2 Using a single-eye detector to detect eye candidates
•在 Ω ,^两个区域中分别使用单眼检测器检测给出左右眼睛 20个候选位置,并估 计出每个候选位置的置信度, 眼睛候选的具体检测过程如下:  • Use the single-eye detector to detect 20 candidate positions for the left and right eyes in the Ω and ^ areas, and estimate the confidence of each candidate position. The specific detection process of the eye candidates is as follows:
(1)输入人脸图像的积分图像的计算  (1) Calculation of the integrated image of the input face image
计算输入人脸图像 /(X, y)对应的积分图像 JI(x, y)和平方积分图像 SqrIIi , y);  Calculate the integral image JI(x, y) and the square integral image SqrIIi, y) corresponding to the input face image /(X, y);
(2)判别左右眼区域中的每一个的小窗口  (2) Discriminating the small window of each of the left and right eye regions
判别 Ω^^、 两个区域中的每一个的 24x12像素尺寸的小窗口, 对任一个小窗口 [x0,y0;x0 +23,;。 + 11]的处理步骤如下: Distinguish Ω^^, a small window of 24x12 pixel size for each of the two regions, for any small window [x 0 , y 0 ; x 0 +23,;. The processing steps for + 11] are as follows:
(i)利用整幅图像的积分图 //,·(χ, 和平方积分图 S^AO,;)计算小窗口的均值^和方差 σ  (i) Using the integral map of the entire image //, ·(χ, and square integral map S^AO,;) to calculate the mean ^ and variance σ of the small window
^ = [II(x0 + 23,yQ + U) + II(x0 -l,y0 -l) - //(x0 -l,y0 +U) - II(x0 + 23, y0 -l)]/288 σ = {[SgrII(x0 + 23,y0 +U) + SqrII(x0 ~l,y0 -l) -SqrII(x0 -i,y0 +11) ^ = [II(x 0 + 23, y Q + U) + II(x 0 -l, y 0 -l) - //(x 0 -l,y 0 +U) - II(x 0 + 23, y 0 -l)]/288 σ = {[SgrII(x 0 + 23,y 0 +U) + SqrII(x 0 ~l,y 0 -l) -SqrII(x 0 -i,y 0 +11)
- SqrII(x0 + 23, y0 - 1)] / 288 - μ2 }I/2 - SqrII(x 0 + 23, y 0 - 1)] / 288 - μ 2 } I/2
(ii)利用训练阶段步骤 2 (1)的特征提取方法提取该小窗口^微结构特征, 并进行特征归 一化处理;  (ii) extracting the small window ^ micro structure feature by using the feature extraction method of step 2 (1) of the training phase, and performing feature normalization processing;
(iii)采用训练好的多层眼睛 /非眼睛强分类器对小窗口进行判断; 如果通过所有层强分类 器的判断,则认为该小窗口包含一个眼睛候选,输出其位置及其置信度;否则抛弃掉该小窗口, 不进行后续处理; 最后根据候选的置信度大小输出最多前 20个候选位置;  (iii) judging the small window by using the trained multi-layer eye/non-eye strong classifier; if judged by all the layer strong classifiers, the small window is considered to contain an eye candidate, and the position and its confidence are output; Otherwise, the small window is discarded, and no subsequent processing is performed; finally, the top 20 candidate positions are output according to the candidate confidence level;
步骤 3 眼睛候选对的验证  Step 3 Verification of eye candidate pairs
为了排除眼睛候选中的误检测和不精确的定位结果,将左右眼睛候选配对,提取候选周围 区域更多的特征,然后使用眼睛对分类器来验证每一对候选,最后根据后验概率从所有候选对 中估计出双眼的最佳位置, 具体来说对每一对眼睛候选, 包括以下处理步骤- In order to exclude misdetection and inaccurate positioning results in eye candidates, the left and right eye candidates are paired, more features of the candidate surrounding regions are extracted, and then each pair of candidates is verified using the eye pair classifier, and finally from all posterior probabilities based on posterior probabilities The optimal position of the binocular is estimated in the candidate pair, specifically for each pair of eye candidates, including the following processing steps -
(1)根据左右眼候选位置割取图像进行尺寸归一化 (1) Dimension normalization according to the image of the left and right eye candidate positions
对每一对眼睛候选, 首先根据左右眼候选位置按照训练阶段步骤 1.1中眼睛对样本的割取 方式来割取图像, 然后进行尺寸归一化和光照归一化, 得到 25 x 15像素的眼睛候选对图像  For each pair of eye candidates, firstly, according to the left and right eye candidate positions, the image is cut according to the way the eye cuts the sample in step 1.1 of the training phase, and then the size normalization and illumination normalization are performed to obtain an eye of 25 x 15 pixels. Candidate pair image
(2) 输入图像积分图像的计算 (2) Calculation of input image integral image
计算图像 Ρ/(χ, 对应的积分图像 Ρ//Ο,3 y Calculate the image Ρ/(χ, the corresponding integral image Ρ//Ο, 3 y
Figure imgf000010_0001
Figure imgf000010_0001
(3)眼睛候选对图像 P/(x,j )的判断  (3) Judgment of the eye candidate P·(x,j)
对每一眼睛候选对图像 的验证步骤如下:  The verification steps for each eye candidate pair image are as follows:
(i)利用整幅图像的积分图提取微结构特征;  (i) extracting microstructure features using an integral map of the entire image;
(ii)采用训练好的第 i层强分类器对图像进行判断;  (ii) using the trained i-th layer strong classifier to judge the image;
(i ii)如果通过判断, 则 值增加 1, 返回步骤 3 (3) (i i) ; 否则抛弃掉该眼睛候选对; 如果 通过所有层强分类器的判断, 则认为该候选对为有效候选对, 输出其位置及其置信度;  (i ii) If judged, the value is incremented by 1, returning to step 3 (3) (ii); otherwise, the eye candidate pair is discarded; if it is judged by all layer strength classifiers, the candidate pair is considered to be a valid candidate pair , output its position and its confidence;
最后,对所有通过判断的候选对按照置信度从大到小排序,取置信度最大的前 3对候选对 的平均位置作为眼睛中心位置, 输出眼睛位置。  Finally, the candidate pairs that pass the judgment are sorted according to the confidence level from large to small, and the average position of the first three pairs of candidate pairs with the highest confidence is taken as the eye center position, and the eye position is output.
所述步骤 3 (2) (ii) 训练第 i层强分类器包括以下步骤: 将 AdaBoost算法结合上述弱分类器构造方法用于训练眼睛 /非眼睛强分类器; 训练算法步 骤如下, 记给定训练集 = {(Sub /,.)}, = l,...,«, /,.= 0,1是样本图像 sub,.的类别标号, 分别对应 非眼睛类别和眼睛类别, 其中眼睛样本 个, 非眼睛样本 。 ' The step 3 (2) (ii) training the i-th layer strong classifier includes the following steps: Combine the AdaBoost algorithm with the weak classifier construction method above to train the eye/non-eye strong classifier; the training algorithm steps are as follows, remember the given training set = {( S ub /,.)}, = l,...,« , /,.= 0,1 is the category label of the sample image sub,. Corresponding to the non-eye category and eye category, respectively, the eye sample, non-eye sample. '
1)参数的初始化 训练样本权重的初始化。 初始每个样本的权重为 A(0=丄;  1) Initialization of parameters Initialization of training sample weights. The initial weight of each sample is A (0 = 丄;
n  n
选择迭代次数 τ, Τ是希望使用的弱分类器的个数, Τ应随着强分类器层数的增多逐渐增 统计样本集上每个特征分布的极大值 Fmax(D和极小值 Fmin(j), 其中 _/为特征序号, 1<; < 42727;  Select the number of iterations τ, Τ is the number of weak classifiers that you want to use, and you should increase the maximum value Fmax (D and minimum value Fmin) of each feature distribution on the statistical sample set as the number of strong classifiers increases. (j), where _/ is the feature number, 1<; < 42727;
2)重复以下过程 Γ次, t = l,...,r: 2) Repeat the following process, t = l,...,r :
使用第 j个特征, 1≤ _ ≤ 42727构造弱分类器 , 然后在 FminG')和 Fmax(j)间 穷举搜索阈值参数 使得 ί,的错误率 最小 定义 Sj=^D,(i) (sub,)— Using the jth feature, 1 ≤ _ ≤ 42727 constructs the weak classifier, then exhaustively searches the threshold parameter between FminG') and Fmax(j) such that the error rate of ί is minimized to define Sj = ^D, (i) (sub ,)—
/=1 令 S =argmin , 并将其对应的弱分类器作为^;  /=1 Let S = argmin , and its corresponding weak classifier as ^;
1< ≤42727 计算参数 , -^ΐη^^) ; 更 新 样 本 的 权 重 +i( = ( exp(-^^.^(sub,.))1< ≤42727 Calculate the parameter, -^ΐη^^) ; Update the weight of the sample +i( = ( exp(-^^.^(sub,.))
Figure imgf000011_0001
对于通过强分类器判断的模式, 采用 (1|/(Χ)):
Figure imgf000011_0001
For the mode judged by the strong classifier, use (1|/(Χ)) :
e/(x) +e-/(x)得到模式属于眼睛的后验概 率, 此处 /(X) = e/(x) +e -/(x) gets the posterior probability that the pattern belongs to the eye, where /(X) =
/=1 L  /=1 L
所述步骤 2 (3) (iii)单眼检测器釆用分层结构,训练多层强分类桊的级联包括以下步骤: 1)初始化 ζ· = ι; 定义每层强分类器的训练目标是在眼睛训练集上 ' The step 2 (3) (iii) the monocular detector uses a hierarchical structure, and training the cascade of the multi-layer strong classification includes the following steps: 1) Initialization ζ · = ι; Define the training target for each layer of strong classifiers on the eye training set'
FRR <0.1%, 在非眼睛训练集上 ¾i?≤60%; 定义整个眼睛检测器在眼睛训练集上的 目标 FRR≤ 1%, 在非眼睛训练集上的目标 FAR≤5χ10"4; FRR <0.1%, 3⁄4i? ≤ 60% on the non-eye training set; define the target FRR ≤ 1% of the entire eye detector on the eye training set, and the target FAR ≤ 5 χ 10" 4 on the non-eye training set;
2)使用训练样本集采用步骤 2 (3)(ii)所述的基于 AdaBoost算法的来训练第 i层眼睛 / 非眼睛强分类器;  2) using the training sample set to train the i-th layer eye/non-eye strong classifier using the AdaBoost algorithm described in step 2 (3) (ii);
3)用训练得到的前 i层分类器对样本集进行检测; 3) testing the sample set with the former i-level classifier obtained by training;
4) 如果 、 未达到预定值, 则 值增加 1, 返回步骤②继续进行 训练; 否则停止训练; 4) If the value does not reach the predetermined value, the value increases by 1. Return to step 2 to continue the training; otherwise, stop the training;
共训练得到 7层结构从简单到复杂的强分类器;由这些强分类器级联构成一个完整的眼睛 检测器;  Co-training to obtain a 7-layer structure from simple to complex strong classifier; these strong classifiers cascade to form a complete eye detector;
所述步骤 3 (3) (iii)整个眼睛对验证器采用分层结构, 训练多层强分类器的连接包括以 下步骤:  The step 3 (3) (iii) the entire eye uses a hierarchical structure for the verifier, and training the connection of the multi-layer strong classifier includes the following steps:
1)初始化 = 1; 定义每层强分类器的训练目标是在眼睛对训练集上 ?≤0.1%, 在非 眼睛对训练集上 ¾i?≤50%; 定义整个眼睛对检测器在眼睛对训练集上的目标 Ei?i?≤l%,在 非眼睛对训练集上的目标 ¾i?≤lxl0—31) Initialization = 1; The training goal for defining each level of strong classifier is on the eye-to-training set? ≤ 0.1%, on the non-eye-pair training set 3⁄4i? ≤ 50% ; defining the entire eye-to-detector in the eye-pair training ? Ei on the target set i ≤l%, in the eyes of non-target ¾i on the training set ≤lxl0- 3??;
2)使用训练样本集采用步骤 3.3.2所述的使用 AdaBoost算法来训练第 i层眼睛对 /非眼睛对强分类器;  2) Using the training sample set, use the AdaBoost algorithm described in step 3.3.2 to train the i-th eye-pair/non-eye-to-strong classifier;
3)用训练得到的前 i层分类器对样本集进行检测; 3) testing the sample set with the former i-level classifier obtained by training;
4)如果 ¾?、 F^R未达到预定值, 贝 lj/值增加 1, 返回步骤 (b) 继续进行训练; 否 则停止训练; 共训练得到 9层结构从简单到复杂的强分类器;由这些强分类器级联构成一个完整的眼睛 对检测器; 为验证本发明的有效性, 我们进行了如下实验: 4) If 3⁄4?, F^R does not reach the predetermined value, the shell lj/value is increased by 1, returning to step (b) to continue the training; otherwise, the training is stopped; the training is to obtain a 9-layer structure from simple to complex strong classifier; These strong classifier cascades form a complete eye-pair detector; to verify the effectiveness of the invention, we performed the following experiment:
眼睛定位算法使用的测试集包括如下 3部分:  The test set used by the eye positioning algorithm includes the following three parts:
测试集 1: 由 Yale B、 Aerolnfo. 公安部一所人脸数据库组成, 共包括 209个人的 4353 张图像。其中 Yale B数据库包括 15个人, 165张图像,其特点是光照变化比较复杂; Aerolnfo 数据库, 由中国航天信息有限公司提供, 包括 165个人的 3740张图像, 其特点是外界光照、 人脸的姿态变化复杂, 并且背景复杂, 人脸图像质量差; 公安部一所人脸数据库, 包括 30个 人的 448张图像, 其特点是光照变化复杂, 有的人佩戴的眼镜还有强烈的反光; Test Set 1: Consisting of a face database of Yale B and Aerolnfo. The Ministry of Public Security consists of 4,353 images of 209 people. The Yale B database consists of 15 people, 165 images, which are characterized by complex illumination changes. The Aerolnfo database, provided by China Aerospace Information Co., Ltd., includes 3,740 images of 165 people, characterized by external illumination, The posture of the face changes complexly, and the background is complex, and the quality of the face image is poor. The Ministry of Public Security has a face database of 448 images of 30 people, which is characterized by complex lighting changes, and some people wear glasses and strong Reflective
测试集 2: 由 BANCA人脸数据库的英语部分组成, 共包括 82个人的 6540张图像。 其特点 是图像背景和图像质量有很大变化, 包括受控、 降质和恶劣三种场景下采集到的图像, 另外光 照和人脸姿态变化也很复杂, 多人还佩戴有黑框眼镜;  Test Set 2: Consists of the English part of the BANCA Face Database, which includes a total of 6540 images of 82 people. It is characterized by a large change in image background and image quality, including images acquired under controlled, degraded and harsh scenes. In addition, illumination and face pose changes are also complicated, and many people also wear black-rimmed glasses;
测试集 3: JAFFE数据库, 包括 213张人脸图像, 其特点是人脸表情变化丰富; 在来源和变化如此丰富的集合上进行的测试应该能真实反映出一个定位算法的性能: 表 1与其它定位算法在不同允许误差下的性能比较  Test Set 3: The JAFFE database, which includes 213 face images, is characterized by rich facial expression changes; tests performed on collections with such rich sources and changes should truly reflect the performance of a positioning algorithm: Table 1 and others Performance comparison of positioning algorithm under different allowable errors
Figure imgf000013_0002
Figure imgf000013_0002
与其它
Figure imgf000013_0001
产品 FacelT 相比, 本文算法在不同测试集上性能稳定, 均优于 FacelT的定位精度, 而 FacelT在实验中对 人眼的开闭、 人脸的尺寸等因素比较敏感; 与 zhou[zh°u Z H' Geng X Projeeti°n funeti°ns fM eye deteetiOT Pattern Re∞gniti°n' 2(m]方法相比, 本文方法在 JAFFE数据库上误差在 0.10内的定位准确率即为 98.6%, 而其方法误差在 0.25内的定位准确率仅为 97.2%。 附图说明
And other
Figure imgf000013_0001
Compared with the product FacelT, the algorithm has stable performance on different test sets, which is better than the positioning accuracy of the FacelT, and the FacelT is sensitive to the opening and closing of the human eye and the size of the face in the experiment; and zhou [zh ° u ZH ' Geng X Projeeti ° n funeti ° ns fM eye deteetiOT Pattern Re∞gniti ° n ' 2 (m) method, the accuracy of the method in the JAFFE database within 0.10 is 98.6%, and its method The positioning accuracy of the error within 0.25 is only 97.2%.
图 1 一个典型的眼睛定位***的硬件构成; Figure 1 shows the hardware composition of a typical eye positioning system;
图 2 训练样本的收集流程; Figure 2 The collection process of training samples;
图 3 单个眼睛样本和眼睛对样本示例; Figure 3 Example of a single eye sample and an eye pair sample;
图 4 眼睛定位***的结构框图; Figure 4 is a block diagram of the eye positioning system;
图 5 眼睛定位过程示例; Figure 5 Example of an eye positioning process;
图 6 采用的五种微结构特征模板; Figure 5: Five microstructure feature templates used;
图 7 积分图的计算和微结构特征的提取示例; Figure 7 Example of calculation of integral graph and extraction of microstructure features;
图 8 多级强分类器的级联结构; Figure 8 cascading structure of multi-level strong classifier;
图 9 基于 AdaBoost算法的强分类器的训练流程; Figure 9 The training process of a strong classifier based on the AdaBoost algorithm;
图 10眼睛对模板比例示意图; 图 11基于本算法的人脸识别签到***。 具体实施方式 Figure 10 is a schematic diagram of the eye to template ratio; Figure 11 is a face recognition sign-in system based on the algorithm. detailed description
整个人眼定位***的硬件结构如图 1所示, 图 1中. 101为扫描仪、 102为摄像头、 103为 计算机。 ***的训练过程和识别过程如图 4所示, 下面详细介绍***的各个部分:  The hardware structure of the entire human eye positioning system is shown in Fig. 1. In Fig. 1, 101 is a scanner, 102 is a camera, and 103 is a computer. The training process and identification process of the system are shown in Figure 4. The following sections describe the various parts of the system in detail:
***的输入是单个人脸区域图像。 人脸检测部分不包含在本发明中, 不进行详细说明。 A)训练***的实现(如图 2、 图 3所示)  The input to the system is a single face area image. The face detection portion is not included in the present invention and will not be described in detail. A) Implementation of the training system (as shown in Figure 2, Figure 3)
1.样本采集与归一化  1. Sample collection and normalization
(1)样本的收集  (1) Collection of samples
为训练单眼检测器, 采用人手工标定的方法, 从人脸图片中切割出单只眼睛图像, 并从人 脸图像非眼睛部位随机割取非眼睛样本。单只眼睛图像和非眼睛图像分别作为正例样本和反例 样本用于训练单眼检测器。 一些训练样本如图 3(a)所示。  In order to train the monocular detector, a single eye image is cut out from the face image by a manual calibration method, and a non-eye sample is randomly cut from the non-eye portion of the face image. The single eye image and the non-eye image are used as positive and negative samples, respectively, for training the monocular detector. Some training samples are shown in Figure 3(a).
另外为训练眼睛对检测器, 还根据人手工标定的眼睛位置, 按照图 7所示的比例从人脸图 像中剪裁得到眼睛对样本,并从人脸图像中随机割取非眼睛对样本。眼睛对图像和非眼睛对图 像分别作为正例样本和反例样本用于训练单眼检测器。一些采集到的样本如图 3(b)示。这样采 集到的样本不只包括两只眼睛还包括眉毛、 鼻子等部位, 体现了眼睛与周围器官的约束关系。  In addition, in order to train the eye-to-detector, according to the position of the eye manually calibrated by the person, the eye-to-sample is obtained by cropping from the face image according to the scale shown in Fig. 7, and the non-eye pair sample is randomly cut from the face image. The eye-to-image and non-eye-to-image images are used as training positive and negative samples, respectively, for training monocular detectors. Some of the samples collected are shown in Figure 3(b). The samples collected in this way include not only the two eyes but also the eyebrows, the nose and other parts, which embodies the constraint relationship between the eyes and the surrounding organs.
(2)尺寸归一化  (2) Size normalization
将收集好的各尺寸的样本图像(包括单眼和非单眼、 眼睛对和非眼睛对图像)归一化为指 定尺寸。 设原始样本图像为 ] MxW, 图像宽度为 M, 高度为 N, 图像位于第 X行第 y列 的象素点的值为 F(x,y) ( 0≤x < M , 0≤y < N ) ; 设尺寸归一化后图像为 [G(x, x/i, 图像 宽度为 , 髙度为 H, 实验中对单眼样本取 = 24,H = 12, 对眼睛对样本取 = 25, H = 15。 本发明使用反向投影和线性插值将原始样本图像变换到标准尺寸样本图像, 则输入图像 与归一化后图像 [G xW之间的对应关系为- 其中 和
Figure imgf000014_0001
。根据线性插值方法, 对 于给定 (X, , 令:
The collected sample images of each size (including monocular and non-monocular, eye-pair, and non-eye-to-image) are normalized to a specified size. Let the original sample image be ] MxW , the image width be M, the height be N, and the value of the pixel at the y column of the Xth row of the image is F(x, y) ( 0 ≤ x < M , 0 ≤ y < N The normalized image is [G(x, x/i , image width, 髙 degree H, experiment for single eye sample = 24, H = 12, for eye to sample = 25, H = 15. The present invention transforms the original sample image into a standard size sample image using back projection and linear interpolation, and the correspondence between the input image and the normalized image [G xW is -
Figure imgf000014_0001
. According to the linear interpolation method, for a given (X, , order:
Figure imgf000014_0002
Figure imgf000014_0002
. G(x,y) = F(x。 + Δ,:。 + Δ = F(x0,y0)AxAy + F(x0 +l,y0)(\ - Ax)A +F(x0,y0 + Ϊ)ΑΧ (1— Δ + F(x0 + 1,γ0 + 1)(1― Ax)(l - Δ G(x,y) = F(x. + Δ,:. + Δ = F(x 0 , y 0 )A x A y + F(x 0 +l,y 0 )(\ - A x )A +F(x 0 ,y 0 + Ϊ)Α Χ (1— Δ + F(x 0 + 1,γ 0 + 1)(1― A x )(l - Δ
(3)灰度归一化 _ 本发明采用灰度均值、方差归一化对样本进行灰度均衡化处理,将样本图片灰度的均值 ^ 和方差 调整到给定值 。和 σ。。 首先釆用下式计算样本图像 ( 0≤X<W , 0≤ y <H ) 的均值和方差: = ^∑∑G (3) Grayscale normalization _ The invention uses gray mean value and variance normalization to perform gray level equalization processing on the sample, and adjusts the mean value and variance of the gray level of the sample picture to a given value. And σ. . First, calculate the mean and variance of the sample image ( 0 ≤ X < W , 0 ≤ y < H ) using the following formula: = ^∑∑ G
Wtl y=0 χ=0Wtl y =0 χ=0 ,
1 -l W-\ 1 -l W-\
WH v=0 x=o 然后对每个像素点的灰度值进行如下变换:
Figure imgf000015_0001
WH v =0 x=o Then the gray value of each pixel is transformed as follows:
Figure imgf000015_0001
从而将图像灰度的均值和方差调整到给定值 。和 σ。, 完成样本的灰度归一化。 Thereby the mean and variance of the image gradation are adjusted to a given value. And σ. , complete the grayscale normalization of the sample.
2.单个眼睛检测器的训练 单眼检测器的训练使用归一化后的单眼睛样本和非眼睛样本的微结构特征库, 利用 AdaBoost算法训练得到单只眼睛检测器; 其具体训练过程如下:  2. Training of a single eye detector The training of a single eye detector uses a normalized single eye sample and a non-eye sample microstructure feature library, and a single eye detector is trained using the AdaBoost algorithm; the specific training process is as follows:
(1) 特征提取  (1) Feature extraction
由于微结构特征可以利用整幅图像的积分图快速得到图像中任意尺度、任意位置的一种微 结构特征,从而为眼睛的实时检测提供了可能。本发明釆用图 6中的五种类型微结构模板来提 取人眼模式的高维微结构特征;通过计算图像中对应黑色区域和白色区域内像素灰度的差值得 到特征, 表达出眼睛模式的特点。  Since the microstructure feature can quickly obtain a micro-structure feature of any scale and arbitrary position in the image by using the integral map of the whole image, it provides a possibility for real-time detection of the eye. The invention uses the five types of microstructure templates in FIG. 6 to extract high-dimensional microstructure features of the human eye mode; and obtains features by calculating the difference between the gray levels of the pixels in the corresponding black and white regions in the image, and expresses the characteristics of the eye mode. .
①积分图像的快速计算  1 fast calculation of integral image
对于一个图像 , (x≥0,j >0) )定义其对应的积分图像 //(X, 为从 (0,0)到 范围 内的所有像素之和, 即 //(^ = /(x', )。 则原始图像 /(X, 中任何一个矩形区域中 For an image, (x≥0,j >0) ) defines its corresponding integral image //(X, is the sum of all pixels from (0,0) to the range, ie //(^ = /(x ', ). Then the original image / (X, in any rectangular area
0≤x'≤x ≤y'≤y  0≤x'≤x ≤y'≤y
的像素和可通过积分图经 3次加减法快速计算出。 The pixel sum can be quickly calculated by the addition and subtraction of the integral map by 3 times.
②微结构特征的快速提取  2 Rapid extraction of microstructure features
以上任意一种微结构特征可以通过对应的积分图像经过若干次加减计算出。  Any of the above microstructure features can be calculated by adding and subtracting several times through the corresponding integral image.
我们以图 6第 (a)种类型特征的提取过程为例说明微结构特征的快速提取。 如图 7所示, 在计算出整幅图像的积分图像后,对于左上角像素坐标为 (X, 、宽度为 w个像素、髙度为 A个 像素的第 (a)种类型微结构特征可以如下计算: g(x,y,w,h) = 2-II(x + w-l,y-Y) + II(x + 2-w-l,y + h-T) We take the extraction process of the type (a) feature of Figure 6 as an example to illustrate the rapid extraction of microstructure features. As shown in FIG. 7 , after calculating the integral image of the entire image, the (a) type of microstructural feature of the pixel coordinates of the upper left corner is (X, width w pixels, and 髙 degree is A pixels) Calculated as follows: g(x,y,w,h) = 2-II(x + wl,yY) + II(x + 2-wl,y + hT)
+ II(x-l,y + h-Y)-2-II(x + w-l,y + h-V)  + II(x-l,y + h-Y)-2-II(x + w-l,y + h-V)
-II(x + 2-w-l,y-Y)-II(x-\,y-l) 其中 受到下列约束: χ0 ≤χ, y0 y ' -II(x + 2-wl,yY)-II(x-\,yl) where the following constraints are imposed: χ 0 ≤χ , y 0 y '
x + 2 · M'≤ x0 + 24 , y + h≤y0+\2 改变参数 Χ,ΑΜ/,/ϊ可以提取不同位置、 不同尺度的特征。 其它类型的特征可以采取类似的方法 提取。 对于尺寸归一化为 24x12像素的人眼模式, 共得到 42727个特征; x + 2 · M'≤ x 0 + 24 , y + h≤y 0 +\2 By changing the parameter Χ, ΑΜ/, /ϊ can extract features of different positions and different scales. Other types of features can be extracted in a similar manner. For the human eye mode whose size is normalized to 24x12 pixels, a total of 42727 features are obtained;
(2)特征选择和分类器设计  (2) Feature selection and classifier design
本发明使用 AdaBoost算法选择特征和训练分类器。一方面 AdaBoost算法在每轮迭代中选 择出性能最好的基于单个特征的弱分类器,达到特征选择的目的; 另一方面把这些弱分类器集 成为一个强分类器,并通过将多个强分类器级联起来得到一个性能优秀的眼睛检测器。具体来 说, 包括以下几个组成部分:  The present invention uses the AdaBoost algorithm to select features and train classifiers. On the one hand, the AdaBoost algorithm selects the best performance single class-based weak classifier in each iteration to achieve the purpose of feature selection; on the other hand, these weak classifiers are integrated into one strong classifier, and by multiple strong The classifiers are cascaded to get an excellent eye detector. Specifically, it includes the following components:
①弱分类器的构造  1 weak classifier construction
弱分类器要具有非常高的分类速度,整个强分类器才能达到足够高的分类速度。本发明对 应于每一维特征构造最简单的树分类器来作为弱分类器:  Weak classifiers must have very high classification speeds, and the entire strong classifier can achieve a sufficiently high classification speed. The present invention constructs the simplest tree classifier for each dimension feature as a weak classifier:
g sub)> g sub)>
Figure imgf000016_0001
e 其中 sub是一个 24x12像素的样本, (sub)表示从该样本提取得到的第 j个特征, 是第 j个特征对应的判决阈值, 表示使用第 j个特征构造的树分类器的判决输出。 这样每个 弱分类器只需要比较一次阈值就可以完成判决; 共可得到 42727个弱分类器。
Figure imgf000016_0001
e where sub is a 24x12 pixel sample, (sub) represents the jth feature extracted from the sample, is the decision threshold corresponding to the jth feature, and represents the decision output of the tree classifier constructed using the jth feature. In this way, each weak classifier only needs to compare the threshold once to complete the decision; a total of 42727 weak classifiers are obtained.
②基于 AdaBoost算法的眼睛 /非眼睛强分类器设计  2 Eye/non-eye strong classifier design based on AdaBoost algorithm
本发明将 AdaBoost算法结合上述弱分类器构造方法用于训练眼睛 /非眼睛强分类器。训练 算法步骤如下(记给定训练集£ = {(8^,.,/,)}, = 1,...,«, /,.= 0,1是样本图像 sub,的类别标号, 分 别对应非眼睛类别和眼睛类别):  The present invention combines the AdaBoost algorithm with the weak classifier construction method described above for training eye/non-eye strong classifiers. The training algorithm steps are as follows (note that the given training set £ = {(8^,.,/,)}, = 1,...,«, /,.= 0,1 is the category label of the sample image sub, respectively Corresponding to non-eye categories and eye categories):
(i)参数的初始化  (i) Initialization of parameters
训练样本权重的初始化。 初始每个样本的权重为 =丄;  Initialization of training sample weights. The initial weight of each sample is =丄;
n  n
选择迭代次数 T (T即为希望使用的弱分类器的个数), τ应随着强分类器层数的增多逐渐 增多, 具体选择值见表 1; 统计样本集上每个特征分布的极大值 FmaxCj)和极小值 Fmin(j) (其中 _ 为特征序号: 1 < 7 < 42727 ); Select the number of iterations T (T is the number of weak classifiers that you want to use), and τ should increase gradually with the increase of the number of strong classifiers. See Table 1 for specific selection values; The maximum value FmaxCj) and the minimum value Fmin(j) of each feature distribution on the statistical sample set (where _ is the feature number: 1 < 7 <42727);
(ii)重复以下过程: Γ次 = l,...,r ):  (ii) Repeat the following process: Γ times = l,...,r ):
a)使用第 _/个特征 (1≤ ≤42727 )构造弱分类器 , 然后在 Fmin①和  a) Construct a weak classifier using the _th feature (1 ≤ ≤42727), then in Fmin1 and
Fmax①间穷举搜索阈值参数^ 使得Exhaustive search threshold parameter between Fmax1 ^ makes
Figure imgf000017_0001
Figure imgf000017_0001
b)令 = argmin^, 并将其对应的弱分类器作为 ;  b) let = argmin^, and use its corresponding weak classifier as;
1≤7'≤42727 c)计算参数 = jln(^— ^);  1≤7'≤42727 c) Calculation parameter = jln(^— ^);
Φ 更 新 样 本 的 权 重 A+1(o = A (0exp(—"' (sub'》 Φ Update the weight of the sample A +1 (o = A (0exp( —"' (sub '》
=∑ A (0 ^vi-^y (sub,. ))。 输出最后的强分类器 H(sub)=∑ A (0 ^vi-^y (sub,. )). Output the last strong classifier H(sub)
Figure imgf000017_0002
对于通过强分类器判断的模式,本发明采用 得到模式属于眼睛的置 信度,
Figure imgf000017_0002
For the mode judged by the strong classifier, the present invention adopts the confidence that the obtained mode belongs to the eye,
Figure imgf000017_0003
Figure imgf000017_0003
③多层强分类器的级联  3 cascade of multi-layer strong classifiers
由于单层强分类器很难同时实现高分类速度,极低的 FRR和极低的 FAR等目标,所以整个 眼睛检测器必须采用分层结构, 由简单到复杂将多层强分类器级联起来, 如图 8所示。在检测 时只要某个图像窗口不能通过其中任何一层, 可立即排除掉, 从而大大节省了计算量。  Since single-layer strong classifiers are difficult to achieve high classification speed, extremely low FRR and extremely low FAR, the entire eye detector must be hierarchical, with multiple layers of strong classifiers being cascaded from simple to complex. , as shown in Figure 8. As long as an image window cannot pass through any of the layers during the detection, it can be immediately eliminated, which greatly saves the calculation amount.
多层强分类器级联的具体训练步骤如下- e)初始化 = 1 ; 定义每层强分类器的训练目标是在眼睛训练集上 i¾i?≤0.1%, 在非眼 睛训练集上 _^2≤60%; 定义整个眼睛检测器在眼睛训练集上的目标 Fi?i?≤l%,在非 眼睛训练集上的目标 ^ SxlO"4 ; The specific training steps for the multi-layer strong classifier cascade are as follows - e) Initialization = 1; The training goal for defining each level of strong classifier is i3⁄4i? ≤ 0.1% on the eye training set, _^2 ≤ on the non-eye training set 60%; defines the target Fi?i? ≤ l% of the entire eye detector on the eye training set, and the target ^ SxlO" 4 on the non-eye training set;
f)使用训练样本集釆用 2(2)②节中的方法训练第 i层强分类器;  f) using the training sample set, use the method in 2(2) 2 to train the i-th strong classifier;
g)用训练得到的前 i层分类器对样本集进行检测;  g) detecting the sample set by using the trained first layer classifier;
h)如果 ?、 未达到预定值, 贝 ζ· + 1, 返回步骤 b)继续进行训练; 否则停止 训练。 共训练得到 7层结构从简单到复杂的强分类器;由这些强分类器级联构成一个完整的眼睛 检测器。 h) If? , did not reach the predetermined value, Bessie + 1, return to step b) continue training; otherwise stop training. A total of 7 layers of structures from simple to complex strong classifiers are obtained; these strong classifiers are cascaded to form a complete eye detector.
3. 利用眼睛对分类器对眼睛候选对的验证  3. Use eye-to-classifier verification of eye candidate pairs
为了排除候选中的虚警和不精确的定位结果,本发明将左右眼睛候选配对,提取候选周围 区域更多的特征,然后使用眼睛对分类器来验证每一对候选,最后根据后验概率从所有候选对 中估计出双眼的最佳位置(如图 4、 5所示)。 眼睛对分类器的训练包括以下步骤:  In order to eliminate false alarms and inaccurate positioning results in the candidate, the present invention pairs left and right eye candidates, extracts more features of the candidate surrounding regions, and then uses the eye pair classifier to verify each pair of candidates, and finally according to the posterior probability. The optimal position of both eyes is estimated for all candidate pairs (as shown in Figures 4 and 5). Eye training for the classifier includes the following steps:
(1)特征提取  (1) Feature extraction
采用图 6中的五种类型微结构模板来提取双眼和非眼睛对样本的高维微结构特征。同样可 以利用整幅图像的积分图像 //(X, -∑ ∑ /(χ', )快速得到任意尺度、 任意位置的一种微 结构特征。 同样定义平方积分图像 ∑ I(x',y')-I(x',y'), 用于计算每个矩形 区域的方差。  The five types of microstructure templates in Figure 6 were used to extract high-dimensional microstructure features of both the eye and non-eye pairs. It is also possible to quickly obtain a microstructure feature of arbitrary scale and arbitrary position by using the integral image of the entire image //(X, -∑ ∑ /(χ', ). Also define the square integral image ∑ I(x', y' ) -I(x',y'), used to calculate the variance of each rectangular region.
由于每一种特征提取只涉及到矩形区域中像素和的计算问题,所以以上任意一种微结构特 征可以通过积分图像若干次加减快速计算出。对于尺寸归一化为 25x15像素的眼睛对模式, 共 得到 71210个特征, 组成该样本的特征矢量^ Ϋ(_/),1≤_/≤71210。 为了减轻光照的影响, 需要对每一个 25x15像素样本图像进行灰度均值和方差的归一化, 所以首先要快速计算出小窗口的均值;^和方差 ^, 然后对每一维特征进行归一化, 其中的 25 X 15像素小窗口区域 (χ。 < '≤χ0+ 24, yQ≤ /≤ + 14)内的像素灰度和的 和 分别为 Since each feature extraction only involves the computational problem of pixel sums in a rectangular region, any of the above microstructure features can be quickly calculated by adding and subtracting the integral image several times. For the eye pair mode whose size is normalized to 25x15 pixels, a total of 71,210 features are obtained, which constitute the feature vector ^(_/) of the sample, 1≤_/≤71210. In order to reduce the influence of illumination, it is necessary to normalize the gray mean and variance for each 25x15 pixel sample image, so firstly calculate the mean of the small window; ^ and the variance ^, and then normalize each dimension feature. , in which the sum of the grayscale sums of the pixels in the small window area of 25 X 15 pixels (χ. <' ≤ + 0 + 24, y Q ≤ / ≤ + 14) is
^ = [//(x0+24^0+14) + //( 0 -\,y0 -l)-//(x0 -\,y0 +14)-//(x0 +24,j/0— 1)]/365 σ = {[SqrII(xQ + 24, 3;0 + 14) + SqrII(x0 -l,yQ -\)-SqrII(x0 -l,y0 +14) -SqrII(xQ + 24, y0 -ί)]/365-μψ 对每一维微结构特征可以进行如下的归一化: ^ = [//(x 0 +24^ 0 +14) + //( 0 -\,y 0 -l)-//(x 0 -\,y 0 +14)-//(x 0 +24 ,j/ 0 — 1)]/365 σ = {[SqrII(x Q + 24, 3; 0 + 14) + SqrII(x 0 -l,y Q -\)-SqrII(x 0 -l,y 0 +14) -SqrII(x Q + 24, y 0 -ί)]/365-μψ The following normalization can be performed for each dimension of microstructure features:
FV(j)=^-FV(j) FV(j)=^-FV(j)
σ  σ
对于一个 25x15像素的样本图像, 共得到 71210维微结构特征 ^( ),l≤ ≤71210。 For a 25x15 pixel sample image, a total of 71210 dimensional microstructure features ^( ), l ≤ 7121010 are obtained.
(2) 特征选择和分类器设计 (2) Feature selection and classifier design
为达到足够快的验证速度, 一个眼睛对检测器必须采用分层结构 (如图 8所示), 先由结 构简单的强分类器排除掉图像中的背景窗口, 然后由结构复杂的强分类器对余下窗口进行判 断。 本部分仍然使用 AdaBoost算法选择特征和训练分类器, 如图 9所示。 具体来说包括以下 几个组成部分- ①弱分类器的构造 In order to achieve a fast enough verification speed, an eye-to-detector must adopt a layered structure (as shown in Figure 8). The background window in the image is first excluded by a simple classifier with a simple structure, and then a strong classifier with complex structure is used. Judge the remaining windows. This section still uses the AdaBoost algorithm to select features and train the classifier, as shown in Figure 9. Specifically, it includes the following components - 1 weak classifier construction
弱分类器仍使用一维特征构造的树分类器:  The weak classifier still uses a tree classifier constructed with one-dimensional features:
^ (sub)= i g-7(s b)<^. or gJ(sub)>0J ^ (sub)= i g- 7 (sb)<^. or g J (sub)>0 J
j \ 0, otherwise 共可得到 71210个弱分类器。  j \ 0, otherwise A total of 71210 weak classifiers are available.
②基于 AdaBoost算法的眼睛 /非眼睛强分类器设计  2 Eye/non-eye strong classifier design based on AdaBoost algorithm
将 CS- AdaBoost算法结合上述弱分类器构造方法用于训练眼睛对强分类器。训练步骤如下 (记训练样本集 = {(sub /,.)}, = 1,...,«, =0,1是样本图像 sub,.的类别标号, 分别对应非眼睛 对类别和眼睛对类别, 其中眼睛对样本《^,个, 非眼睛对样本 。„ ^个):  The CS-AdaBoost algorithm is combined with the weak classifier construction method described above for training eye-to-strong classifiers. The training steps are as follows (record training sample set = {(sub /,.)}, = 1,...,«, =0,1 is the category label of the sample image sub,., corresponding to the non-eye pair and eye pair respectively Category, where the eye is against the sample "^, one, non-eye to sample." ^):
(i)参数的初始化  (i) Initialization of parameters
训练样本误分类风险的初始化。 对于每个眼睛对样本的误分类风险 c(o = , 对每个非 c + 1  Initialization of the risk of misclassification of training samples. The risk of misclassification of the sample for each eye c(o = , for each non-c + 1
(c是眼睛对类别是非眼睛对类别的误分类风险倍数, e值
Figure imgf000019_0001
(c is the misclassification risk multiple of the eye to the category is non-eye to category, e value
Figure imgf000019_0001
应大于 1且随着强分类器层数的增多逐渐减小接近于 1, 具体选择值见表 2); 训练样本权重的初始化。 初始每个样本的权重为It should be greater than 1 and gradually decrease as the number of strong classifiers increases. The specific selection value is shown in Table 2); The initialization of training sample weights. The initial weight of each sample is
Figure imgf000019_0002
Figure imgf000019_0002
选择迭代次数 T (T即为希望使用的弱分类器的个数), τ应随着强分类器层数的增多逐渐 增多, 具体选择值见表 2;  Select the number of iterations T (T is the number of weak classifiers that you want to use), and τ should gradually increase with the number of strong classifiers. See Table 2 for specific selection values;
统计样本集上每个特征分布的极大 i Fmax①和极小值 FminCD (其中 _/为特征序号, 1< /<71210): Fmax(j)= max FN①, Fmin(j)= min F (j);  The maximum i Fmax1 and the minimum value FminCD of each feature distribution on the statistical sample set (where _/ is the feature number, 1< /<71210): Fmax(j)= max FN1, Fmin(j)= min F (j );
(ii)重复以下过程 Γ次 (t = l5...,D: (ii) repeat the following process (t = l 5 ..., D:
a. )使用第 _个特征 (1≤_ ≤71210)构造弱分类器 ¾, 然后在 Fmin①和 Fmax①间穷 举搜索阈值参数 ., 使得 ^的错误率 最小, 定义 = ]A(0*|/^(sub,)-/,. a. ) Construct the weak classifier 3⁄4 using the _th feature (1≤_ ≤71210), and then exhaustively search the threshold parameter between Fmin1 and Fmax1, so that the error rate of ^ is the smallest, definition = ]A(0*|/ ^(sub,)-/,.
=1  =1
b. )令 s=argmin , 并将其对应的弱分类器作为^;  b. ) Let s=argmin , and make its corresponding weak classifier as ^;
1≤7'≤71210 c.)计算参数 α, = ln( ~~ '-) ; d.) 更 新 样 本 的 权 重 A+1(0 = A(0exp(— "' (sub'》 1≤7'≤71210 c.) Calculate the parameter α, = ln( ~~ '-) ; d.) Update the weight of the sample A +1 (0 = A(0exp( — "' (sub '
Zt = J £>( () (sub,.))。 输出最后的强分类器 H(sub) =Z t = J £> ( () (sub,.)). Output the last strong classifier H(sub) =
Figure imgf000020_0001
Figure imgf000020_0001
/(sub)  /(sub)
对于通过强分类器判断的模式, 本发明采用 (l I /(sub)) 得到模式属于眼睛 对的后验概率, 此处 /(Sub) 。For the mode judged by the strong classifier, the present invention uses (l I /(sub)) to obtain the posterior probability that the pattern belongs to the eye pair, here /( S ub) .
Figure imgf000020_0002
Figure imgf000020_0002
③多层强分类器的级联  3 cascade of multi-layer strong classifiers
整个眼睛对验证器采用分层结构, 如图 8所示。  The entire eye has a hierarchical structure for the validator, as shown in Figure 8.
多层强分类器级联的具体训练步骤如下:  The specific training steps of the multi-layer strong classifier cascade are as follows:
i)初始化 ζ· = 1 ; 定义每层强分类器的训练目标是在眼睛对训练集上 i?i?≤0.1%, 在非眼睛 对训练集上 ¾ ?≤50% ; 定义整个眼睛对检测器在眼睛对训练集上的目标 ¾ ?≤l%, 在非眼 睛对训练集上的目标^^≤1 10— 3; j)使用训练样本集中的训练第 i层强分类器; , i) Initialization ζ · = 1 ; The training objective for defining each class of strong classifiers is i?i? ≤ 0.1% on the eye-to-training set, and 3⁄4 ≤ 50% on the non-eye-pair training set; defining the entire eye-to-detection ? in ≤l% target ¾ eye on the training set, in the eyes of non-target on the training set ^^ ≤1 10- 3; j) using the training set of training i-layer strong classifier;,
k)用训练得到的前 i层分类器对样本集进行检测;  k) detecting the sample set by using the trained first layer classifier;
1)如果 ^、 未达到预定值, 则 ζ· + 1, 返回步骤 (b)继续进行训练; 否则停止 训练。 .  1) If ^, does not reach the predetermined value, then ζ· + 1, return to step (b) continue training; otherwise stop training. .
共训练得到 9层结构从简单到复杂的强分类器,使用了 1347个特征。; 由这些强分类器级 联构成一个完整的眼睛对检测器。  A total of 9347 features were used to obtain a 9-layer structure from simple to complex strong classifiers. ; These strong classifier cascades form a complete eye-to-detector.
B)测试***的实现 B) Implementation of the test system
而眼睛检测阶段则包括以下步骤 - 1.估计左右眼睛所在的区域 Ω,^^Π CLrigl eye 使用人脸灰度图像竖直方向投影的均值、 方差函数来确定 Ω,^^和 Ω, / 在水平方向上的 分界线,然后根据从训练样本中统计到的眼睛在人脸区域竖直方向上的分布规律,确定 Ω ^和The eye detection phase includes the following steps: 1. Estimating the area where the left and right eyes are located Ω, ^^Π CL rigl eye uses the mean and variance functions of the vertical projection of the face grayscale image to determine Ω, ^^ and Ω, / The boundary line in the horizontal direction is then determined according to the distribution law of the eye in the vertical direction of the face region from the training sample, and Ω ^ and
Ω„.^的上下边界, 从而估计出 和 nri≠leyeThe upper and lower boundaries of Ω„.^, thus estimating and n ri≠leye .
(1) 利用投影函数确定眼睛区域的左右分界线 (1) Using the projection function to determine the left and right boundaries of the eye area
取检测到的人脸区域的上半部分, 将垂直方向灰度投影的均值函数与方差函数的比值 的峰值作为左右两眼所在区域的竖直分界线, 如图 5 (b)示。 定义此波峰的位置;^ , Taking the upper half of the detected face region, the peak value of the ratio of the mean function of the vertical direction grayscale projection to the variance function is taken as the vertical boundary line between the left and right eyes, as shown in Fig. 5(b). Define the location of this peak; ^ ,
VPFv{x) VPF v {x)
即 MPR (x) which is MPR (x)
xmak = argmax x mak = argmax
oLwfaa VPFv(x) oLw faa VPF v (x)
(2) 利用样本统计信息得到眼睛区域的上下边界 (2) Using the sample statistics to get the upper and lower boundaries of the eye area
QleJieye、 Qrlg/lleye的上下边界则可以利用人脸样本中眼睛位置在竖直方向的分布范围统计出 来。 有 The upper and lower boundaries of Q leJieye and Q rlg/lleye can be counted using the distribution of eye positions in the vertical direction in the face sample. Have
= 0, , 0 < X < xpek, 0.05Hface < y < 0.45H = (x, y), xp∞k < x < Wface, 0.05H細 < y < QA5Hface = 0, , 0 < X < x pe . k , 0.05H face < y < 0.45H = (x, y), x p∞k < x < W face , 0.05H fine < y < QA5H face
2. 利用局部特征检测眼睛候选 2. Using local features to detect eye candidates
在 Ω¾ 、 Ω 两个区域中分别使用眼睛检测器检测左右眼睛候选位置, 并估计出每个 候选位置的置信度。 眼睛候选的具体检测过程如下-Eye detectors are used to detect left and right eye candidate positions in the Ω 3⁄4 and Ω regions, respectively, and the confidence of each candidate position is estimated. The specific detection process for eye candidates is as follows -
(1)计算输入人脸图像的积分图像 (1) Calculate the integral image of the input face image
计算输入人脸图像 /(X, y)对应的积分图像 //(X, y)和平方积分图像 SqrI x, y):  Calculate the integral image of the input face image /(X, y) //(X, y) and square integral image SqrI x, y):
(2)判别左右眼区域中的每一个的小窗口  (2) Discriminating the small window of each of the left and right eye regions
判别 Ω ^、 Ω^,, ^两个区域中的每一个的 24 x 12像素尺寸的小窗口, 对任一个小窗口 Determining a small window of 24 x 12 pixel size for each of the two regions Ω ^, Ω^,, ^, for any small window
[x0, y0;xQ + 23,^0 + 11]的处理步骤如下: The processing steps of [x 0 , y 0 ; x Q + 23, ^ 0 + 11] are as follows:
①利用整幅图像的积分图和平方积分图计算小窗口的均值和方差;  1 Calculate the mean and variance of the small window by using the integral map and the square integral map of the entire image;
②利用积分图提取小窗口的微结构特征, 并进行归一化处理;  2 using the integral map to extract the micro-structure features of the small window, and normalization processing;
③釆用训练好的第 i层强分类器对小窗口进行判断;  3釆 Use the trained i-th layer strong classifier to judge the small window;
④如果通过判断, 则 + 1, 返回步骤③; 否则抛弃掉该小窗口; 如果通过所有层强 分类器的判断, 则认为该小窗口包含一个眼睛候选, 输出其位置及其置信度。 否则抛 弃掉该小窗口, 不进行后续处理;  4 If judged, then +1 returns to step 3; otherwise, the small window is discarded; if judged by all layer strength classifiers, the small window is considered to contain an eye candidate, and its position and its confidence are output. Otherwise, the small window is discarded and no subsequent processing is performed;
由于真实的眼睛会在相^位置处被检测到多次, 并且眉毛和境框也常被误认为是眼睛候 选。 所以本发明根据候选的置信度大小输出最多前 20个候选位置。  Since the real eye is detected multiple times at the position, the eyebrows and the frame are often mistaken for eye candidates. Therefore, the present invention outputs up to the top 20 candidate positions according to the candidate confidence level.
3. 眼睛候选对的验证  3. Verification of eye candidate pairs
为了排除候选中的虚警和不精确的定位结果,本发明将左右眼睛候选配对,提取候选周围 区域更多的特征,然后使用眼睛对分类器来验证每一对候选,最后根据后验概率从所有候选对 中估计出双眼的最佳位置。 对每一对眼睛候选, 处理步骤包括以下:  In order to eliminate false alarms and inaccurate positioning results in the candidate, the present invention pairs left and right eye candidates, extracts more features of the candidate surrounding regions, and then uses the eye pair classifier to verify each pair of candidates, and finally according to the posterior probability. The best position for both eyes is estimated for all candidate pairs. For each pair of eye candidates, the processing steps include the following:
(1)根据左右眼候选位置割取图像进行尺寸归一化  (1) Dimension normalization according to the image of the left and right eye candidate positions
对每一对眼睛候选, 首先根据左右眼候选位置按照模板(图 10)所示位置割取图像, 然后 进行尺寸归一化和光照归一化, 得到 25 x 15像素的眼睛候选对图像 。  For each pair of eye candidates, the image is first cut according to the position of the left and right eye candidates according to the position shown in the template (Fig. 10), and then the size normalization and illumination normalization are performed to obtain an eye candidate pair image of 25 x 15 pixels.
(2)输入图像积分图像的计算 用下式分别计算图像 对应的积分图像 : (2) Calculation of input image integral image Calculate the integral image corresponding to the image by the following formula:
(3)眼睛候选对图像 Ρ/0,;μ)的判断  (3) Judgment of the eye candidate pair image Ρ/0,; μ)
对每一个图像的验证步骤如下:  The verification steps for each image are as follows:
①利用整幅图像的积分图 PII(x,y)提取微结构特征;  1 Extracting the microstructure features using the integral map PII(x, y) of the entire image;
②采用训练好的第 i层强分类器对图像进行判断;  2 Using the trained i-th layer strong classifier to judge the image;
③如果通过判断, 贝 z' + l, 返回步骤①; 否则抛弃掉该眼睛候选对; 如果通过所有 层强分类器的判断, 则认为该候选对为有效候选对, 输出其位置及其置信度; 最后,对所有通过判断的候选对按照置信度从大到小排序,取置信度最大的前 3对候选对 的平均位置作为眼睛中心位置。 输出眼睛位置。  3 If judged, Bay z' + l, return to step 1; otherwise discard the eye candidate pair; if judged by all layer strong classifiers, then consider the candidate pair as a valid candidate pair, output its position and its confidence Finally, the candidate pairs that pass the judgment are sorted according to the confidence level from large to small, and the average position of the first three pairs of candidate pairs with the highest confidence is taken as the eye center position. Output eye position.
C)眼睛定位误差测试标准 C) Eye positioning error test standard
为了比较不同定位算法的精度,本发明采用一种与人脸尺寸无关的定位误差度量标准。由 于正面人脸的双眼中心间距一般不随表情等改变,具有相对的稳定性,所以以人工标定的双眼 中心间距为基准。  In order to compare the accuracy of different positioning algorithms, the present invention employs a positioning error metric that is independent of the size of the face. Since the center distance of the eyes of the frontal face generally does not change with the expression and the like, and has relative stability, the center distance of the eyes of the artificial calibration is used as a reference.
对于一张人脸, 设人工标定的左右眼和嘴巴位置分别为 P/e、 ^和 , 自动定位的左右眼 和嘴巴位置分别为 Pfe'、 '和 , 是 4和 之间的欧氏距离, 4为 与 间的欧氏距离, dre为 Pr:与 Pre间的欧氏距离, dm为 P„;与 Pm间的欧氏距离。 则眼睛定位误差定义为: For a human face, the left and right eye and mouth positions are manually calibrated as P / e , ^ and , respectively, and the left and right eye and mouth positions of the automatic positioning are P fe ', ' and , respectively, and the Euclidean distance between 4 and , 4 is the Euclidean distance between the two, d re is the Euclidean distance between P r and P re , d m is P „; Euclidean distance from P m . Then the eye positioning error is defined as:
^ max(dle,dre) 考虑到不同人的手工标定结果之间的差异都可能达到 0.10,本文以 0.15作为定位准确与否 的界限, 当眼睛定位误差 e^ < 0.15时, 认为双眼的定位是准确的; ^ max(d le ,d re ) Considering that the difference between the manual calibration results of different people may reach 0.10, this paper takes 0.15 as the boundary of the accuracy of positioning. When the eye positioning error e^ < 0.15, it is considered that the eyes are Positioning is accurate;
实施例 1 : 基于人脸的识别签到*** (如图 11所示) Embodiment 1 : Face-based identification check-in system (as shown in Figure 11)
人脸认证是近来受到广泛关注的生物特征认证技术中最友好的一种认证方式,旨在利用人 脸图像进行计算机自动个人身份认证, 以代替传统的密码、证件、 ***等身份认证方式, 具有 不易伪造、不会丢失以及方便等优点。本***利用人脸信息来对人身份迸行自动验证。其中使 用的人脸检测模块是本文的研究成果。 此外本***还参加了 ICPR2004组织的 FAT2004竞赛。 此次竞赛共有包括来自美国的 Carnegie Mellon大学、 德国的 Neuroinformatik研究所、 英国 的 Surrey大学等 11个学术和商业机构的 13个人脸识别算法参加。 本实验室所提交的***在 三个评价指标上都比第二名以低约 50%错误率的结果获得第一名。本文的研究成果应用在本实 验实所提交***的眼睛定位模块中, 从而保证了***的总体性能居于世界先进水平。  Face authentication is one of the most friendly authentication methods in biometric authentication technology that has received wide attention recently. It is designed to use face images for computer automatic personal identification to replace traditional passwords, certificates, seals and other authentication methods. It is not easy to forge, not lost, and convenient. The system uses face information to automatically verify the identity of the person. The face detection module used therein is the research result of this paper. In addition, the system also participated in the FAT2004 competition organized by ICPR2004. The competition included 13 face recognition algorithms from 11 academic and commercial institutions including Carnegie Mellon University in the United States, Neuroinformatik Institute in Germany, and Surrey University in the United Kingdom. The system submitted by the laboratory won the first place in the three evaluation indicators with a lower error rate of about 50% than the second. The research results of this paper are applied to the eye positioning module of the system submitted by the actual implementation, thus ensuring that the overall performance of the system is at the advanced level in the world.
综上所述, 本发明能够在具有复杂背景的图像中鲁棒精确地定位眼睛, 在实验中获得了优 异的定位结果, 具有非常广泛的应用前景。 P T/CN2007/001894 以上所述的实施例,只是本发明较优选的具体实施方式,本领域的技术人员在本发明技术 方案范围内进行的通常变化和替换都应包含在本发明的保护范围内。 In summary, the present invention can accurately and accurately locate an eye in an image with a complex background, and obtains excellent positioning results in experiments, and has a very broad application prospect. PT/CN2007/001894 The above-mentioned embodiments are only preferred embodiments of the present invention, and those common changes and substitutions made by those skilled in the art within the scope of the technical solutions of the present invention should be included in the scope of protection of the present invention. .

Claims

权 利 要 求 书 Claim
1. 复杂背景图像中鲁棒的眼睛精确定位方法, 其特征在于, 该眼睛精确定位方法包括训 练和检测两个阶段; · A robust eye-accurate positioning method in a complex background image, characterized in that the eye precise positioning method comprises two stages of training and detecting;
在训练阶段,首先采集大量样本, 即采用人手工标定的方法,从人脸图像中收集训练样本, 然后对样本进行归一化处理。利用采集到的训练样本,进行特征抽取,得到训练样本的特征库, 在特征库的基础上, 通过实验确定分类器的参数, 训练得到眼睛定位分类器;  In the training phase, a large number of samples are first collected, that is, a manual calibration method is used to collect training samples from the face images, and then the samples are normalized. Using the collected training samples, feature extraction is performed to obtain the feature database of the training samples. Based on the feature database, the parameters of the classifier are determined experimentally, and the eye positioning classifier is trained.
在检测阶段, 对一张输入的人脸图像区域 /(x,} , ≤x < Wface , 0≤y < Hface , 首先估计左 右人眼可能存在的区域,然后在两个区域中穷举判断所有的小窗口(定义小窗口为输入图像中 的一个矩形区域子图像), 对每一个小窗口抽取特征, 然后使用单眼检测器进行判断, 从而得 到区域中所有人眼候选位置;然后将左右眼睛候选结合起来利用全局性质从中选择出最佳组合 方式, 定位得到眼睛的最终位置, 由此可以得到极好的眼睛定位准确率。 In the detection phase, for an input face image area /(x,} , ≤ x < W face , 0 ≤ y < H face , first estimate the area where the left and right human eyes may exist, and then exhaustively in the two areas. Judging all the small windows (defining the small window as a rectangular area sub-image in the input image), extracting features for each small window, and then using the monocular detector to determine, thereby obtaining all human eye candidate positions in the region; The eye candidates combine to use the global properties to select the best combination, and the final position of the eye is obtained, thereby obtaining excellent eye positioning accuracy.
2. 根据权利要求 1所述复杂背景图像中鲁棒的眼睛精确定位方法, 其特征在于, 所述训练阶段依次含有以下步骤:  2. The method for accurately positioning an eye in a complex background image according to claim 1, wherein the training phase comprises the following steps in sequence:
样本采集与归一化:利用投影函数估计左右眼睛所在区域、单眼检测器的训练、眼睛对检测器 的训练 Sample collection and normalization: use the projection function to estimate the area of the left and right eyes, the training of the monocular detector, and the training of the eye to the detector.
步骤 1 样本釆集与归一化  Step 1 Sample collection and normalization
(1 ) 样本的采集  (1) Sample collection
为训练单眼检测器, 采用人手工标定的方法, 从人脸图片中切割出单只眼睛图像, 并从 人脸图像非眼睛部位随机割取非眼睛样本,单只眼睛图像和非眼睛图像分别作为正例禅本和反 例样本用于训练单眼检测器;  In order to train the monocular detector, a manual eye calibration method is used to cut a single eye image from the face image, and a non-eye sample is randomly cut from the non-eye portion of the face image, and the single eye image and the non-eye image are respectively taken as Positive zen and counterexample samples are used to train monocular detectors;
另外为训练眼睛对检测器, ·还根据人手工标定的眼睛位置, 按照设定的比例从人脸图像 中剪裁得到眼睛对样本,并从人脸图像中随机割取非眼睛对样本,眼睛对图像和非眼睛对图像 分别作为正例样本和反例样本用于训练单眼检测器;这样采集到的样本不只包括两只眼睛还包 括眉毛、 鼻子等部位, 体现了眼睛与周围器官的约束关系;  In addition, in order to train the eye to the detector, the eye is also manually calibrated according to the position of the eye, and the eye is sampled from the face image according to the set ratio, and the non-eye pair sample is randomly cut from the face image, and the eye pair is The image and the non-eye pair image are used as the positive sample and the counter sample respectively to train the monocular detector; the sample thus collected includes not only the two eyes but also the eyebrows, the nose and the like, which embodies the constraint relationship between the eye and the surrounding organs;
所述从人脸图像中剪裁得到眼睛对样本是按下述比例进行的: 以双眼眼球中心的连线作 为 X轴, 以垂直于所述双眼眼球中心线的垂直线作为 Y轴,垂足位于双眼内侧间距%处的所 述中心线上; 当设定双眼眼球中心的间距为 dist时, 双眼眼球中心距左、右外边框的水平距离 各%dis/t 3 , 而剪裁时上、 下两条边框各距垂足距离为 The cropping from the face image is performed on the sample in the following proportions: the line connecting the center of the eyeball is taken as the X axis, and the vertical line perpendicular to the center line of the eyeball is taken as the Y axis, and the foot is located The center line at the distance between the inner sides of the eyes; when the distance between the centers of the eyes is set to dist, the horizontal distance between the center of the eyes and the left and right outer frames is % dis / t 3 , and the upper and lower sides are cut. The distance between the bars and the feet is
(2) 尺寸归一化 将收集好的各尺寸的样本图像, 包括单眼和非单眼、 眼睛对和非眼睛对图像, 归一化为 指定尺寸:设原始样本图像为 : ]MxW, 图像宽度为 M,高度为 N, 图像位于第 X行第 列 的象素点的值为 (χ, , 0≤x<M, Q≤y<N; 设尺寸归一化后图像为 [G(x, ]^ w, 图像宽 度为 , 高度为 H, 输入图像 1 (^ ¾^与归一化后图像 [G(X,J ] x//之间的对应关系为: 其中 和 分别为 X和 方向的尺度
Figure imgf000025_0001
根据线性插值方法, 对 于给定 (X, , 令: '
(2) Size normalization The collected sample images of each size, including monocular and non-monocular, eye-pair and non-eye-pair images, are normalized to the specified size: the original sample image is: ] MxW , image width M, height N, image The value of the pixel point located in the Xth column is (χ, , 0≤x<M, Q≤y<N; the image is normalized to [G(x, ]^ w , image width is, The height is H, the correspondence between the input image 1 (^ 3⁄4^ and the normalized image [G(X,J] x// is: the sum of the X and the direction respectively)
Figure imgf000025_0001
According to the linear interpolation method, for a given (X, , ,: '
其中: 程可表示为:Where: Cheng can be expressed as:
Figure imgf000025_0002
Figure imgf000025_0002
G(x,y) = F(x0+Ax,ya + y) = F(x0,y0)AxAy + F(x0 +\,y0)(l-Ax)Ay +F(x0 , y0 + l)Ax (1 - Δ + F(x0 +1^0+ 1)(1 - Δ,)(1 - Δ G(x,y) = F(x 0 +A x ,y a + y ) = F(x 0 ,y 0 )A x A y + F(x 0 +\,y 0 )(lA x )A y +F(x 0 , y 0 + l)A x (1 - Δ + F(x 0 +1^ 0 + 1)(1 - Δ,)(1 - Δ
(3) 灰度归一化 (3) Grayscale normalization
由于外界光照、成像设备等因素可能导致图像亮度或对比度异常, 出现强阴影或反光等情 况,所以还需要对几何归一化后的样本进行灰度均值、方差归一化处理, 将样本图片灰度的均 值;^和方差 调整到给定值 。和 σ·。:  Due to factors such as external illumination and imaging equipment, the brightness or contrast of the image may be abnormal, and strong shadows or reflections may occur. Therefore, it is necessary to perform gray-scale mean and variance normalization on the geometrically normalized samples, and gray the sample image. The mean of the degrees; ^ and the variance are adjusted to the given value. And σ·. :
首先采用下式计算样本图像 (0≤x<W, 0≤y<H) 的均值和方差: First, calculate the mean and variance of the sample image (0 ≤ x < W, 0 ≤ y < H) using the following formula:
—l  —l
Figure imgf000025_0003
Figure imgf000025_0003
然后对每个像素点的灰度值进行如下变换: σ Then the gray value of each pixel is transformed as follows: σ
从而将图像灰度的均值和方差调整到给定值 。和 , 完成样本的灰度归一化; Thereby the mean and variance of the image gradation are adjusted to a given value. And , complete the grayscale normalization of the sample;
步骤 2单眼检测器的训练 Step 2 Training of the single eye detector
单眼检测器的训练使用归一化后的单眼睛样本和非眼睛样本的微结构特征库, 利用 AdaBoost算法训练得到单只眼睛检测器; 其具体训练过程如下:  The training of the monocular detector uses a normalized single eye sample and a non-eye sample microstructure feature library, and a single eye detector is trained by the AdaBoost algorithm; the specific training process is as follows:
(1) 特征提取  (1) Feature extraction
(i)设定下述五种类型微结构模板; 设定:用下述五种类型微结构模板来提取人脸样本的五种微结构特征,每一种微结构特征 通过计算模板黑色区域和白色区域内所对应图像中像素灰度和的差值来得到,所述五种类型微 结构特征^ / 分别表示如下: (i) setting the following five types of microstructure templates; Setting: Five kinds of microstructure features of the face sample are extracted by the following five types of microstructure templates, each of which calculates the difference of the gray level of the pixels in the corresponding image in the black area and the white area of the template. As a result, the five types of microstructure features ^ / are respectively expressed as follows:
(a)类:黑色区域和白色区域左右对称且面积相等, 用 w表示其中各区域的宽, h表示其中各 区域的高;  Class (a): The black area and the white area are bilaterally symmetrical and equal in area, with w indicating the width of each of the areas, and h indicating the height of each of the areas;
(b)类:黑色区域和白色区域上下对称且面积相等, w、 h的定义与 (a)类相同;  Class (b): The black area and the white area are vertically symmetrical and equal in area, and the definitions of w and h are the same as (a);
(c)类: 在水平方向, 黑色区域位于两块白色区域之间, 且黑色区域和每块白色区域的面 积相等, w、 h的定义与 (a)类相同;  Class (c): In the horizontal direction, the black area is located between two white areas, and the black area and each white area have the same area, and the definitions of w and h are the same as (a);
(d)类: 两块黑色区域分别处于第一象限和第三象限, 两块白色区域分别处于第二和第四 象限, 每块黑色区域和每块白色区域的面积相等, w、 h的定义与 (a)类相同:  Class (d): The two black areas are in the first quadrant and the third quadrant respectively. The two white areas are in the second and fourth quadrants respectively. The area of each black area and each white area is equal, and the definition of w and h Same as (a):
(e)类: 黑色区域位于白色区域的中央, 且黑色区域的上、下两边, 左、右两边分别距离白 色区域的上、 下两边, 左、 右两边 2个像素, w、 h分别表示白色区域周框的宽和高: (ii)快速计算积分图像- 对于所述图像 /(X, ,定义其对应的积分图像/ /(X, 为从 (0,0)到 (x, 范围内的所有 像素之和, 即 //(x,_y)= ∑ ∑ I(x',y') iii)快速提取单眼睛和非单眼样本的高维微结构特征:  Class (e): The black area is located in the center of the white area, and the upper and lower sides of the black area, the left and right sides are respectively separated from the upper and lower sides of the white area, and the left and right sides are 2 pixels, w and h respectively represent white. The width and height of the area's perimeter box: (ii) Quickly calculate the integral image - for the image /(X, , define its corresponding integral image / /(X, from (0,0) to (x, range) The sum of all pixels, ie //(x,_y)= ∑ ∑ I(x',y') iii) Quickly extract high-dimensional microstructure features of single-eye and non-single-eye samples:
每一种微结构特征通过计算模板所覆盖图像中黑色区域和白色区域内像素灰度和的 差值来得到,并且模板相对于图像的位置以及模板的尺寸可以改变, 由于每一种特征提取 只涉及到矩形区域中像素和的计算问题, 便于利用整幅图像的积分图像快速得到任意尺 度、 任意位置的一种微结构特征;  Each of the microstructure features is obtained by calculating the difference between the grayscale sum of the pixels in the black area and the white area in the image covered by the template, and the position of the template relative to the image and the size of the template can be changed, since each feature extraction is only It involves the calculation of the pixel sum in the rectangular area, and it is convenient to quickly obtain a micro-structure feature of arbitrary scale and arbitrary position by using the integral image of the whole image;
(a)  (a)
g(x,y,w,h) = 2-II(x + w-l,y-\) + II(x + 2-w-\,y + h-Y)  g(x,y,w,h) = 2-II(x + w-l,y-\) + II(x + 2-w-\,y + h-Y)
+ II(x-\,y + h-\)-2-II(x + w-l,y + h-l)  + II(x-\,y + h-\)-2-II(x + w-l,y + h-l)
-II(x + 2-w-l,y-Y)-II(x-l,y-l)  -II(x + 2-w-l,y-Y)-II(x-l,y-l)
(b)  (b)
g(x,y,w,h) = 2II(x + w-l,y + h-l) + II(x-l,y-l)-II(x + w-l,y-V)  g(x,y,w,h) = 2II(x + w-l,y + h-l) + II(x-l,y-l)-II(x + w-l,y-V)
-2II(x-l,y + h-\)-II(x + M>-l,y + 2h-\) + II(x-l,y + 2h-l)  -2II(x-l,y + h-\)-II(x + M>-l,y + 2h-\) + II(x-l,y + 2h-l)
(c)  (c)
g(x,y,M>,h) = 2II(x + 2w-l,y + h-l) + 2II(x + w-l,y-l)-2II(x + 2w-l,y-\) g(x,y,M>,h) = 2II(x + 2w-l,y + h-l) + 2II(x + w-l,y-l)-2II(x + 2w-l,y-\)
-2II(x + w-\,y + h-l)-II(x + 3w-\,y + h-l)-II(x-l,y-l)  -2II(x + w-\,y + h-l)-II(x + 3w-\,y + h-l)-II(x-l,y-l)
+ II(x-l,y + h-Y) + II(x + 3w-l,y-r)  + II(x-l,y + h-Y) + II(x + 3w-l,y-r)
(d) g(x,y,w,h) = -II(x-l,y~l)-II(x + 2w-l,y-l)-II(x-l,y + 2h-l) (d) g(x,y,w,h) = -II(xl,y~l)-II(x + 2w-l,yl)-II(xl,y + 2h-l)
-4II(x + w-l,y + h-l) + 2II(x + w-l,y-l) + 2II(x-i,y + h-l)  -4II(x + w-l,y + h-l) + 2II(x + w-l,y-l) + 2II(x-i,y + h-l)
-II(x + 2w-l,y + 2h-Y) + 2II(x + 2w-l,y + h~Y) + 2II(x + M'-l,y + 2h-l)  -II(x + 2w-l, y + 2h-Y) + 2II(x + 2w-l, y + h~Y) + 2II(x + M'-l, y + 2h-l)
(e)  (e)
g(x,y,w,h) = II(x+w-l,y + h-T) + II(x-l,y-i)-II(x + w-l,y-l)-II(x-l,y + h-i) 改 -II(x+w-3,y + h-3)-II(x + l,y + l) + II(x + l,y + h-3) + II(x+w-l,y + l)  g(x,y,w,h) = II(x+wl,y + hT) + II(xl,yi)-II(x + wl,yl)-II(xl,y + hi) change-II( x+w-3,y + h-3)-II(x + l,y + l) + II(x + l,y + h-3) + II(x+wl,y + l)
变参数 x,y,w,h的值可提取样本图像不同位置的微结构特征,对于尺寸归一化为 24X12的眼睛 Varying the values of x, y, w, h can extract microstructure features at different locations in the sample image, normalizing the size to 24X12 eyes
/非眼睛样本图像可得到 42727个特征, 从而组成该样本图像的特征量 ^1 ^,1≤ j≤ 42727/ Non-eye sample image can get 42727 features, so that the feature quantity of the sample image ^ 1 ^, 1 ≤ j ≤ 42727 ;
(2)归一化样本图像特征量 (2) normalized sample image feature quantity
首先计算 24X12像素的样本图像区域( ≤ '≤ +23,;。≤ ≤ 。+11)内的像素灰度和的 均值^和方差 μ = [II(x0 + 23, y0+U) + II(x0— 1, y。— 1)— II(x0— 1, _y0 + 11) - II(x0 + 23, y0-l)]/288 σ = { [SqrII(x0 +23,^0+ll) + SqrII(x0— 1, ¾一 1)一 SqrII(xQ -1,30+11) 一 SqrII(x0 +23,^0 -l)]/288- μ2 },/2 First calculate the mean value of the gray level of the pixel in the sample image area of 24×12 pixels (≤ '≤ +23,; ≤ ≤ 。 +11)^ and the variance μ = [II(x 0 + 23, y 0 +U) + II(x 0 — 1, y. — 1)— II(x 0 — 1, _y 0 + 11) - II(x 0 + 23, y 0 -l)]/288 σ = { [SqrII(x 0 + 23,^ 0 +ll) + SqrII(x 0 — 1, 3⁄4−1)-SqrII(x Q -1,3 0 +11) -SqrII(x 0 +23,^ 0 -l)]/288- μ 2 } , /2
其次, 对每一个微结构对征进行如下归一化:  Second, the normalization of each microstructure is as follows:
FV{j)=^FV{j), FV{j)=^FV{j),
σ 对于一个 24x12像素的样本图像, 共得到 42727维微结构特征 (_/),1≤ 42727;  σ For a 24x12 pixel sample image, a total of 42727 dimensional microstructural features (_/), 1 ≤ 42727;
(3)特征选择和分类器设计  (3) Feature selection and classifier design
使用 AdaBoost算法选择特征和训练分类器:一方面 AdaBoost算法在每轮迭代中选择出性 能最好的基于单个特征的弱分类器,达到特征选择的目的; 另一方面把这些弱分类器集成为一 个强分类器, 并通过将多个强分类器级联起来得到一个完整的眼睛检测器; 具体来说, 包括以 下几个组成部分:  Using AdaBoost algorithm to select features and training classifiers: On the one hand, AdaBoost algorithm selects the best performance single class-based weak classifier in each iteration to achieve the purpose of feature selection; on the other hand, integrate these weak classifiers into one Strong classifier, and by cascading multiple strong classifiers to get a complete eye detector; specifically, it includes the following components:
(i)弱分类器的构造  (i) Construction of weak classifier
对应于每一维特征构造最简单的树分类器来作为弱分类器:  The simplest tree classifier is constructed as a weak classifier corresponding to each dimension feature:
1 [ 0, otherwise 1 [ 0, otherwise
其中 sub是一个 24x12像素的样本, g (sub)表示从该样本提取得到的第 j个特征, 是第 j个特征对应的判决阈值, 该阈值通过统计所有采集的眼睛和非眼睛样本的第 j个特征使得眼 睛样本的 FRR满足规定的要求来得到, h b 表示使用第 j个特征构造的树分类器的判决输 出, 这样每个弱分类器只需要比较一次阈值就可以完成判决; 共可得到 42727个弱分类器;Where sub is a 24x12 pixel sample, g(sub) represents the jth feature extracted from the sample, and is the decision threshold corresponding to the jth feature, which is obtained by counting the jth of all collected eye and non-eye samples. The characteristics are such that the FRR of the eye sample satisfies the specified requirements, and hb represents the decision input of the tree classifier constructed using the jth feature. Therefore, each weak classifier only needs to compare the threshold once to complete the decision; a total of 42727 weak classifiers are obtained;
(ii) 基于 AdaBoost算法的眼睛 /非眼睛强分类器设计 (ii) Eye/non-eye strong classifier design based on AdaBoost algorithm
(iii)多层强分类器的级联 · 步骤 3 眼睛对分类器的训练  (iii) Cascade of multi-layer strong classifiers · Step 3 Eye training for classifiers
眼睛对分类器的训练使用归一化后的眼睛对样本和非眼睛对样本,分别提取两类样本的特 征库, 利用 AdaBoost算法训练得到眼睛对分类器。 眼睛对分类器所使用微结构特征及训练过 程与前文单眼检测器的完全相同, 都是使用 AdaBoost算法从大量微结构特征中选择基于单个 特征的弱分类器来构成强分类器,并将多层强分类器级联在一起; 眼睛对分类器的具体训练过 程同样包括特征提取、 特征选择、 强分类器的训练、 多层强分类器的级联:  The eye trains the classifier using the normalized eye-to-sample and non-eye-pair samples, extracts the feature libraries of the two types of samples, and uses the AdaBoost algorithm to train the eye-pair classifier. The eye uses the same microstructural features and training process as the previous one-eye detector. The AdaBoost algorithm is used to select a weak classifier based on a single feature from a large number of microstructure features to form a strong classifier. Strong classifiers are cascaded together; the specific training process of the eye to the classifier also includes feature extraction, feature selection, training of strong classifiers, and cascade of multi-layer strong classifiers:
(1)特征提取  (1) Feature extraction
使用归一化后的眼睛对样本和非眼睛对样本按上述步骤 2.1所述的特征提取方法提取眼睛 对和非眼睛对样本的高维微结构特征, 对于尺寸归一化为 25X15像素的样本, 共得到 71210 个特征, 组成该样本的特征点是^ Ϋ(7'),1≤ 7 < 71210;  Use the normalized eye-to-sample and non-eye-pair samples to extract the high-dimensional microstructure features of the eye-pair and non-eye-pair samples according to the feature extraction method described in step 2.1 above. For samples with a size normalized to 25×15 pixels, a total of 71210 features, the feature points that make up the sample are ^ Ϋ (7'), 1 ≤ 7 < 71210;
(2)为了减轻光照的影响, 按步骤 2. (2)归一化样本图像特征量中所述的灰度均值和方差归 一化方法对每一个 25Χ 15像素样本进行灰度均值和方差的归一化:  (2) In order to reduce the influence of illumination, perform the gray-scale mean and variance for each 25Χ15-pixel sample according to the gray-scale mean and variance normalization method described in step 2. (2) normalizing the sample image feature quantity. Normalized:
首先, 快速计算出所述 25X15像素样本的灰度均值;^和方差 , 样本在整幅图像中的坐 标区域为 0。≤x'≤ x。+ 24, +14), 则 ^和 分别为- i = [II(x0+24,y0+U) + II(xQ -\,y0 -l)-//( 0 -l,y0 +14)- II(x0 +24, yQ -l)]/365 σ = { [SqrII(x0 + 24, y0+U) + SqrII(x0 -\,y0 -Y)-SqrII(x0 -l,y0 +14) -SqrII(x0 + 2A,y0 -l)]/365- z2},/2 First, the gray mean of the 25×15 pixel samples is quickly calculated; ^ and the variance, and the coordinate area of the sample in the entire image is 0. ≤ x' ≤ x. + 24, +14), then ^ and respectively are - i = [II(x 0 +24, y 0 +U) + II(x Q -\,y 0 -l)-//( 0 -l,y 0 +14)- II(x 0 +24, y Q -l)]/365 σ = { [SqrII(x 0 + 24, y 0 +U) + SqrII(x 0 -\,y 0 -Y)- SqrII(x 0 -l,y 0 +14) -SqrII(x 0 + 2A,y 0 -l)]/365- z 2 } ,/2
其次对每一个维微结构特征进行如下的归一化:  Secondly, the normalization of each dimension micro-structure feature is as follows:
FV(j)^W{j) FV(j)^W{j)
σ 对于一个 25x15像素的样本图像, 共得到 71210维微结构特征 /),l≤_/≤71210。  σ For a sample image of 25x15 pixels, a total of 71210 dimensional microstructure features /), l ≤ _ / ≤ 71210 are obtained.
(3)特征选择及分类器设计  (3) Feature selection and classifier design
眼睛对检测器也采用分层结构,先由结构简单的强分类器排除掉图像中的背景窗口,然后 由结构复杂的强分类器对余下窗口进行判断。 具体来说包括以下几个组成部分:  The eye also uses a layered structure for the detector. The background window in the image is first excluded by a strong classifier with a simple structure, and then the remaining window is judged by a strong classifier with a complicated structure. Specifically, it includes the following components:
(i)弱分类器的构造 弱分类器仍使用一维特征构造的树分类器; , , , ' ί 1, if g,(sub)<^. or g,(sub)>^, (i) Constructing a weak classifier The weak classifier still uses a tree classifier constructed with one-dimensional features; , , , ' ί 1, if g,(sub)<^. or g,(sub)>^,
hj (sub) = i j j j j h j (sub) = ijjjj
0, otherwise 共可得到 71210个弱分类器。  0, otherwise A total of 71210 weak classifiers are available.
(ii) 基于 AdaBoost算法的眼睛对 /非眼睛对强分类器设计  (ii) Eye-to-non-eye-to-strong classifier design based on AdaBoost algorithm
(iii)多层强分类器的级联 (iii) Cascade of multi-layer strong classifiers
3. 根据权利要求 1所述复杂背景图像中鲁棒的眼睛精确定位方法, 其特征在于, 所述检测阶段是指是判断一张输入的人脸区域眼睛中心位置, 包含以下步骤- 所述眼睛检测阶段, 是对一张输入的人脸区域, 使用以下步骤来精确定位眼睛中心位置: 步骤 1 估计左右眼睛所在的区域 和 QrigMy¾ ; 3. The method for accurately positioning an eye in a complex background image according to claim 1, wherein the detecting phase is to determine an eye center position of an input face region, comprising the following steps - the eye The detection phase is for an input face area. Use the following steps to accurately locate the center of the eye: Step 1 Estimate the area where the left and right eyes are located and Q rigMy3⁄4 ;
使用人脸图像竖直方向投影的均值、 方差函数来确定 Ω¾/ ^和 在水平方向上的分界 线, 然后根据从训练样本中统计到的眼睛在人脸区域竖直方向上的分布规律, 确定 ^^^和 的上下边界, 从而估计出 和 ; Using the mean and variance functions of the vertical projection of the face image to determine Ω 3⁄4/ ^ and the boundary line in the horizontal direction, and then according to the distribution of the eye in the vertical direction of the face region, which is statistically calculated from the training sample, Determine the upper and lower boundaries of ^^^ and estimate the sum;
(1)利用投影函数确定眼睛区域的左右分界线  (1) Using the projection function to determine the left and right boundaries of the eye area
取检测到的人脸区域的上半部分, 以其垂直方向灰度投影的均值函数与方差函数的比值  Taking the upper half of the detected face area, the ratio of the mean function to the variance function of the gray level projection in the vertical direction
' ^^的峰值作为左右两眼所在区域的竖直分界线, 定义此位置为: The peak of '^^ is the vertical dividing line of the area where the left and right eyes are located. Define this position as:
VPFv(x) P VPF v (x) P
MPF(x)  MPF(x)
xnn]r = argmax —— x nn]r = argmax ——
pak o<& x<wfm VPFv(x) p ak o<& x <w fm VPF v (x)
(2)利用样本统计信息得到眼睛区域的上下边界  (2) Using the sample statistics to get the upper and lower boundaries of the eye area
nlefleye、 Ω λ, ^的上下边界则可以利用样本中眼睛在人脸竖直方向的分布位置统计出来; 有 The upper and lower boundaries of n lefleye , Ω λ , ^ can be counted by the distribution of the eyes in the vertical direction of the face in the sample;
H y), 0<x<xpeai, 0.05 ace<y<0A5H/oo H y), 0<x<x peai , 0.05 ace <y<0A5H /oo
Ω = (x, y) ' xpaak <x< Wface ' 0.05Hface <y< 0A5H Ω = (x, y) ' x paak <x< W face ' 0.05H face <y< 0A5H
其中 H/«ce、 为利用样本统计得出的人脸高度和宽度; Where H/« ce is the height and width of the face obtained by using the sample statistics;
步骤 2利用单眼检测器检测眼睛候选  Step 2 Using a single-eye detector to detect eye candidates
在 Ω,¾/^、 Ω„¾¾¾两个区域中分别使用单眼检测器检测给出左右眼睛 20个候选位置,并估 计出每个候选位置的置信度, 眼睛候选的具体检测过程如下: In Ω, ¾ / ^, Ω " ¾¾¾ two areas were given using monocular detector 20 candidate positions around the eyes, and estimating the position of a confidence for each candidate, the candidate of the specific eye detection process is as follows:
(1)输入人脸图像的积分图像的计算  (1) Calculation of the integrated image of the input face image
计算输入人脸图像 J(x, y)对应的积分图像 //(X, y)和平方积分图像 Sqrllipcy);  Calculate the integral image corresponding to the input face image J(x, y) //(X, y) and the squared integral image Sqrllipcy);
(2)判别左右眼区域中的每一个的小窗口 判别 ί¾^、 Ω^,^两个区域中的每一个的 24x 12像素尺寸的小窗口, 对任一个小窗口 [x。,3 Q;x。 + 23,;;。 +Π]的处理步骤如下: (2) Discriminating the small window of each of the left and right eye regions Identify the small window of 24x 12 pixel size for each of the two regions ί3⁄4^, Ω^, ^, for any small window [x. , 3 Q ; x. + 23,;;. The processing steps of +Π] are as follows:
(i)利用整幅图像的积分图 II x,y~)和平方积分图 Sqrll^y)计算小窗口的均值 ^和方差 σ;  (i) using the integral map of the entire image II x, y~) and the square integral map Sqrll^y) to calculate the mean ^ and variance σ of the small window;
μ = [ΙΙ(χ0 + 23, y0 +l l) + II(x0 - 1 J0 - 1) - - 1 J0 + 11) - + 23, y0 - 1)] 1288 μ = [ΙΙ(χ 0 + 23, y 0 +ll) + II(x 0 - 1 J 0 - 1) - - 1 J 0 + 11) - + 23, y 0 - 1)] 1288
= { [SqrII(x0 + 23,^0 +l l) + SqrII(x0 -1,^0 -1) - SqrII(x0 - 1, j;0 + 11) , - SqrII(x0 + 23, y0 - 1)] / 288 - μ2 },/2 = { [SqrII(x 0 + 23,^ 0 +ll) + SqrII(x 0 -1,^ 0 -1) - SqrII(x 0 - 1, j; 0 + 11) , - SqrII(x 0 + 23 , y 0 - 1)] / 288 - μ 2 } ,/2
(ii)利用训练阶段步骤 2 (1)的特征提取方法提取该小窗口的微结构特征, 并进行特征归 一化处理;  (ii) extracting the microstructure features of the small window by using the feature extraction method of step 2 (1) of the training phase, and performing feature normalization processing;
(iii)采用训练好的多层眼睛 /非眼睛强分类器对小窗口进行判断; 如果通过所有层强分类 器的判断,则认为该小窗口包含一个眼睛候选,输出其位置及其置信度;否则抛弃掉该小窗口, 不进行后续处理; 最后根据候选的置信度大小输出最多前 20个候选位置;  (iii) judging the small window by using the trained multi-layer eye/non-eye strong classifier; if judged by all the layer strong classifiers, the small window is considered to contain an eye candidate, and the position and its confidence are output; Otherwise, the small window is discarded, and no subsequent processing is performed; finally, the top 20 candidate positions are output according to the candidate confidence level;
步骤 3 眼睛候选对的验证  Step 3 Verification of eye candidate pairs
为了排除眼睛候选中的误检测和不精确的定位结果,将左右眼睛候选配对,提取候选周围 区域更多的特征,然后使用眼睛对分类器来验证每一对候选,最后根据后验概率从所有候选对 中估计出双眼的最佳位置, 具体来说对每一对眼睛候选, 包括以下处理步骤:  In order to exclude misdetection and inaccurate positioning results in eye candidates, the left and right eye candidates are paired, more features of the candidate surrounding regions are extracted, and then each pair of candidates is verified using the eye pair classifier, and finally from all posterior probabilities based on posterior probabilities The candidate pair estimates the best position of both eyes, specifically for each pair of eye candidates, including the following processing steps:
(1)根据左右眼候选位置割取图像进行尺寸归一化  (1) Dimension normalization according to the image of the left and right eye candidate positions
对每一对眼睛候选, 首先根据左右眼候选位置按照训练阶段步骤 1.1中眼睛对样本的割取 方式来割取图像, 然后进行尺寸归一化和光照归一化, 得到 25 x 15像素的眼睛候选对图像  For each pair of eye candidates, firstly, according to the left and right eye candidate positions, the image is cut according to the way the eye cuts the sample in step 1.1 of the training phase, and then the size normalization and illumination normalization are performed to obtain an eye of 25 x 15 pixels. Candidate pair image
(2) 输入图像积分图像的计算 (2) Calculation of input image integral image
计算图像 P/(x, 对应的积分图像 Ρ//(χ, =∑ ∑ PI(x',y') ;  Calculate the image P / (x, the corresponding integral image Ρ / / (χ, = ∑ ∑ PI (x', y');
Q≤x'≤x 0≤y'≤y  Q≤x'≤x 0≤y'≤y
(3)眼睛候选对图像 /O,;)的判断  (3) Judgment of eye candidate image /O,;)
对每一眼睛候选对图像 Ρ/(χ, 的验证步骤如下- For each eye candidate pair image Ρ / (χ, the verification steps are as follows -
(i)利用整幅图像的积分图提取微结构特征; (i) extracting microstructure features using an integral map of the entire image;
(ii)采用训练好的第 i层强分类器对图像进行判断; '  (ii) use the trained i-th layer strong classifier to judge the image; '
(iii)如果通过判断, 贝 值增加 1, 返回步骤 3 (3) (ii) ; 否则抛弃掉该眼睛候选对; 如果 通过所有层强分类器的判断, 则认为该候选对为有效候选对, 输出其位置及其置信度;  (iii) If judged, the Bayesian value is increased by 1, returning to step 3 (3) (ii); otherwise, the eye candidate pair is discarded; if it is judged by all layer strength classifiers, the candidate pair is considered to be a valid candidate pair, Output its position and its confidence;
最后,对所有通过判断的候选对按照置信度从大到小排序,取置信度最大的前 3对候选对 的平均位置作为眼睛中心位置, 输出眼睛位置。 Finally, the candidate pairs that pass the judgment are sorted according to the confidence level from large to small, and the average position of the first three pairs of candidate pairs with the highest confidence is taken as the eye center position, and the eye position is output.
4.根据权利要求 2述复杂背景图像中鲁棒的人脸检测方法, 其特征在 4. A robust face detection method in a complex background image according to claim 2, characterized in that
于, 所述步骤 3 (2) (ii) 训练第 i层强分类器包括以下步骤: The step 3 (2) (ii) training the i-th layer strong classifier includes the following steps:
将 AdaBoost算法结合上述弱分类器构造方法用于训练眼睛 /非眼睛强分类器;训练算法步 骤如下, 记给定训练集 Ζ = { νώ,.,/,.)}, = 1,...,«, ,.= 0,1是样本图像 sub,的类别标号, 分别对应 非眼睛类别和眼睛类别, 其中眼睛样本 ^ ,个, 非眼睛样本《„。,^个;  Combine the AdaBoost algorithm with the weak classifier construction method described above to train the eye/non-eye strong classifier; the training algorithm steps are as follows, and the given training set Ζ = { νώ,.,/,.)}, = 1,... , «, ,.= 0,1 is the category label of the sample image sub, corresponding to the non-eye category and the eye category, respectively, wherein the eye sample ^, one, non-eye sample "„., ^;
1) 数的初始化 训练样本权重的初始化。 初始每个样本的权重为 Α(Ο =1;  1) Initialization of the number Initialization of training sample weights. The initial weight of each sample is Α(Ο =1;
η 选择迭代次数 τ, Τ是希望使用的弱分类器的个数, Τ应随着强分类器层数的增多逐渐增  η selects the number of iterations τ, Τ is the number of weak classifiers that you want to use, Τ should increase with the number of strong classifiers
统计样本集上每个特征分布的极大值 Fmax(i)和极小值 Fmin(j), 其中 j为特征序号: 1< y< 42727; The maximum value Fmax(i) and the minimum value Fmin(j) of each feature distribution on the statistical sample set, where j is the feature number: 1< y< 42727;
2)重复以下过程 Γ次, t = l,...,r: 使用第 j个特征, i≤j≤ 42727构造弱分类器 , 然后在 Fmin①和 Fmax(j)间 穷举搜索阈值参数 , 使得 的错误率 最小, 定义2) Repeat the following process, t = l,...,r : construct the weak classifier using the jth feature, i ≤ j ≤ 42727, and then exhaustively search the threshold parameters between Fmin1 and Fmax(j), so that Minimum error rate, definition
Figure imgf000031_0001
Figure imgf000031_0001
^s^argmin^, 并将其对应的弱分类器作为 ; ^s^argmin^, and its corresponding weak classifier as;
1<7<42727 计算参数^- ^!; 更 新 样 本 的 权 重 其 中 = l,...,w 1<7<42727 Calculate the parameter ^- ^!; update the weight of the sample where = l,...,w
Figure imgf000031_0002
Figure imgf000031_0002
=∑A (0 exp(-", (sub,. )) =∑A (0 exp(-", (sub,. ))
输出最后的强分类器 H(Sub)Output the last strong classifier H ( S ub)
Figure imgf000031_0003
对于通过强分类器判断的模式, 釆用 (l I f(x)) = e +e-.m得到模式属于眼睛的后验概 率, 此处/ (x) = f ,(/¾(x)— ;
Figure imgf000031_0003
For the mode judged by the strong classifier, use (l I f(x)) = e +e -.m to get the pattern belonging to the posterior of the eye. Rate, here / (x) = f , (/3⁄4(x)— ;
5.根据权利要求 2所述复杂背景图像中鲁棒的眼睛精确定位方法, 其特征在于, 所述步 骤 2 (3) (iii) 单眼检测器釆用分层结构, 训练多层强分类器的级联包括以下步骤: The method for accurately positioning an eye in a complex background image according to claim 2, wherein the step 2 (3) (iii) the monocular detector uses a hierarchical structure to train the multi-layer strong classifier Cascading includes the following steps:
1)初始化 ζ· = 1; 定义每层强分类器的训练目标是在眼睛训练集上 1) Initialization ζ· = 1; Define the training target for each level of strong classifier on the eye training set
FRR≤0.l%, 在非眼睛训练集上 ¾i?≤60%; 定义整个眼睛检测器在眼睛训练集上的 目标 FRR≤ 1% , 在非眼睛训练集上的目标 ≤ 5 X 10 ; FRR ≤ 0.l%, 3⁄4i? ≤ 60% on the non-eye training set; define the target of the entire eye detector on the eye training set FRR ≤ 1%, and the target on the non-eye training set ≤ 5 X 10 ;
2) 使用训练样本集采用步骤 2 (3)(ii)所述的基于 AdaBoost算法的来训练第 i层眼睛 / 非眼睛强分类器; 2) using the training sample set to train the i-th layer eye/non-eye strong classifier using the AdaBoost algorithm described in step 2 (3)(ii);
3)用训练得到的前 i层分类器对样本集进行检测; 3) testing the sample set with the former i-level classifier obtained by training;
4) 如果 ?、 /¾ 未达到预定值, 贝 ij 值增加 1, 返回步骤②继续进行 训练; 否则停止训练; 4) If? , /3⁄4 does not reach the predetermined value, the shell ij value increases 1, return to step 2 to continue training; otherwise stop training;
共训练得到 7层结构从简单到复杂的强分类器;由这些强分类器级联构成一个完整的单眼 检测器;  Co-training to obtain a 7-layer structure from simple to complex strong classifier; these strong classifiers cascade to form a complete monocular detector;
6.根据权利要求 2述复杂背景图像中鲁棒的眼睛精确定位方法, 其特征在  6. A method for accurately positioning an eye in a complex background image according to claim 2, characterized in that
于, 所述步骤 3 (3) (iii) 整个眼睛对验证器釆用分层结构,'训练多层强分类器的连接包括以 下步骤: Then, the step 3 (3) (iii) uses the hierarchical structure of the entire eye for the verifier, and the training of the multi-layer strong classifier includes the following steps:
1)初始化/ = 1; 定义每层强分类器的训练目标是在眼睛对训练集上^ 0.1%, 在非 眼睛对训练集上 ¾ ?≤50%; 定义整个眼睛对检测器在眼睛对训练集上的目标 Ei?i?≤l%,在 非眼睛对训练集上的目标_¾^≤^10一31) Initialization / = 1; The training goal for defining each level of strong classifier is ^ 0.1% on the eye pair training set, 3⁄4 ≤ 50% on the non-eye pair training set ; defining the entire eye to the detector in the eye pair training ? Ei on the target set i ≤l%, in non-eye target on the training set _¾ ^ ≤ ^ 10 a 3?;
2)使用训练样本集釆用步骤 3.3.2所述的使用 AdaBoost算法来训练第 i层眼睛对 /非眼睛对强分类器; 2) Using the training sample set, use the AdaBoost algorithm described in step 3.3.2 to train the i-th eye pair/non-eye pair strong classifier;
3)用训练得到的前 i层分类器对样本集进行检测; 3) testing the sample set with the former i-level classifier obtained by training;
4) 如果 i¾?、 未达到预定值, 则 值增加 1, 返回步骤 (b) 继续进行训练; 否 则停止训练; 共训练得到 9层结构从简单到复杂的强分类器;由这些强分类器级联构成一个完整的眼睛 对检测器; 4) If i3⁄4? If the predetermined value is not reached, the value is increased by 1, returning to step (b) to continue training; otherwise, the training is stopped; the training is to obtain a 9-layer structure from simple to complex strong classifier; the strong classifiers are cascaded to form a complete Eye to detector;
PCT/CN2007/001894 2007-06-15 2007-06-15 A robust precise eye positioning method in complicated background image WO2008151471A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2007/001894 WO2008151471A1 (en) 2007-06-15 2007-06-15 A robust precise eye positioning method in complicated background image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2007/001894 WO2008151471A1 (en) 2007-06-15 2007-06-15 A robust precise eye positioning method in complicated background image

Publications (1)

Publication Number Publication Date
WO2008151471A1 true WO2008151471A1 (en) 2008-12-18

Family

ID=40129206

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/001894 WO2008151471A1 (en) 2007-06-15 2007-06-15 A robust precise eye positioning method in complicated background image

Country Status (1)

Country Link
WO (1) WO2008151471A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102339377A (en) * 2010-07-21 2012-02-01 比亚迪股份有限公司 Quick human-eye positioning method and device
CN107330188A (en) * 2017-06-30 2017-11-07 武汉大学深圳研究院 Towards the multi-color halftone disassembled asset formula modeling method and system for replicating object
CN108182380A (en) * 2017-11-30 2018-06-19 天津大学 A kind of flake pupil intelligent measurement method based on machine learning
CN109636794A (en) * 2018-12-14 2019-04-16 辽宁奇辉电子***工程有限公司 A kind of subway height adjusting valve fastening nut localization method based on machine learning
CN110706235A (en) * 2019-08-30 2020-01-17 华南农业大学 Far infrared pedestrian detection method based on two-stage cascade segmentation
CN111523407A (en) * 2020-04-08 2020-08-11 上海涛润医疗科技有限公司 Face recognition system and method and medical care recording system based on face recognition
CN111582008A (en) * 2019-02-19 2020-08-25 富士通株式会社 Device and method for training classification model and device for classification by using classification model
CN112907510A (en) * 2021-01-15 2021-06-04 中国人民解放军国防科技大学 Surface defect detection method
CN112966705A (en) * 2020-11-24 2021-06-15 大禹节水集团股份有限公司 Adaboost-based agricultural irrigation drip irrigation head quality online identification method
CN113780155A (en) * 2021-09-07 2021-12-10 合肥工业大学 Pig face detection method based on newly-added Haar-like features
CN116363736A (en) * 2023-05-31 2023-06-30 山东农业工程学院 Big data user information acquisition method based on digitalization
CN117934598A (en) * 2024-03-21 2024-04-26 浙江大学 Desktop-level rigid body positioning equipment and method based on optical positioning technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005008210A2 (en) * 2003-05-14 2005-01-27 Polcha Michael P System and method for performing security access control based on modified biometric data
CN1731418A (en) * 2005-08-19 2006-02-08 清华大学 Method of robust accurate eye positioning in complicated background image

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005008210A2 (en) * 2003-05-14 2005-01-27 Polcha Michael P System and method for performing security access control based on modified biometric data
CN1731418A (en) * 2005-08-19 2006-02-08 清华大学 Method of robust accurate eye positioning in complicated background image

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102339377A (en) * 2010-07-21 2012-02-01 比亚迪股份有限公司 Quick human-eye positioning method and device
CN107330188A (en) * 2017-06-30 2017-11-07 武汉大学深圳研究院 Towards the multi-color halftone disassembled asset formula modeling method and system for replicating object
CN108182380A (en) * 2017-11-30 2018-06-19 天津大学 A kind of flake pupil intelligent measurement method based on machine learning
CN108182380B (en) * 2017-11-30 2023-06-06 天津大学 Intelligent fisheye pupil measurement method based on machine learning
CN109636794B (en) * 2018-12-14 2023-02-28 辽宁奇辉电子***工程有限公司 Machine learning-based subway height adjusting valve fastening nut positioning method
CN109636794A (en) * 2018-12-14 2019-04-16 辽宁奇辉电子***工程有限公司 A kind of subway height adjusting valve fastening nut localization method based on machine learning
CN111582008A (en) * 2019-02-19 2020-08-25 富士通株式会社 Device and method for training classification model and device for classification by using classification model
CN111582008B (en) * 2019-02-19 2023-09-08 富士通株式会社 Device and method for training classification model and device for classifying by using classification model
CN110706235A (en) * 2019-08-30 2020-01-17 华南农业大学 Far infrared pedestrian detection method based on two-stage cascade segmentation
CN111523407A (en) * 2020-04-08 2020-08-11 上海涛润医疗科技有限公司 Face recognition system and method and medical care recording system based on face recognition
CN112966705A (en) * 2020-11-24 2021-06-15 大禹节水集团股份有限公司 Adaboost-based agricultural irrigation drip irrigation head quality online identification method
CN112907510B (en) * 2021-01-15 2023-07-07 中国人民解放军国防科技大学 Surface defect detection method
CN112907510A (en) * 2021-01-15 2021-06-04 中国人民解放军国防科技大学 Surface defect detection method
CN113780155A (en) * 2021-09-07 2021-12-10 合肥工业大学 Pig face detection method based on newly-added Haar-like features
CN116363736A (en) * 2023-05-31 2023-06-30 山东农业工程学院 Big data user information acquisition method based on digitalization
CN116363736B (en) * 2023-05-31 2023-08-18 山东农业工程学院 Big data user information acquisition method based on digitalization
CN117934598A (en) * 2024-03-21 2024-04-26 浙江大学 Desktop-level rigid body positioning equipment and method based on optical positioning technology
CN117934598B (en) * 2024-03-21 2024-06-11 浙江大学 Desktop-level rigid body positioning equipment and method based on optical positioning technology

Similar Documents

Publication Publication Date Title
WO2008151471A1 (en) A robust precise eye positioning method in complicated background image
US20210287026A1 (en) Method and apparatus with liveness verification
WO2020000908A1 (en) Method and device for face liveness detection
US8320643B2 (en) Face authentication device
US9064145B2 (en) Identity recognition based on multiple feature fusion for an eye image
WO2008151470A1 (en) A robust human face detecting method in complicated background image
CN105956572A (en) In vivo face detection method based on convolutional neural network
CN111680588A (en) Human face gate living body detection method based on visible light and infrared light
CN108563999A (en) A kind of piece identity&#39;s recognition methods and device towards low quality video image
US7869632B2 (en) Automatic trimming method, apparatus and program
CN104504362A (en) Face detection method based on convolutional neural network
CN103632132A (en) Face detection and recognition method based on skin color segmentation and template matching
KR101172898B1 (en) Method for detecting liveness for face recognition system
CN101216887A (en) An automatic computer authentication method for photographic faces and living faces
CN110414299B (en) Monkey face affinity analysis method based on computer vision
TW200910223A (en) Image processing apparatus and image processing method
CN105512630B (en) Human eye detection and localization method
WO2021088640A1 (en) Facial recognition technology based on heuristic gaussian cloud transformation
WO2022268183A1 (en) Video-based random gesture authentication method and system
RU2316051C2 (en) Method and system for automatically checking presence of a living human face in biometric safety systems
WO2015037973A1 (en) A face identification method
Pathak et al. Multimodal eye biometric system based on contour based E-CNN and multi algorithmic feature extraction using SVBF matching
CN111191549A (en) Two-stage face anti-counterfeiting detection method
CN110378414A (en) The personal identification method of multi-modal biological characteristic fusion based on evolution strategy
Xu et al. A novel multi-view face detection method based on improved real adaboost algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07721467

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07721467

Country of ref document: EP

Kind code of ref document: A1