Background technology
At present, traffic hazard causes ten hundreds of vehicle collisions and great casualties every year, according to incompletely statistics, the whole world surpasses 600,000 because of road traffic accident causes dead number, wherein because the traffic hazard that driver tired driving causes has 100,000 at least, direct economic loss reaches 12,500,000,000 dollars.Driver tired driving with drive when intoxicated equally, become the main hidden danger of traffic hazard.Follow development of computer, the various countries researchist has begun to further investigate the detection method of fatigue driving from every field, the United States Federal in 1998 Speedway Control Broad test has confirmed that PERCLOS (number percent of unit interval human eye closure) and driver's fatigue conditions have the correlativity of height, and this has opened up new thinking for fatigue driving detects.See document D.F.Dinges for details, and R.Grace, " PERCLOS:A valid psychophysiological measure of alertness asassessed by psychomotor vigilance; " US Department of Transportation, Federal highwayAdministration.Publication Number FHWA-MCRT-98-006.
Method for detecting fatigue driving based on the PERCLOS feature is gathered the driver front usually, and especially the video image of eye areas is handled, and whole detection method mainly comprises people's face location, human eye location, three processes of human eye state identification.And these processes all can be summed up as in the pattern-recognition people's face and non-face, human eye and non-human eye, the classification problem of opening eyes and closing one's eyes.Solve above-mentioned classification problem following several classical way is arranged usually: (1) SVM, i.e. support vector machine.SVM is a kind of learning machine of the Statistical Learning Theory based on structural risk minimization, is widely used in each branch of pattern-recognition.SVM is the earliest by propositions such as Vapnik, and it is specially adapted to the higher-dimension small sample problem, and the excellent popularization ability is arranged.(2) FLD, promptly Fisher is linear differentiates.FLD attempts to seek a projecting direction, makes to differentiate best to 2 class samples.Try to achieve best projection direction w
*After, all samples are projected to the best projection direction, obtain y=w
* TX, and select a threshold value y
0Carrying out 2 classes divides.(3) based on the Adaboost algorithm of Haar type rectangular characteristic.The Adaboost algorithm is a kind of learning algorithm that is widely used in recent years, and it is proposed by people such as Schapire the earliest, and its main thought is to select the part Weak Classifier from a big Weak Classifier space, and they are combined constitutes a strong classifier.
Experiment shows, and is fast based on Adaboost algorithm strong robustness, accuracy height and the speed of Haar type rectangular characteristic, has very significantly actual application value.Its specific practice is to extract the Haar-like proper vector from positive negative sample, uses cascade AdaBoost method to make up sorter model then, trains the concrete parameter of sorter.See for details document Paul Viola andMichael J.Jones. " Rapid Object Detection using a Boosted Cascade of Simple Features; " IEEECVPR, 2001. and document R.Lienhart, A.Kuranov, and V.Pisarevsky. " Empirical analysis of detectioncascades of boosted classifiers for rapid object detection; " In DAGM25th Pattern RecognitionSymposium, 2003.
In actual applications, adopt the Adaboost algorithm based on Haar type rectangular characteristic, the classifier parameters of training out by general people's face sample storehouse can apply to people's face location and human eye location; And for the identification of eye state, this method can only reach certain accuracy rate for most of crowd, and higher relatively for another part crowd misclassification rate, even mistake fully.This is that and custom such as whether wear glasses is difficult to differentiate with a general sorter because everyone eyes are opened with closed otherness very greatly.
Summary of the invention
The invention provides a kind of eye state identification method based on customization classifier, this method can generate the sorter of different eye states according to different users, improves the accuracy rate and the scope of application of eye state identification.
In order to describe content of the present invention easily, at first some terms are defined.
Definition 1: eye state.Detect for fatigue driving, eye state is divided into to be opened and closed two types.
Definition 2: people's face sample storehouse.People's face sample storehouse among the present invention is meant the image library that has comprised different front faces.Whether the image of this database should be gathered under different photoenvironments, and according to wearing glasses, and is divided into wearing spectacles database and wearing spectacles database not.
Definition 3: human eye central point.For the image of opening eyes, definition human eye central point is a pupil center location; For the image of closing one's eyes, definition human eye central point is an eye seam point midway.
Define five in 4: three front yards." five in three front yards " be people's face long with the wide ratio of face, think 3/10ths of human eye area width behaviour face width in the present invention, and the distance between two human eyes is the width of a human eye just.
Definition 5:Haar-like proper vector.The Haar-like feature is to be characterized in people's face by humans such as Papageorgiou the earliest.People such as Papageorgiou use the Haar wavelet basis function at the research of front face and human detection problem, they find that standard quadrature Haar wavelet basis is subjected to certain restriction on using, in order to obtain better spatial resolution, they have used the feature of 3 kinds of forms.People such as Viola have done expansion on this basis, use 2 types of features of totally 4 kinds of forms.Lienhart has increased the rectangular characteristic of several hypotenuses again finally, makes characteristic type reach 3 types 14 kinds forms (as shown in Figure 2).
Definition 6:AdaBoost.The Adaboost full name is Adaptive Boost, is a kind of iterative algorithm, and its core concept is at the different sorter (Weak Classifier) of same training sample set training, then these Weak Classifiers is combined, and constitutes a strong classifier.Its algorithm itself realizes by changing DATA DISTRIBUTION whether it is correct according to the classification of each training sample among each training sample set, and the accuracy rate of overall classification last time, determines the weights of each training sample.Give lower floor's sorter with the new training sample set of revising weights and train, will train the set of classifiers that obtains at last at every turn altogether as decision-making sorter (strong classifier).Use the Adaboost sorter can get rid of some unnecessary training sample features, and the main foundation that will classify is placed on above the main training sample feature.Wherein common Adaboost has Discrete AdaBoost, Real AdaBoost and Gentle AdaBoost.Discrete AdaBoost be meant a kind of output valve of Weak Classifier be limited to 1 ,+1}'s and generate the AdaBoost algorithm of strong classifier through the weights adjustment; Real AdaBoost is meant that a kind of Weak Classifier output area is R's and generate the AdaBoost algorithm of strong classifier through the weights adjustment; Gentle AdaBoost be a kind of at the two kinds of AdaBoost in front owing to " unlike " the very high problem that has caused the decrease in efficiency of sorter of positive sample weights adjustment, and the mutation algorithm that produces.
Technical solution of the present invention is as follows:
A kind of eye state identification method based on customization classifier as shown in Figure 1, may further comprise the steps:
Step 1: set up facial image database A.Described face database A comprises two word bank A1 and A2, one of them word bank A1 forms by removing with the open air, Different Individual, that do not wear glasses, front face gray level image, and another word bank A2 forms by removing with the open air, Different Individual, that wear glasses, front face gray level image.Two central point distances of the people's face gray level image among the face database A are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state.
Step 2: set up user's facial image database B.Described user's facial image database B comprises two word bank B1 and B2, and one of them word bank B1 is made up of the user, that do not wear glasses, front face gray level image, and another word bank B2 is made up of the user, that wear glasses, front face gray level image.Two central point distances of the people's face gray level image among the face database B are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state.
Step 3: the eye image that calculates each width of cloth facial image among facial image database A and the user's facial image database B, obtain two word bank A1 ' and the A2 ' of the eye image database A ' corresponding respectively with two word bank A1 and A2 among the facial image database A, and two the word bank B1 ' of the eye image database B ' corresponding with two word bank B1 and B2 among user's facial image database B and B2 '.The computing method of concrete eye image are: at first calculate the pixel distance d between two of people's face gray level images; According to the principle in five in three front yards, be the center then with the human eye central point, the long and wide rectangular area that is the d/2 pixel size of intercepting; All rectangular areas are zoomed to 24 * 24 pixel sizes, and rotation at random in-10 ° to 10 ° scopes in the direction of the clock, eye image obtained at last.
Step 4: set up and mix eye image database C.Described mixing eye image database C comprises 2N word bank
With
Word bank wherein
(1≤i≤N, N are natural number) by the eye image of the eye image of the A1 ' of word bank described in the
step 3 and word bank B1 ' according to different proportion, mix at random; Word bank
(1≤i≤N, N are natural number) by the eye image of the eye image of the A2 ' of word bank described in the
step 3 and word bank B2 ' according to different proportion, mix at random.Described word bank
With
In eye image quantity be not less than 2000.
Step 5: calculate the eye image word bank
With
In the haar-like proper vector x of all eye images, described haar-like proper vector x comprises 3 types of 14 kinds of forms, and with each eye image word bank
With
All proper vector x combine and constitute 2N training sequence
With
(1≤i≤N); And training sequence
With
Can be expressed as { (x
1, y
1), (x
2, y
2) ..., (x
i, y
i) ..., (x
M, y
M) form, x wherein
iExpression
With
In i haar-like proper vector; y
i∈ 1,1}, expression haar-like proper vector x
iThe state that pairing eye image is opened eyes or closed one's eyes; M is the eye image storehouse
With
Middle eye image quantity.
Step 6: to 2N training sequence of step 5 gained
With
Adopt the AdaBoost method to make up a corresponding 2N strong classifier
With
Step 7: the eye image from user's eye image word bank B1 ' that
step 3 is set up more than picked at random 1000 width of cloth, calculate its haar-like proper vector x, adopt the constructed strong classifier of step 6 respectively
Judge, obtain judged result: 1-opens eyes, and 0-closes one's eyes; The eye image more than picked at random 1000 width of cloth from user's eye image word bank B2 ' that
step 3 is set up calculates its haar-like proper vector x equally, adopts the constructed strong classifier of step 6 respectively
Judge, obtain judged result: 1-opens eyes, and 0-closes one's eyes.
Step 8: the judged result of step 7 gained and selected eye image actual opened eyes or closed-eye state compares, and then count two groups of strong classifiers respectively
With
Recognition accuracy, choose strong classifier then
In the highest strong classifier of recognition accuracy carry out the sorter of the human eye state identification in the driving procedure as the user at wearing spectacles not, choose strong classifier
The sorter that the strong classifier that middle recognition accuracy is the highest carries out the human eye state identification in the driving procedure as the user at wearing spectacles.
Step 9: in user's driving procedure, gather user's front face image in real time, and calculate the eyes image of 24 * 24 pixel sizes and the haar-like proper vector x of this eyes image in real time, at last according to the user whether wearing spectacles select that corresponding strong classifier carries out human eye state identification in the step 8.
By above step, just can use eye state sorter, thereby improve the accuracy rate of individual state identification according to different user based on customization.
Need to prove:
1. step 1 and step 2 are when setting up face database A and user's face database B, and facial image is preferably under various different light environment and gathers.Can at first make up one and gather environment, this collection environment is preferably the darkroom, is furnished with regulatable light source, can realize that the light and shade of photoenvironment changes, and can collect individual thousands of width of cloth facial images in a few minutes.
2. there is no particular limitation to the AdaBoost method that adopted in the step 6, and various AdaBoost methods all can be used, and is that last accuracy rate is slightly different.
The present invention adopts the constant method of feature according to the thought of customization, sets up facial image database and user's facial image database at first respectively; Calculate the eye image of every width of cloth image in facial image database and the user's facial image database then respectively; Eye image with the facial image database mixes by different proportion with the eye image of user's facial image database again, obtains mixing the eye image database; Calculate the haar-like proper vector of mixing every width of cloth image in the eye image database again, and adopt the AdaBoost method to make up strong classifier; Eye image in the some width of cloth users of the picked at random facial image database again, calculate its haar-like proper vector, the strong classifier that adopts the AdaBoost method to make up is judged, count the recognition accuracy of strong classifier, choose the highest strong classifier of recognition accuracy as the human eye state recognition classifier of using in user's driving procedure; In user's driving procedure, adopt this sorter to carry out human eye state identification at last.
Innovation part of the present invention is:
1, the thought with customization applies to use different sorters for different user in the human eye state identification, has improved the accuracy rate of individual human eye state identification.
2, the training sample of sorter has adopted the method for user data and face database data mixing, makes sorter to improve accuracy rate at individuality, guarantees again simultaneously to be without loss of generality, and reduces the identification risk.
3, the user's of raising wearing spectacles recognition accuracy, and user can be selected for use and wear glasses and the two kinds of different sorters of not wearing glasses, and possesses dirigibility.
Embodiment
A kind of eye state identification method based on customization classifier as shown in Figure 1, may further comprise the steps:
Step 1: set up facial image database A.Described face database A comprises two word bank A1 and A2, one of them word bank A1 forms by removing with the open air, Different Individual, that do not wear glasses, front face gray level image, and another word bank A2 forms by removing with the open air, Different Individual, that wear glasses, front face gray level image.Two central point distances of the people's face gray level image among the face database A are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state.
Step 2: set up user's facial image database B.Described user's facial image database B comprises two word bank B1 and B2, and one of them word bank B1 is made up of the user, that do not wear glasses, front face gray level image, and another word bank B2 is made up of the user, that wear glasses, front face gray level image.Two central point distances of the people's face gray level image among the face database B are not less than 48 pixel units, people's face gray level image quantity basically identical of open eyes state and closed-eye state.
Step 3: the eye image that calculates each width of cloth facial image among facial image database A and the user's facial image database B, obtain two word bank A1 ' and the A2 ' of the eye image database A ' corresponding respectively with two word bank A1 and A2 among the facial image database A, and two the word bank B1 ' of the eye image database B ' corresponding with two word bank B1 and B2 among user's facial image database B and B2 '.The computing method of concrete eye image are: at first calculate the pixel distance d between two of people's face gray level images; According to the principle in five in three front yards, be the center then with the human eye central point, the long and wide rectangular area that is the d/2 pixel size of intercepting; All rectangular areas are zoomed to 24 * 24 pixel sizes, and rotation at random in-10 ° to 10 ° scopes in the direction of the clock, eye image obtained at last.
Step 4: set up and mix eye image database C.Described mixing eye image database C comprises 2N word bank
With
Word bank wherein
(1≤i≤N, N are natural number) by the eye image of the eye image of the A1 ' of word bank described in the
step 3 and word bank B1 ' according to different proportion, mix at random; Word bank
(1≤i≤N, N are natural number) by the eye image of the eye image of the A2 ' of word bank described in the
step 3 and word bank B2 ' according to different proportion, mix at random.Described word bank
With
In eye image quantity be not less than 2000.
Step 5: calculate the eye image word bank
With
In the haar-like proper vector x of all eye images, described haar-like proper vector x comprises 3 types of 14 kinds of forms, and with each eye image word bank
With
All proper vector x combine and constitute 2N training sequence
With
(1≤i≤N); And training sequence
With
Can be expressed as { (x
1, y
1), (x
2, y
2) ..., (x
i, y
i) ..., (x
M, y
M) form, x wherein
iExpression
With
In i haar-like proper vector; y
i∈ 1,1}, expression haar-like proper vector x
iThe state that pairing eye image is opened eyes or closed one's eyes; M is the eye image storehouse
With
Middle eye image quantity.
Step 6: to 2N training sequence of step 5 gained
With
Adopt the AdaBoost method to make up a corresponding 2N strong classifier
With
Step 7: the eye image from user's eye image word bank B1 ' that
step 3 is set up more than picked at random 1000 width of cloth, calculate its haar-like proper vector x, adopt the constructed strong classifier of step 6 respectively
Judge, obtain judged result: 1-opens eyes, and 0-closes one's eyes; The eye image more than picked at random 1000 width of cloth from user's eye image word bank B2 ' that
step 3 is set up calculates its haar-like proper vector x equally, adopts the constructed strong classifier of step 6 respectively
Judge, obtain judged result: 1-opens eyes, and 0-closes one's eyes.
Step 8: the judged result of step 7 gained and selected eye image actual opened eyes or closed-eye state compares, and then count two groups of strong classifiers respectively
With
Recognition accuracy, choose strong classifier then
In the highest strong classifier of recognition accuracy carry out the sorter of the human eye state identification in the driving procedure as the user at wearing spectacles not, choose strong classifier
The sorter that the strong classifier that middle recognition accuracy is the highest carries out the human eye state identification in the driving procedure as the user at wearing spectacles.
Step 9: in user's driving procedure, gather user's front face image in real time, and calculate the eyes image of 24 * 24 pixel sizes and the haar-like proper vector x of this eyes image in real time, at last according to the user whether wearing spectacles select that corresponding strong classifier carries out human eye state identification in the step 8.
The inventive method is compared with the method for only using general face database image to train, and the general individual accuracy rate improves about 2%, and the individual accuracy rate of wearing spectacles improves 3%~5%, and operation time is less than 0.1s.
In sum, method of the present invention is utilized the thought of customization, and user data is combined with the face database data, adopts the constant method of feature to train the human eye state sorter, thereby has realized human eye state identification fast and accurately.