CN110348350A

CN110348350A - A kind of driver status detection method based on facial expression

Info

Publication number: CN110348350A
Application number: CN201910584900.8A
Authority: CN
Inventors: 胡江平; 甘路涛; 张馨滢; 李咏章
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2019-07-01
Filing date: 2019-07-01
Publication date: 2019-10-18
Anticipated expiration: 2039-07-01
Also published as: CN110348350B

Abstract

The driver status detection method based on facial expression that the invention discloses a kind of, by gray processing, Gamma correction and PCA dimension-reduction treatment, so that face-image size reduces, feature enhancing.On this basis, the present invention constructs a sequentially connected four-layer structure, average pond layer, the human facial expression recognition that dropout layers and softmax classifier are constituted is rolled up according to neural network, its number of parameters is small, simultaneously, inception structure design therein uses a kind of new form, traditional regular convolution 3*3 convolution is splitted into 1*3 convolution sum 3*1 convolution, quantity of parameters has on the one hand been saved in this way, accelerate operation and alleviates over-fitting, and increase one layer of nonlinear extensions model tormulation ability, it can handle more, richer space characteristics, increase characteristic polymorphic.This inception structure designs so that the accuracy rate that human facial expression recognition volume becomes more lightweight, while there is better detection effect to improve driver status detection according to neural network.

Description

A kind of driver status detection method based on facial expression

Technical field

The invention belongs to driver status technical fields, more specifically, are related to a kind of driving based on facial expression Member's condition detection method, i.e., be measured in real time driver's facial expression and driving condition thus current to driver carries out The method of judgement.

Background technique

The driving condition of driver plays a crucial role safe driving, goes out driving for driver by real-time detection State is sailed, can ensure the safe driving of driver well.

At present the driving condition of driver analyze and determine and is broadly divided into contact and contactless two major classes.Its In, contact method predominantly detects the physiological signals such as driver's EEG signals, electromyography signal by wearable device etc. to sentence The driving condition of disconnected driver, this method primary disadvantage is that driver safety driving can be impacted in detection process and at This is higher；Contactless method is divided into three groups, and the first kind is to judge driver's by detecting the driving trace of vehicle Driving condition, but this method is affected by environment road and accuracy rate is low, second method is by real-time detection direction Situations such as disk rotational angle, brake clutch stress degree, judges the driving condition of driver, but this method is by driver The driving habit of people is affected；The third method is the driver taken using camera using computer vision methods Face-image judges the current expression of driver, and then real-time detection goes out the driving condition of driver, and this method has real-time Property good, advantage that accuracy rate is high, therefore, the driving condition that computer vision methods detect driver is current main flow direction.

Facial expression exchanges upper important role interpersonal, facial expression relative to media such as text-to-speech, There is more intuitive, accurate advantage in terms of the emotion of expression people.This affective interaction mode of people have been used to now as The scenes such as virtual reality, field of digital entertainment, communication and video conference, human-computer interaction.Therefore the driver based on facial expression State-detection can more have advantage and hommization relative to simple fatigue detecting.Human facial expression recognition method generally comprise with Lower three aspects: facial image pretreatment, facial expression feature study, facial expression classification, it is last to classify according to facial expression Detect the driving condition of driver.

However, the existing driver status detection based on facial label, number of parameters is big, and arithmetic speed is lower, thus Affect the real-time of driver status detection.Meanwhile accuracy rate is also to be improved.

Summary of the invention

The driver status inspection based on facial expression that it is an object of the invention to overcome the deficiencies of the prior art and provide a kind of Survey method to enhance the real-time of driver status detection, while improving accuracy rate.

For achieving the above object, the present invention is based on the driver status detection methods of facial expression, which is characterized in that The following steps are included:

(1), the face-image of driver is obtained

The video flowing that driver is obtained using the camera installed in front of driver, utilizes haar feature+adaboost Face datection algorithm detects human face region, that is, face-image of user in video streaming image；

(2), the face-image of acquisition is pre-processed

Image gray processing processing is carried out to face-image first

Gray=0.3R+0.59G+0.11B

Wherein, Gray is grey scale pixel value, and R is red pixel value, G is green pixel values, B is blue pixel value；

Then Gamma correction is carried out:

I=Gray^γ

Wherein, I is grey scale pixel value after correction, γ 0.5；

Finally, using at PCA (principal component analysis) method the image after gray processing processing and Gamma correction again Reason:

It determines that face-image is the matrix X of n row m column, each row of matrix X is first subjected to zero averaging, then finds out square The covariance matrix of battle array X, then find out the characteristic value and corresponding feature vector of covariance matrix；By feature vector according to corresponding special Value indicative size lines up matrix O by row from small to large, and the preceding K row of matrix O is taken to form matrix P, and matrix P is the matrix of K row n column, obtains Face-image after being dimensionality reduction to K dimension to face-image Y, Y=PX；

(3), driver's human facial expression recognition

3.1), building human facial expression recognition volume is according to neural network

The human facial expression recognition volume includes sequentially connected four-layer structure, average pond layer, dropout according to neural network Layer and softmax classifier；

Each layer of structure include convolution kernel size be 3 × 3, the first convolutional layer that step-length is 2, convolution kernel size be 3 × 3, Second and third convolutional layer, pond layer and the inception structure that step-length is 1；Wherein, the volume of the pond layer in three first layers structure Product core size is 3 × 3, step-length 2, and the convolution kernel size of the pond layer of four-layer structure is 3 × 3, step-length 1；

The inception structure is divided into four characteristic pattern treatment channels and a filter including parallel processing Concatenation layers, first characteristic pattern treatment channel uses size to operate for 3 × 3 convolution kernels to input feature vector figure pondization, Then it uses size to carry out convolution operation for 1 × 1 convolution kernel, is finally sent into filter concatenation layers；Second Characteristic pattern treatment channel uses size to carry out convolution operation to input feature vector figure for 1 × 1 convolution kernel, is then fed into filter In concatenation layers；Third characteristic pattern treatment channel uses size to roll up for 1 × 1 convolution kernel to input feature vector figure Product operation, obtained characteristic pattern use size to carry out convolution operation, obtained spy for 3 × 1 convolution kernels, 1 × 3 convolution kernel again respectively Sign figure is all sent into filter concatenation layers；4th characteristic pattern treatment channel uses size for 1 × 1 convolution kernel pair Input feature vector figure carries out convolution operation, and the characteristic pattern that convolution operation obtains uses size to carry out convolution behaviour for 3 × 3 convolution kernels again Make, then respectively use again size for 3 × 1 convolution kernels, 1 × 3 convolution kernel to the characteristic pattern after 3 × 3 convolution kernel convolution operations into Row convolution operation, obtained characteristic pattern are all sent into concatenation layers of filter；Concatenation layers of filter The middle characteristic pattern for obtaining four characteristic pattern treatment channels is attached, the characteristic pattern after being connected；

K after the first convolutional layer input dimensionality reduction of first layer structure ties up face-image, successively passes through the first, second and third convolution It is sent into after the convolution operation of layer in the layer of pond and carries out pondization operation, the characteristic image of Chi Huahou is sent into inception structure and is carried out Processing, the characteristic pattern after being connected；Characteristic pattern after the complete obtained connection of first layer pattern handling be sent into second layer structure into The complete characteristic pattern of the same treatment of row first layer structure, second layer pattern handling is sent into third layer structure and carries out first layer structure Same treatment, the same treatment of the complete characteristic pattern feeding four-layer structure progress first layer structure of third layer pattern handling, the 4th Characteristic pattern after the layer complete obtained connection of pattern handling is sent into average pond layer and carries out average pondization operation, the spy of average Chi Huahou It levies figure and carries out a certain proportion of discarding at dropout layers, be then fed into classification in softmax classifier device and obtain facial expression；

3.2), training human facial expression recognition volume is according to neural network

The K for marking facial expression is tieed up into face-image and is sent into the human facial expression recognition volume of step 3.1) building according to neural network It is trained, obtains trained human facial expression recognition volume according to neural network；

Wherein, human facial expression recognition volume is Relu function, optimization according to the activation primitive chosen in neural network training process Algorithm is SGD ((Stochastic Gradient Descent, stochastic gradient descent)), initial method Xavier, study Rate are as follows:

base_lr(1-iter/max_iter)×0.5

Wherein, base_lr=0.01 is initial learning rate, and iter is the number of current iteration, and max_iter is maximum The number of iterations；

3.3), the driver's face-image obtained is sent into trained facial expression and is known after step (1), (2) processing The facial expression of driver Juan not be obtained according to neural network；

(4), result is exported

After identifying driver's facial expression, the driving condition of driver is obtained, is displayed in real time in screen, or Driver can be prompted in time, when the facial expression of the unsuitable driving such as indignation occurs in driver, provide discomfort It closes the state driven to remind, driver is effectively prompted in time or is worked as using a series of method to alleviate driver Preceding uncomfortable driving condition.

Goal of the invention of the invention is achieved in that

The present invention is based on the driver status detection methods of facial expression, are dropped by gray processing, Gamma correction and PCA Dimension processing, so that face-image size reduces, feature enhancing.On this basis, the present invention constructs one sequentially connected four The human facial expression recognition that layer structure, average pond layer, dropout layers and softmax classifier are constituted is rolled up according to neural network, Number of parameters is small, meanwhile, inception structure design therein uses a kind of new form, by traditional regular convolution 3*3 Convolution splits into 1*3 convolution sum 3*1 convolution, has on the one hand saved quantity of parameters in this way, accelerates operation and alleviates over-fitting, and increases Add one layer of nonlinear extensions model tormulation ability, can handle more and richer space characteristics, increases characteristic polymorphic. This inception structure designs so that human facial expression recognition volume becomes more lightweight according to neural network, while having more preferable Detection effect be improve driver status detection accuracy rate.

Detailed description of the invention

Fig. 1 is the flow chart of the driver status detection method the present invention is based on facial expression；

Fig. 2 is human facial expression recognition volume according to a kind of specific embodiment frame diagram of neural network；

Fig. 3 is frame diagram of the volume of human facial expression recognition shown in Fig. 2 according to inception structure in neural network.

Specific embodiment

A specific embodiment of the invention is described with reference to the accompanying drawing, preferably so as to those skilled in the art Understand the present invention.Requiring particular attention is that in the following description, when known function and the detailed description of design perhaps When can desalinate main contents of the invention, these descriptions will be ignored herein.

Embodiment

In the present invention, the human face region of driver is detected using haar feature+adaboost Face datection algorithm first That is face-image, the human facial expression recognition convolutional Neural of the building inputted after then the face-image detected is pre-processed In network, the facial expression of current driver's is detected in real time, obtains the driving condition of driver.

Fig. 1 is the flow chart of the driver status detection method the present invention is based on facial expression.

In the present embodiment, as shown in Figure 1, the present invention is based on the driver status detection methods of facial expression, including with Lower step:

Step S1: the face-image of driver is obtained

The video flowing that driver is obtained using the camera installed in front of driver, utilizes haar feature+adaboost Face datection algorithm detects human face region, that is, face-image of user in video streaming image: the haar-like extracted in image is special Then haar-like feature is inputted Adaboost classifier by sign, regional location locating for driver's face is detected, face position Frame is set to choose as face-image progress subsequent processing.

Step S2: the face-image of acquisition is pre-processed

Image gray processing processing is carried out to face-image first

Gray=0.3R+0.59G+0.11B

Wherein, Gray is grey scale pixel value, and R is red pixel value, G is green pixel values, B is blue pixel value.

Then Gamma correction is carried out:

I=Gray^γ

Wherein, I is grey scale pixel value after correction, γ 0.5.

Finally, being handled using PCA method the image after gray processing processing and Gamma correction again: determining face Image is the matrix X of n row m column, and each row of matrix X is first carried out zero averaging, then finds out the covariance matrix of matrix X, The characteristic value and corresponding feature vector of covariance matrix are found out again；From small to large according to corresponding eigenvalue size by feature vector Matrix O is lined up by row, the preceding K row of matrix O is taken to form matrix P, matrix P is the matrix of K row n column, obtains face-image Y, Y=PX Face-image as after dimensionality reduction to K dimension.This completes the pretreatments of driver's facial image, facilitate subsequent neural network Expression Recognition.The real-time of processing is enhanced, is passed through so that face-image size reduces by gray processing PCA dimension-reduction treatment Gamma correction, characteristics of image are enhanced, and the accuracy rate of identification is improved.

Step S3: driver's human facial expression recognition

Step S3.1: building human facial expression recognition volume is according to neural network

It is total in the present embodiment, as shown in Fig. 2, human facial expression recognition volume includes sequentially connected four layers according to neural network Structure, average pond layer, dropout layers and softmax classifier.

Each layer of structure of the four-layer structure includes that convolution kernel size is the first convolutional layer that 3 × 3, step-length is 2, convolution Core size is second and third convolutional layer, pond layer and inception structure that 3 × 3, step-length is 1；Wherein, in three first layers structure Pond layer convolution kernel size be 3 × 3, step-length 2, four-layer structure, pond layer convolution kernel size be 3 × 3, step-length It is 1.

In the present embodiment, as shown in figure 3, the inception structure is divided into four characteristic patterns including parallel processing Treatment channel and one concatenation layers of filter, first characteristic pattern treatment channel uses size for 3 × 3 convolution The operation of input feature vector figure pondization is checked, then uses size to carry out convolution operation for 1 × 1 convolution kernel, is finally sent into filter In concatenation layers；Second characteristic pattern treatment channel uses size to roll up for 1 × 1 convolution kernel to input feature vector figure Product operation, is then fed into concatenation layers of filter；Third characteristic pattern treatment channel uses size for volume 1 × 1 Product verification input feature vector figure carries out convolution operation, and obtained characteristic pattern uses size for 3 × 1 convolution kernels, 1 × 3 convolution again respectively Core carries out convolution operation, and obtained characteristic pattern is all sent into concatenation layers of filter；4th characteristic pattern processing is logical Road uses size to carry out convolution operation to input feature vector figure for 1 × 1 convolution kernel, and the characteristic pattern that convolution operation obtains uses size again Convolution operation is carried out for 3 × 3 convolution kernels, then using size again respectively is 3 × 1 convolution kernels, 1 × 3 convolution kernel to 3 × 3 convolution Characteristic pattern after nuclear convolution operation carries out convolution operation, and obtained characteristic pattern is all sent into concatenation layers of filter； The characteristic pattern that four characteristic pattern treatment channels obtain is attached in concatenation layers of filter, after being connected Characteristic pattern.

In the present embodiment, the design of inception structure uses a kind of new form, such as by traditional regular convolution 3*3 convolution splits into 1*3 convolution sum 3*1 convolution, has on the one hand saved quantity of parameters, accelerates operation and alleviates over-fitting, simultaneously One layer of nonlinear extensions model tormulation ability is increased, can handle more and richer space characteristics, increases feature multiplicity Property.This special structure designs so that human facial expression recognition, which is rolled up, becomes more lightweight according to neural network, while having more preferable Detection effect.

K after the first convolutional layer input dimensionality reduction of first layer structure ties up face-image, successively passes through the first, second and third convolution It is sent into after the convolution operation of layer in the layer of pond and carries out pondization operation, the characteristic image of Chi Huahou is sent into inception structure and is carried out Processing, the characteristic pattern after being connected；Characteristic pattern after the complete obtained connection of first layer pattern handling be sent into second layer structure into The complete characteristic pattern of the same treatment of row first layer structure, second layer pattern handling is sent into third layer structure and carries out first layer structure Same treatment, the same treatment of the complete characteristic pattern feeding four-layer structure progress first layer structure of third layer pattern handling, the 4th Characteristic pattern after the layer complete obtained connection of pattern handling is sent into average pond layer and carries out average pondization operation, the spy of average Chi Huahou It levies figure and carries out a certain proportion of discarding at dropout layers, be then fed into classification in softmax classifier device and obtain facial expression.

Step S3.2: training human facial expression recognition volume is according to neural network

The K for marking facial expression is tieed up into face-image and is sent into the human facial expression recognition volume of step S3.1 building according to neural network It is trained, obtains trained human facial expression recognition volume according to neural network

Wherein, human facial expression recognition volume is Relu function, optimization according to the activation primitive chosen in neural network training process Algorithm is SGD, initial method Xavier, learning rate are as follows:

base_lr(1-iter/max_iter)×0.5

Wherein, base_lr=0.01 is initial learning rate, and iter is the number of current iteration, and max_iter is maximum The number of iterations.

Step S3.3: driver's face-image of acquisition is sent into trained facial table after step (1), (2) processing Feelings identification volume obtains the facial expression of driver according to neural network.

In the present embodiment, 7 kinds of basic driver's facial expressions are exported.In the present embodiment, the required data set of training For the partial data collection in Facial expression database, after pre-processing to data set, human facial expression recognition volume is sent into according to nerve Network is trained.Use other parts data set in Facial expression database as test.

Step S4: the driving condition of driver is obtained

According to driver's facial expression is identified, the driving condition of driver is obtained, is displayed in real time in screen, or Driver can be prompted in time, when the facial expression of the unsuitable driving such as indignation occurs in driver, provide discomfort It closes the state driven to remind, driver is effectively prompted in time or is worked as using a series of method to alleviate driver Preceding uncomfortable driving condition.

In the present embodiment, it is demonstrated by the training in data set by carrying out Expression Recognition to driver's face The correctness and validity of improved inception structure proposed by the present invention.

Using VGG network model, inception V2 network model, ResNet network model and the present invention is based on facial tables The driver status detection method of feelings carries out human facial expression recognition, and recognition result is as shown in table 1.1 algorithm discrimination result pair of table Than:

Table 1

As can be seen from Table 1: the present invention can be improved accurately by dynamically increasing and decreasing improved inception structure Rate can be adapted for different conditions, and in the case where improved inception number of structures increases to certain degree, accuracy rate can To be more than best VGG network model, and number of parameters is less than the model.Therefore the present invention is in driver's human facial expression recognition There is greater advantage in terms of accuracy rate and real-time.

Although the illustrative specific embodiment of the present invention is described above, in order to the technology of the art Personnel understand the present invention, it should be apparent that the present invention is not limited to the range of specific embodiment, to the common skill of the art For art personnel, if various change the attached claims limit and determine the spirit and scope of the present invention in, these Variation is it will be apparent that all utilize the innovation and creation of present inventive concept in the column of protection.

Claims

1. a kind of driver status detection method based on facial expression, which comprises the following steps:

(1), the face-image of driver is obtained

The video flowing that driver is obtained using the camera installed in front of driver, utilizes haar feature+adaboost face Detection algorithm detects human face region, that is, face-image of user in video streaming image；

(2), the face-image of acquisition is pre-processed

Image gray processing processing is carried out to face-image first

Gray=0.3R+0.59G+0.11B

Then Gamm correction is carried out:

I=Gray^γ

Wherein, I is grey scale pixel value after correction；

Finally, being handled using PCA method the image after gray processing processing and Gamma correction again:

It determines that face-image is the matrix X of n row m column, each row of matrix X is first subjected to zero averaging, then finds out matrix X's Covariance matrix, then find out the characteristic value and corresponding feature vector of covariance matrix；By feature vector according to corresponding eigenvalue Size lines up matrix O by row from small to large, and the preceding K row of matrix O is taken to form matrix P, and matrix P is the matrix of K row n column, obtains face Portion image Y, Y=PX are the face-image after dimensionality reduction to K dimension；

(3), driver's human facial expression recognition

Human facial expression recognition volume according to neural network include sequentially connected four-layer structure, average pond layer, dropout layers with And softmax classifier；

Each layer of structure include convolution kernel size be 3 × 3, the first convolutional layer that step-length is 2, convolution kernel size be 3 × 3, step-length For 1 second and third convolutional layer, pond layer and inception structure；Wherein, the convolution kernel of the pond layer in three first layers structure Size is 3 × 3, step-length 2, and four-layer structure, pond layer convolution kernel size is 3 × 3, step-length 1；

K after the first convolutional layer input dimensionality reduction of first layer structure ties up face-image, successively by the first, second and third convolutional layer It is sent into after convolution operation in the layer of pond and carries out pondization operation, the characteristic image of Chi Huahou is sent into inception structure and is handled, Characteristic pattern after being connected；Characteristic pattern after the complete obtained connection of first layer pattern handling is sent into second layer structure and carries out first The complete characteristic pattern of the same treatment of layer structure, second layer pattern handling is sent into third layer structure and carries out mutually existing together for first layer structure Reason, the complete characteristic pattern of third layer pattern handling are sent into the same treatment that four-layer structure carries out first layer structure, four-layer structure Characteristic pattern after the connection handled is sent into average pond layer and carries out average pondization operation, and the characteristic pattern of average Chi Huahou exists Dropout layers carry out a certain proportion of discarding, are then fed into classification in softmax classifier device and obtain facial expression；

The K for the marking facial expression human facial expression recognition for tieing up face-image feeding step 3.1) building is rolled up according to neural network to it It is trained, obtains trained human facial expression recognition volume according to neural network；

Wherein, human facial expression recognition volume is Relu function, optimization algorithm according to the activation primitive chosen in neural network training process For SGD (Stochastic Gradient Descent, stochastic gradient descent), initial method Xavier, learning rate Are as follows:

base_lr(1-iter/max_iter)×0.5

Wherein, base_lr=0.01 is initial learning rate, and iter is the number of current iteration, and max_iter is greatest iteration Number；

3.3), the driver's face-image obtained is sent into trained human facial expression recognition volume after step (1), (2) processing According to neural network, the facial expression of driver is obtained；

(4), result is exported

After identifying driver's facial expression, the driving condition of driver is obtained, is displayed in real time in screen, or can Driver is prompted in time, when the facial expression of the unsuitable driving such as indignation occurs in driver, provides and is not suitable for driving The state sailed is reminded, and is effectively prompted driver in time or alleviates driver using a series of method currently Uncomfortable driving condition.