CN111291590B - Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium - Google Patents

Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium Download PDF

Info

Publication number
CN111291590B
CN111291590B CN201811485916.5A CN201811485916A CN111291590B CN 111291590 B CN111291590 B CN 111291590B CN 201811485916 A CN201811485916 A CN 201811485916A CN 111291590 B CN111291590 B CN 111291590B
Authority
CN
China
Prior art keywords
value
image
opening
frame
eye
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811485916.5A
Other languages
Chinese (zh)
Other versions
CN111291590A (en
Inventor
彭斐
毛茜
何俏君
尹超凡
李彦琳
谷俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Automobile Group Co Ltd
Original Assignee
Guangzhou Automobile Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Automobile Group Co Ltd filed Critical Guangzhou Automobile Group Co Ltd
Priority to CN201811485916.5A priority Critical patent/CN111291590B/en
Publication of CN111291590A publication Critical patent/CN111291590A/en
Application granted granted Critical
Publication of CN111291590B publication Critical patent/CN111291590B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Ophthalmology & Optometry (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method, a device, computer equipment and a storage medium for detecting fatigue of a driver, wherein the method comprises the following steps: acquiring a face video of a target driver, and respectively detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes in each frame of face image; determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value; according to each human eye opening value, the first opening threshold and the second opening threshold, counting a first image frame value of which the human eye opening value is smaller than or equal to the first opening threshold and a second image frame value of which the human eye opening value is smaller than or equal to the second opening threshold; and if the ratio of the first image frame value to the second opening threshold value is greater than a preset fatigue judgment threshold value, judging that the target driver is in a fatigue state. By adopting the scheme of the invention, the accuracy of the fatigue detection result can be improved.

Description

Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium
Technical Field
The invention relates to the technical field of image processing, in particular to a method and a device for detecting fatigue of a driver, computer equipment and a storage medium.
Background
Traffic accidents have always been one of the most serious threats to the security of lives and properties faced by mankind, and most of them occur due to human factors of drivers. In the driving process of a vehicle, fatigue of a driver is one of important reasons for causing a malignant traffic accident, and the traffic safety is seriously damaged.
With the development of image recognition processing technology, the fatigue state of a driver is judged and an alarm is given out by recognizing and processing the facial image information of the driver in the driving process, so that a new solution is provided for preventing traffic accidents.
The traditional driver fatigue detection method based on facial image information is to identify the opening and closing states of human eyes and finally carry out fatigue detection through the detection states of continuous frames, and the method is low in accuracy of detection results.
Disclosure of Invention
In view of the above, it is necessary to provide a driver fatigue detection method, apparatus, computer device, and storage medium capable of improving the detection accuracy.
A driver fatigue detection method, the method comprising:
acquiring a face video of a target driver, and respectively detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes in each frame of face image;
determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value;
according to each human eye opening value, the first opening threshold and the second opening threshold, counting a first image frame value of which the human eye opening value is smaller than or equal to the first opening threshold and a second image frame value of which the human eye opening value is smaller than or equal to the second opening threshold;
if the ratio of the first image frame value to the second image frame value is larger than a preset fatigue judgment threshold value, judging that the target driver is in a fatigue state
A driver fatigue method apparatus, the apparatus comprising:
the detection module is used for acquiring a face video of a target driver, and detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes of each frame of face image;
the processing module is used for determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value;
a counting module, configured to count, according to each of the eye opening values, the first opening threshold and the second opening threshold, a first image frame value of which the eye opening value is smaller than or equal to the first opening threshold, and a second image frame value of which the eye opening value is smaller than or equal to the second opening threshold;
and the judging module is used for judging that the target driver is in a fatigue state if the ratio of the first image frame value to the second image frame value is greater than a preset fatigue judging threshold value.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring a face video of a target driver, and respectively detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes in each frame of face image;
determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value;
according to each human eye opening value, the first opening threshold and the second opening threshold, counting a first image frame value of which the human eye opening value is smaller than or equal to the first opening threshold and a second image frame value of which the human eye opening value is smaller than or equal to the second opening threshold;
and if the ratio of the first image frame value to the second image frame value is greater than a preset fatigue judgment threshold value, judging that the target driver is in a fatigue state.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a face video of a target driver, and respectively detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes in each frame of face image;
determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value;
according to each human eye opening value, the first opening threshold and the second opening threshold, counting a first image frame value of which the human eye opening value is smaller than or equal to the first opening threshold and a second image frame value of which the human eye opening value is smaller than or equal to the second opening threshold;
and if the ratio of the first image frame value to the second image frame value is greater than a preset fatigue judgment threshold value, judging that the target driver is in a fatigue state.
According to the driver fatigue detection method, the driver fatigue detection device, the computer equipment and the storage medium, a first opening threshold value and a second opening threshold value are determined according to the eye opening values in the face images of the frames of the face video, eye states are distinguished based on the first opening threshold value and the second opening threshold value, the target driver is judged to be in a fatigue state based on the ratio of a first image frame value of which the eye opening value is greater than the first opening threshold value to a second image frame value of which the eye opening value is smaller than the second opening threshold value and a preset fatigue judgment threshold value, and the accuracy of a detection result can be improved.
Drawings
FIG. 1 is a diagram of an exemplary driver fatigue detection method;
FIG. 2 is a flow diagram of a driver fatigue detection method in one embodiment;
FIG. 3 is a schematic diagram illustrating a process of obtaining an opening value of a human eye according to an embodiment;
FIG. 4 is a schematic diagram illustrating a process for obtaining a first opening degree threshold and a second opening degree threshold according to an embodiment;
FIG. 5 is a schematic diagram illustrating an exemplary process for obtaining a facial feature image;
FIG. 6 is a schematic diagram of a training process for an eye feature point localization model in one embodiment;
FIG. 7 is a DPM feature extraction schematic in one embodiment;
FIG. 8 is a diagram of a 2-fold spatial model (right) after Gaussian filtering of a root filter (left) component filter in one embodiment;
FIG. 9 is a diagram illustrating the comparison of the effects of a conventional Hog + SVM and an applied DPM + Latent-SVM in one embodiment (a) and the formula comparison (b);
FIG. 10 is a diagram illustrating the effects of cascading iterations in one embodiment;
FIG. 11 is a diagram that illustrates a hybrid tree model that encodes topology changes due to viewpoints, in one embodiment;
FIG. 12 is a diagram illustrating the positioning results of the human eye feature points in one embodiment;
FIG. 13 is a block diagram showing the construction of a driver fatigue method apparatus in another embodiment;
fig. 14 is an internal structural view of a computer device in another embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
It should be noted that the terms "first" and "second" and the like in the description, the claims, and the drawings of the present application are used for distinguishing similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein.
The driver fatigue detection method provided by the application can be applied to the application environment shown in fig. 1. The infrared camera collects video information of a driver, and the video information collected by the infrared camera can be input into the terminal to carry out a driver fatigue method. Wherein, the preferred mounted position of infrared camera is on the steering column of car steering wheel below. The infrared camera can communicate with the terminal in a wired or wireless mode. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, vehicle-mounted terminals, and portable wearable devices.
In one embodiment, as shown in fig. 2, a method for detecting fatigue of a driver is provided, which is described by taking the method as an example of being applied to a terminal, and comprises the following steps:
step S201: acquiring a face video of a target driver, and respectively detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes in each frame of face image;
here, the face video is obtained by photographing the face of the target driver.
Specifically, a face video of a target driver in one detection period may be acquired, and eye opening detection may be performed on each frame of face image in the face video, so as to obtain an eye opening value in each frame of the face image. The size of the detection period can be set according to actual needs.
Step S202: determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value;
generally, the first opening degree threshold value is also smaller than the maximum opening degree value of the eyes of the target driver, and the second opening degree threshold value is also larger than 0, the first opening degree threshold value being a threshold value for determining whether the eyes are in a fully open state, and the second opening degree threshold value being a threshold value for determining whether the eyes are in a closed state.
Step S203: according to each human eye opening value, the first opening threshold and the second opening threshold, counting a first image frame value of which the human eye opening value is smaller than or equal to the first opening threshold and a second image frame value of which the human eye opening value is smaller than or equal to the second opening threshold;
the eye opening value is larger than the first opening threshold value and indicates that the eyes are in a fully opened state, and the eye opening value is smaller than the second opening threshold value and indicates that the eyes are in a closed state. The first image frame value is the number of image frames in the face video whose eye opening value is less than or equal to the first opening threshold, and the second image frame value is the number of image frames in the face video whose eye opening value is less than or equal to the second opening threshold.
Specifically, a first image frame value and a second image frame value in one detection period may be counted according to each of the human eye opening value, the first opening threshold value, and the second opening threshold value.
Step S204: if the ratio of the first image frame value to the second opening threshold value is larger than a preset fatigue judgment threshold value, judging that the target driver is in a fatigue state;
the fatigue determination threshold value may be set according to actual conditions.
In the driver fatigue detection method, a face video of a target driver is obtained, the eye opening degree of each frame of face image in the face video is detected respectively to obtain the eye opening degree value of each frame of face image, determining a first opening degree threshold value and a second opening degree threshold value according to each human eye opening degree value, wherein the first opening degree threshold value is larger than the second opening degree threshold value, counting a first image frame numerical value of which the human eye opening value is smaller than or equal to the first opening threshold value according to each human eye opening value, the first opening threshold value and the second opening threshold value, and a second image frame value with the human eye opening value smaller than or equal to the second opening threshold value, and if the ratio of the first image frame value to the second image frame value is larger than a preset fatigue judgment threshold value, judging that the target driver is in a fatigue state. In the scheme of this embodiment, the eye states are distinguished based on the first opening threshold and the second opening threshold, and the target driver is determined to be in the fatigue state based on the ratio of the first image frame value with the human eye opening value larger than the first opening threshold to the second image frame value with the human eye opening value smaller than the second opening threshold and the preset fatigue determination threshold, so that the accuracy of the detection result can be improved.
In one embodiment, in order to eliminate the influence of the change of the distance between the human eye and the camera on the calculation of the human eye opening degree, a processing mode of normalizing the human eye opening degree value is provided. As shown in fig. 3, specifically, the above-mentioned detecting the eye opening degree of each frame of face image in the face video to obtain the eye opening degree value of each frame of face image may include:
step S301: respectively carrying out eye feature point positioning on each frame of the face image to obtain eye feature points in each frame of the face image;
step S302: respectively determining a human eye interpupillary distance value and a human eye opening original value in the face image of each frame according to the eye feature points in the face image of each frame;
specifically, the calibration point of the upper eyelid and the calibration point of the lower eyelid directly facing the pupil in each frame of the face image may be determined respectively according to the eye feature points in each frame of the face image, and the original value of the eye opening may be calculated according to the calibration point of the upper eyelid and the calibration point of the lower eyelid, where the original value of the eye opening may be equal to the distance between the calibration point of the upper eyelid and the calibration point of the lower eyelid.
Step S303: and respectively determining the human eye opening value in the face image of each frame according to the human eye interpupillary distance value and the human eye opening original value in the face image of each frame.
In particular, can be according to
Figure GDA0002899787390000061
Determining an eye opening value in each frame of the face image, wherein HiRepresenting the opening value of human eye in the face image of the ith frame hiIndicating the original value of the opening degree of the human eye in the face image of the ith frame, liThe value of the interpupillary distance of the human eyes in the face image of the ith frame is shown, C represents a correction parameter, and the size of the correction parameter can be set or adjusted according to the situation.
By adopting the scheme in the embodiment, the influence of the change of the distance between the human eyes and the camera on the calculation of the human eye opening degree can be effectively eliminated.
In one embodiment, as shown in fig. 4, the determining the first opening threshold and the second opening threshold according to each of the eye opening values includes:
step S401: determining the maximum eye opening degree value according to the eye opening degree values;
specifically, the sizes of the eye opening values may be compared to obtain a maximum value of the eye opening values, and the maximum value may be used as the maximum eye opening value. Or sequencing the eye opening degree values in the descending order, taking the average value of the first N eye opening degree values in the sequencing, and taking the average value as the maximum eye opening degree value.
Step S402: multiplying the maximum opening degree value by a preset first proportional coefficient and a preset second proportional coefficient respectively to obtain a first opening degree threshold value and a second opening degree threshold value;
the first scaling factor is greater than the second scaling factor, and generally, the first scaling factor and the second scaling factor are both values greater than 0 and less than 1, and the magnitudes of the first scaling factor and the second scaling factor can be selected according to actual needs, preferably, the first scaling factor is 0.8, and the second scaling factor is 0.2.
In this embodiment, the first opening threshold and the second opening threshold are obtained by multiplying the maximum opening value by a preset first proportional coefficient and a preset second proportional coefficient, so that the algorithm is easy to implement, the first opening threshold and the second opening threshold are determined based on the maximum opening value of the human eye, and the maximum opening value of the human eye is determined by the opening values of the human eye, so that the detection accuracy can be further improved.
In one embodiment, as shown in fig. 5, the above-mentioned performing the facial feature point positioning on the facial image of each frame to obtain each facial feature image may include:
step S501: extracting a first DPM feature map, wherein the first DPM feature map is a DPM feature map of a current face image, and the current face image is any one frame of face image;
step S502: sampling the first DPM feature map, and extracting a second DPM feature map, wherein the second DPM feature map is a DPM feature map of an image obtained by sampling the first DPM feature map;
step S503: performing convolution operation on the first DPM characteristic diagram by using a pre-trained root filter to obtain a response diagram of the root filter;
step S504: performing convolution operation on the N times of the second DPM characteristic diagram by using a pre-trained component filter to obtain a response diagram of the component filter, wherein the resolution of the component filter is N times of that of the root filter, and N is a positive integer;
step S505: obtaining a target response diagram according to the response diagram of the root filter and the response diagram of the component filter;
step S506: and acquiring a current face feature image according to the target response image.
The facial images of the frames may be respectively used as the current facial image in this embodiment, and the steps S501 to S506 are respectively adopted to perform the facial feature point positioning, so as to obtain the facial feature images corresponding to the facial images of the frames.
In the embodiment, the face detection is performed by adopting the DPM target detection algorithm, so that the detection accuracy of the algorithm is improved, and the false detection rate and the missing detection rate can be reduced at the same time.
In one embodiment, the performing eye feature point positioning on each frame of the face image to obtain eye feature points in each frame of the face image may include: respectively carrying out face feature point positioning on the face image of each frame to obtain each face feature image; and respectively inputting each human face feature image into a preset eye feature point positioning model to obtain eye feature points in each frame of the human face image.
In one embodiment, as shown in fig. 6, the training process of the eye feature point location model may include:
step S601: acquiring a pixel value of each pixel point of a target image and a feature vector of each pixel point;
in this embodiment, the model used may be based on a hybrid tree and a shared component V pool. Each facial landmark is modeled as a part and global blending is used to capture topological changes due to the viewpoint.
Step S602: configuring a tree structure local model according to the pixel values and the feature vectors, and determining a score function at the L part;
wherein the scoring function is S (I, L, m) Appm(I,L)+Shapem(L)+αm
Figure GDA0002899787390000081
Figure GDA0002899787390000082
I denotes the target image,/i=(xi,yi) Representing the pixel value of the ith pixel point of the target image, w representing a partial model, m representing that the tree structure is a mixed type, the partial model is obtained by modeling each facial feature point in the target image as a part, a, b, c and d representing elastic parameters, and alpha representing a mixed offset scalar;
in the present embodiment, the first and second electrodes are,
Figure GDA0002899787390000083
find out atiTemplate at position i (i-th pixel point)
Figure GDA0002899787390000084
Sum of phi (I, l)i) An object image l is representediA feature vector at the pixel.
In the present embodiment, the first and second electrodes are,
Figure GDA0002899787390000085
expressed is the permutation score of the mixed-type specific spatial L permutation, where dx ═ xi-xjAnd dy ═ yi-yjIndicating the displacement of the ith part to the jth part. Each parameter in the formula (referred to as a, b, c, d) can be interpreted as a spatial constraint between the different parts.
Step S603: obtaining optimal configuration parameters of each part of each hybrid type by calculating values of L and m which enable the score function to obtain a maximum value;
in particular, all hybrids may be enumerated, finding the best configuration parameters for each of the components for each hybrid.
Step S604: establishing a training sample set, wherein the training sample set comprises a positive sample and a negative sample which are set with labels, the positive sample is an image containing a human face, and the negative sample is an image not containing the human face;
in particular, assume a fully supervised scene, in which there are positive examples and mixed labels that contain faces and negative examples that do not contain faces. The shape parameters and appearance parameters may be learned differentially with a structural prediction framework.
Step S605: constructing a target vector according to the partial model, the elasticity parameters and the mixed bias scalar, and modifying the score function according to the target vector;
specifically, the partial model w, the elastic parameters (a, b, c, d), and the mixed bias scalar α are all placed into a vector β, and the scoring function described above is modified to the form: s (I, z) ═ β · Φ (I, z). The vector Φ (I, z) is sparse, having non-zero terms in a single interval corresponding to the hybrid m.
Step S606: learning to obtain the eye characteristic point positioning model according to the training sample set, the optimal configuration parameters, the modified score function and a predefined target prediction function;
wherein, the eye characteristic point location model obtained by learning is as follows:
Figure GDA0002899787390000091
Figure GDA0002899787390000092
Figure GDA0002899787390000093
Figure GDA0002899787390000094
wherein β represents the target vector, zn={Ln,mnC represents the penalty factor of the objective function, ξnAnd the penalty term of the nth sample is represented, pos and neg respectively represent a positive sample and a negative sample, K represents the number of the target vectors, and K represents the number of the corresponding target vectors.
In the embodiment, the human face feature points and the eye feature points are positioned by using a machine learning algorithm and are respectively positioned, so that the positioning accuracy is very high, the generalization capability on illumination and posture is very strong, and the accuracy of calculating the opening and closing degree of the eyes can be improved.
In order to facilitate an understanding of the present invention, a preferred embodiment of the present invention will be described in detail.
The driver fatigue detection method in this embodiment includes the steps of: the first step is as follows: inputting video information; the second step is that: detecting a human face; the third step: positioning face feature points; the fourth step: positioning human eye characteristic points; the fifth step: blink detection-calculating the eye opening, fatigue detection analysis.
The first step is as follows: and collecting video information. The monocular infrared camera (arranged on a steering column below a steering wheel) for inputting video information inputs face state information (images) of a driver in a driving process in real time. The frequency of the video input is 30Hz and the image size per frame is 1280 × 1080 pixels. Wherein, the infrared camera can adapt to the different light condition in the car, accurately catches driver head gesture information and facial information.
The second step is that: and detecting the human face. For each frame of image of the input video, the present embodiment performs face detection by using a dpm (deformable Part model) target detection algorithm. The DPM algorithm is applied to part of principles in the HOG algorithm: firstly, graying the picture; then, as equation (1), the input image is normalized (normalized) in color space by using a Gamma correction method:
I(x,y)=I(x,y)gamma (1)
the value of gamma is seen from specific conditions (for example, 1/2 can be taken), so that local shadow and illumination change of the image can be effectively reduced; next, gradient calculation is performed, wherein the gradient reflects the change between adjacent pixels, the change between adjacent pixels is relatively flat, the gradient is smaller, and the gradient is larger, and the gradient of any point pixel (x, y) of the simulation image f (x, y) is a vector:
Figure GDA0002899787390000095
Figure GDA0002899787390000096
wherein G isxIs a gradient in the x-direction, GyIs the gradient along the y-direction, and the magnitude and direction angle of the gradient can be expressed by the following formula:
Figure GDA0002899787390000101
pixel points in the digital image are calculated using the difference:
Figure GDA0002899787390000102
since the detection effect obtained by the gradient operation using a simple one-dimensional discrete differential template [ -1,0,1] is the best, the calculation formula used is as follows:
Figure GDA0002899787390000103
the value, the magnitude and direction of its gradient, is calculated as follows:
Figure GDA0002899787390000104
Figure GDA0002899787390000105
then, the whole target picture is divided into cell units (cells) which are not overlapped with each other and have the same size, and then the gradient size and direction of each cell unit are calculated. DPM retained the cell units of the HOG map and then normalized a certain cell unit on the map (8 × 8 cell unit in fig. 7) to its four cells in the diagonal neighborhood. Extracting signed HOG gradients, 0-360 degrees will yield 18 gradient vectors, extracting unsigned HOG gradients, 0-180 degrees will yield 9 gradient vectors. DPM extracts only unsigned features, generates 4 × 9-36 dimensional features, adds rows and columns to form 13 feature vectors (9 columns and 4 rows as shown in fig. 7), adds extracted 18-dimensional signed gradient features (18 columns and 18 rows as shown in fig. 18) to further improve accuracy, and finally obtains 13+ 18-31 dimensional gradient features.
As shown in fig. 8, the DPM model employs an 8 × 8 resolution Root filter (left) and a 4 × 4 resolution component filter (middle). Wherein the resolution of the middle graph is 2 times that of the left graph, and the size of the component filter is 2 times that of the root filter, so the gradient can be seen more finely. The right image is a 2-fold spatial model after gaussian filtering.
Firstly, a DPM feature map (DPM feature map of an original image) is extracted from an input image, gaussian pyramid upsampling (scaling image) is performed, and then the DPM feature map of the gaussian pyramid upsampled image is extracted. And carrying out convolution operation on the DPM characteristic diagram of the original image and the trained root filter to obtain a response diagram of the root filter. Meanwhile, performing convolution operation on the DPM characteristic diagram (sampled on a Gaussian pyramid) of the extracted 2-time image by using a trained component filter to obtain a response diagram of the component filter. The resulting response maps of the component filters are subjected to a fine gaussian downsampling operation so that the response maps of the root filter and the component filters have the same resolution. And finally, carrying out weighted average on the two images to obtain a final response image, wherein the response effect is better when the brightness is higher, and the human face is detected. Wherein the response value is expressed in the following formula:
Figure GDA0002899787390000111
wherein x is0,y0,l0Respectively representing the abscissa, ordinate and scale of the characteristic point;
Figure GDA0002899787390000112
is the response score of the root model;
Figure GDA0002899787390000113
response points of the component model; 2 (x)0,y0) The pixel representing the component model is 2 times the original; b is the offset coefficient between different model components for alignment with the model; 2 (x)0,y0) The pixels representing the component model are 2 times the original, so pixel x 2; v. ofiThe deviation coefficient between the pixel point and the ideal detection point is obtained; wherein the detailed response score formula of the component model is as follows:
Figure GDA0002899787390000114
similar to equation (8), we expect the objective function (D)i,l(x, y)) the larger the value, the better, the variable is dx, dy. Further, in the above formula, (x, y) is the position of the ideal model trained; dx, dy is the offset of the ideal model position, and the range is the position from the ideal position to the picture edge; ri,l(x + dx, y + dy) is the matching score of the component model; did(dx, dy) is the offset loss score for the component; diIs the offset loss factor; phid(dx, dy) is the distance between a pixel point of the component model and a detection point of the component model. The formula shows that the higher the response of the component model is, the smaller the distance between each component and the corresponding pixel point is, the higher the response score is, and the more likely it is that the object is to be detected.
When training the model, the DPM features obtained above are trained. DPM is here used as the late-SVM classification, where a late variable (Latent variable) is added to the Linear-SVM classification, and can be used to determine which of the positive samples is the positive sample. There are many Latent variables in LSVM, because after a bounding box is marked given a picture of a positive sample, a maximum sample needs to be proposed at a certain position by a certain scale as the maximum sample of a certain part. FIG. 9(a) is a graph comparing the effects of a general Hog + SVM and an applied DPM + tension-SVM. The general formula of Hog + SVM and DPM + tension-SVM used is shown in FIG. 9 (b).
The third step: and positioning the face feature points. In the embodiment, the LBF algorithm is used, and a cascade of regressors is used to locate the facial feature points and the eye feature points in milliseconds. Each regression rt(,) use the current picture I and the shape vector
Figure GDA0002899787390000118
To predict the updated shape vector, the specific formula is as follows:
Figure GDA0002899787390000115
Figure GDA0002899787390000116
Figure GDA0002899787390000117
wherein
Figure GDA0002899787390000121
Indicating the current estimated vector S, XiThe (x, y) coordinates of the facial feature points in the image I are represented. The most important step in the cascade is the regressor rt(,) based on predictions such as pixel grayscale features based on image I calculations and the current shape vector
Figure GDA0002899787390000122
Indexed out. Geometric invariance is introduced in this process, and as the cascade progresses one can more determine that the precise semantic location of the face is being indexed.
If the initial estimate is
Figure GDA0002899787390000123
Belonging to this space, it is then ensured that the output range expanded by the set lies in the linear subspace of the training data. Doing so does not require additional restrictions on the prediction, which greatly simplifies the method. Furthermore, the initial shape is simply selected as the average shape of the training data that is centered and scaled according to the bounding box output of the generic face detector.
Next, each regressor in the learning cascade is used with a training data set ((I)1,S1),......,(In,Sn) To learn the regression function r0,IiRepresenting a picture of a face, SiRepresenting a shape vector. Initialized shape estimation and target update)
Figure GDA0002899787390000124
The following were used:
πi∈{1,......n} (13)
Figure GDA00028997873900001212
Figure GDA0002899787390000125
wherein i ═ 1.... N). The total number of these triplets is set to N-nR, where R is the number of initializations used per image I. Each initialized shape estimate for an image is derived from (S)1,......,Sn) Uniformly sampled and no replacement is required.
Using sum of squared error losses from this dataGradient tree lifting, the regression function r can be learned0The specific algorithm is as follows:
training data
Figure GDA0002899787390000126
Learning rate 0<v<1, the specific process is as follows:
a. initialization:
Figure GDA0002899787390000127
b. for K from 1 to K:
①i=1,…,N:
Figure GDA0002899787390000128
② for regression function rikFitting a regression tree to give a weak regression function
Figure GDA0002899787390000129
And (3) updating:
Figure GDA00028997873900001210
c. and (3) outputting:
Figure GDA00028997873900001211
the triplet training data in turn will update the training data as:
Figure GDA0002899787390000131
the next regressor r in the cascade1Set as follows (t ═ 0):
Figure GDA0002899787390000132
Figure GDA0002899787390000133
this process is iterated until a cascade r of T regressions0,r1,…,rT-1The combination gives a sufficient level of accuracy.
Each regression function rtIs a tree-based regression function that fits the residual target in the gradient boosting algorithm. On each separate node of the regression tree, we make a decision based on the threshold of intensity difference between the two pixels. In the coordinate system defined based on the average shape, the pixel coordinate used in the test is (u, v). For an arbitrarily shaped face image, we want to index points that have the same position as their shape as u and v for the average shape. To achieve this, the image may be deformed into an average shape based on the current shape estimate before extracting the features. Because we use a very sparse representation of the pixels of this image, it is more efficient to dewax the positions of these points rather than dewax the entire image.
Suppose kuIs the index of the facial marker closest to u in the average shape and defines its offset from u as:
Figure GDA0002899787390000134
then, for image IiThe shape S defined ini,IiThis is qualitatively similar to defining u in a shape image:
Figure GDA0002899787390000135
wherein s isi,RiIs a scale matrix and a rotation matrix, both of which are used to minimize the sum of squared differences between the mean shape facial marker points and the warped points:
Figure GDA0002899787390000136
v' is defined similarly. Formally each segmentation is a decision involving 3, with the parameter θ ═ t, u, v and applied to each training and test sample.
Figure GDA0002899787390000137
Here u 'and v' are defined by a scale matrix and a rotation matrix. And (4) calculating similarity conversion, and completing cascade connection only once at each level in a part of the process of the maximum calculation amount in the test time.
For each regression tree, we approximate the bottom-level function using a piecewise constant function, with a constant vector fitting to each leaf node. To train the regression tree, we randomly generate a set of candidate segmentations, i.e., θ, at each tree node. We then choose θ from these candidates at will, which minimizes the sum of the squared errors. If Q is the index set of the training examples on the node, this corresponds to minimizing:
Figure GDA0002899787390000138
wherein Qθ,SIs an index of the sample, riIs the vector of all residuals computed for image i in the gradient enhancement algorithm, and μθ,sThe formula is defined as follows:
Figure GDA0002899787390000141
the best optimization point can be easily found because if we rearrange the formula or ignore the factors that depend on θ, we see the following formula relationship:
Figure GDA0002899787390000142
when evaluating different theta, we only need to calculate muθ,lAs is muθ,rCan pass through mu and muθ,lThe process is as follows:
Figure GDA0002899787390000143
the decision at each node is based on thresholding the difference in intensity values at a pair of pixels. This is a fairly simple test, but it is more powerful than a single threshold because it is relatively insensitive to global illumination variations. Unfortunately, a drawback of using pixel differences is that the number of possible segmentation (feature) candidates is quadratic in the number of pixels in the average image. This makes it difficult to find a good theta without searching for many theta. However, by considering the structure of the image data, such a limiting factor can be alleviated to some extent. We first introduce an index
p(u,v)αe-λ||u-v|| (29)
The pixel segmentation points within the distance range are easy to select, so that the number of prediction errors of the data set can be effectively reduced.
Handling missing tags, we introduce a variable w ranging between 0 and 1i,j(the jth landmark point representing the ith image), a new squared difference and formula is derived:
Figure GDA0002899787390000144
wherein WiIs a vector (w)i,i,......wi,p)TA distorted diagonal matrix. In addition, muθ,sThe formula of (1) is as follows:
Figure GDA0002899787390000145
the gradient enhancement algorithm must also be modified to take into account these weighting factors. This can be done simply by initializing the overall model with a weighted average of the targets and fitting a regression tree to the weights. In addition, the weight residual algorithm of the fitting regression tree is as follows:
Figure GDA0002899787390000146
wherein the cascade iteration effect is shown in fig. 10.
The fourth step: and positioning the characteristic points of the human eyes. In this embodiment, the model used is based on a hybrid tree and a shared pool of components V. In this approach we model each facial landmark as a part and use global mixing to capture the topological changes due to the viewpoint. As shown in fig. 11, the hybrid tree model employed in the present embodiment encodes topology changes due to viewpoints.
Tree structure local model: we write a tree structure T that is linear per parameterm=(Vm,Em) Wherein m indicates that the structure is a hybrid, and
Figure GDA0002899787390000151
we mark a picture as I, andi=(xi,yi) To represent the pixel at position i. Our scores in section L are configured as:
S(I,L,m)=Appm(I,L)+Shapem(L)+αm (33)
Figure GDA0002899787390000152
equation (34) obtainsiTemplate at position i
Figure GDA0002899787390000153
And, wherein m represents a mixed type here. Phi (I, l)i) Is shown on picture IiA feature vector at the pixel.
Figure GDA0002899787390000154
Expression (35) represents the arrangement score of the mixed-type specific spatial L arrangement, where dx ═ xi-xjAnd dy ═ yi-yjIndicating the displacement of the ith part to the jth part. Each parameter in the formula (referred to as a, b, c, d) can be interpreted as a spatial constraint between the different parts. Alpha is alphamScalar represents a scalar offset.
Since the method for positioning the eye feature points is mainly applied in the scheme of the embodiment, the factors of whole and part sharing are not considered.
The values of the parameters L and m are found which give the maximum value of the formula S (I, L, m):
Figure GDA0002899787390000155
simply enumerate all hybrids and find the best configuration for each of the components for each hybrid.
Because each hybrid type Tm=(Vm,Em) Is a tree structure, so the internal maximization can be efficiently completed through dynamic programming. Due to the lack of space, a way of omitting the messaging equations may be employed. The total number of different partial templates in the vocabulary in this embodiment is M' | V |, assuming that the dimension of each part is D and there are N candidate locations. The total cost of evaluating all segments at all locations is:
Figure GDA0002899787390000156
and then, distance conversion is carried out, and the information transmission cost is converted into: o (NM | V |). This makes the overall model of the solution of the present embodiment linear in terms of the number of components and the image size.
And training a human eye characteristic point positioning model. The scheme of the embodiment assumes a fully supervised sceneThere are positive examples and mixed labels in this scene and no negative examples of face images. The present embodiment scheme discriminatively learns the shape parameters and appearance parameters using a structure prediction framework. First, the edge structure E of each mixture type needs to be estimatedm. Although it is a natural process to derive human body type models using tree structures, the tree structure of human eye features is not clear.
Embodiments use the Chow-Liu algorithm to find the maximum similarity tree structure that best explains the positions of the feature points of the gaussian distribution. Positive sample of a given label In,Ln,mnAnd negative samples { I }nIn the embodiment, a structural target prediction function is defined, and z is assumed to ben={Ln,mn}. The formula S (I, L, m) relates to the partial model w, the elastic parameters (a, b, c, d) and the hybrid bias α. Putting all these parameters into a vector β, we can then write the scoring function as follows:
S(I,z)=β·Φ(I,z) (38)
where the vector Φ (I, z) is sparse with non-zero terms in a single interval corresponding to mix m.
Next, a model of the form:
Figure GDA0002899787390000161
in equation (39), C represents the penalty coefficient (hyper-parameter, which is needed to find the most suitable value for the tuning parameter) of the objective function, ξnAnd (3) representing penalty items (penalty items of the nth sample) corresponding to different samples, wherein n corresponds to different samples, pos and neg respectively represent positive and negative samples, K represents the number of the target vectors beta, and K represents the number of the corresponding target vectors beta.
Fig. 12 is a schematic diagram showing the result of locating the feature points of the human eye.
The fifth step: blink detection-calculating the eye opening and performing fatigue analysis. The longer eye closure time when blinking is one of the important indicators of driver fatigue. On the basis of the previous steps, the human eye area has been located and the human eye feature points have been found. In order to calculate the eye opening, in the present embodiment, first, a change in a distance between the eye and the camera is excluded to prevent the change from affecting the calculation of the eye opening. On the basis, the fatigue degree of the driver is judged by the proportion of the eye closing time in unit time. The details are as follows.
First, the open/close state of the human eye is determined based on the open degree value of the human eye. In the scheme of the embodiment, the eye feature points positioned in the previous step are selected, and the calibration points at the positions of the upper eyelid and the lower eyelid which are opposite to the pupil are found to calculate the opening degree of the eye. However, it has been found through a lot of experiments and experiences that the opening degree of the human eye is smaller when the distance between the human eye and the camera is longer, and the opening degree of the human eye is larger when the distance between the human eye and the camera is shorter. This is not beneficial to the detection of the fatigue of the driver in the later period, and for this reason, the abnormal change of the opening degree of the human eyes caused by the relative position change of the human eyes and the camera is normalized. In the scheme of this embodiment, the pupil distance l of a person can be measured by using the positioning of the eye feature points in the previous step, a linear relationship exists between the pupil distance and the change of the eye opening degree, assuming that the actually measured eye opening degree (equivalent to the original value of the eye opening degree) is H, the normalized eye opening degree is H, and the eye opening degree value is corrected by using the following formula:
Figure GDA0002899787390000171
where C represents a selected correction parameter.
Next is the human eye state division. In the embodiment, the maximum opening degree of the human eye is obtained according to the human eye opening degree value obtained above, and is recorded as MaxW. Assuming that the measured eye opening is W, wherein state I represents W > 80% MaxW
W > 80% MaxW, with the eye in a fully open state; state II indicates that 20% MaxW is less than or equal to 80% MaxW, and the eyes are in a half-open state; state III means W ≦ 20% MaxW, the eye is in the closed state.
In this embodiment, a ratio f can be obtained by counting the number of frames with the eye opening smaller than or equal to 80% of the maximum eye opening in the period (which is equivalent to the detection period) and recording the number of frames with the eye opening smaller than or equal to 20% of the maximum eye opening in the period as n, and recording the number of frames with the eye opening smaller than or equal to 20% of the maximum eye opening in the period as m, and the calculation formula is as follows:
Figure GDA0002899787390000172
the closer f is to 1, the closer the driver is to fatigue. An experimental threshold value T (equivalent to the fatigue judgment threshold value) can be obtained through a large number of experiments, if f is larger than T, the driver is in a fatigue state, and early warning is performed in a voice mode to remind the driver of paying attention to fatigue driving.
In the scheme of the embodiment, the DPM algorithm is used for face detection, so that the detection accuracy of the algorithm is greatly improved, the false detection rate and the missing detection rate are reduced, and the robustness of illumination and face posture is improved; the human face feature points and the eye feature points are positioned by using a machine learning algorithm, the positioning accuracy is very high, meanwhile, the illumination and posture are very high in generalization capability, and finally, the opening and closing degree of the eyes can be accurately estimated by using the algorithm; for fatigue detection, not only the open-closed eye state is used as a main criterion, but also the closed-eye time, the blinking times per unit time, the eye opening degree and the like are used as fatigue criteria.
In one embodiment, as shown in fig. 13, there is provided a driver fatigue method apparatus, including: a detection module 1301, a processing module 1302, a statistics module 1303 and a discrimination module 1304, wherein:
the detection module 1301 is configured to acquire a face video of a target driver, and perform eye opening detection on each frame of face image in the face video to obtain an eye opening value in each frame of face image;
a processing module 1302, configured to determine a first opening threshold and a second opening threshold according to each human eye opening value, where the first opening threshold is greater than the second opening threshold;
a statistic module 1303, configured to count, according to each eye opening value, the first opening threshold, and the second opening threshold, a first image frame value of which the eye opening value is smaller than or equal to the first opening threshold, and a second image frame value of which the eye opening value is smaller than or equal to the second opening threshold;
a determining module 1304, configured to determine that the target driver is in a fatigue state if a ratio of the first image frame value to the second image frame value is greater than a preset fatigue determination threshold.
In one embodiment, the detection module 1301 may perform eye feature point positioning on each frame of the face image to obtain an eye feature point in the face image of each frame, determine a pupil distance value and an eye opening original value in the face image of each frame according to the eye feature point in the face image of each frame, and determine an eye opening value in the face image of each frame according to the pupil distance value and the eye opening original value in the face image of each frame.
In an embodiment, the processing module 1302 may determine a maximum eye opening degree value according to each of the eye opening degree values, and multiply the maximum eye opening degree value by a preset first scaling coefficient and a preset second scaling coefficient, respectively, to obtain the first opening degree threshold and the second opening degree threshold.
In one embodiment, the detection module 1301 may perform face feature point positioning on each frame of the face image to obtain each face feature image; respectively inputting each face feature image into a preset eye feature point positioning model to obtain eye feature points in each frame of the face image;
wherein, the training process of the eye feature point positioning model comprises the following steps: acquiring a pixel value of each pixel point of a target image and a feature vector of each pixel point; configuring a tree structure local model according to the pixel values and the feature vectors, and determining a score function in the L part, wherein the score function is S (I, L, m) Appm(I,L)+Shapem(L)+αm(ii) a Tong (Chinese character of 'tong')Obtaining the optimal configuration parameters of each part of each mixed type by calculating the values of L and m which enable the score function to obtain the maximum value; establishing a training sample set, wherein the training sample set comprises a positive sample and a negative sample which are set with labels, the positive sample is an image containing a human face, and the negative sample is an image not containing the human face; constructing a target vector according to the partial model, the elasticity parameters and the mixed bias scalar, and modifying the score function according to the target vector; learning to obtain the eye characteristic point positioning model according to the training sample set, the optimal configuration parameters, the modified score function and a predefined target prediction function;
wherein,
Figure GDA0002899787390000181
i denotes the target image,/i=(xi,yi) The method comprises the steps of representing a pixel value of an ith pixel point of a target image, w representing a partial model, m representing that a tree structure is a mixed type, the partial model is obtained by modeling each facial feature in the target image as a part, a, b, c and d representing elastic parameters, and alpha representing a mixed offset scalar.
In one embodiment, the detection module 1301 may extract a first DPM feature map, where the first DPM feature map is a DPM feature map of a current face image, the current face image is any one frame of face image, perform sampling processing on the first DPM feature map, extract a second DPM feature map, where the second DPM feature map is a DPM feature map of an image obtained by performing sampling processing on the first DPM feature map, perform convolution operation on the first DPM feature map with a pre-trained root filter to obtain a response map of the root filter, perform convolution operation on N times of the second DPM feature map with a pre-trained component filter to obtain a response map of the component filter, where the resolution of the component filter is N times of the resolution of the root filter, where N is a positive integer, and according to the response map of the root filter and the response map of the component filter, and obtaining a target response image, and acquiring a current face feature image according to the target response image.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of face feature analysis. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the architecture shown in FIG. 14 is only a block diagram of some of the structures associated with the inventive arrangements and is not intended to limit the computing devices to which the inventive arrangements may be applied, as a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the computer program implementing the driver fatigue detection method in any of the above embodiments.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the method of driver fatigue detection in any one of the above embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A driver fatigue detection method, characterized in that the method comprises:
acquiring a face video of a target driver, and respectively detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes in each frame of face image;
determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value;
according to each human eye opening value, the first opening threshold and the second opening threshold, counting a first image frame value of which the human eye opening value is smaller than or equal to the first opening threshold and a second image frame value of which the human eye opening value is smaller than or equal to the second opening threshold;
and if the ratio of the first image frame value to the second image frame value is greater than a preset fatigue judgment threshold value, judging that the target driver is in a fatigue state.
2. The method of claim 1, wherein the detecting the eye opening degree of each frame of face image in the face video to obtain the eye opening degree value of each frame of face image comprises:
respectively carrying out eye feature point positioning on each frame of the face image to obtain eye feature points in each frame of the face image;
respectively determining a human eye interpupillary distance value and a human eye opening original value in the face image of each frame according to the eye feature points in the face image of each frame;
and respectively determining the human eye opening value in the face image of each frame according to the human eye interpupillary distance value and the human eye opening original value in the face image of each frame.
3. The driver fatigue detection method according to claim 1 or 2, wherein the determining a first opening degree threshold value and a second opening degree threshold value from each of the human eye opening degree values includes:
determining the maximum eye opening degree value according to the eye opening degree values;
and multiplying the maximum opening degree value by a preset first proportional coefficient and a preset second proportional coefficient respectively to obtain the first opening degree threshold value and the second opening degree threshold value.
4. The method for detecting driver fatigue according to claim 2, wherein the performing eye feature point positioning on the face image of each frame to obtain eye feature points in the face image of each frame includes:
respectively carrying out face feature point positioning on the face image of each frame to obtain each face feature image;
respectively inputting each face feature image into a preset eye feature point positioning model to obtain eye feature points in each frame of the face image;
wherein, the training process of the eye feature point positioning model comprises the following steps:
acquiring a pixel value of each pixel point of a target image and a feature vector of each pixel point;
configuring a tree structure local model according to the pixel values and the feature vectors, and determining a score function in the L part, wherein the score function is S (I, L, m) Appm(I,L)+Shapem(L)+αm
Wherein,
Figure FDA0002899787380000021
i denotes the target image,/i=(xi,yi) Representing the pixel value of the ith pixel point of the target image, w representing a partial model, m representing that the tree structure is a mixed type, the partial model is obtained by modeling each facial feature in the target image as a part, a, b, c and d representing elastic parameters, and alpha representing a mixed offset scalar; phi (I, l)i) Representing l on the target image IiA feature vector at the pixel;
obtaining optimal configuration parameters of each part of each hybrid type by calculating values of L and m which enable the score function to obtain a maximum value;
establishing a training sample set, wherein the training sample set comprises a positive sample and a negative sample which are set with labels, the positive sample is an image containing a human face, and the negative sample is an image not containing the human face;
constructing a target vector according to the partial model, the elasticity parameters and the mixed bias scalar, and modifying the score function according to the target vector;
and learning to obtain the eye characteristic point positioning model according to the training sample set, the optimal configuration parameters, the modified score function and a predefined target prediction function.
5. The method of claim 4, wherein the performing facial feature point location on the facial image of each frame to obtain each facial feature image comprises:
extracting a first DPM feature map, wherein the first DPM feature map is a DPM feature map of a current face image, and the current face image is any one frame of face image; DPM is a target detection algorithm Deformable Part Model;
sampling the first DPM feature map, and extracting a second DPM feature map, wherein the second DPM feature map is a DPM feature map of an image obtained by sampling the first DPM feature map;
performing convolution operation on the first DPM characteristic diagram by using a pre-trained root filter to obtain a response diagram of the root filter;
performing convolution operation on the N times of the second DPM characteristic diagram by using a pre-trained component filter to obtain a response diagram of the component filter, wherein the resolution of the component filter is N times of that of the root filter, and N is a positive integer;
obtaining a target response diagram according to the response diagram of the root filter and the response diagram of the component filter;
and acquiring a current face feature image according to the target response image.
6. The method of detecting driver fatigue as set forth in claim 4, wherein the eye feature point location model is:
Figure FDA0002899787380000031
Figure FDA0002899787380000032
Figure FDA0002899787380000033
Figure FDA0002899787380000034
wherein β represents the target vector, zn={Ln,mnC represents the penalty factor of the objective function, ξnAnd the penalty term of the nth sample is represented, pos and neg respectively represent a positive sample and a negative sample, K represents the number of the target vectors, and K represents the number of the corresponding target vectors.
7. A driver fatigue detecting device, characterized in that the device comprises:
the detection module is used for acquiring a face video of a target driver, and detecting the opening degree of human eyes of each frame of face image in the face video to obtain the opening degree value of human eyes of each frame of face image;
the processing module is used for determining a first opening threshold value and a second opening threshold value according to each human eye opening value, wherein the first opening threshold value is larger than the second opening threshold value;
a counting module, configured to count, according to each of the eye opening values, the first opening threshold and the second opening threshold, a first image frame value of which the eye opening value is smaller than or equal to the first opening threshold, and a second image frame value of which the eye opening value is smaller than or equal to the second opening threshold;
and the judging module is used for judging that the target driver is in a fatigue state if the ratio of the first image frame value to the second image frame value is greater than a preset fatigue judging threshold value.
8. The driver fatigue detection device according to claim 7, characterized in that:
the detection module positions eye feature points of each frame of the face image to obtain eye feature points of each frame of the face image, determines a human eye pupil distance value and a human eye opening original value of each frame of the face image according to the eye feature points of each frame of the face image, and determines a human eye opening value of each frame of the face image according to the human eye pupil distance value and the human eye opening original value of each frame of the face image.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN201811485916.5A 2018-12-06 2018-12-06 Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium Active CN111291590B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811485916.5A CN111291590B (en) 2018-12-06 2018-12-06 Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811485916.5A CN111291590B (en) 2018-12-06 2018-12-06 Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111291590A CN111291590A (en) 2020-06-16
CN111291590B true CN111291590B (en) 2021-03-19

Family

ID=71024334

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811485916.5A Active CN111291590B (en) 2018-12-06 2018-12-06 Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111291590B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183220B (en) * 2020-09-04 2024-05-24 广州汽车集团股份有限公司 Driver fatigue detection method and system and computer storage medium thereof
CN112528792B (en) * 2020-12-03 2024-05-31 深圳地平线机器人科技有限公司 Fatigue state detection method, device, medium and electronic equipment

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7742621B2 (en) * 2006-06-13 2010-06-22 Delphi Technologies, Inc. Dynamic eye tracking system
CN100462047C (en) * 2007-03-21 2009-02-18 汤一平 Safe driving auxiliary device based on omnidirectional computer vision
CN101732055B (en) * 2009-02-11 2012-04-18 北京智安邦科技有限公司 Method and system for testing fatigue of driver
CN102013013B (en) * 2010-12-23 2013-07-03 华南理工大学广州汽车学院 Fatigue driving monitoring method
CN104013414B (en) * 2014-04-30 2015-12-30 深圳佑驾创新科技有限公司 A kind of Study in Driver Fatigue State Surveillance System based on intelligent movable mobile phone
CN104881955B (en) * 2015-06-16 2017-07-18 华中科技大学 A kind of driver tired driving detection method and system
CN105354985B (en) * 2015-11-04 2018-01-12 中国科学院上海高等研究院 Fatigue driving monitoring apparatus and method
CN105574487A (en) * 2015-11-26 2016-05-11 中国第一汽车股份有限公司 Facial feature based driver attention state detection method
US10262219B2 (en) * 2016-04-21 2019-04-16 Hyundai Motor Company Apparatus and method to determine drowsiness of a driver
CN106530623B (en) * 2016-12-30 2019-06-07 南京理工大学 A kind of fatigue driving detection device and detection method
CN107169437A (en) * 2017-05-11 2017-09-15 南宁市正祥科技有限公司 The method for detecting fatigue driving of view-based access control model

Also Published As

Publication number Publication date
CN111291590A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
Li et al. Deep neural network for structural prediction and lane detection in traffic scene
Deng et al. M3 csr: Multi-view, multi-scale and multi-component cascade shape regression
US7783082B2 (en) System and method for face recognition
CN106407911A (en) Image-based eyeglass recognition method and device
CN106469298A (en) Age recognition methodss based on facial image and device
Ng et al. Hybrid ageing patterns for face age estimation
CN111222589B (en) Image text recognition method, device, equipment and computer storage medium
CN104915658B (en) A kind of emotion component analyzing method and its system based on emotion Distributed learning
CN111291590B (en) Driver fatigue detection method, driver fatigue detection device, computer equipment and storage medium
CN108564040A (en) A kind of fingerprint activity test method based on depth convolution feature
Pei et al. Consistency guided network for degraded image classification
Chen et al. A coarse-to-fine approach for vehicles detection from aerial images
CN110175500B (en) Finger vein comparison method, device, computer equipment and storage medium
Sedai et al. Discriminative fusion of shape and appearance features for human pose estimation
CN116306681A (en) Method and system for constructing interpretive visual question-answer model based on fact scene
Ngxande et al. Detecting inter-sectional accuracy differences in driver drowsiness detection algorithms
CN110751005B (en) Pedestrian detection method integrating depth perception features and kernel extreme learning machine
CN111291607B (en) Driver distraction detection method, driver distraction detection device, computer equipment and storage medium
CN112784494B (en) Training method of false positive recognition model, target recognition method and device
Gao et al. Fatigue state detection from multi-feature of eyes
Li et al. A novel approach for vehicle detection using an AND–OR-graph-based multiscale model
CN107886060A (en) Pedestrian&#39;s automatic detection and tracking based on video
Li et al. Visual tracking with structured patch-based model
CN111582057A (en) Face verification method based on local receptive field
Le et al. Multiple distribution data description learning method for novelty detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant