CN111898454A - Weight binarization neural network and transfer learning human eye state detection method and device - Google Patents

Weight binarization neural network and transfer learning human eye state detection method and device Download PDF

Info

Publication number
CN111898454A
CN111898454A CN202010624577.5A CN202010624577A CN111898454A CN 111898454 A CN111898454 A CN 111898454A CN 202010624577 A CN202010624577 A CN 202010624577A CN 111898454 A CN111898454 A CN 111898454A
Authority
CN
China
Prior art keywords
neural network
human eye
network model
predicted
level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010624577.5A
Other languages
Chinese (zh)
Inventor
刘振焘
吴敏
曹卫华
蒋承汕
李锶涵
郝曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202010624577.5A priority Critical patent/CN111898454A/en
Publication of CN111898454A publication Critical patent/CN111898454A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Ophthalmology & Optometry (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a weight binarization neural network and a method and equipment for detecting human eye state by transfer learning, wherein the method comprises the following steps: collecting human eye images and preprocessing the human eye images; constructing a human eye positioning convolution neural network model based on weight binarization, and predicting to obtain binocular coordinates; constructing a boundary frame by taking the binocular coordinates as a center; constructing a human eye detection convolutional neural network model based on weight binarization, and completing training of the human eye detection convolutional neural network model based on weight binarization by using a human face database and a human eye database by adopting transfer learning; the bounding box is used as the input of a human eye detection convolutional neural network model based on weight binarization to complete human eye state detection; the beneficial effects provided by the invention are as follows: the method reduces or even overcomes the influence on human eye recognition caused by uncertainty of head posture, external environment illumination, interference under complex background conditions and shielding, and improves the robustness of human eye recognition.

Description

Weight binarization neural network and transfer learning human eye state detection method and device
Technical Field
The invention relates to the field of image processing, in particular to a weight binarization neural network and a method and equipment for detecting human eye state by transfer learning.
Background
Current methods for detecting the state of the human eye can be roughly classified into two types of methods based on feature analysis and pattern classification. The method based on feature analysis mainly relies on the geometric features of the eye, such as the iris, pupil, eyelid shape or aspect ratio of human eyes, to distinguish the open and closed states of the eye, or to judge the ratio of white pixels in the eye image. Such methods rely on accurate eye positioning and are susceptible to environmental disturbances leading to misjudgment. The detection method based on pattern classification firstly extracts shape or texture features of an eye region, such as local binary pattern features, directional gradient histogram features, Haar features, Gabor wavelet features and the like, and then trains a classifier through a support vector machine, an Adaboost classifier or a neural network and the like to automatically learn a classification rule so as to judge the opening and closing states of the eye.
These methods have advantages, but are all susceptible to interference from factors such as illumination, facial pose, image sharpness, etc. in practical applications.
Disclosure of Invention
In view of the above, the invention provides a weight binarization neural network and a transfer learning human eye state detection method, which can greatly reduce or even overcome the problems of uncertainty of head posture, influence of external environment illumination, interference under complex background conditions, influence of shielding and the like, and the method provided by the invention has higher robustness and can better adapt to environmental changes; the method comprises the following steps:
s101: collecting RGB images of the human face by using a camera;
s102: preprocessing the face RGB image to obtain a preprocessed face image, constructing a weight binarization convolution neural network model for human eye positioning, and training the weight binarization convolution neural network model for human eye positioning by using a face database; the weight binarization convolutional neural network model for human eye positioning comprises four levels; the first hierarchical structure comprises three convolutional neural networks, F1, LE1, and RE 1; the second hierarchy includes five convolutional neural networks, F2, LE21, LE22, RE21, and RE22, respectively; the third hierarchical structure comprises three convolutional neural networks, namely F3, LE3 and RE 3; the fourth hierarchy includes two convolutional neural networks, LE4 and RE4, respectively;
s103: the preprocessed human face image is used as the input of the weight binarization convolutional neural network model for human eye positioning, the output of the weight binarization convolutional neural network model for human eye positioning is the final prediction coordinate of human eyes, and human eye positioning is completed;
s104: with the final predicted coordinates of the human eyes as the center, constructing a cutting frame to cut the human eye area to obtain a finally extracted human eye image;
s105: constructing a weight binarization cascaded convolutional neural network model for human eye state detection, wherein the weight binarization cascaded convolutional neural network model for human eye state detection comprises six convolutional layers, two pooling layers and two full-connection layers;
s106: sequentially training a weight binarization cascade convolution neural network model for human eye state detection by using a human face database and a human eye state database to obtain a trained weight binarization cascade convolution neural network model for human eye state detection;
s107: and inputting the finally extracted human eye image in the step S104 into the trained cascaded convolutional neural network model with the weight binarization for human eye state detection to obtain the final state of the human eye.
Further, in step S102, the preprocessing is performed on the face RGB image, and the obtaining of the preprocessed face image specifically includes: carrying out gray level transformation on the face RGB image to obtain a face gray level image; and (4) performing size cutting on the face gray level image to respectively obtain a left face image and a right face image of the face.
Further, step S103 specifically includes:
s201: inputting the face gray level image to F1 to obtain F1 predicted binocular coordinates; inputting the left face image into LE1, and obtaining left eye coordinates predicted by LE 1; inputting the right face image into RE1 to obtain the right eye coordinate predicted by RE 1;
s202: correspondingly adding the binocular coordinate predicted by F1, the left eye coordinate predicted by LE1 and the right eye coordinate predicted by RE1, and dividing by 2 to obtain the binocular coordinate finally predicted by the first level of the weight binarization convolution neural network model for human eye positioning;
s203: presetting a bounding box by taking the finally predicted binocular coordinate of the first level of the weight binarization convolutional neural network model for human eye positioning as the center, and taking the bounding box as the input of F2 of the first level of the weight binarization convolutional neural network model for human eye positioning to obtain the predicted binocular coordinate of F2; presetting a bounding box by taking the left-eye coordinate predicted by LE1 as a center, and taking the bounding box as the input of LE21 and LE22 of a second level of the weight binarization convolutional neural network model for human eye positioning to obtain the left-eye coordinate predicted by LE21 and LE 22; presetting a boundary box by taking the right-eye coordinate predicted by the RE1 as a center, and taking the boundary box as the input of RE21 and RE22 of a second level of the weight binarization convolutional neural network model for human eye positioning to obtain the right-eye coordinate predicted by the RE21 and RE 22;
s204: correspondingly adding the binocular coordinates predicted by the F2, the left eye coordinates predicted by the LE21 and the LE22 and the right eye coordinates predicted by the RE21 and the RE22, and dividing by 3 to obtain the binocular coordinates finally predicted by the second level of the weight binarization convolutional neural network model for human eye positioning;
s205: constructing a bounding box by taking the finally predicted binocular coordinate of the second level as the center, and taking the bounding box as the input of a third level F3 to obtain the predicted binocular coordinate of a third level F3; dividing the sum of the left-eye coordinates predicted by LE21 and LE22 by 2 to form a bounding box as the input of a third level LE3, and obtaining the left-eye coordinates predicted by a third level LE 3; dividing the sum of the right-eye coordinates predicted by RE21 and RE22 by 2 to form a boundary box serving as the input of a third-level RE3, and obtaining the right-eye coordinate predicted by a third-level RE 3;
s206: correspondingly adding the binocular coordinate predicted by the third level F3, the left eye coordinate predicted by the third level LE3 and the right eye coordinate predicted by the third level RE4, and dividing by 2 to obtain the final predicted binocular coordinate of the third level;
s207: constructing a bounding box by taking the left eye coordinate in the binocular coordinate finally predicted by the third level as the center, wherein the bounding box is used as the input of the fourth level LE4 to obtain the left eye coordinate predicted by the fourth level LE 4; taking the right-eye coordinate in the binocular coordinate finally predicted by the third level as the center, constructing a bounding box as the input of the RE4 of the fourth level, and obtaining the right-eye coordinate predicted by the RE4 of the fourth level; the left eye coordinate predicted by the fourth level LE4 and the right eye coordinate predicted by the fourth level RE4 jointly form the final predicted coordinate of the human eye output by the weight binarization convolution neural network model of the human eye positioning.
Further, in step S105, a cascaded convolutional neural network model for weight binarization for human eye state detection is constructed, specifically: the weight binarization cascade convolution neural network model for human eye state detection comprises two cascade convolution neural network models which are respectively a main weight binarization cascade convolution neural network model and a secondary weight binarization cascade convolution neural network model; the structure of the cascade convolution neural network model for the primary weight binarization is the same as that of the cascade convolution neural network model for the secondary weight binarization.
Further, step S106 specifically includes:
s301: pre-training the secondary weight binarization cascade convolution neural network model by using a face image database with a large number of samples to obtain secondary weight binarization cascade convolution neural network model initial parameters;
s302: transmitting the initial parameters of the secondary weight binarization cascaded convolutional neural network model to the primary weight binarization cascaded convolutional neural network model through transfer learning to obtain a primary weight binarization cascaded convolutional neural network model with the initial parameters;
s303: retraining the cascade convolution neural network model with the primary weight binaryzation of the initial parameters by using an image database marked with the human eye state to obtain a trained cascade convolution neural network model with the primary weight binaryzation; the trained cascaded convolutional neural network model with the main weight binaryzation is the trained cascaded convolutional neural network model with the weight binaryzation for detecting the human eye state.
A storage device stores instructions and data and is used for a weight binarization neural network and a transfer learning human eye state detection method.
A human eye state detection device based on weight binarization convolutional neural network and transfer learning comprises: a processor and a storage device; the processor loads and executes instructions and data in the storage device to realize a weight binarization neural network and a transfer learning human eye state detection method.
The beneficial effects provided by the invention are as follows: the method reduces or even overcomes the influence on human eye recognition caused by uncertainty of head posture, external environment illumination, interference under complex background conditions and shielding, and improves the robustness of human eye recognition.
Drawings
FIG. 1 is a schematic flow chart of the weight binarization neural network and the transfer learning human eye state detection method of the present invention;
FIG. 2 is a schematic diagram of a weight binarization convolution neural network model structure for human eye positioning according to the present invention;
FIG. 3 is a schematic diagram of a weight binarization convolutional neural network structure for human eye state detection according to the present invention;
FIG. 4 is a schematic diagram of the weight binarization convolutional neural network training process of the present invention;
FIG. 5 is a hardware device operational diagram of an embodiment of the present invention;
FIG. 6 is a schematic diagram of a histogram comparison of accuracy of a conventional human eye detection method and a human eye detection method of the present invention;
fig. 7 is a table comparing accuracy rates of a conventional human eye detection method and the human eye detection method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
Referring to fig. 1, an embodiment of the present invention provides a method for detecting a weight binarization neural network and a transfer learning human eye state, including the following steps:
s101: collecting RGB images of the human face by using a camera;
in the embodiment, the conventional camera acquires the face image, the frame rate is about 30 frames per second, and the image output format is RGB;
s102: preprocessing the face RGB image to obtain a preprocessed face image, constructing a weight binarization convolution neural network model for human eye positioning, and training the weight binarization convolution neural network model for human eye positioning by using a face database; the weight binarization convolutional neural network model for human eye positioning comprises four levels; the first hierarchical structure comprises three convolutional neural networks, F1, LE1, and RE 1; the second hierarchy includes five convolutional neural networks, F2, LE21, LE22, RE21, and RE22, respectively; the third hierarchical structure comprises three convolutional neural networks, namely F3, LE3 and RE 3; the fourth hierarchy includes two convolutional neural networks, LE4 and RE4, respectively;
referring to fig. 2, fig. 2 is a schematic structural diagram of a weight binarization convolutional neural network model for human eye positioning according to the present invention;
in this embodiment, when the weight binarization convolutional neural network model for human eye positioning is trained, the adopted database is a laboratory Faces in the Wild (LFW) database;
in this embodiment, the weight is binarized, and the specific weight is limited to 1 or-1;
s103: the preprocessed human face image is used as the input of the weight binarization convolutional neural network model for human eye positioning, the output of the weight binarization convolutional neural network model for human eye positioning is the final prediction coordinate of human eyes, and human eye positioning is completed;
s104: with the final predicted coordinates of the human eyes as the center, constructing a cutting frame to cut the human eye area to obtain a finally extracted human eye image;
s105: constructing a weight binarization cascade convolution neural network model for human eye state detection, wherein the specific structure refers to fig. 2 and fig. 3 are schematic structural diagrams of the weight binarization convolution neural network for human eye state detection;
s106: sequentially training a weight binarization cascade convolution neural network model for human eye state detection by using a human face database and a human eye state database to obtain a trained weight binarization cascade convolution neural network model for human eye state detection;
in the embodiment, a face database adopted by a weight binarization cascade convolution neural network model for training eye state detection is from a fer2013 face expression database; the data samples of the human eye state database are data samples combined by a CEW database and a ZJU database;
s107: and inputting the finally extracted human eye image in the step S104 into the trained cascaded convolutional neural network model with the weight binarization for human eye state detection to obtain the final state of the human eye.
In step S102, the preprocessing is performed on the face RGB image, and the obtaining of the preprocessed face image specifically includes: carrying out gray level transformation on the face RGB image to obtain a face gray level image; and (4) performing size cutting on the face gray level image to respectively obtain a left face image and a right face image of the face.
Step S103 specifically includes:
s201: inputting the face gray level image to F1 to obtain F1 predicted binocular coordinates; inputting the left face image into LE1, and obtaining left eye coordinates predicted by LE 1; inputting the right face image into RE1 to obtain the right eye coordinate predicted by RE 1;
s202: correspondingly adding the binocular coordinate predicted by F1, the left eye coordinate predicted by LE1 and the right eye coordinate predicted by RE1, and dividing by 2 to obtain the binocular coordinate finally predicted by the first level of the weight binarization convolution neural network model for human eye positioning;
s203: presetting a bounding box by taking the finally predicted binocular coordinate of the first level of the weight binarization convolutional neural network model for human eye positioning as the center, and taking the bounding box as the input of F2 of the first level of the weight binarization convolutional neural network model for human eye positioning to obtain the predicted binocular coordinate of F2; presetting a bounding box by taking the left-eye coordinate predicted by LE1 as a center, and taking the bounding box as the input of LE21 and LE22 of a second level of the weight binarization convolutional neural network model for human eye positioning to obtain the left-eye coordinate predicted by LE21 and LE 22; presetting a boundary box by taking the right-eye coordinate predicted by the RE1 as a center, and taking the boundary box as the input of RE21 and RE22 of a second level of the weight binarization convolutional neural network model for human eye positioning to obtain the right-eye coordinate predicted by the RE21 and RE 22;
s204: correspondingly adding the binocular coordinates predicted by the F2, the left eye coordinates predicted by the LE21 and the LE22 and the right eye coordinates predicted by the RE21 and the RE22, and dividing by 3 to obtain the binocular coordinates finally predicted by the second level of the weight binarization convolutional neural network model for human eye positioning;
s205: constructing a bounding box by taking the finally predicted binocular coordinate of the second level as the center, and taking the bounding box as the input of a third level F3 to obtain the predicted binocular coordinate of a third level F3; dividing the sum of the left-eye coordinates predicted by LE21 and LE22 by 2 to form a bounding box as the input of a third level LE3, and obtaining the left-eye coordinates predicted by a third level LE 3; dividing the sum of the right-eye coordinates predicted by RE21 and RE22 by 2 to form a boundary box serving as the input of a third-level RE3, and obtaining the right-eye coordinate predicted by a third-level RE 3;
s206: correspondingly adding the binocular coordinate predicted by the third level F3, the left eye coordinate predicted by the third level LE3 and the right eye coordinate predicted by the third level RE4, and dividing by 2 to obtain the final predicted binocular coordinate of the third level;
s207: constructing a bounding box by taking the left eye coordinate in the binocular coordinate finally predicted by the third level as the center, wherein the bounding box is used as the input of the fourth level LE4 to obtain the left eye coordinate predicted by the fourth level LE 4; taking the right-eye coordinate in the binocular coordinate finally predicted by the third level as the center, constructing a bounding box as the input of the RE4 of the fourth level, and obtaining the right-eye coordinate predicted by the RE4 of the fourth level; the left eye coordinate predicted by the fourth level LE4 and the right eye coordinate predicted by the fourth level RE4 jointly form the final predicted coordinate of the human eye output by the weight binarization convolution neural network model of the human eye positioning.
In step S105, a cascaded convolutional neural network model for weight binarization for human eye state detection is constructed, specifically: the weight binarization cascade convolution neural network model for human eye state detection comprises two cascade convolution neural network models which are respectively a main weight binarization cascade convolution neural network model and a secondary weight binarization cascade convolution neural network model; the structure of the cascade convolutional neural network model for the primary weight binarization and the structure of the cascade convolutional neural network model for the secondary weight binarization are the same, namely the structure shown in fig. 3.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a training process of a weight binarization convolutional neural network according to the present invention;
step S106 specifically includes:
s301: pre-training the secondary weight binarization cascade convolution neural network model by using a face image database with a large number of samples to obtain secondary weight binarization cascade convolution neural network model initial parameters;
s302: transmitting the initial parameters of the secondary weight binarization cascaded convolutional neural network model to the primary weight binarization cascaded convolutional neural network model through transfer learning to obtain a primary weight binarization cascaded convolutional neural network model with the initial parameters;
s303: retraining the cascade convolution neural network model with the primary weight binaryzation of the initial parameters by using an image database marked with the human eye state to obtain a trained cascade convolution neural network model with the primary weight binaryzation; the trained cascaded convolutional neural network model with the main weight binaryzation is the trained cascaded convolutional neural network model with the weight binaryzation for detecting the human eye state.
Referring to fig. 5, fig. 5 is a schematic diagram of a hardware device according to an embodiment of the present invention, where the hardware device specifically includes: a human eye state detection device 401, a processor 402 and a storage device 403 based on weight binary convolution neural network and transfer learning.
A human eye state detection device 401 based on weight binarization convolutional neural network and transfer learning: the human eye state detection device 401 based on the weight binarization convolutional neural network and the transfer learning realizes the weight binarization neural network and the transfer learning human eye state detection method.
The processor 402: the processor 402 loads and executes the instructions and data in the storage device 403 to implement the weight binarization neural network and the transfer learning human eye state detection method.
The storage device 403: the storage device 403 stores instructions and data; the storage device 403 is used to implement the weight binarization neural network and the transfer learning human eye state detection method.
Please refer to fig. 6 and 7; FIG. 6 is a schematic diagram illustrating a comparison of human eye state detection accuracy of a feature extraction method in a CEW database according to an embodiment of the present invention; FIG. 7 is a comparison of the accuracy of the eye state detection methods in the ZJU database according to the present invention.
Open in fig. 6 indicates the recognition accuracy in the human eye-Open state, Closed indicates the recognition accuracy in the human eye-Closed state, and Average indicates the Average human eye state recognition accuracy. Gabor, LBP, HOG Method and MultiHPOG in the figure are some traditional feature extraction methods, and Our Method is the human eye detection Method based on weight binarization convolutional neural network and transfer learning provided by the invention. As can be seen from the figure, the human eye state detection capability of LBP and MultiHPOG is better, the performance of Gabor is the worst, and the accuracy of the human eye state detection method provided by the invention is obviously higher than that of the traditional method.
The Method in fig. 7 is listed as some methods that have been proposed for application to human eye state detection, wherein Ourmethod is the weight binarization neural network and the transfer learning human eye state detection Method proposed by the present invention. The Accuracy column is the Accuracy result of the human eye state detection on the ZJU database by each method.
The invention comprehensively considers that in addition to the influence of the disordered image background on the positioning and state classification of human eyes, facial organs such as eyebrows, lips and the like can also cause troubles on the positioning and opening and closing state classification of the eyes. The conventional method such as the cascade classifier method is very easy to make wrong judgment. According to the invention, through the fine parameter adjustment of the six convolution layers, the two pooling layers, the two fully-connected layers and all the composition layers, the error that the misjudgment section is easy to generate in the traditional method under the above condition is overcome.
The invention comprehensively considers the problems of training efficiency and sample number in model training, and compared with the traditional convolutional neural network method, the invention overcomes the problems of overlong training time, insufficient training samples and the like of the traditional convolutional neural network through transfer learning and weight binarization, so that the time cost can be reduced and the recognition efficiency can be improved under the condition of finishing high-accuracy human eye state recognition.
The invention comprehensively considers the difficulty of popularization, does not need to wear any physical measuring equipment, does not influence the normal behavior of the detected person, has good universality and can be popularized to the aspects of production operation fatigue detection, automobile driving fatigue detection, aircraft driving attention detection and the like.
A convolutional neural network based on weight binarization and a human eye state detection method based on transfer learning. The binary convolution neural network contained in the method can effectively extract the state characteristics of human eyes, and the binary convolution neural network is not only beneficial to reducing the storage capacity of the model, but also can accelerate the calculation speed. The transfer learning applies the knowledge learned from the source domain to the target domain, namely, the trained model parameters are transferred to the new model to help the new model training, thereby improving the training efficiency of the new model
The beneficial effects of the implementation of the invention are as follows: the method reduces or even overcomes the influence on human eye recognition caused by uncertainty of head posture, external environment illumination, interference under complex background conditions and shielding, and improves the robustness of human eye recognition.
The features of the above-described embodiments and embodiments of the invention may be combined with each other without conflict.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A weight binarization neural network and a transfer learning human eye state detection method are characterized in that: the method specifically comprises the following steps:
s101: collecting RGB images of the human face by using a camera;
s102: preprocessing the face RGB image to obtain a preprocessed face image, constructing a weight binarization convolution neural network model for human eye positioning, and training the weight binarization convolution neural network model for human eye positioning by using a face database; the weight binarization convolutional neural network model for human eye positioning comprises four levels; the first hierarchical structure comprises three convolutional neural networks, F1, LE1, and RE 1; the second hierarchy includes five convolutional neural networks, F2, LE21, LE22, RE21, and RE22, respectively; the third hierarchical structure comprises three convolutional neural networks, namely F3, LE3 and RE 3; the fourth hierarchy includes two convolutional neural networks, LE4 and RE4, respectively;
s103: the preprocessed human face image is used as the input of the weight binarization convolutional neural network model for human eye positioning, the output of the weight binarization convolutional neural network model for human eye positioning is the final prediction coordinate of human eyes, and human eye positioning is completed;
s104: with the final predicted coordinates of the human eyes as the center, constructing a cutting frame to cut the human eye area to obtain a finally extracted human eye image;
s105: constructing a weight binarization cascaded convolutional neural network model for human eye state detection, wherein the weight binarization cascaded convolutional neural network model for human eye state detection comprises six convolutional layers, two pooling layers and two full-connection layers;
s106: sequentially training a weight binarization cascade convolution neural network model for human eye state detection by using a human face database and a human eye state database to obtain a trained weight binarization cascade convolution neural network model for human eye state detection;
s107: and inputting the finally extracted human eye image in the step S104 into the trained cascaded convolutional neural network model with the weight binarization for human eye state detection to obtain the final state of the human eye.
2. The method for detecting the state of the human eye through weight binarization neural network and transfer learning as claimed in claim 1, wherein: in step S102, the preprocessing is performed on the face RGB image, and the obtaining of the preprocessed face image specifically includes: carrying out gray level transformation on the face RGB image to obtain a face gray level image; and (4) performing size cutting on the face gray level image to respectively obtain a left face image and a right face image of the face.
3. The method for detecting the state of the human eye through weight binarization neural network and transfer learning as claimed in claim 2, wherein: step S103 specifically includes:
s201: inputting the face gray level image to F1 to obtain F1 predicted binocular coordinates; inputting the left face image into LE1, and obtaining left eye coordinates predicted by LE 1; inputting the right face image into RE1 to obtain the right eye coordinate predicted by RE 1;
s202: correspondingly adding the binocular coordinate predicted by F1, the left eye coordinate predicted by LE1 and the right eye coordinate predicted by RE1, and dividing by 2 to obtain the binocular coordinate finally predicted by the first level of the weight binarization convolution neural network model for human eye positioning;
s203: presetting a bounding box by taking the finally predicted binocular coordinate of the first level of the weight binarization convolutional neural network model for human eye positioning as the center, and taking the bounding box as the input of F2 of the first level of the weight binarization convolutional neural network model for human eye positioning to obtain the predicted binocular coordinate of F2; presetting a bounding box by taking the left-eye coordinate predicted by LE1 as a center, and taking the bounding box as the input of LE21 and LE22 of a second level of the weight binarization convolutional neural network model for human eye positioning to obtain the left-eye coordinate predicted by LE21 and LE 22; presetting a boundary box by taking the right-eye coordinate predicted by the RE1 as a center, and taking the boundary box as the input of RE21 and RE22 of a second level of the weight binarization convolutional neural network model for human eye positioning to obtain the right-eye coordinate predicted by the RE21 and RE 22;
s204: correspondingly adding the binocular coordinates predicted by the F2, the left eye coordinates predicted by the LE21 and the LE22 and the right eye coordinates predicted by the RE21 and the RE22, and dividing by 3 to obtain the binocular coordinates finally predicted by the second level of the weight binarization convolutional neural network model for human eye positioning;
s205: constructing a bounding box by taking the finally predicted binocular coordinate of the second level as the center, and taking the bounding box as the input of a third level F3 to obtain the predicted binocular coordinate of a third level F3; dividing the sum of the left-eye coordinates predicted by LE21 and LE22 by 2 to form a bounding box as the input of a third level LE3, and obtaining the left-eye coordinates predicted by a third level LE 3; dividing the sum of the right-eye coordinates predicted by RE21 and RE22 by 2 to form a boundary box serving as the input of a third-level RE3, and obtaining the right-eye coordinate predicted by a third-level RE 3;
s206: correspondingly adding the binocular coordinate predicted by the third level F3, the left eye coordinate predicted by the third level LE3 and the right eye coordinate predicted by the third level RE4, and dividing by 2 to obtain the final predicted binocular coordinate of the third level;
s207: constructing a bounding box by taking the left eye coordinate in the binocular coordinate finally predicted by the third level as the center, wherein the bounding box is used as the input of the fourth level LE4 to obtain the left eye coordinate predicted by the fourth level LE 4; taking the right-eye coordinate in the binocular coordinate finally predicted by the third level as the center, constructing a bounding box as the input of the RE4 of the fourth level, and obtaining the right-eye coordinate predicted by the RE4 of the fourth level; the left eye coordinate predicted by the fourth level LE4 and the right eye coordinate predicted by the fourth level RE4 jointly form the final predicted coordinate of the human eye output by the weight binarization convolution neural network model of the human eye positioning.
4. The method for detecting the state of the human eye through weight binarization neural network and transfer learning as claimed in claim 1, wherein: in step S105, a cascaded convolutional neural network model for weight binarization for human eye state detection is constructed, specifically: the weight binarization cascade convolution neural network model for human eye state detection comprises two cascade convolution neural network models which are respectively a main weight binarization cascade convolution neural network model and a secondary weight binarization cascade convolution neural network model; the structure of the cascade convolution neural network model for the primary weight binarization is the same as that of the cascade convolution neural network model for the secondary weight binarization.
5. The method for detecting the state of the human eye through the weight binarization neural network and the transfer learning as claimed in claim 4, wherein: step S106 specifically includes:
s301: pre-training the secondary weight binarization cascade convolution neural network model by using a face image database with a large number of samples to obtain secondary weight binarization cascade convolution neural network model initial parameters;
s302: transmitting the initial parameters of the secondary weight binarization cascaded convolutional neural network model to the primary weight binarization cascaded convolutional neural network model through transfer learning to obtain a primary weight binarization cascaded convolutional neural network model with the initial parameters;
s303: retraining the cascade convolution neural network model with the primary weight binaryzation of the initial parameters by using an image database marked with the human eye state to obtain a trained cascade convolution neural network model with the primary weight binaryzation; the trained cascaded convolutional neural network model with the main weight binaryzation is the trained cascaded convolutional neural network model with the weight binaryzation for detecting the human eye state.
6. A storage device, characterized by: the storage device stores instructions and data for realizing the weight binarization neural network and the transfer learning human eye state detection method as claimed in any one of claims 1-5.
7. A human eye state detection device of weight binary convolution neural network and transfer learning is characterized in that: the method comprises the following steps: a processor and a storage device; the processor loads and executes instructions and data in the storage device to realize the weight binarization neural network and the transfer learning human eye state detection method as claimed in any one of claims 1 to 5.
CN202010624577.5A 2020-07-02 2020-07-02 Weight binarization neural network and transfer learning human eye state detection method and device Pending CN111898454A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010624577.5A CN111898454A (en) 2020-07-02 2020-07-02 Weight binarization neural network and transfer learning human eye state detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010624577.5A CN111898454A (en) 2020-07-02 2020-07-02 Weight binarization neural network and transfer learning human eye state detection method and device

Publications (1)

Publication Number Publication Date
CN111898454A true CN111898454A (en) 2020-11-06

Family

ID=73191782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010624577.5A Pending CN111898454A (en) 2020-07-02 2020-07-02 Weight binarization neural network and transfer learning human eye state detection method and device

Country Status (1)

Country Link
CN (1) CN111898454A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329752A (en) * 2021-01-06 2021-02-05 腾讯科技(深圳)有限公司 Training method of human eye image processing model, image processing method and device
CN112818938A (en) * 2021-03-03 2021-05-18 长春理工大学 Face recognition algorithm and face recognition device adaptive to illumination interference environment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107748858A (en) * 2017-06-15 2018-03-02 华南理工大学 A kind of multi-pose eye locating method based on concatenated convolutional neutral net
CN108614999A (en) * 2018-04-16 2018-10-02 贵州大学 Eyes based on deep learning open closed state detection method
CN110738071A (en) * 2018-07-18 2020-01-31 浙江中正智能科技有限公司 face algorithm model training method based on deep learning and transfer learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107748858A (en) * 2017-06-15 2018-03-02 华南理工大学 A kind of multi-pose eye locating method based on concatenated convolutional neutral net
CN108614999A (en) * 2018-04-16 2018-10-02 贵州大学 Eyes based on deep learning open closed state detection method
CN110738071A (en) * 2018-07-18 2020-01-31 浙江中正智能科技有限公司 face algorithm model training method based on deep learning and transfer learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHEN-TAO LIU等: "Eye localization based on weight binarization cascade convolution neural network", NEUROCOMPUTING *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329752A (en) * 2021-01-06 2021-02-05 腾讯科技(深圳)有限公司 Training method of human eye image processing model, image processing method and device
CN112818938A (en) * 2021-03-03 2021-05-18 长春理工大学 Face recognition algorithm and face recognition device adaptive to illumination interference environment

Similar Documents

Publication Publication Date Title
CN106599883B (en) CNN-based multilayer image semantic face recognition method
WO2020108362A1 (en) Body posture detection method, apparatus and device, and storage medium
Lajevardi et al. Higher order orthogonal moments for invariant facial expression recognition
CN108614999B (en) Eye opening and closing state detection method based on deep learning
CN111652317B (en) Super-parameter image segmentation method based on Bayes deep learning
CN109767422A (en) Pipe detection recognition methods, storage medium and robot based on deep learning
CN112837344B (en) Target tracking method for generating twin network based on condition countermeasure
CN110781829A (en) Light-weight deep learning intelligent business hall face recognition method
CN108830237B (en) Facial expression recognition method
Sajanraj et al. Indian sign language numeral recognition using region of interest convolutional neural network
KR102132407B1 (en) Method and apparatus for estimating human emotion based on adaptive image recognition using incremental deep learning
CN110046544A (en) Digital gesture identification method based on convolutional neural networks
CN114092793B (en) End-to-end biological target detection method suitable for complex underwater environment
CN110728185A (en) Detection method for judging existence of handheld mobile phone conversation behavior of driver
CN110889397A (en) Visual relation segmentation method taking human as main body
CN111898454A (en) Weight binarization neural network and transfer learning human eye state detection method and device
CN109815887B (en) Multi-agent cooperation-based face image classification method under complex illumination
CN116071575A (en) Multi-mode data fusion-based student classroom abnormal behavior detection method and detection system
García et al. Pollen grains contour analysis on verification approach
CN111898473B (en) Driver state real-time monitoring method based on deep learning
CN111553202B (en) Training method, detection method and device for neural network for living body detection
CN114038035A (en) Artificial intelligence recognition device based on big data
Karim et al. Bangla Sign Language Recognition using YOLOv5
CN111353353A (en) Cross-posture face recognition method and device
Venkatesan et al. Advanced classification using genetic algorithm and image segmentation for Improved FD

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination