CN113705460B

CN113705460B - Method, device, equipment and storage medium for detecting open and closed eyes of face in image

Info

Publication number: CN113705460B
Application number: CN202111003044.6A
Authority: CN
Inventors: 洪叁亮
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-08-30
Filing date: 2021-08-30
Publication date: 2024-03-15
Anticipated expiration: 2041-08-30
Also published as: CN113705460A

Abstract

The invention relates to artificial intelligence and digital medical technology, and discloses a method for detecting the opening and closing of eyes of a face in an image, which comprises the following steps: performing face detection on the image to be detected by using a face detection model to obtain the face size, the face center offset and the heat of each pixel point; screening out a target image according to the heat, and calculating a face block diagram in the target image according to the face size and the face center offset in the target image; detecting the coordinates of key points of the human face in the human face block diagram by using the key point detection model; extracting left and right eye corner point coordinates from the face key point coordinates to obtain left and right eye images; and classifying and detecting the left and right eye images by using an eye state classification model to obtain eye state categories. In addition, the invention also relates to a blockchain technology, such as a face detection model can be stored in nodes of the blockchain. The invention also provides a device, equipment and medium for detecting the opening and closing of the eyes of the face in the image. The invention can improve the accuracy of eye state detection in the image.

Description

Method, device, equipment and storage medium for detecting open and closed eyes of face in image

Technical Field

The present invention relates to the field of artificial intelligence and digital medical technology, and in particular, to a method and apparatus for detecting open and closed eyes of a face in an image, an electronic device, and a computer readable storage medium.

Background

In the human image acquisition process, the acquired human face image is disqualified due to the situations of lack of concentration and the like of the user, for example, the user closes the eyes, and the subsequent human face recognition is difficult. Therefore, it is important to effectively judge the eye opening and closing state of the collected face image under complex conditions.

Methods commonly used in the industry at present, such as calculation of eye-opening and eye-closing states based on the distance between key points of a face and classification judgment of eye-opening and eye-closing states based on deep learning, are easily affected by inaccurate positioning of face frames and key points, and bring errors to subsequent eye state classification.

Disclosure of Invention

The invention provides a method and a device for detecting whether eyes are open or closed in an image and a computer readable storage medium, and mainly aims to solve the problem of inaccurate detection of eye states in the image.

In order to achieve the above object, the present invention provides a method for detecting that a face in an image is open and closed, comprising:

acquiring an image set to be detected, and calculating image characteristics in each image to be detected in the image set to be detected by using a pre-constructed face detection model;

Screening out a target image from the image set to be detected according to the image characteristics and calculating a face block diagram in the target image;

detecting face key point coordinates in the face block diagram by using a pre-constructed key point detection model;

extracting left and right eye corner point coordinates from the face key point coordinates, and obtaining left and right eye frames by performing outward expansion on the left and right eye corner point coordinates;

and cutting out left and right eye images from the human face block diagram according to the left and right eye frames, and carrying out classification detection on the left and right eye images by utilizing a pre-constructed eye state classification model to obtain eye state types.

Optionally, the calculating, by using a pre-constructed face detection model, image features in each image to be detected in the image set to be detected, to obtain a face size, a face center offset, and a heat of each pixel in each image to be detected includes:

carrying out data enhancement processing on each image to be detected in the image set to be detected to obtain a standard image set to be detected;

counting pixel values of all pixel points of each standard to-be-detected image in the standard to-be-detected image set one by one to obtain a pixel matrix of each standard to-be-detected image;

Carrying out convolution, pooling and activation treatment on the pixel matrix by using the face detection model to obtain the heat of each pixel point in the standard image to be detected;

selecting a pixel point with the heat degree larger than a preset threshold value of each pixel point in the standard image to be detected as a human face pixel point, and calculating according to the human face pixel points to obtain the human face size of the standard image to be detected;

calculating the face center offset in the standard image to be detected according to the face pixel points;

and summarizing the heat degree, the face size and the face center offset of each pixel point to obtain the image characteristics in each image to be detected.

Optionally, the performing data enhancement processing on each image to be detected in the image set to be detected to obtain a standard image set to be detected includes:

randomly cutting the images to be detected of the image set to be detected one by one;

and carrying out random color dithering on the cut image to be detected, and summarizing the cut and dithered image to obtain a standard image set to be detected.

Optionally, the screening the target image from the image set to be detected according to the image features includes:

Averaging the heat of all pixel points of each image to be detected in the image set to be detected to obtain a heat average value;

and screening the image with the heat average value larger than a preset threshold value as a target image.

Optionally, the detecting the coordinates of the key points of the face in the face block diagram by using the pre-constructed key point detection model includes:

extracting features of the face block diagram by utilizing each convolution layer of the key point detection model to obtain key point feature information;

and activating the key point characteristic information by using a regressor to obtain the coordinates of the key points of the human face.

Optionally, the extracting the coordinates of the left and right eye corner points from the coordinates of the face key point includes:

extracting the position information of each key point from the key point characteristic information corresponding to the key point coordinates of the face, and generating a key point data table according to the position information;

constructing an index of the key point data table by using a preset index function;

and searching in the index by using a preset left eye corner point coordinate label and a preset right eye corner point coordinate label, and taking the searched position information as left eye point coordinates and right eye point coordinates.

Optionally, the classifying detecting the left and right eye images by using the pre-constructed eye state classification model to obtain an eye state class includes:

Counting pixel values of all pixel points in the left and right eye images to respectively obtain a pixel matrix of the left eye image and a pixel matrix of the right eye image;

respectively converting the pixel matrix of the left eye image and the pixel matrix of the right eye image into one-dimensional matrixes by using the eye state classification model to obtain the one-dimensional matrixes of the left eye image and the right eye image;

the one-dimensional matrix of the left eye image and the one-dimensional matrix of the right eye image are respectively subjected to convolution and full connection operation for preset times by utilizing a plurality of neurons of a hidden layer in the eye state classification model, so that left eye state information and right eye state information are obtained;

activating the left and right eye state information by using a two-classifier to obtain the open eye state probability and the closed eye state probability of the left and right eyes;

if the eye-open state probability is larger than the eye-closed state probability, determining that the eye state categories of the left eye and the right eye are eye-open;

and if the eye opening state probability is smaller than or equal to the eye closing state probability, determining that the eye state categories of the left eye and the right eye are eye closing.

In order to solve the above problem, the present invention further provides an apparatus for detecting that a face in an image is open and closed, the apparatus comprising:

The image detection module is used for calculating image characteristics in each image to be detected in the image set to be detected by utilizing a pre-constructed face detection model;

the face block diagram acquisition module is used for screening out a target image from the image set to be detected according to the image characteristics and calculating a face block diagram in the target image;

the face key point acquisition module is used for detecting face key point coordinates in the face block diagram by utilizing a pre-constructed key point detection model;

the left eye socket and right eye socket acquisition module is used for extracting left and right eye corner point coordinates from the face key point coordinates, and obtaining left and right eye frames by carrying out outward expansion on the left and right eye corner point coordinates;

and the eye state acquisition module is used for cutting out left and right eye images from the face block diagram according to the left and right eye frames, and classifying and detecting the left and right eye images by utilizing a pre-constructed eye state classification model to obtain eye state types.

In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of face open eye and closed eye detection in the image described above.

In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned face-open-eye-closure detection method in an image.

According to the embodiment of the invention, the image to be detected is detected by using the face detection model to obtain the face block diagram, and then the face positioning and the face key point extraction are carried out on the face block diagram by using the key point detection model, so that the acquisition of the face eye area is more accurate and the false detection rate is lower; the eye state classification model is adopted to realize the eye image state classification of the external reaming region, so that the detection is more accurate and the robustness is high. Therefore, the method, the device, the electronic equipment and the computer readable storage medium for detecting the open eyes and the closed eyes of the human face in the image can solve the problem of inaccurate detection of the state of eyes in the image.

Drawings

Fig. 1 is a flow chart of a method for detecting whether a face is open or closed in an image according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for calculating image features in an image to be detected according to an embodiment of the present invention;

FIG. 3 is a flow chart for determining eye state categories according to an embodiment of the present invention;

fig. 4 is a functional block diagram of a face open-eye and closed-eye detection device in an image according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device for implementing the method for detecting that eyes are open to eyes and closed to eyes of a face in an image according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The embodiment of the application provides a method for detecting open eyes and closed eyes of a face in an image. The main execution body of the method for detecting whether the eyes of the face open or closed in the image is open includes, but is not limited to, at least one of a server, a terminal and the like which can be configured to execute the method provided by the embodiment of the application. In other words, the method for detecting that the eyes of the face are open and closed in the image can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (ContentDelivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

Referring to fig. 1, a flow chart of a method for detecting that eyes are open to human face in an image according to an embodiment of the present invention is shown. In this embodiment, the method for detecting that the eyes of the face open in the image are closed includes:

s1, acquiring an image set to be detected, and calculating image characteristics in each image to be detected in the image set to be detected by using a pre-constructed face detection model;

in the embodiment of the invention, the image features comprise a face size, a face center offset and the heat of each pixel point, and the face size comprises the width and the height of a face area, which are the sizes of the face frames of the image to be detected; the heat degree of each pixel point is the heat degree of each pixel point of the image to be detected, which is the heat degree of the face pixel point; and the face center offset is an offset generated between the heat of the face center pixel point of the image to be detected and the real face center pixel point.

In the embodiment of the invention, the pre-constructed face detection model can be composed of a MobileNet V2 neural network and a UNet neural network with improved structures.

In detail, the improved mobilenet v2 neural network is a lightweight convolutional neural network, the three-layer network structure of the whole mobilenet v2 neural network is removed, and a linear bottleneck module (Linear Bottleneck Block) and an inverted residual error (Inverted Residual) module of the whole mobilenet v2 neural network are reserved, so that the characteristic expression capability of the network can be improved, and further the accuracy of a face detection model can be improved.

Optionally, the UNet neural network adopts a full convolution neural network, and the left convolution network is a feature extraction network: using convolution (conv) and pooling (pooling), the right-hand convolution network is a feature fusion network: the right convolution network performs layer jump connection (con-cate) operation by using the feature map generated by up-sampling and the feature map obtained by convolution of the left convolution network, and the network is beneficial to improving the image processing speed and better retaining the image features.

In the embodiment of the present invention, before calculating the image features in each image to be detected in the image set to be detected by using the pre-constructed face detection model, the method further includes:

acquiring a preset training image set, a real face center offset corresponding to the training image set, a real face size and real heat of each pixel point;

generating a predicted face center offset, a predicted face size and a predicted heat of each pixel point of each training image in the training image set by using a pre-constructed face detection model;

calculating a loss value between the predicted heat of each pixel point and the real heat of each pixel point as a first loss value, calculating a loss value between the real face center offset and the predicted face center offset as a second loss value, and calculating a loss value between the predicted face size and the real face size as a third loss value;

And optimizing the face detection model by using the first loss value, the second loss value and the third loss value to obtain a pre-constructed face detection model.

In the embodiment of the invention, the training image set is an image collected in advance before training a model. The real face center offset, the real face size and the real heat of each pixel point corresponding to the training image set are calibrated manually by service personnel for training a model.

In the embodiment of the present invention, the optimizing the face detection model by using the first loss value, the second loss value and the third loss value to obtain a pre-constructed face detection model includes:

calculating a series loss value of the first, second and third loss values using a series loss function;

when the serial loss value is larger than a preset loss threshold value, a preset optimization algorithm is facilitated to optimize the face detection model, and an optimized face detection model is obtained;

calculating a predicted face center offset, a predicted face size and a predicted heat of each pixel point of each training image in the training image set by using the optimized face detection model, calculating a loss value between the predicted heat of each pixel point and a true heat of each pixel point as a first loss value, calculating a loss value between the true face center offset and the predicted face center offset as a second loss value, calculating a loss value between the predicted face size and the true face size as a third loss value, and returning to the step of calculating a series loss value of the first loss value, the second loss value and the third loss value by using a series loss function;

And when the serial loss value is smaller than or equal to a preset loss threshold value, obtaining a standard face detection model.

In an alternative embodiment of the present invention, model training may be performed using a minimum loss allocation strategy, that is, for each image face real frame, for all the output predicted face center offset, predicted face size, and predicted heat of each pixel, only one predicted face center offset, predicted face size, and predicted heat of each pixel with minimum tandem loss are selected as positive samples, the other are negative samples, and training is iterated for 80 times using the positive samples and the negative samples until the learning rate is reduced to a preset learning rate (e.g., 5e ^-5 ) And repeating the iteration for 80 times until the parameters of the face detection model are converged, so as to obtain the standard face detection model.

In another alternative embodiment of the present invention, when the serial loss value is greater than a preset loss threshold, an Adam optimization algorithm is used to optimize parameters of the face detection model, where the Adam optimization algorithm can adaptively adjust the learning rate in the training process of the face detection model, so that the face detection model is more accurate, and the performance of the face detection model is improved, for example, when the learning rate is reduced to a preset learning rate of 5e ^-5 When the training of the face detection model is finishedAnd obtaining a trained face detection model.

In the embodiment of the present invention, referring to fig. 2, the calculating, by using a pre-constructed face detection model, image features in each image to be detected in the image set to be detected includes:

s11, carrying out data enhancement processing on each image to be detected in the image set to be detected to obtain a standard image set to be detected;

s12, counting pixel values of all pixel points of each standard to-be-detected image in the standard to-be-detected image set one by one to obtain a pixel matrix of each standard to-be-detected image;

s13, carrying out convolution, pooling and activation treatment on the pixel matrix by utilizing the face detection model to obtain the heat of each pixel point in the standard image to be detected;

s14, counting that the pixel points with the heat degree larger than a preset threshold value in each pixel point in the standard image to be detected are human face pixel points, and calculating the human face size of the standard image to be detected according to the human face pixel points;

s15, calculating the face center offset in the standard image to be detected according to the face pixel points;

and S16, summarizing the heat degree, the face size and the face center offset of each pixel point to obtain the image characteristics in each image to be detected.

Further, the performing data enhancement processing on each image to be detected in the image set to be detected to obtain a standard image set to be detected includes:

In the embodiment of the invention, the random clipping is to randomly clip a plurality of images from one image, for example clipping by python technology; the random color dithering is a color crossing effect of shifting the hue of a formed image to cause adjacent point differences, and comprises random color dithering, random brightness dithering, random saturation dithering, random contrast dithering and the like; the random brightness jitter is the effect of causing brightness and brightness crossing on an image; the random saturation dithering is used for generating a saturation difference-like cross effect on the image; the random contrast shake is a cross effect that gives contrast difference to the contrast of an image.

In an optional embodiment of the invention, because the parameters of the neural network are numerous, if the training data is not abundant enough, the neural network is often subjected to fitting, the model generalization capability is seriously affected, the diversity of the image can be improved and the data of the image can be enhanced through random clipping processing and random color dithering, and the neural network can detect the image more accurately, so that the model generalization capability is improved.

S2, screening out a target image from the image set to be detected according to the image characteristics, and calculating a face block diagram in the target image according to the face center offset in the target image, the heat of each pixel point and the face size;

in the embodiment of the invention, the target image is an image containing a human face; the face block diagram is an image of a frame-selected face obtained by removing the complex background.

In the embodiment of the present invention, the screening the target image from the image set to be detected according to the image feature includes:

averaging the heat of all pixel points of each image to be detected in the image set to be detected;

and screening the image with the average value larger than a preset threshold value as a target image.

In the embodiment of the present invention, the calculating a face block diagram in the target image according to the face center offset in the target image, the heat of each pixel point, and the face size includes:

selecting the face pixel points with the heat degree larger than a preset threshold value, and screening the face pixel points to obtain a central pixel point;

obtaining a face center point according to the center pixel point and the face center offset;

Calculating through the width and the height of the face center point and the face size to obtain a face frame;

and cutting the target image according to the face frame to obtain a face block diagram.

Further, in an optional embodiment of the present invention, the filtering the face pixel point to obtain a center pixel point includes:

screening extreme value pixel points of an abscissa and extreme value pixel points of an ordinate from the face pixel points to obtain a first pixel point with the largest abscissa, a second pixel point with the largest ordinate, a third pixel point with the smallest abscissa and a fourth pixel point with the smallest ordinate;

connecting the first pixel point with the third pixel point to obtain a first straight line, and connecting the second pixel point with the fourth pixel point to obtain a second straight line;

and determining the center pixel point of the face pixel point according to the intersection point of the first straight line and the second straight line.

In the embodiment of the invention, the point with the pixel point heat degree larger than the preset threshold value (for example, 0.35) can be regarded as the human face pixel point.

S3, detecting face key point coordinates in the face block diagram by using a pre-built key point detection model;

In the embodiment of the invention, the key point detection model is similar to the face detection model in structure and can be composed of a MobileNet V2 neural network and a UNet neural network with improved structures. And carrying out regression on the input photo through the key point detection model to obtain a plurality of face key points, wherein the key points can be output as 486 key points.

In the embodiment of the present invention, before the detection of the coordinates of the key points of the face in the face block diagram by using the pre-constructed key point detection model, the method further includes:

acquiring a preset training image set and preset key point coordinates corresponding to the training image set;

generating predicted key point coordinates of the training image by using a pre-constructed key point detection model;

calculating a loss value between the predicted key point coordinate and the preset key point coordinate as a key point coordinate loss value;

and optimizing the key point detection model by using the key point coordinate loss value to obtain a pre-constructed key point detection model.

Further, the calculation formula of the key point coordinate loss value is as follows:

wherein L is _off For the key point loss value, o _k For the preset abscissa/ordinate,for the predicted abscissa, N is the number of training image sets, and x is the difference between the actual value of the preset key point abscissa and the predicted key point abscissa.

In the embodiment of the present invention, the step of optimizing the pre-constructed key point detection model is similar to the step and embodiment of the pre-constructed face detection model, which are not repeated herein.

In the embodiment of the present invention, the detecting the coordinates of the key points of the face in the face block diagram by using the pre-constructed key point detection model includes:

In the embodiment of the invention, the method and the device are used for face recognitionBefore the feature extraction of the block diagram, the face block diagram can be expanded to obtain the upper left point coordinate [ x ] of the face block ₁ ,y ₁ ]Lower right point coordinates [ x ₂ ,y ₂ ]The face block diagram is expanded for example. Left upper point and right lower point abscissa of human face frameThe expansion is carried out according to one fourth of the width w of the detection frame, namely +.>Left upper point and right lower point ordinate of human face frame +.>The expansion is carried out according to one fourth of the height h of the face frame, namelyBy expanding the detection range of the human face, the problem of insufficient extraction of key points in the detection of key points of the human face caused by too small detection result of the human face detection model can be solved, and the accuracy of extracting the key points is improved.

Further, the keypoint detection model converts the pixel matrix into a one-dimensional multiplied multidimensional matrix; the neurons of the first layer of convolution layers convolve the one-dimensional multiplied multidimensional matrix to obtain the input of the second layer of convolution layers, the neurons of the second layer of convolution layers convolve the input of the second layer of convolution layers, the operation of the N th layer (N can be a preset convolution layer number, for example, 4) of convolution layers is as described above, and finally the key point information of the face is obtained, wherein the convolution kernel dimension of each layer of convolution layers is different according to different feature extraction; and activating the key point information through a regressor of the key point detection model to obtain a plurality of key point coordinates, wherein the regressor comprises but is not limited to Mean-Square-Error and MSE.

In the embodiment of the invention, when the deep neural network detects the key points, each layer of neural network can be constructed according to the key point extraction requirement, and a plurality of neurons are arranged to detect different positions of the human face, so that the accuracy of the detection result is improved.

S4, extracting left and right eye corner point coordinates from the face key point coordinates, and performing outward expansion on the left and right eye corner point coordinates to obtain left and right eye frames;

In the embodiment of the present invention, the left and right eye corner point coordinates may include left and right eye corner point coordinates of a left eye, and left and right eye corner point coordinates of a right eye. The left and right eye frames obtained by the expansion are image frames containing eyes.

Further, the extracting the coordinates of the left and right eye corner points from the coordinates of the face key point includes:

In the embodiment of the invention, the INDEX can be selected from CREATE INDEX function to construct the INDEX of the key point data table.

In the embodiment of the invention, taking the left eye as an example, the left eye angle point coordinate [ p ] of the left eye ₁ ,q ₁ ]And right eye corner point coordinates [ p ] ₂ ,q ₂ ]The outer expansion weight l=p is obtained through calculation ₁ -p ₂ The formula for the flaring of the eye region is as follows:

wherein,is the abscissa point of the left eye frame after expansion, < - >Is the ordinate point of the left eye frame after the expansion.The coordinates of the lower right corner, the lower left corner, the upper right corner and the upper right corner of the left eye socket are respectively, and the left eye socket can be obtained through the coordinates. The manner in which the right frame is obtained is described in the above example, and is not described in detail herein.

S5, cutting out left and right eye images from the face block diagram according to the left and right eye frames, and classifying and detecting the left and right eye images by using a pre-built eye state classification model to obtain eye state types.

In the embodiment of the invention, the pre-constructed eye state classification model is similar to the human face detection model and the key point detection model in composition and training process, and can be composed of a MobileNet V2 neural network and a UNet neural network with improved structures.

In an embodiment of the present invention, before the classifying and detecting the left and right eye images by using the pre-constructed eye state classification model, the method further includes:

acquiring a preset training image set and a real eye state corresponding to the training image set;

generating a predicted eye state of the training image by using a pre-constructed eye state classification model;

calculating a loss value between the predicted eye state and the real eye state as an eye state loss value;

And optimizing the eye state classification model by using the eye state loss value to obtain a pre-constructed eye state classification model.

Further, the calculation formula of the key point loss value is as follows:

wherein the method comprises the steps ofFor eye state loss value e ⁽ⁱ⁾ For the real eye state>To predict eye state, N is the number of training images.

In the embodiment of the present invention, the step of optimizing the pre-constructed eye state classification model is similar to the steps and embodiments of the pre-constructed face detection model and the pre-constructed key point detection model described above, and will not be repeated herein.

In the embodiment of the invention, computer sentences with a data grabbing function, such as java sentences, python sentences and the like, can be utilized. The pre-stored face detection model, keypoint detection model, and eye state classification model are captured from pre-built storage areas including, but not limited to, databases, blockchain nodes, network caches.

In the embodiment of the invention, the left eye image and the right eye image can be obtained by cutting the region of the left eye orbit and the right eye orbit through the image to be detected.

In the embodiment of the present invention, referring to fig. 3, the classifying and detecting the left and right eye images by using the pre-constructed eye state classification model to obtain an eye state class includes:

S51, counting pixel values of all pixel points in the left and right eye images to respectively obtain a pixel matrix of the left eye image and a pixel matrix of the right eye image;

s52, respectively converting the pixel matrix of the left eye image and the pixel matrix of the right eye image into one-dimensional matrixes by using the eye state classification model to obtain the one-dimensional matrixes of the left eye image and the right eye image;

s53, carrying out convolution and full connection operation of preset times on the one-dimensional matrix of the left eye image and the one-dimensional matrix of the right eye image by utilizing a plurality of neurons of a hidden layer in the eye state classification model to obtain left and right eye state information;

s54, activating the left and right eye state information by using a two-classifier to obtain the open eye state probability and the closed eye state probability of the left and right eyes;

s55, judging whether the eye opening state probability is larger than the eye closing state probability;

if the eye-open state probability is greater than the eye-closed state probability, executing S56 to determine that the eye-state categories of the left eye and the right eye are eye-open;

and if the open eye state probability is smaller than or equal to the closed eye state probability, executing S57, and determining the eye state categories of the left eye and the right eye as closed eyes.

In the embodiment of the invention, the eye state is detected through the depth network, and finally, two classification results of opening eyes and closing eyes are obtained, the neural network activation function can calculate the output of the full-connection layer by adopting the function of the two classifiers, and the probability values of opening eyes and closing eyes are respectively obtained, so that the eye state can be obtained. For example, the full-link layer first neuron outputs an open eye probability value of 0.8 through the two classifier functions, and the full-link layer second neuron outputs an eye closing probability value of 0.2 through the two classifier functions, so that the eye state is eye closing.

According to the embodiment of the invention, the image to be detected is detected by using the face detection model to obtain the face block diagram, and then the face positioning and the face key point extraction are carried out on the face block diagram by using the key point detection model, so that the acquisition of the face eye area is more accurate and the false detection rate is lower; the eye state classification model is adopted to realize the eye image state classification of the external reaming region, so that the detection is more accurate and the robustness is high. Therefore, the method for detecting the open eyes and the closed eyes of the face in the image can solve the problem of inaccurate detection of the states of the eyes in the image.

Fig. 4 is a functional block diagram of a device for detecting that a face is open and eyes are closed in an image according to an embodiment of the present invention.

The device 100 for detecting the opening and closing of the eyes of the face in the image can be installed in electronic equipment. Depending on the functions implemented, the device for detecting that the eyes of the face open and closed in the image 100 may include an image detection module 101, a face block diagram acquisition module 102, a face key point acquisition module 103, a left and right orbit acquisition module 104, and an eye state acquisition module 105. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.

In the present embodiment, the functions concerning the respective modules/units are as follows:

the image detection module 101 is configured to obtain a set of images to be detected, and calculate image features in each image to be detected in the set of images to be detected by using a pre-constructed face detection model;

the face block diagram obtaining module 102 screens out a target image from the image set to be detected according to the image characteristics and calculates a face block diagram in the target image;

a face key point obtaining module 103, configured to detect face key point coordinates in the face block diagram by using a pre-constructed key point detection model;

The left and right eyebox obtaining module 104 is configured to extract left and right corner point coordinates from the face key point coordinates, and obtain left and right eyeboxes by performing outer expansion on the left and right corner point coordinates;

the eye state obtaining module 105 is configured to cut out left and right eye images from the face block according to the left and right eye frames, and perform classification detection on the left and right eye images by using a pre-constructed eye state classification model to obtain an eye state class.

In detail, each module in the device 100 for detecting whether the face is open or closed in the image in the embodiment of the present invention adopts the same technical means as the method for detecting whether the face is open or closed in the image described in fig. 1 to 3, and can produce the same technical effects, which are not repeated here.

Fig. 5 is a schematic structural diagram of an electronic device for implementing a method for detecting that a face in an image is open and closed according to an embodiment of the present invention.

The electronic device 1 may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a face open eye and closed eye detection program in an image.

The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, executes or executes programs or modules stored in the memory 11 (for example, executes a face-open eye-closure detection program or the like in an image), and invokes data stored in the memory 11 to perform various functions of the electronic device and process data.

The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in an electronic device and various types of data, such as codes of a face-open-eye-close detection program in an image, but also for temporarily storing data that has been output or is to be output.

The communication bus 12 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.

The communication interface 13 is used for communication between the electronic device and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.

Fig. 5 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.

For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.

It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.

The face open eye and closed eye detection program in the image stored in the memory 11 in the electronic device 1 is a combination of a plurality of computer programs, which when run in the processor 10 can realize:

Obtaining an image set to be detected, and calculating the face size, the face center offset and the heat of each pixel point in each image to be detected in the image set to be detected by using a pre-constructed face detection model;

screening out a target image from the image set to be detected according to the heat, and calculating a face block diagram in the target image according to the face center offset in the target image, the heat of each pixel point and the face size;

In particular, the specific implementation method of the processor 10 on the computer program may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.

Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).

The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:

In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. A method for detecting open eyes and closed eyes of a face in an image, the method comprising:

acquiring an image set to be detected, and calculating image characteristics in each image to be detected in the image set to be detected by using a pre-constructed face detection model, wherein the image characteristics comprise: carrying out data enhancement processing on each to-be-detected image in the to-be-detected image set to obtain a standard to-be-detected image set, counting pixel values of all pixel points of each standard to-be-detected image in the standard to-be-detected image set one by one to obtain a pixel matrix of each standard to-be-detected image, carrying out convolution, pooling and activation processing on the pixel matrix by using the face detection model to obtain the heat degree of each pixel point in the standard to-be-detected image, selecting the pixel point with the heat degree of each pixel point being greater than a preset threshold value in the standard to-be-detected image as a face pixel point, calculating the face size of the standard to-be-detected image according to the face pixel point, calculating the face center offset in the standard to-be-detected image according to the face pixel point, and summarizing the heat degree, the face size and the face center offset of each pixel point to obtain the image characteristics in each to-be-detected image;

extracting features of the face block diagram by using each convolution layer of a pre-constructed key point detection model to obtain key point feature information, activating the key point feature information by using a regressor to obtain face key point coordinates, wherein the key point detection model consists of a MobileNet V2 neural network and a UNet neural network with improved structures, the MobileNet V2 neural network with improved structures removes the three-layer network structure of the whole MobileNet V2 neural network, and a linear bottleneck module and an inverse residual module of the whole MobileNet V2 neural network are reserved;

extracting position information of each key point from key point characteristic information corresponding to the key point coordinates of the face, generating a key point data table according to the position information, constructing an index of the key point data table by using a preset index function, searching in the index by using a preset left eye corner point coordinate label and a preset right eye corner point coordinate label, taking the searched position information as left and right eye corner point coordinates, and expanding the left and right eye corner point coordinates to obtain left and right eye frames;

2. The method for detecting open eyes and closed eyes of a face in an image according to claim 1, wherein the step of performing data enhancement processing on each image to be detected in the image set to be detected to obtain a standard image set to be detected includes:

3. A method for detecting open-eye and closed-eye of a face in an image according to claim 1, wherein said screening out a target image from the image set to be detected according to the image features comprises:

4. A method for detecting that a face in an image is open and closed according to any one of claims 1 to 3, wherein the classifying the left and right eye images using a pre-constructed eye state classification model to obtain an eye state class includes:

5. An eye-closure detection device for opening eyes of a face in an image, the device comprising:

the image detection module is used for acquiring an image set to be detected, calculating image characteristics in each image to be detected in the image set to be detected by using a pre-constructed face detection model, and comprises the following steps: carrying out data enhancement processing on each to-be-detected image in the to-be-detected image set to obtain a standard to-be-detected image set, counting pixel values of all pixel points of each standard to-be-detected image in the standard to-be-detected image set one by one to obtain a pixel matrix of each standard to-be-detected image, carrying out convolution, pooling and activation processing on the pixel matrix by using the face detection model to obtain the heat degree of each pixel point in the standard to-be-detected image, selecting the pixel point with the heat degree of each pixel point being greater than a preset threshold value in the standard to-be-detected image as a face pixel point, calculating the face size of the standard to-be-detected image according to the face pixel point, calculating the face center offset in the standard to-be-detected image according to the face pixel point, and summarizing the heat degree, the face size and the face center offset of each pixel point to obtain the image characteristics in each to-be-detected image;

the face key point acquisition module is used for carrying out feature extraction on the face block diagram by utilizing each convolution layer of a pre-constructed key point detection model to obtain key point feature information, activating the key point feature information by utilizing a regressive to obtain face key point coordinates, wherein the key point detection model consists of a MobileNet V2 neural network and a UNet neural network with improved structure, the MobileNet V2 neural network with improved structure removes the three-layer network structure of the complete MobileNet V2 neural network, and a linear bottleneck module and a reverse residual error module of the complete MobileNet V2 neural network are reserved;

the left eye orbit acquisition module is used for extracting the position information of each key point from the key point characteristic information corresponding to the key point coordinates of the human face, generating a key point data table according to the position information, constructing an index of the key point data table by using a preset index function, searching in the index by using a preset left eye corner point coordinate label and a preset right eye corner point coordinate label, taking the searched position information as left eye corner point coordinates and right eye corner point coordinates, and expanding the left eye corner coordinates and the right eye corner point coordinates to obtain left eye frames and right eye frames;

6. An electronic device, the electronic device comprising:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the face-open-eye detection method in an image as claimed in any one of claims 1 to 4.

7. A computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the method for detecting that the eyes of a face are open in an image according to any one of claims 1 to 4.