CN108764139B

CN108764139B - Face detection method, mobile terminal and computer readable storage medium

Info

Publication number: CN108764139B
Application number: CN201810530284.3A
Authority: CN
Inventors: 张弓
Original assignee: Oppo Chongqing Intelligent Technology Co Ltd
Current assignee: Oppo Chongqing Intelligent Technology Co Ltd
Priority date: 2018-05-29
Filing date: 2018-05-29
Publication date: 2021-01-29
Anticipated expiration: 2038-05-29
Also published as: CN108764139A

Abstract

The application is applicable to the technical field of face detection, and provides a face detection method, a mobile terminal and a computer readable storage medium, wherein the method comprises the following steps: after a camera of the mobile terminal is started, whether the current condition is a backlight condition is determined, if the current condition is the backlight condition, whether a human face exists in a preview image of the camera is detected through a first human face detection model, if the human face is detected in the preview image of the camera through the first human face detection model, a human face area is marked in the preview image, and the accuracy of human face detection under the backlight condition can be improved through the application.

Description

Face detection method, mobile terminal and computer readable storage medium

Technical Field

The present application belongs to the technical field of face detection, and in particular, to a face detection method, a mobile terminal, and a computer-readable storage medium.

Background

With the development of intelligent mobile terminals, people use the photographing function of mobile terminals such as mobile phones more and more frequently. The photographing function of most of the existing mobile terminals supports face detection, and after a face is detected, focusing, beautifying and other operations are performed on the detected face.

Currently, the detection of the human face is usually based on a traditional skin color detection model or a human face feature point detection model. However, the difference of the photographing environments of the cameras is large, and in some photographing environments, the detection effect is poor or even the face cannot be detected based on the traditional face detection method.

Disclosure of Invention

In view of this, embodiments of the present application provide a face detection method, a mobile terminal, and a computer-readable storage medium, so as to solve the problem that the detection effect is poor in the conventional face detection method in some current photographing environments.

A first aspect of an embodiment of the present application provides a face detection method, including:

after a camera of the mobile terminal is started, determining whether the current condition is a backlight condition;

if the backlight condition is met, detecting whether a human face exists in a preview image of the camera through the first human face detection model;

and if the first face detection model detects a face in a preview image of the camera, marking a face area in the preview image.

A second aspect of an embodiment of the present application provides a mobile terminal, including:

the determining module is used for determining whether the current condition is a backlight condition or not after a camera of the mobile terminal is started;

the first detection module is used for detecting whether a human face exists in a preview image of the camera through the first human face detection model if the backlight condition exists;

and the marking module is used for marking a face area in the preview image if the first face detection model detects a face in the preview image of the camera.

A third aspect of an embodiment of the present application provides a mobile terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method provided in the first aspect of the embodiment of the present application when executing the computer program.

A fourth aspect of embodiments of the present application provides a computer-readable storage medium storing a computer program which, when executed by one or more processors, performs the steps of the method provided by the first aspect of embodiments of the present application.

A fifth aspect of embodiments of the present application provides a computer program product comprising a computer program that, when executed by one or more processors, performs the steps of the method provided by the first aspect of embodiments of the present application.

This application embodiment is starting behind mobile terminal's the camera, confirm whether present be the backlight condition, if for the backlight condition, then whether have the people's face in the preview image through first face detection model detection camera, if pass through first face detection model detects the people's face in the preview image of camera, then mark out the face region in the preview image, because this application is after mobile terminal's camera is started, at first confirm whether present be the backlight condition, if for the backlight condition, then whether have the people's face in the preview image through the first face detection model detection camera that sets up for the backlight condition, just so solved under the backlight condition through traditional face detection mode detection effect relatively poor problem.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flow chart illustrating an implementation of a face detection method according to an embodiment of the present application;

fig. 2 is a schematic flow chart illustrating an implementation of another face detection method according to an embodiment of the present application;

FIG. 3 is a schematic grayscale diagram of a photograph taken under backlighting conditions provided by an embodiment of the present application;

fig. 4 is a schematic block diagram of a mobile terminal according to an embodiment of the present application;

fig. 5 is a schematic block diagram of another mobile terminal provided in an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.

Fig. 1 is a schematic flow chart of an implementation process of a face detection method provided in an embodiment of the present application, and is applied to a mobile terminal, where as shown in the figure, the method may include the following steps:

step S101, after the camera of the mobile terminal is started, determining whether the current condition is a backlight condition.

In the embodiment of the application, after the camera of the mobile terminal is started, the display interface of the mobile terminal displays a preview image, that is, a picture currently acquired by the camera, and whether the current picture is a backlight condition or not can be determined according to the picture currently acquired by the camera. The backlight condition is a condition that the background brightness is far higher than that of a shot main body due to the fact that the shot main body is just between a light source and a camera and the special position of the shot main body, and particularly when the area occupied by the background in a picture is larger than that of the shot main body, the background can be exposed according to the light condition of the background, so that the exposure of the shot main body is insufficient. When the subject is a human face, it is found that the human face is partially blurred and darker in color in the preview image.

In practical application, the method for determining whether the current preview image is the backlight condition can detect the current preview image, determine whether the current preview image is the backlight condition according to the detection result, and detect whether the current preview image is the backlight condition through a sensor arranged on the mobile terminal.

As another embodiment of the present application, after determining whether the backlight condition is currently present, the method may further include:

if the condition is not a backlight condition, detecting whether a human face exists in a preview image of the camera through a second human face detection model;

and if the face is detected in the preview image of the camera through the second face detection model, marking a face area in the preview image.

In the embodiment of the present application, the second face detection model is a traditional face detection method, such as an HSV skin color model, or a model detected according to a face feature point. The second face detection model is based on the skin color of the face or the feature points of the face for detection, and under the backlight condition, the face is fuzzy and darker, the detection effect is poor through the traditional face detection mode, and even the situation that the face cannot be detected but exists in the preview image occurs. So when the non-backlighting condition can be determined, the second face detection model is used again to detect whether a face is present in the preview image of the camera.

And step S102, if the backlight condition is met, detecting whether a human face exists in a preview image of the camera through the first human face detection model.

In the embodiment of the application, the first face detection model is a MobileNet-SSD convolutional neural network model, and due to the limitation of the use environment of the mobile terminal, the memory is small compared with the memory of the terminal device such as a computer, the processing capability of the processor is weak, and the large-scale convolutional neural network model cannot be deployed and operated on the mobile terminal such as a mobile phone. Therefore, MobileNets, also called MobileNets, are selected. The MobileNet is a light-weight deep neural network provided for embedded equipment such as mobile phones, and effectively reduces network parameters by decomposing convolution kernels in the neural network. The decomposition process is to decompose the standard convolution into a deep convolution and a point convolution, the deep convolution applies each convolution kernel to each channel, and the point convolution is used for combining the output of the channel convolution. The SSD network model is used for target detection, combines the MobileNet and the SSD network together, and can be used for target detection in embedded equipment such as a mobile phone. Taking VGG-SSD and MobileNet-SSD as examples, 7 detected pictures were detected on the same device, respectively, and it can be found that the detection time used by MobileNet-SSD is 1/6 to 1/2 of the detection time used by VGG-SSD.

Step S103, if a human face is detected in a preview image of the camera through the first human face detection model, marking a human face area in the preview image.

In the embodiment of the present application, after a face is detected in a preview image of a camera by a first face detection model, a preview image with a detection frame is generated, that is, a detection frame is generated in a face region in a current preview image.

It should be noted that, in the embodiment shown in fig. 1, after the camera of the mobile terminal is started, it is first determined whether a backlight condition exists currently, and in the backlight condition, a MobileNet-SSD detection model is used to detect whether a face exists in a preview image of the camera, and if not, a traditional face detection model, for example, an HSV skin color model is used to detect whether a face exists in a preview image of the camera.

In practical application, after a camera of the mobile terminal is started, a traditional face detection method is adopted, for example, an HSV skin color model is used for detecting whether a face exists in a preview image of the camera, if the face is not detected in the preview image of the camera through a second face detection model, whether a backlight condition exists is determined, and if the backlight condition exists, whether the face exists in the preview image of the camera is detected through a MobileNet-SSD detection model. The purpose of this is: although the MobileNet-SSD detection model belongs to a lightweight neural network compared with other convolutional neural network models, compared with the conventional face detection method, the method still needs to occupy more memory, and the detection of the backlight condition also occupies the memory, and in practical application, it is not that the face cannot be detected by the conventional face detection method under the backlight condition. Therefore, whether a human face exists in the preview image of the camera is preferentially detected by adopting a traditional human face detection mode which occupies a smaller memory, and only when the traditional human face detection mode does not detect the human face in the preview image of the camera and is in a backlight condition, the MobileNet-SSD detection model is started, because the human face is not in the backlight condition, the human face does not exist in the preview image in the situation that the human face is not detected by the traditional human face detection model in the preview image. Therefore, the detection precision of the human face can be improved when the human face exists in the preview image under various shooting environments, and the internal memory of the system is controlled to be in a state as small as possible.

In the embodiment of the present application, in addition to the above-listed face detection models, the first face detection model and the second face detection model may also be other face detection models. However, the detection accuracy of the first face detection model is higher than that of the second face detection model, because the face region is relatively dark under the backlight condition, and it is relatively difficult to detect the face in the image under the backlight condition, the first face detection model with relatively high detection accuracy can be selected for detection, and the second face detection model with relatively low detection accuracy can be selected for detection under the non-backlight condition, and the face is relatively clear. The model with high detection precision is relatively complex, the memory occupancy rate is high, the model with low detection precision is relatively simple, and the memory occupancy rate is low. In order to not only detect the face in the image, but also reduce the memory occupancy rate during photographing to a lower state, a second face detection model with lower detection precision and lower memory occupancy rate can be selected under a non-backlight condition, and a first face detection model with higher detection precision and higher memory occupancy rate is selected under a backlight condition.

According to the method and the device, after the camera of the mobile terminal is started, whether the current condition is a backlight condition is determined, if the current condition is the backlight condition, whether the human face exists in the preview image of the camera is detected through the first human face detection model set for the backlight condition, and therefore the problem that the detection effect is poor through a traditional human face detection mode under the backlight condition is solved.

Fig. 2 is a schematic flow chart of another face detection method provided in the embodiment of the present application, and as shown in the figure, the method describes how to determine whether a backlight condition is currently set based on the embodiment shown in fig. 1, and specifically may include the following steps:

first, we analyze the preview image under a backlight condition, which means that the subject is just between the light source and the camera, resulting in insufficient exposure of the subject. As shown in fig. 3, which is a gray scale image of a photograph taken under a backlight condition, it can be seen from fig. 3 that at least one light source region is usually present in a preview image under the backlight condition, the light source region may be a light source or an intense light emitted by a light source, and the gray scale value of a pixel point in the light source region is higher than that in other regions. The method comprises the steps of firstly determining a light source area, for example, obtaining a pixel point of a current preview image with a gray value within a first preset range, determining the light source area in the preview image according to coordinates of the pixel point, and recording the determined light source area as a first area. Steps S201 to S203 are processes of determining the light source region.

Step S201, obtaining a pixel point of the current preview image with a gray value within a first preset range, and generating a pixel point distribution map according to coordinates of the pixel point of the gray value within the first preset range.

In the embodiment of the present application, the preview image is first processed to obtain a grayscale image. Because of the difference of the light source and the difference of the photographing angle under the backlight condition, the gray value of the light source region does not necessarily approach 255, for example, when the white light is used as the light source, the light source region in the gray image of the photograph taken under the backlight condition may approach 255; in contrast, at night, when warm white light is used as a light source, the light source region in the grayscale image of the photograph taken under the backlight condition does not approach 255, but in general, the grayscale value of the light source region in the grayscale image of the photograph taken under the backlight condition is necessarily closer to the 255 side no matter what kind of photographing environment. Therefore, a first predetermined range, such as 200-. Of course, in practical applications, other range values may be set as the first preset range.

After the first preset range is set, it is necessary to acquire pixel points in the current preview image, where the gray values are within the first preset range, and as can be seen from fig. 3, actually, only the gray values near the light source are within the first preset range, and due to reflection or the color of other objects, the pixel points in the first preset range of the gray values may be distributed in various places in the preview image, but the number of the pixel points near the light source is certainly the largest. At this time, a pixel point distribution graph can be generated according to the coordinates of the pixel points of the gray value in the first preset range, and the region with the most concentrated pixel points is obtained from the pixel point distribution graph.

Step S202, sliding a sliding window with a preset width on the pixel distribution map to obtain the position of the sliding window when the sliding window comprises the most pixels, and recording the third gray average value and the coordinate average value of all pixels included in the sliding window when the sliding window is at the position.

In the embodiment of the present application, the pixel point distribution map is actually a scatter map, and an area where pixel points are most concentrated can be obtained on the pixel point distribution map by a clustering method. The embodiment of the application adopts a sliding window method, a sliding window with a preset width can be set, the sliding window with the preset width slides from left to right, from right to left, from top to bottom or from bottom to top on the pixel distribution diagram, when the sliding window at a certain position comprises most pixels, the position is recorded, the gray mean value and the coordinate mean value of all pixels included by the sliding window at the position are calculated, and the gray mean value is recorded as a third gray mean value.

In practical application, the sliding window may be set to a preset width, the length is not limited, that is, a long sliding window with a limited width and an unlimited length slides from the left side of the pixel point distribution diagram of the preview image to the right side, and may also be set to slide from the right side to the left side. After the sliding window is adjusted to 90 degrees, the sliding window can slide downwards from the upper part of the pixel point distribution diagram or slide upwards from the lower part. Of course, the sliding window may also be a sliding window with a preset width and a preset length (such as a rectangular window shown in fig. 3), that is, the sliding window is a rectangular window with a limited length and a limited width. If the sliding process of the rectangular window with limited length and limited width is as follows: and starting to slide the sliding window rightwards from the upper left corner of the pixel point distribution diagram by a preset step length, moving the sliding window downwards by the preset step length to slide from the rightmost side to the leftmost side of the pixel point distribution diagram after sliding to the rightmost side of the pixel point distribution diagram, moving the sliding window downwards by the preset step length to slide from the leftmost side to the rightmost side of the pixel point distribution diagram after sliding to the leftmost side of the pixel point distribution diagram, … …, and so on until sliding is finished. And finding the position of the sliding window including the maximum pixel points, and recording the gray average value and the coordinate average value of the pixel points in the sliding window when the sliding window is at the position. It should be noted that the sliding process is only an example, and in an actual process, the sliding process may start from any position of the pixel point distribution diagram and end at any position, and the sliding window is required to completely cover the pixel point distribution diagram in the process of sliding according to a preset step length.

As shown in fig. 3, under a backlight condition, since the light source exists, the light source region in the preview image is a region where the pixels with the gray values within the first preset range are relatively concentrated, and therefore, the position where the most pixels are included through the sliding window is the position where the light source region is located. The point (the central point in the rectangular window in fig. 3) at which the coordinate mean determined by the position is located is the central point of the light source region.

Step S203, regarding a point where the coordinate mean value is located as a preset distance of a center point, regarding a region formed by pixel points whose difference value with the third gray scale mean value is within a second preset range as the light source region, and recording the light source region as a first region.

In this embodiment of the application, as described above, since the center point of the light source region can be determined by the position where the sliding window includes the maximum number of pixels, and the average values of the pixels in the light source region are all the points that are close to 255 and within a certain range, the light source region can be determined by the area that is formed by the pixels whose difference value from the third gray-scale average value is within the second preset range within the preset distance that takes the point where the coordinate average value is located as the center point.

Continuing with FIG. 3 as an example, in FIG. 3 the light source region may be a region centered at a point within the rectangular window, however, the specific extent of the light source region may be specified manually. The third gray-scale average value h is set as the average value of the pixel points in the light source area, then the second preset range is set to be 0-a, and then the gray-scale value x of the pixel point with the difference value of the third gray-scale average value in the second preset range meets | x-h | less than or equal to a, namely x ═ h-a, h + a.

In practical application, the third grayscale mean value is an average value of all pixel points in a position where the sliding window includes the most pixel points, and since the size of the sliding window is manually set, the size of the window affects the third grayscale mean value. Therefore, when determining the gray scale range of the pixel point of the light source region, the third gray scale mean value is not necessarily the central point. Assuming that the third gray-scale average value is 245, the gray-scale values of the pixels in the light source region may be set to be [245-a1, 245+ a2], and a1 and a2 may or may not be equal. Assuming that a1 is a2 is 5, the third gray average h is set as an average value of the pixels in the light source region, the set second preset range is [0-5], the gray scale value range of the pixels in the light source region is [240, 250], in the gray scale image of the preview image, the region where all the pixels with gray scale values in the range of [240, 250] are located is not the light source region, and it is further required to be limited to a circular region with the center point of the rectangular window as the center point, and the region formed by the pixels with gray scale values in the range of [240, 250] is the light source region. The size of the circular area may be limited by a predetermined value, for example, the circular area is within a predetermined distance from the point where the coordinate mean is located as the center point. Of course, the value of the preset distance may be manually specified, or may be obtained by a calculation method, for example, setting a step length b, taking a point where the coordinate mean value is located as a central point, sequentially increasing the radius of the circular area according to the step length b, calculating the mean value of the pixel points in the circular area corresponding to each radius (r ═ nb, n is a natural number greater than 0), drawing a curve in which the mean value of the pixel points changes along with the radius, and recording a point when a tangent slope at a certain position of the curve is smaller than a preset slope, where a radius value corresponding to the point is the preset distance.

After finding the light source area in the preview image, recording the light source area as a first area, and meanwhile, taking an area outside the first area in the preview image as a second area.

It should be noted that, the light source region is not a circular region with a radius of a preset distance and a point with the coordinate mean as a center of a circle, but a region composed of pixels with a difference between the gray value in the circular region with the radius of the preset distance and the third gray mean within a second preset range and with the point with the coordinate mean as a center of a circle. Furthermore, the light source area need not necessarily include a light source, for example, in a non-backlit condition, the light source area may be determined to be only the area of white clothing in the preview image. Of course, if the determined light source region is only the region of white clothes in the preview image, the difference between the first gray average of the pixel points in the first region and the second gray average of the pixel points in the second region is less than or equal to the preset value, and it can be determined that the current preview image is not collected under the backlight condition. Even if a face is not detected, there is a possibility that the face does not exist only in the current preview image itself.

After the first and second regions are determined, it is to determine whether or not a backlight condition is present through the first and second regions. Step S204 and step S205 are how to determine whether the current photographing environment is a backlight condition according to the first gray average of the pixel points in the first region and the second gray average of the pixel points in the second region in the preview image.

Step S204, if the difference value between the first gray level mean value of the pixel points in the first area and the second gray level mean value of the pixel points in the second area is larger than a preset value, determining that the current photographing environment is a backlight condition.

Step S205, if a difference between the first gray-scale average of the pixels in the first region and the second gray-scale average of the pixels in the second region is less than or equal to the preset value, determining that the current photographing environment is not a backlight condition.

In the embodiment of the application, by observing the gray level histograms under the backlight condition and the non-backlight condition, it can be found that the gray level histograms under the backlight environment have more pixel points distributed on extremely bright and extremely dark gray levels and relatively less pixel points distributed on intermediate gray levels; the gray level histogram in the non-backlight environment has fewer pixel points distributed on the extremely bright and the extremely dark gray levels, and more pixel points distributed on the intermediate gray level. The light source area determined in the front is the concentrated area where the pixel points on the extremely bright gray are located. Under the backlight condition, after the first region (light source region) is deducted, the gray average value of the remaining second region is relatively smaller (more pixel points are in the extremely dark gray range). Under the non-backlight condition, the range of the first region is very small, and after the first region is deducted, the average value of the remaining second region is relatively large (more pixels in the middle gray scale range). Then, it can be obtained that the difference between the first gray level mean value of the pixel points in the first region and the second gray level mean value of the pixel points in the second region under the backlight condition is larger than the difference between the first gray level mean value of the pixel points in the first region and the second gray level mean value of the pixel points in the second region under the non-backlight condition. A preset value can be set, and if the difference value between the first gray level mean value of the pixel points in the first area and the second gray level mean value of the pixel points in the second area is greater than the preset value, the current photographing environment is determined to be a backlight condition; and if the difference value between the first gray average value of the pixel points in the first area and the second gray average value of the pixel points in the second area is smaller than or equal to the preset value, determining that the current photographing environment is not in a backlight condition.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 4 is a schematic block diagram of a mobile terminal according to an embodiment of the present application, and only a portion related to the embodiment of the present application is shown for convenience of description.

The mobile terminal 4 may be a software unit, a hardware unit or a combination of software and hardware unit built in a mobile terminal such as a mobile phone, a tablet computer, a notebook computer, etc., or may be integrated into a mobile terminal such as a mobile phone, a tablet computer, a notebook computer, etc., as an independent pendant.

The mobile terminal 4 includes:

a determining module 41, configured to determine whether a current backlight condition is present after a camera of the mobile terminal is started;

the first detection module 42 is configured to detect whether a human face exists in a preview image of the camera through the first human face detection model if the backlight condition is met;

and the labeling module 43 is configured to, if a human face is detected in a preview image of the camera by the first human face detection model, mark a human face region in the preview image.

Optionally, the mobile terminal 4 further includes:

a second detection module 44, configured to detect whether a human face exists in a preview image of the camera through a second human face detection model before determining whether the current condition is a backlight condition, where detection accuracy of the second human face detection model is lower than that of the first human face detection model;

the determining module 41 is further configured to determine whether a backlight condition exists currently if no face is detected in the preview image of the camera through the second face detection model.

Optionally, the determining module 41 includes:

a light source region determining unit 411, configured to obtain a pixel point in a current preview image, where a gray value of the pixel point is within a first preset range, determine a light source region in the preview image according to coordinates of the pixel point, and mark the determined light source region as a first region;

a backlight determining unit 412, configured to determine whether the current photographing environment is a backlight condition according to a first gray average of pixel points in the first region and a second gray average of pixel points in a second region in the preview image, where the second region is a region outside the first region in the preview image.

Optionally, the light source region determining unit 411 includes:

a distribution map obtaining subunit 4111, configured to generate a pixel distribution map according to the coordinates of the pixels of which the gray values are within a first preset range;

a mean value determining subunit 4112, configured to slide on the pixel point distribution map through a sliding window with a preset width to obtain a position where the sliding window includes the most pixel points, and record a third grayscale mean value and a coordinate mean value of all pixel points included in the sliding window at the position;

and a light source region determining subunit 4113, configured to determine, as the light source region, a region formed by pixel points in a second preset range, where a difference between the pixel points and the third gray scale mean value is within a preset distance taking a point where the coordinate mean value is located as a center point.

Optionally, the backlight determining unit 412 is further configured to:

if the difference value between the first gray average value of the pixel points in the first area and the second gray average value of the pixel points in the second area is larger than a preset value, determining that the current photographing environment is a backlight condition;

and if the difference value between the first gray average value of the pixel points in the first area and the second gray average value of the pixel points in the second area is smaller than or equal to the preset value, determining that the current photographing environment is not in a backlight condition.

Optionally, the first face detection model is a MobileNet SSD convolutional neural network model, and the second face detection model is an HSV skin color model.

It will be apparent to those skilled in the art that, for convenience and simplicity of description, the foregoing functional units and modules are merely illustrated in terms of division, and in practical applications, the foregoing functional allocation may be performed by different functional units and modules as needed, that is, the internal structure of the mobile terminal is divided into different functional units or modules to perform all or part of the above described functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the above-mentioned apparatus may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Fig. 5 is a schematic block diagram of a mobile terminal according to another embodiment of the present application. As shown in fig. 5, the mobile terminal 5 of this embodiment includes: one or more processors 50, a memory 51 and a computer program 52 stored in said memory 51 and executable on said processors 50. The processor 40, when executing the computer program 52, implements the steps in the above-described embodiments of the face detection method, such as the steps S101 to S103 shown in fig. 1. Alternatively, the processor 50, when executing the computer program 52, implements the functions of the modules/units in the above-described mobile terminal embodiments, such as the functions of the modules 41 to 43 shown in fig. 4.

Illustratively, the computer program 52 may be partitioned into one or more modules/units, which are stored in the memory 51 and executed by the processor 50 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 52 in the mobile terminal 5. For example, the computer program 52 may be divided into a determination module, a first detection module, an annotation module.

The determining module is used for determining whether the current condition is a backlight condition or not after the camera of the mobile terminal is started;

and the marking module is used for marking a face area in a preview image of the camera if the first face detection model detects a face in the preview image.

Other modules or units can refer to the description of the embodiment shown in fig. 4, and are not described again here.

The mobile terminal includes, but is not limited to, a processor 50, a memory 51. Those skilled in the art will appreciate that fig. 5 is only one example of a mobile terminal 5 and is not intended to limit the mobile terminal 5 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the mobile terminal may also include input devices, output devices, network access devices, buses, etc.

The Processor 50 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 51 may be an internal storage unit of the mobile terminal 5, such as a hard disk or a memory of the mobile terminal 5. The memory 51 may also be an external storage device of the mobile terminal 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the mobile terminal 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the mobile terminal 5. The memory 51 is used for storing the computer program and other programs and data required by the mobile terminal. The memory 51 may also be used to temporarily store data that has been output or is to be output.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed mobile terminal and method may be implemented in other ways. For example, the above-described embodiments of the mobile terminal are merely illustrative, and for example, the division of the modules or units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A face detection method is applied to a mobile terminal, and the method comprises the following steps:

after a camera of the mobile terminal is started, determining whether the current condition is a backlight condition; the determining whether the backlight condition is currently present comprises: acquiring a pixel point of which the gray value is within a first preset range in a current preview image, confirming a light source area in the preview image according to the coordinate of the pixel point, and recording the confirmed light source area as a first area, wherein the method comprises the following steps of: generating a pixel point distribution map according to the coordinates of the pixel points of the gray value in a first preset range; sliding a sliding window with a preset width on the pixel distribution map to obtain the position of the sliding window when the sliding window comprises the most pixels, and recording a third gray average value and a coordinate average value of all pixels included in the sliding window when the sliding window is at the position, wherein the third gray average value is an average value of all pixels in the position of the sliding window when the sliding window comprises the most pixels; within a preset distance taking the point where the coordinate mean value is located as a central point, a region formed by pixel points of which the difference value with the third gray scale mean value is within a second preset range is taken as the light source region; determining whether the current photographing environment is a backlight condition or not according to a first gray average value of pixel points of the first region and a second gray average value of pixel points of a second region in the preview image, wherein the second region is a region outside the first region in the preview image;

2. The face detection method of claim 1, prior to determining whether a backlight condition is present, further comprising:

detecting whether a human face exists in a preview image of a camera through a second human face detection model, wherein the detection precision of the second human face detection model is lower than that of the first human face detection model;

and if the face is not detected in the preview image of the camera through the second face detection model, determining whether the current condition is a backlight condition.

3. The method of claim 1, wherein the determining whether the current photographing environment is a backlight condition according to the first gray average of the pixels in the first region and the second gray average of the pixels in the second region in the preview image comprises:

4. The face detection method of claim 1 or 2, wherein the first face detection model is a MobileNet-SSD convolutional neural network model;

the second face detection model is an HSV skin color model.

5. A mobile terminal, comprising:

the determining module is used for determining whether the current condition is a backlight condition or not after a camera of the mobile terminal is started; the determining module comprises: the light source area determining unit is used for acquiring pixel points of which the gray values in the current preview image are within a first preset range, determining the light source area in the preview image according to the coordinates of the pixel points, and recording the determined light source area as a first area; the backlight determining unit is used for determining whether the current photographing environment is a backlight condition or not according to a first gray average value of pixel points of the first region and a second gray average value of pixel points of a second region in the preview image, wherein the second region is a region outside the first region in the preview image; the light source region determining unit includes: the distribution graph obtaining subunit is used for generating a pixel distribution graph according to the coordinates of the pixels of which the gray values are within a first preset range; the average value determining subunit is configured to slide on the pixel point distribution map through a sliding window with a preset width to obtain a position where the sliding window includes the most pixel points, and record a third grayscale average value and a coordinate average value of all pixel points included in the sliding window at the position; a light source area determining subunit, configured to determine, as the light source area, an area formed by pixel points whose difference value with the third gray level average value is within a second preset range within a preset distance taking a point where the coordinate average value is located as a center point; the third gray average value is the average value of all pixel points in the position where the sliding window comprises the most pixel points;

6. The mobile terminal of claim 5, further comprising:

the second detection module is used for detecting whether a human face exists in a preview image of the camera through the second human face detection model before determining whether the current condition is a backlight condition;

the determining module is further configured to determine whether a backlight condition is currently present if no face is detected in the preview image of the camera by the second face detection model.

7. A mobile terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 4 when executing the computer program.

8. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by one or more processors, implements the steps of the method according to any one of claims 1 to 4.