CN110059590B

CN110059590B - Face living experience authentication method and device, mobile terminal and readable storage medium

Info

Publication number: CN110059590B
Application number: CN201910252907.XA
Authority: CN
Inventors: 徐爱辉
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2023-06-30
Anticipated expiration: 2039-03-29
Also published as: CN110059590A

Abstract

The invention discloses a face living experience verification method, a device, a mobile terminal and a computer readable storage medium, which are applied to the field of mobile terminals and comprise the following steps: shooting face images by using left and right cameras of the mobile terminal; respectively carrying out distortion correction and line alignment on the photographed left and right face images; face detection is carried out on the corrected left and right face images respectively; matching the key points of the human face with the left and right images subjected to human face detection; and performing living body judgment according to the distance between the key points of the human face. By the embodiment of the invention, when the face recognition authentication is utilized, the attack such as images, videos and the like can be effectively resisted, the method is simple, reliable and mature, the speed is high, the potential safety hazard is reduced, and the user experience is improved.

Description

Face living experience authentication method and device, mobile terminal and readable storage medium

Technical Field

The present invention relates to the field of mobile terminals, and in particular, to a face living experience authentication method and apparatus based on binocular cameras, a mobile terminal, and a computer readable storage medium.

Background

With the development of society, face recognition is increasingly applied to various products. Although applications for face recognition are promoted, the user experience is poor. For example, when using face recognition authentication, a person is required to blink continuously to confirm that a living body is in front of the lens, which has two disadvantages:

(1) Sometimes the eyes blink out of time or are not clearly detectable by living beings at all;

(2) Is easy to attack by images and videos.

Because the existing face recognition has the defects, potential safety hazards exist and user experience is poor when the face recognition is utilized to carry out safety authentication.

Disclosure of Invention

Accordingly, the present invention aims to provide a face living experience authentication method, device, mobile terminal and computer readable storage medium based on binocular camera, which can effectively resist attacks such as image and video when face recognition authentication is utilized, and the method is simple, reliable and mature, and has high speed, reduces potential safety hazard and improves user experience.

The technical scheme adopted by the invention for solving the technical problems is as follows:

according to one aspect of the invention, a face activity experience verification method is provided, and is applied to a mobile terminal, and the method comprises the following steps:

shooting face images by using left and right cameras of the mobile terminal;

respectively carrying out distortion correction and line alignment on the photographed left and right face images;

face detection is carried out on the corrected left and right face images respectively;

matching face key points on left and right images subjected to face detection

And performing living body judgment according to the distance between the key points of the human face.

In one possible design, the correcting distortion and aligning rows of the photographed left and right face images respectively includes:

carrying out distortion correction on face images shot by the left camera and the right camera;

binocular correction is carried out on the left face image and the right face image which are subjected to distortion correction.

In one possible design, the binocular correction of the distortion corrected left and right face images includes: and rotating face images shot by the left camera and the right camera after distortion correction, and carrying out binocular correction so that the images acquired by the binocular cameras can be kept aligned mathematically.

In one possible design, the performing face detection on the corrected left and right face images respectively includes: and detecting and acquiring the positions of face frames in the corrected left and right face images and key point information of the faces by adopting an mtcnn algorithm based on deep learning.

In one possible design, the matching the face key points on the left and right face images subjected to face detection includes: and matching key point positions of the left and right faces in the left and right face images according to the obtained face frame and the face key point information.

In one possible design, the performing the living body judgment according to the distance between the face key points includes:

determining the distance between the left ear and the right ear and the camera lens;

determining the distance from the left eye to the right eye to the lens;

determining a left eye-to-ear depth distance dist_eyetoprose;

determining a depth distance dist_eyetoeye from the left eye to the right eye;

determining a face rotation angle;

and performing living judgment according to the depth distance Dist_eyeToRose from the left eye to the ear and the face rotation angle, or performing living judgment according to the depth distance Dist_eyeToose from the left eye to the right eye and the face rotation angle.

In one possible design, the in-vivo judgment is performed according to the depth distance dist_eyetoprose from the left eye to the ear and the face rotation angle; comprising the following steps: when the face rotation angle is determined to be within the [0, angle ], a threshold value T1 is set: dist_eyeToRose < T, is a non-living attack; or alternatively, the process may be performed,

the performing in-vivo judgment according to the depth distance dist_eyetoeye from the left eye to the right eye and the face rotation angle includes: when the face rotation angle is determined to be in the [ angle,90] range, judging the depth difference between the left eye and the right eye: setting a threshold T1: dist_eyeToey < T1, is a non-living attack.

According to another aspect of the present invention, a face-activity experience-verification device is provided, which is applied to a mobile terminal, and the device includes: shooting module, correction and line alignment module, detection module, matching module, judgement module, wherein:

the shooting module is used for shooting face images by using left and right cameras of the mobile terminal;

the correcting and line aligning module is used for respectively carrying out distortion correction and line alignment on the photographed left and right face images;

the detection module is used for respectively carrying out face detection on the corrected left and right face images;

the matching module is used for matching the key points of the human face with the left and right images subjected to the human face detection;

and the judging module is used for judging the living body according to the distance between the key points of the human face.

According to another aspect of the present invention, there is provided a terminal including: the method comprises the steps of a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the steps of the method for verifying the human face living experience provided by the embodiment of the invention are realized when the computer program is executed by the processor.

According to another aspect of the present invention, a computer readable storage medium is provided, where a face-activity experience-checking method program is stored, and the steps of the face-activity experience-checking method provided by the embodiment of the present invention are implemented when the face-activity experience-checking method program is executed by a processor.

Compared with the prior art, the invention provides a face living experience authentication method and device based on a binocular camera, a mobile terminal and a computer readable storage medium, which are applied to the field of mobile terminals and comprise the following steps: shooting face images by using left and right cameras of the mobile terminal; respectively carrying out distortion correction and line alignment on the photographed left and right face images; face detection is carried out on the corrected left and right face images respectively; matching the key points of the human face with the left and right images subjected to human face detection; and performing living body judgment according to the distance between the key points of the human face. By the embodiment of the invention, when the face recognition authentication is utilized, the attack such as images, videos and the like can be effectively resisted, the method is simple, reliable and mature, the speed is high, the potential safety hazard is reduced, and the user experience is improved.

Drawings

Fig. 1 is a schematic diagram of a hardware structure of a mobile terminal implementing various embodiments of the present invention;

fig. 2 is a schematic diagram of a communication network system according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of a face-activity experience verification method according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a face-activity experience-checking device according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart of a face-activity experience verification method according to an embodiment of the present invention;

FIG. 6 is a schematic flow chart of a face-activity experience verification method according to an embodiment of the present invention;

fig. 7 is a schematic flow chart of a face live experience verification method based on a binocular camera according to an embodiment of the present invention;

fig. 8 is a flow chart of a face live experience verification method based on a binocular camera provided by the embodiment of the invention;

fig. 9 is a schematic flow chart of a face live experience verification method based on a binocular camera provided by the embodiment of the invention;

fig. 10 is a flow chart of a face live experience verification method based on a binocular camera according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of a mobile terminal to which the method of the present invention is applied according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

In order to make the technical problems, technical schemes and beneficial effects to be solved more clear and obvious, the invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the particular embodiments described herein are illustrative only and are not limiting upon the invention.

In the following description, suffixes such as "module", "component", or "unit" for representing elements are used only for facilitating the description of the present invention, and have no specific meaning per se. Thus, "module," "component," or "unit" may be used in combination.

The terminal may be implemented in various forms. For example, the terminals described in the present invention may include mobile terminals such as cell phones, tablet computers, notebook computers, palm computers, personal digital assistants (Personal Digital Assistant, PDA), portable media players (Portable Media Player, PMP), navigation devices, wearable devices, smart bracelets, pedometers, and fixed terminals such as digital TVs, desktop computers, and the like.

The following description will be given taking a mobile terminal as an example, and those skilled in the art will understand that the configuration according to the embodiment of the present invention can be applied to a fixed type terminal in addition to elements particularly used for a moving purpose.

Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal implementing various embodiments of the present invention, the mobile terminal 100 may include: an RF (Radio Frequency) unit 101, a WiFi module 102, an audio output unit 103, an a/V (audio/video) input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, a processor 110, and a power supply 111. Those skilled in the art will appreciate that the mobile terminal structure shown in fig. 1 is not limiting of the mobile terminal and that the mobile terminal may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The following describes the components of the mobile terminal in detail with reference to fig. 1:

the radio frequency unit 101 may be used for receiving and transmitting signals during the information receiving or communication process, specifically, after receiving downlink information of the base station, processing the downlink information by the processor 110; and, the uplink data is transmitted to the base station. Typically, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication, global System for Mobile communications), GPRS (General Packet Radio Service ), CDMA2000 (Code Division Multiple Access, CDMA 2000), WCDMA (Wideband Code Division Multiple Access ), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access, time Division synchronous code Division multiple Access), FDD-LTE (Frequency Division Duplexing-Long Term Evolution, frequency Division Duplex Long term evolution), and TDD-LTE (Time Division Duplexing-Long Term Evolution, time Division Duplex Long term evolution), etc.

WiFi belongs to a short-distance wireless transmission technology, and a mobile terminal can help a user to send and receive e-mails, browse web pages, access streaming media and the like through the WiFi module 102, so that wireless broadband Internet access is provided for the user. Although fig. 1 shows a WiFi module 102, it is understood that it does not belong to the necessary constitution of a mobile terminal, and can be omitted entirely as required within a range that does not change the essence of the invention.

The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a talk mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output (e.g., a call signal reception sound, a message reception sound, etc.) related to a specific function performed by the mobile terminal 100. The audio output unit 103 may include a speaker, a buzzer, and the like.

The a/V input unit 104 is used to receive an audio or video signal. The a/V input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042, the graphics processor 1041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 can receive sound (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, and the like, and can process such sound into audio data. The processed audio (voice) data may be converted into a format output that can be transmitted to the mobile communication base station via the radio frequency unit 101 in the case of a telephone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting the audio signal.

The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 1061 according to the brightness of ambient light, and the proximity sensor can turn off the display panel 1061 and/or the backlight when the mobile terminal 100 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for applications of recognizing the gesture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; as for other sensors such as fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured in the mobile phone, the detailed description thereof will be omitted.

The display unit 106 is used to display information input by a user or information provided to the user. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the mobile terminal. In particular, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the touch panel 1071 or thereabout by using any suitable object or accessory such as a finger, a stylus, etc.) and drive the corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, and sends the touch point coordinates to the processor 110, and can receive and execute commands sent from the processor 110. Further, the touch panel 1071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 107 may include other input devices 1072 in addition to the touch panel 1071. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc., as specifically not limited herein.

Further, the touch panel 1071 may overlay the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or thereabout, the touch panel 1071 is transferred to the processor 110 to determine the type of touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of touch event. Although in fig. 1, the touch panel 1071 and the display panel 1061 are two independent components for implementing the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 may be integrated with the display panel 1061 to implement the input and output functions of the mobile terminal, which is not limited herein.

The interface unit 108 serves as an interface through which at least one external device can be connected with the mobile terminal 100. For example, the external devices may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and an external device.

Memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a storage program area that may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and a storage data area; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 109 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The processor 110 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by running or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the mobile terminal. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The mobile terminal 100 may further include a power source 111 (e.g., a battery) for supplying power to the respective components, and preferably, the power source 111 may be logically connected to the processor 110 through a power management system, so as to perform functions of managing charging, discharging, and power consumption management through the power management system.

Although not shown in fig. 1, the mobile terminal 100 may further include a bluetooth module or the like, which is not described herein.

In order to facilitate understanding of the embodiments of the present invention, a communication network system on which the mobile terminal of the present invention is based will be described below.

Referring to fig. 2, fig. 2 is a schematic diagram of a communication network system according to an embodiment of the present invention, where the communication network system is an LTE system of a general mobile communication technology, and the LTE system includes a UE (User Equipment) 201, an e-UTRAN (Evolved UMTS Terrestrial Radio Access Network ) 202, an epc (Evolved Packet Core, evolved packet core) 203, and an IP service 204 of an operator that are sequentially connected in communication.

Specifically, the UE201 may be the terminal 100 described above, and will not be described herein.

The E-UTRAN202 includes eNodeB2021 and other eNodeB2022, etc. The eNodeB2021 may be connected with other eNodeB2022 by a backhaul (e.g., an X2 interface), the eNodeB2021 is connected to the EPC203, and the eNodeB2021 may provide access from the UE201 to the EPC 203.

EPC203 may include MME (Mobility Management Entity ) 2031, hss (Home Subscriber Server, home subscriber server) 2032, other MMEs 2033, SGW (Serving Gate Way) 2034, pgw (PDN Gate Way) 2035 and PCRF (Policy and Charging Rules Function, policy and tariff function entity) 2036, and so on. The MME2031 is a control node that handles signaling between the UE201 and EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as home location registers (not shown) and to hold user specific information about service characteristics, data rates, etc. All user data may be sent through SGW2034 and PGW2035 may provide IP address allocation and other functions for UE201, PCRF2036 is a policy and charging control policy decision point for traffic data flows and IP bearer resources, which selects and provides available policy and charging control decisions for a policy and charging enforcement function (not shown).

IP services 204 may include the internet, intranets, IMS (IP Multimedia Subsystem ), or other IP services, etc.

Although the LTE system is described above as an example, it should be understood by those skilled in the art that the present invention is not limited to LTE systems, but may be applied to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, and future new network systems.

Based on the above mobile terminal hardware structure and the communication network system, various embodiments of the method of the present invention are provided.

Please refer to fig. 3. The embodiment of the invention provides a face activity experience verification method based on a binocular camera, which is applied to a mobile terminal and comprises the following steps:

s1, shooting face images by using left and right cameras of a mobile terminal;

s2, respectively carrying out distortion correction and line alignment on the photographed left and right face images;

s3, face detection is carried out on the corrected left and right face images respectively;

s4, matching the right and left images subjected to face detection with key points of the face;

s5, performing living body judgment according to the distance between the key points of the face.

Further, before the step S1 of capturing face images by using the left and right cameras of the mobile terminal, the method further includes: and building a binocular camera environment by using the mobile terminal.

The mobile terminal is provided with two cameras, and a binocular camera environment is built by using the mobile terminal.

Further, in the step S1, the capturing face images by using the left and right cameras of the mobile terminal includes: and simultaneously shooting face images of the user by using the left camera and the right camera of the mobile terminal.

Further, in the step S2, the performing distortion correction and line alignment on the captured left and right face images respectively includes:

s21, carrying out distortion correction on face images shot by the left camera and the right camera;

s22, binocular correction is carried out on the left and right face images subjected to distortion correction; comprising the following steps: and rotating face images shot by the left camera and the right camera after distortion correction, and carrying out binocular correction so that the images acquired by the binocular cameras can be kept aligned mathematically.

Further, in the step S3, the performing face detection on the corrected left and right face images respectively includes:

and detecting and acquiring the positions of face frames in the corrected left and right face images and key point information of the faces by adopting an mtcnn algorithm based on deep learning. Wherein, the key points of the human face comprise ears, noses, mouths, eyes and the like.

Further, in the step S4, the matching of the face key points of the left and right face images subjected to face detection includes:

according to the obtained face frame and face key point information, the key point positions of the left and right faces in the left and right face images are matched, for example: the left ear in the left face image matches the right ear in the right face image, the nose in the left face image matches the nose in the right face image, and so on.

Further, in the step S5, the performing the living body judgment according to the distance between the face key points includes:

s51, determining the distance between the left ear and the right ear and the camera lens, wherein D_left_rose and D_right_rose respectively represent the distance between the left ear and the right ear and the camera lens, and (x 1, y1, z 1) represents the three-dimensional coordinates of D_left_rose;

s52, determining the distance from the left eye to the lens, wherein D_eye_left and D_mole_right respectively represent the distance from the left eye to the lens, (x 2, y2, z 2) represent the three-dimensional coordinates of the left eye, and (x 4, y4, z 4) represent the three-dimensional coordinates of the right eye;

s53, determining the depth distance from the left eye to the ear: dist_eyetolose= |z2-z1|;

s54, determining the depth distance from the left eye to the right eye: dist_EyToey= |z4-z2|;

S55, determining a face rotation angle;

s56, performing living body judgment according to the depth distance Dist_eyeToRose from the left eye to the ear and the face rotation angle or the depth distance Dist_eyeToose from the left eye to the right eye and the face rotation angle, including:

when the face rotation angle is determined to be within the [0, angle ], a threshold value T1 is set: dist_eyeToRose < T, is a non-living attack; or alternatively, the process may be performed,

when the face rotation angle is determined to be in the [ angle,90] range, judging the depth difference between the left eye and the right eye: setting a threshold T1: dist_eyeToey < T1, is a non-living attack.

The embodiment of the invention provides a face activity experience verification method based on a binocular camera, which is applied to a mobile terminal and comprises the following steps: shooting face images by using left and right cameras of the mobile terminal; respectively carrying out distortion correction and line alignment on the photographed left and right face images; face detection is carried out on the corrected left and right face images respectively; matching the key points of the human face with the left and right images subjected to human face detection; and performing living body judgment according to the distance between the key points of the human face. By the embodiment of the invention, when the face recognition authentication is utilized, the attack such as images, videos and the like can be effectively resisted, the method is simple, reliable and mature, the speed is high, the potential safety hazard is reduced, and the user experience is improved.

Please refer to fig. 4. The embodiment of the invention provides a face activity experience verification device based on a binocular camera, which is applied to a mobile terminal and comprises the following components: shooting module 10, correction and line alignment module 20, detection module 30, matching module 40, judging module 50, wherein:

the shooting module 10 is used for shooting face images by using left and right cameras of the mobile terminal;

the correcting and line aligning module 20 is configured to correct distortion and align lines of the photographed left and right face images, respectively;

the detection module 30 is configured to perform face detection on the corrected left and right face images respectively;

the matching module 40 is configured to match the left and right images subjected to face detection with key points of a face;

the judging module 50 is configured to perform living body judgment according to the distance between the face key points.

Further, the shooting module 10 is further used for building a binocular camera environment by using the mobile terminal. The mobile terminal is provided with two cameras, and a binocular camera environment is built by using the mobile terminal.

Further, the photographing module 10 is configured to simultaneously photograph face images of a user using left and right cameras of the mobile terminal.

Further, the correction and line alignment module 20 is specifically configured to:

binocular correction is carried out on the left face image and the right face image which are subjected to distortion correction; comprising the following steps: and rotating face images shot by the left camera and the right camera after distortion correction, and carrying out binocular correction so that the images acquired by the binocular cameras can be kept aligned mathematically.

The detection module 30 is specifically configured to:

The matching module 40 is specifically configured to:

The judging module 50 is specifically configured to:

determining the distance between the left ear and the right ear and the camera lens, wherein D_left_rose and D_right_rose respectively represent the distance between the left ear and the right ear and the camera lens, and (x 1, y1, z 1) represents the three-dimensional coordinates of D_left_rose;

Determining left and right eye-to-lens distances, wherein D_eye_left and D_mole_right respectively represent left and right eye-to-lens distances, (x 2, y2, z 2) represent left eye three-dimensional coordinates and (x 4, y4, z 4) represent right eye three-dimensional coordinates;

determining the depth distance of the left eye to the ear: dist_eyetolose= |z2-z1|;

determining the depth distance from the left eye to the right eye: dist_EyToey= |z4-z2|;

determining a face rotation angle;

performing in-vivo judgment according to the depth distance dist_eyetorose from the left eye to the ear and the face rotation angle or the depth distance dist_eyetoeye from the left eye to the right eye and the face rotation angle, including:

when the face rotation angle is determined to be within the [0, angle ], a threshold value T1 is set: dist_eyeToRose < T, is a non-living attack;

The embodiment of the invention provides a face living experience verification device based on a binocular camera, which is applied to a mobile terminal and comprises the following components: shooting module, correction and line alignment module, detection module, matching module and judgement module, wherein: the shooting module shoots face images by using left and right cameras of the mobile terminal; the correction and line alignment module respectively corrects distortion and line alignment of the photographed left and right face images; the detection module is used for respectively carrying out face detection on the corrected left and right face images; the matching module matches the key points of the human face with the left and right images subjected to the human face detection; and the judging module judges the living body according to the distance between the key points of the human face. By the embodiment of the invention, when the face recognition authentication is utilized, the attack such as images, videos and the like can be effectively resisted, the method is simple, reliable and mature, the speed is high, the potential safety hazard is reduced, and the user experience is improved.

It should be noted that the above device embodiments and method embodiments belong to the same concept, the specific implementation process of the device embodiments is detailed in the method embodiments, and technical features in the method embodiments are applicable correspondingly in the device embodiments, which are not repeated herein.

The technical scheme of the present invention is described in further detail below with reference to examples.

Please refer to fig. 5.

The embodiment of the invention provides a face activity experience verification method based on a binocular camera, which is applied to a mobile terminal and comprises the following steps:

step S501, a binocular camera environment is built by using the mobile terminal.

Step S502, shooting face images by using left and right cameras of the mobile terminal, including: and simultaneously shooting face images of the user by using the left camera and the right camera of the mobile terminal.

In step S503, the face images captured by the left and right cameras are subjected to distortion correction.

The method aims to eliminate face distortion caused by factors of an imaging lens, and particularly, the distortion is very large around the field of view of the lens.

Distortion includes radial distortion and tangential distortion, where radial distortion occurs because light rays are more curved away from the center of the camera than near the center. Tangential distortion occurs because the camera itself is not parallel to the image plane due to imperfections in the manufacture of the camera.

After distortion correction, the distortion of the whole field of view of the face image can be basically eliminated, so that the face recognition accuracy is improved.

Step S504, binocular correction is carried out on the left and right face images subjected to distortion correction; comprising the following steps: and rotating face images shot by the left camera and the right camera after distortion correction, and carrying out binocular correction so that the images acquired by the binocular cameras can be kept aligned mathematically.

Please refer to fig. 6. When the binocular cameras are installed on the mobile terminal, the left camera and the right camera cannot be at the same level, and a certain rotation relationship exists between the left camera and the right camera. The binocular correction is thus performed so that the images acquired by the binocular cameras can be kept aligned mathematically.

Please refer to fig. 7. After the left and right face images shot by the left and right cameras are rotated, the same content on two images of the left and right face images can be on the same horizontal plane.

The final effect of the left and right face images photographed by the left and right cameras after rotation is shown in fig. 8. At this time, all key points of the face are on the same horizontal plane, so that the distance between the specific part of the face and the camera can be accurately calculated.

Step S505, performing face detection on the corrected left and right face images, respectively, includes:

And detecting and acquiring the positions of face frames in the corrected left and right face images and key point information of the faces by adopting an mtcnn algorithm based on deep learning. Wherein, the key points of the human face comprise ears, noses, mouths, eyes and the like. As shown in fig. 8, the mtcnn algorithm detects a face, and can locate each key point of the left and right faces while locating the position of a face frame.

Step S506, matching the key points of the face to the left and right face images after face detection, comprising:

In step S405, the mtcnn algorithm may obtain the face frame and the face key point information, and the key points of the left face and the right face are in one-to-one correspondence, for example, as shown in fig. 8, the left ear coordinates and the right ear coordinates of the left face correspond to each other, and the left nose and the right nose coordinates correspond to each other.

Step S507, performing living body judgment according to the distance between the face key points, including:

As shown in fig. 9, there are two coordinate systems OR and OT. The point P1 in the OR coordinate system and the point P2 in the OT coordinate system are corresponding left eye coordinates, so that the distance from the left eye of the person to the horizontal axis b can be calculated according to the disparity map of the point P1 and the point P2.

According to fig. 9, in combination with some key points of the left and right faces shown in fig. 8, the distance between the related key points of the faces and the cameras can be calculated.

Please refer to fig. 8 and 9. Fig. 9 is an ideal model of double shot, with projections of P on OR and OT being P1 and P2, respectively. At this time, P1 and P2 are on the same line of the image, and the depth distance Z from P to the camera can be obtained according to the disparity maps of P1 and P2.

P1 and P2 respectively represent the horizontal positions of the points in the left and right images, the parallax is d=p1-P2, meanwhile, the depth distance Z is inversely proportional to the parallax d, and Z can be calculated by using a similar triangle:

in the above formula, f represents a focal length, and b represents a distance between the left and right lenses. The calculation result of the above formula is an ideal result. However, in practical applications, P1 and P2 are unlikely to be on the same horizontal line, and the line error in the left and right images is very large, and if the distance from P to the lens is forcibly calculated according to the disparity map, the obtained error is very wrong. Therefore, adjustments are made according to the methods of embodiments of the present invention.

In the embodiment of the present invention, since the keypoints of the left and right faces have been acquired, as shown in fig. 8 (1 c), the corresponding keypoints above (linearly connected to) the left ear of the left and right images are both on the same pole face (refer to fig. 6 (1 a)), and the corresponding keypoints of the left ear of the right image remain moving on the polar line erpr, and different parallaxes are obtained at different positions on the polar line erpr, so that different results Z are obtained according to the above formula.

Please refer to fig. 10.

determining a distance from a nose to a camera lens, wherein D_nose represents the distance from the nose to the camera lens, and (x 3, y3, z 3) represents three-dimensional coordinates of the nose;

determining the distances from the left and right mouth angles to the lens, wherein D_mouth_left and D_mouth_right respectively represent the distances from the left and right mouth angles to the lens;

Determining the depth distance of nose to ear: dist_nosetorose= |z3-z1|;

determining the depth distance of the nose to the left eye: dist_nosetorose= |z3-z2|;

determining a face rotation angle;

In addition, an embodiment of the present invention further provides a mobile terminal, as shown in fig. 11, where the mobile terminal 900 includes: the face-activity-experience-verification method based on the binocular camera comprises a memory 902, a processor 901 and one or more computer programs stored in the memory 902 and capable of running on the processor 901, wherein the memory 902 and the processor 901 are coupled together through a bus system 903, and the one or more computer programs are executed by the processor 901 to realize the following steps of the face-activity-experience-verification method based on the binocular camera, which is provided by the embodiment of the invention:

s1, shooting face images by using left and right cameras of a mobile terminal;

s55, determining a face rotation angle;

The method disclosed in the above embodiment of the present invention may be applied to the processor 901 or implemented by the processor 901. The processor 901 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by instructions in the form of integrated logic circuits or software in hardware in the processor 901. The processor 901 may be a general purpose processor, DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 901 may implement or perform the methods, steps and logic blocks disclosed in embodiments of the present invention. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiment of the invention can be directly embodied in the hardware of the decoding processor or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium in a memory 902, and the processor 901 reads information in the memory 902, in combination with its hardware, to perform the steps of the method as described above.

It will be appreciated that the memory 902 of embodiments of the invention can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. The nonvolatile Memory may be Read-Only Memory (ROM), programmable Read-Only Memory (PROM, programmable Read-Only Memory), erasable programmable Read-Only Memory (EPROM), electrically Erasable Read-Only Memory (EEPROM, electrically Erasable Programmable Read-Only Memory), magnetic random access Memory (FRAM, ferromagnetic Random Access Memory), flash Memory (Flash Memory) or other Memory technology, compact disc Read-Only Memory (CD-ROM, compact Disk Read-Only Memory), digital versatile disc (DVD, digital Video Disk) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices; volatile memory can be random access memory (RAM, random Access Memory), many forms of RAM being available by way of example and not limitation, such as static random access memory (SRAM, static Random Access Memory), static random access memory (SSRAM, synchronous Static Random Access Memory), dynamic random access memory (DRAM, dynamic Random Access Memory), synchronous dynamic random access memory (SDRAM, synchronous Dynamic Random Access Memory), double data rate synchronous dynamic random access memory (ddr SDRAM, double Data Rate Synchronous Dynamic Random Access Memory), enhanced synchronous dynamic random access memory (ESDRAM, enhanced Synchronous Dynamic Random Access Memory), synchronous link dynamic random access memory (SLDRAM, syncLink Dynamic Random Access Memory), direct memory bus random access memory (DRRAM, direct Rambus Random Access Memory). The memory described by embodiments of the present invention is intended to comprise, without being limited to, these and any other suitable types of memory.

It should be noted that the foregoing mobile terminal embodiments and the method embodiments belong to the same concept, the specific implementation process of the foregoing mobile terminal embodiments is detailed in the method embodiments, and technical features in the method embodiments are correspondingly applicable to the mobile terminal embodiments, which are not repeated herein.

In addition, in an exemplary embodiment, the embodiment of the present invention further provides a computer storage medium, specifically a computer readable storage medium, for example, including a memory 902 storing a computer program, where one or more programs of a face-activity experience verification method based on a binocular camera are stored on the computer storage medium, where the one or more programs of the face-activity experience verification method based on the binocular camera are executed by a processor 901 to implement the following steps of the face-activity experience verification method based on the binocular camera provided by the embodiment of the present invention:

s1, shooting face images by using left and right cameras of a mobile terminal;

s55, determining a face rotation angle;

It should be noted that, the program embodiment and the method embodiment of the face living experience authentication method based on the binocular camera on the computer readable storage medium belong to the same concept, the specific implementation process of the method embodiment is detailed in the method embodiment, and the technical features in the method embodiment are correspondingly applicable in the embodiment of the computer readable storage medium, which is not repeated herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims

1. The face activity experience verification method is applied to a mobile terminal and is characterized by comprising the following steps:

shooting face images by using left and right cameras of the mobile terminal;

matching the key points of the human face with the left and right images subjected to human face detection;

performing living body judgment according to the distance between the key points of the face;

the step of performing living body judgment according to the distance between the face key points includes:

determining the distance from the left eye to the right eye to the lens;

determining a left eye-to-ear depth distance dist_eyetoprose;

determining a depth distance dist_eyetoeye from the left eye to the right eye;

determining a face rotation angle;

performing living body judgment according to the depth distance Dist_eyeToRose from the left eye to the ear and the face rotation angle, including: when the face rotation angle is determined to be within the [0, angle ], a threshold value T1 is set: dist_eyeToRose < T1, is a non-living attack;

performing living body judgment according to the depth distance Dist_eyeToey from the left eye to the right eye and the face rotation angle, wherein the living body judgment comprises the following steps: when the face rotation angle is determined to be in the [ angle,90] range, judging the depth difference between the left eye and the right eye: setting a threshold T1: dist_eyeToey < T1, is a non-living attack.

2. The method according to claim 1, wherein the performing distortion correction and line alignment on the photographed left and right face images, respectively, includes:

3. The method of claim 2, wherein binocular correction of the distortion corrected left and right face images comprises: and rotating face images shot by the left camera and the right camera after distortion correction, and carrying out binocular correction so that the images acquired by the binocular cameras can be kept aligned mathematically.

4. The method according to claim 2, wherein the face detection of the corrected left and right face images, respectively, includes: and detecting and acquiring the positions of face frames in the corrected left and right face images and key point information of the faces by adopting an mtcnn algorithm based on deep learning.

5. The method of claim 4, wherein matching face keypoints for the face-detected left and right face images comprises: and matching key point positions of the left and right faces in the left and right face images according to the obtained face frame and the face key point information.

6. A face-activity-experience-verification apparatus applied to a face-activity-verification method as claimed in any one of claims 1 to 5, characterized in that the apparatus comprises: shooting module, correction and line alignment module, detection module, matching module, judgement module, wherein:

the judging module is used for judging living bodies according to the distances among the key points of the human faces;

the judging module is also used for determining the distance between the left ear and the right ear and the camera lens; determining the distance from the left eye to the right eye to the lens; determining a left eye-to-ear depth distance dist_eyetoprose; determining a depth distance dist_eyetoeye from the left eye to the right eye; determining a face rotation angle; performing living body judgment according to the depth distance Dist_eyeToRose from the left eye to the ear and the face rotation angle, including: when the face rotation angle is determined to be within the [0, angle ], a threshold value T1 is set: dist_eyeToRose < T1, is a non-living attack; performing living body judgment according to the depth distance Dist_eyeToey from the left eye to the right eye and the face rotation angle, wherein the living body judgment comprises the following steps: when the face rotation angle is determined to be in the [ angle,90] range, judging the depth difference between the left eye and the right eye: setting a threshold T1: dist_eyeToey < T1, is a non-living attack.

7. A terminal, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, performs the steps of a face-activity experience verification method as claimed in any one of claims 1 to 5.

8. A computer readable storage medium, wherein a face-activity-experience-verification program is stored on the computer readable storage medium, which when executed by a processor, implements the steps of a face-activity-verification method according to any one of claims 1 to 5.