CN113642368B

CN113642368B - Face pose determining method, device, equipment and storage medium

Info

Publication number: CN113642368B
Application number: CN202010393328.XA
Authority: CN
Inventors: 华丛一; 任志浩; 韩灵杰
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-05-11
Filing date: 2020-05-11
Publication date: 2023-08-18
Anticipated expiration: 2040-05-11
Also published as: CN113642368A

Abstract

The application discloses a method, a device, equipment and a storage medium for determining a face gesture, and belongs to the field of biological recognition. The method comprises the following steps: for a target face in a face image to be processed, face key points in the target face and position information of the face key points can be determined, then face key vectors are determined according to the position information of the face key points, and face posture information of the target face is determined according to the position information of the face key points and the face key vectors, wherein the face posture information comprises up-down pitching information, left-right rotation information and in-plane rotation information. The application can directly determine the face gesture according to the face key points and the face key vectors, does not need to train a neural network in advance by using a large number of samples, has simple algorithm and improves the rate of determining the face gesture.

Description

Face pose determining method, device, equipment and storage medium

Technical Field

The present application relates to the field of biometric identification, and in particular, to a method, apparatus, device, and storage medium for determining a face pose.

Background

With the development of the biometric technology, the application of the face recognition technology in life is more and more common. In order to better perform face recognition, face images acquired by the electronic equipment are required to be analyzed, face posture information of the target face is determined, and then face posture change of the target face is determined according to the face posture information. The face pose information includes pitch up and down (pitch) information, yaw rotation (yaw) information, and in-plane rotation (roll) information of the face.

In the related art, a large amount of training data is required to be used for training the neural network to be trained to obtain the neural network capable of recognizing the face gesture. The training data comprises face key points and face gesture labeling information of the sample face image. In the training process, the face key points in the training data are input into the neural network to be trained, and then the neural network to be trained is supervised and learned according to the face pose information output by the neural network to be trained and the face pose labeling information in the training data, so that the neural network to be trained is more accurately output, and the trained neural network is obtained through continuous supervised and learned. When the face pose needs to be determined, the face key points in the face image to be processed can be determined first, then the face key points are input into a trained neural network, and the face pose information is output through the neural network.

In the related art, a great amount of training data is required to train the neural network, and then the pose of the face is determined according to the key points of the face and the trained neural network. The process of determining the face pose is complex, and the algorithm is complex and takes a long time.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment and a storage medium for determining a face gesture, which can be used for solving the problems of complex process, complex algorithm and long time consumption of the face gesture determination in the related technology. The technical scheme is as follows:

In one aspect, a method for determining a face pose is provided, the method comprising:

determining a face key point in a target face and position information of the face key point, wherein the target face is a face in a face image to be processed;

determining a face key vector according to the position information of the face key points;

and determining the face posture information of the target face according to the position information of the face key points and the face key vectors, wherein the face posture information comprises up-down pitching information, left-right rotation information and rotation information in a plane.

Optionally, the determining the location information of the face key point in the target face includes:

establishing a three-dimensional rectangular coordinate system by taking a reference point of the face image to be processed as an origin, wherein an X-axis of the three-dimensional rectangular coordinate system is parallel to a first edge of the face image to be processed, a Y-axis is parallel to a second edge of the face image to be processed, the second edge is perpendicular to the first edge, and a Z-axis is perpendicular to the X-axis and the Y-axis;

and determining the coordinates of the face key points in the three-dimensional rectangular coordinate system, and taking the coordinates of the face key points in the three-dimensional rectangular coordinate system as the position information of the face key points.

Optionally, the face key points include left eye, right eye, nose tip, left mouth corner and right mouth corner.

Optionally, the face key vector includes at least one of the following:

a first vector pointing from the left eye center point to the left mouth corner;

a second vector pointing from the left mouth corner to the left eye center point;

a third vector pointing from the right eye center point to the right mouth corner;

a fourth vector pointing from the right mouth corner to the right eye center point;

a fifth vector pointing from the left eye center point to the tip of the nose;

a sixth vector pointing from the right eye center point to the tip of the nose;

a seventh vector pointing from the left mouth corner to the tip of the nose;

an eighth vector pointing from the right mouth corner to the tip of the nose;

a ninth vector pointing from the left eye center point to the right eye center point;

a tenth vector pointing from the right eye center point to the left eye center point;

an eleventh vector pointing from the left mouth corner to the right mouth corner;

a twelfth vector pointing from the right mouth corner to the left mouth corner.

Optionally, the determining the face pose information of the target face according to the position information of the face key point and the face key vector includes:

determining rotation information of the target face in a plane according to the position information of the key points of the face;

determining left and right rotation information of the target face according to the face key vector;

And determining the up-down pitching information of the target face according to the face key vector.

Optionally, the face key point includes a left eye and a right eye, the position information of the face key point includes coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system, and the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin;

the determining the rotation information of the target face in the plane according to the position information of the face key points comprises the following steps:

determining a ratio between a first difference value and a second difference value to obtain a first tangent value, wherein the first difference value is a difference value of Y-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system, and the second difference value is a difference value of X-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system;

and determining the first tangent value as rotation information of the target face in a plane.

Optionally, the face key point includes a left mouth angle and a right mouth angle, the position information of the face key point includes coordinates of the left mouth angle and the right mouth angle in a three-dimensional rectangular coordinate system, and the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin;

determining a ratio between a third difference value and a fourth difference value to obtain a second tangent value, wherein the third difference value is a difference value of Y-axis coordinate values of the left nozzle angle and the right nozzle angle in the three-dimensional rectangular coordinate system, and the fourth difference value is a difference value of X-axis coordinate values of the left nozzle angle and the right nozzle angle in the three-dimensional rectangular coordinate system;

and determining the second tangent value as rotation information of the target face in a plane.

Optionally, the face key points include left eye, right eye, nose tip, left mouth corner and right mouth corner;

the determining the left-right rotation information of the target face according to the face key vector includes:

determining a first sine value of an included angle between a first vector pointing to a left mouth corner from a left eye center point and a fifth vector pointing to a nose tip from the left eye center point according to an outer product of the first vector and the fifth vector; taking the first sine value as left rotation information of the target face;

or determining a second sine value of an included angle between a third vector pointing to a right mouth corner from a right eye center point and a sixth vector pointing to a nose tip from the right eye center point according to an outer product of the third vector and the sixth vector; and taking the second sine value as right rotation information of the target face.

determining a third sine value of an included angle between a second vector pointing from a left mouth corner to a left eye center point and a seventh vector pointing from the left mouth corner to a nose tip according to an outer product of the second vector and the seventh vector; taking the third sine value as left rotation information of the target face;

or determining a fourth sine value of an included angle between a fourth vector pointing to a right eye center point from a right mouth angle and an eighth vector pointing to a nose tip from the right mouth angle according to an outer product of the fourth vector and the eighth vector; and taking the fourth sine value as right rotation information of the target face.

Optionally, the face image to be processed is a three-dimensional image, the face key points include a left eye and a right eye, the position information of the face key points includes coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system, and the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin;

The method further comprises the steps of:

determining a ratio of a fifth difference value to a second difference value to obtain a third tangent value, wherein the fifth difference value is a difference value of Z-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system, and the second difference value is a difference value of X-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system;

and determining the third tangent value as left and right rotation information of the target face.

the determining the up-down pitching information of the target face according to the face key vector comprises the following steps:

determining a first distance between the nose tip and a first connecting line according to an outer product of a seventh vector pointing from a left mouth corner to the nose tip and an eleventh vector pointing from the left mouth corner to a right mouth corner, or an outer product of an eighth vector pointing from the right mouth corner to the nose tip and a twelfth vector pointing from the right mouth corner to the left mouth corner, wherein the first connecting line is a connecting line between the left mouth corner and the right mouth corner;

and determining the first distance as the up-down pitching information of the target face.

The determining the up-down pitching value of the target face according to the face key vector comprises the following steps:

determining a second distance between the nose tip and a second connecting line according to the outer product of a fifth vector pointing to the nose tip from the left eye center point and a ninth vector pointing to the right eye center point from the left eye center point, or the outer product of a sixth vector pointing to the nose tip from the right eye center point and a tenth vector pointing to the left eye center point from the right eye center point, wherein the second connecting line is a connecting line between the left eye center point and the right eye center point;

and determining the second distance as the up-down pitching information of the target face.

determining a first distance between the nose tip and a first connecting line according to the outer product of a seventh vector pointing from a left mouth angle to the nose tip and an eleventh vector pointing from the left mouth angle to a right mouth angle, or the outer product of an eighth vector pointing from the right mouth angle to the nose tip and a twelfth vector pointing from the right mouth angle to the left mouth angle, wherein the first connecting line is a connecting line between the left mouth angle and the right mouth angle;

and determining the ratio of the first distance to the second distance as the up-down pitching information of the target face.

Optionally, the face image to be processed is a three-dimensional image, the face key points include a left eye, a right eye, a left mouth angle and a right mouth angle, the position information of the face key points includes coordinates of the left eye, the right eye, the left mouth angle and the right mouth angle in a three-dimensional rectangular coordinate system, the three-dimensional rectangular coordinate system uses a reference point of the face image to be processed as a coordinate origin, uses a direction perpendicular to the face image to be processed as a Z axis, and uses two directions perpendicular to the Z axis and perpendicular to each other as an X axis and a Y axis;

the method further comprises the steps of:

according to the coordinates of the left eye and the right eye in the three-dimensional rectangular coordinate system, determining the coordinates of the center points of the two eyes;

According to the coordinates of the left mouth angle and the right mouth angle in the three-dimensional rectangular coordinate system, determining the coordinates of a mouth angle center point;

determining a ratio of a sixth difference value to a seventh difference value to obtain a fourth tangent value, wherein the sixth difference value is a difference value of Z-axis coordinate values in the binocular central point coordinate and the nozzle angle central point coordinate, and the seventh difference value is a difference value of Y-axis coordinate values in the binocular central point coordinate and the nozzle angle central point coordinate;

and determining the fourth tangent value as the up-down pitching information of the target face.

Optionally, the method further comprises:

and if at least one of the up-down pitching information, the left-right rotation information and the rotation information in the plane is not in the corresponding preset information range, determining that the face image to be processed is invalid.

In another aspect, a device for determining a face pose is provided, where the device includes:

the first determining module is used for determining face key points in a target face and position information of the face key points, wherein the target face is a face in a face image to be processed;

the second determining module is used for determining a face key vector according to the position information of the face key points;

And the third determining module is used for determining the face posture information of the target face according to the position information of the face key points and the face key vectors, wherein the face posture information comprises up-down pitching information, left-right rotation information and rotation information in a plane.

Optionally, the first determining module is configured to:

Optionally, the face key vector includes at least one of the following:

a fifth vector pointing from the left eye center point to the tip of the nose;

a sixth vector pointing from the right eye center point to the tip of the nose;

a seventh vector pointing from the left mouth corner to the tip of the nose;

an eighth vector pointing from the right mouth corner to the tip of the nose;

a twelfth vector pointing from the right mouth corner to the left mouth corner.

Optionally, the third determining module includes:

the first determining submodule is used for determining rotation information of the target face in a plane according to the position information of the key points of the face;

the second determining submodule is used for determining left and right rotation information of the target face according to the face key vector;

and the third determination submodule is used for determining the up-down pitching information of the target face according to the face key vector.

The first determination submodule is used for:

the first determination submodule is used for:

the second determination submodule is used for:

the second determination submodule is further configured to:

The third determination submodule is configured to:

the third determination submodule is configured to:

The third determination submodule is further configured to:

Optionally, the apparatus further comprises:

and a fourth determining module, configured to determine that the face image to be processed is invalid if at least one of the vertical pitch information, the horizontal rotation information, and the rotation information in the plane is not in the corresponding preset information range.

the image acquisition module is used for acquiring face images;

the processor is used for determining the face key points in the target face included in the face image and the position information of the face key points; determining a face key vector according to the position information of the face key points; and determining the face posture information of the target face according to the position information of the face key points and the face key vectors, wherein the face posture information comprises up-down pitching information, left-right rotation information and rotation information in a plane.

In another aspect, there is provided an electronic device comprising:

a processor;

a memory for storing processor-executable instructions;

the processor is configured to implement the method for determining a face pose according to the above aspect.

In another aspect, a computer readable storage medium is provided, where instructions are stored, where the instructions, when executed by a processor, implement the method for determining a face pose according to the above aspect.

In another aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of determining a face pose as described in the above aspect.

The technical scheme provided by the application has at least the following beneficial effects:

in the embodiment of the application, for the target face in the face image to be processed, the face key points in the target face and the position information of the face key points can be determined, then the face key vectors are determined according to the position information of the face key points, and then the face posture information of the target face is determined according to the position information of the face key points and the face key vectors, wherein the face posture information comprises up-down pitching information, left-right rotation information and rotation information in a plane. The application can directly determine the face gesture according to the face key points and the face key vectors, does not need to train a neural network in advance by using a large number of samples, has simple algorithm and improves the rate of determining the face gesture.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a face pose determining system according to an embodiment of the present application;

FIG. 2 is a flowchart of a method for determining a face pose according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a face key point according to an embodiment of the present application;

fig. 4 is a schematic diagram of a face key vector according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a device for determining a face pose according to an embodiment of the present application;

fig. 6 is a block diagram of a terminal according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

Before explaining the method for determining the face pose in detail, the embodiment of the application is described in the implementation environment related to the embodiment of the application.

The method for determining the face gesture provided by the embodiment of the application is applied to electronic equipment, and the electronic equipment can be a terminal or a server and the like. For example, the terminal may be a PC (Personal Computer ), a mobile phone, a smart phone, a PDA (Personal Digital Assistant ), a wearable device, a PPC (Pocket PC), a tablet computer, a smart car machine, a smart television, a mobile camera, a shooting robot, or other electronic devices including camera devices. The server may be a background server for some applications.

As one example, the electronic device may include an image acquisition module, which may be a camera assembly or the like, and a processor.

The image acquisition module is used for acquiring face images.

The processor is used for determining the face key points of the target face included in the face image and the position information of the face key points; determining a face key vector according to the position information of the face key points; and determining the face posture information of the target face according to the position information of the face key points and the face key vectors, wherein the face posture information comprises up-down pitching information, left-right rotation information and rotation information in a plane.

Referring to fig. 1, fig. 1 is a schematic diagram of a face pose determining system according to an embodiment of the present application, and as shown in fig. 1, the face pose determining system 100 includes a plurality of terminals 101 and a server 102, where any one of the terminals 101 and the server 102 are connected by a wired or wireless manner to perform communication.

For any one terminal 101 among the plurality of terminals 101, the terminal 101 carries a camera, and can report the photographed face image to the server 102. After receiving the face image, the server 102 performs face detection on the face image to determine face key points, further determines face key vectors according to the face key points, and determines face pose information according to the face key points and the face key vectors according to the method provided by the embodiment of the application.

The face pose determining method provided by the embodiment of the application is executed by the server 102. The terminal 101 may be a mobile phone, a desktop computer, a notebook computer, a vehicle-mounted terminal, a mobile camera, a shooting robot, or other electronic devices including a camera device, and fig. 1 illustrates only 2 mobile phones as terminals, which is not a limitation of the embodiments of the present application.

In another possible implementation manner, the method for determining the face pose according to the embodiment of the present application may also be performed by the terminal 101, where the terminal 101 also has data processing capability.

After describing the implementation environment related to the embodiment of the present application, the method for determining the face pose provided by the embodiment of the present application is explained in detail with reference to the accompanying drawings.

Fig. 2 is a flowchart of a method for determining a face pose according to an embodiment of the present application, where the method may be applied to an electronic device, and the electronic device may be a terminal or a server. Referring to fig. 2, the method includes the following steps:

step 201: and determining the key points of the human face in the target human face and the position information of the key points of the human face, wherein the target human face is the human face in the human face image to be processed.

The face key points are used for identifying facial features, and for the face image to be processed, the face key points can be determined through a face detection technology. Further, a coordinate system can be established to accurately determine the position information of the key points of the human face, and the position information of the key points of the human face can reflect the position change condition of the key points of the human face, so that the posture change condition of the human face is reflected.

It should be noted that the face image to be processed may be obtained by the camera of the electronic device by itself, may be obtained from a storage space, or may be obtained by sending the face image to be processed from another device, which is not limited in the present application.

Referring to fig. 3, fig. 3 is a schematic diagram of a face key point provided in an embodiment of the present application, and typically, there may be 106, 81, 68, or the like face key points. As shown in fig. 3 (a), when the pose of the face is changed, the positions of the corresponding face key points are also changed. As one example, when a person opens a mouth to laugh, the coordinates of the face keypoints that indicate the mouth will change accordingly.

In order to quickly determine the face pose, the present application screens the face key points, and selects several core face key points that are necessarily changed in the face pose change, and uses the core face key points as the face key points when subsequently determining the face pose information, as shown in fig. 3 (b), the face key points for determining the face pose information of the present application include: left eye, right eye, nasal tip, left mouth corner and right mouth corner.

It should be noted that, since the determination of the face pose is to analyze the face image to be processed to determine the face pose information, after the face key point is detected, further, the position information of the face key point needs to be determined.

As one example, the process of determining the location information of the face key points in the target face is: and establishing a three-dimensional rectangular coordinate system by taking a reference point of the face image to be processed as a coordinate origin, determining the coordinates of the face key points in the three-dimensional rectangular coordinate system, and taking the coordinates of the face key points in the three-dimensional rectangular coordinate system as the position information of the face key points.

The three-dimensional rectangular coordinate system can take a reference point of the face image to be processed as a coordinate origin, take a direction perpendicular to the face image to be processed as a Z axis, and take two directions perpendicular to the Z axis and mutually perpendicular as an X axis and a Y axis respectively. For example, the X-axis of the three-dimensional rectangular coordinate system is parallel to the first side of the face image to be processed, the Y-axis is parallel to the second side of the face image to be processed, and the second side is perpendicular to the first side, and the Z-axis is perpendicular to the X-axis and the Y-axis.

When the reference point is used for establishing a three-dimensional rectangular coordinate system, the reference point can be any point in the face image to be processed as a point of the origin of coordinates, and a selection mode of the reference point can be preset according to actual needs. For example, the reference point may be any corner or center point of the face image to be processed, for example, the reference point is an upper left corner, a lower left corner, an upper right corner, a lower right corner, or the like of the face image to be processed.

As an example, as shown in fig. 3 (b), if the reference point is the upper left corner of the picture, a three-dimensional rectangular coordinate system is established with the upper left corner as the origin of coordinates, the AB side as the X axis, the AC side as the Y axis, and the axis perpendicular to the X axis and the Y axis as the Z axis, and then the position information of the face key point is determined in the three-dimensional rectangular coordinate system.

As an example, the reference point of the face image to be processed may be point a, point B, point C, point D or point E in fig. 3 (a), and when the reference point is selected at four corners of the face image to be processed, it is more likely to determine the complete pose information of the face. The reference point may be preset as long as it is in the face image to be processed, and the position of the reference point is not limited by the present application.

Step 202: and determining the face key vector according to the position information of the face key points.

The face key vector is a vector formed by any two face key points in the face key points.

By way of example, the face key vector may include at least one of: a first vector pointing from the left eye center point to the left mouth corner; a second vector pointing from the left mouth corner to the left eye center point; a third vector pointing from the right eye center point to the right mouth corner; a fourth vector pointing from the right mouth corner to the right eye center point; a fifth vector pointing from the left eye center point to the tip of the nose; a sixth vector pointing from the right eye center point to the tip of the nose; a seventh vector pointing from the left mouth corner to the tip of the nose; an eighth vector pointing from the right mouth corner to the tip of the nose; a ninth vector pointing from the left eye center point to the right eye center point; a tenth vector pointing from the right eye center point to the left eye center point; an eleventh vector pointing from the left mouth corner to the right mouth corner; a twelfth vector pointing from the right mouth corner to the left mouth corner.

Referring to fig. 4, fig. 4 is a schematic diagram of a face key vector according to an embodiment of the present application. As shown in fig. 4, "1" in fig. 4 is used to identify a first vector pointing from the left-eye center point to the left-mouth corner, "2" is used to identify a second vector pointing from the left-mouth corner to the left-eye center point, and so on.

It should be noted that, since the vector has directionality, for convenience of subsequent description, the face key vector determined by the left-eye coordinates and the right-eye coordinates includes: a first vector pointing from the left eye center point to the left mouth corner, and a second vector pointing from the left mouth corner to the left eye center point. The determination method of other face key vectors is the same, and is not described in detail herein.

In addition, before the face pose is determined, for the face image to be processed, when the image is acquired, if the distance between the target face and the shooting lens is too short, the light supplementing is too strong, so that the light reflection is serious, the interpupillary distance of the face is also enlarged, the face information contained in the acquired face image may be incomplete, and the face pose is not easy to judge. Similarly, if the distance between the target face and the photographing lens is too long, the inter-pupillary distance of the face will also be reduced, and the face information contained in the collected face image may be unclear, which is not beneficial to judging the face pose. Therefore, in order to ensure the validity of the face pose information, the face image to be processed may be screened in advance. As one example, the face image to be processed may be screened based on the inter-pupillary distance of the target face in the face image to be processed. The interpupillary distance may be represented by a difference in X-axis coordinate values of the left eye and the right eye in a three-dimensional rectangular coordinate system.

The application range of the interpupillary distance can be preset during screening. If the inter-pupillary distance of the target face is within the applicable range, the face image to be processed is determined to be valid, and step 203 is continuously executed. If the inter-pupillary distance of the target face is not within the applicable range, it is determined that the face image to be processed is invalid, and the subsequent step 203 is not executed.

As one example, the inter-pupillary distance is denoted as IPD, the X-axis coordinate value of the left eye is denoted as le_x, and the X-axis coordinate value of the right eye is denoted as re_x, and ipd=re_x-le_x, or ipd=le_x-re_x.

For example, if the minimum value of the IPD is determined to be 100 in advance and the maximum value of the IPD is determined to be 500, when the inter-pupillary distance IPD of the target face satisfies 100< IPD <500, the face image to be processed is determined to be valid.

Step 203: and determining the face posture information of the target face according to the position information of the face key points and the face key vectors, wherein the face posture information comprises up-down pitching information, left-right rotation information and rotation information in a plane.

As an example, according to the position information of the face key points and the face key vectors, the implementation process of determining the face pose may be: according to the position information of the key points of the face, determining the rotation information of the target face in the plane; according to the face key vector, determining left-right rotation information of the target face; and determining the up-down pitching information of the target face according to the face key vector.

Before further explaining the above three aspects of face pose information, it is assumed that in the face image to be processed, the face key points of the target face are left eye el, right eye er, nose tip n, left mouth angle ml and right mouth angle mr. The coordinates of the five key points in the three-dimensional rectangular coordinate system are respectively as follows: left eye el (el_x, el_y, el_z), right eye er (er_x, er_y, er_z), nose tip n (n_x, n_y, n_z), left mouth angle ml (ml_x, ml_y, ml_z), and right mouth angle mr (mr_x, mr_y, mr_z).

It should be noted that, the position information of the face key point may be three-dimensional data or two-dimensional data, and as an example, if the coordinate value of the Z-axis of the face key point in the three-dimensional rectangular coordinate system is 0, the coordinate of the face key point may be (x, y, 0) or (x, y).

Also, the face key vector may be expressed as: the first vector pointing from the left eye center point to the left mouth corner is denoted as el2ml and the second vector pointing from the left mouth corner to the left eye center point is denoted as ml2el. And so on.

It should be noted that the following examples are explained using the above-mentioned representation. The implementation of the above three aspects will be described in detail.

(1) And determining the rotation information of the target face in the plane according to the position information of the key points of the face.

The rotation information of the target face in the plane is determined based on the planes of the X axis and the Y axis in the established three-dimensional rectangular coordinate system, namely, the rotation information of the target face in the planes of the X axis and the Y axis is determined, and the rotation information can be represented by the tangent value of the included angle between key vectors of the face.

When the face pose of the target face in the plane changes, the relative position between the left eye and the right eye of the target face will also change. That is, rotation information of the target face in the plane may be determined from the position information of the left and right eyes.

Thus, in one possible implementation, the face key point includes a left eye and a right eye, and the position information of the face key point includes coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system, where the three-dimensional rectangular coordinate system uses a reference point of the face image to be processed as an origin of coordinates, uses a direction perpendicular to the face image to be processed as a Z axis, and uses two directions perpendicular to the Z axis and perpendicular to each other as an X axis and a Y axis, respectively.

According to the position information of the key points of the human face, the realization process of determining the rotation information of the target human face in the plane is as follows: and determining the ratio between the first difference value and the second difference value to obtain a first tangent value, and determining the first tangent value as rotation information of the target face in the plane. The first difference value refers to a difference value of Y-axis coordinate values of the left eye and the right eye in a three-dimensional rectangular coordinate system, and the second difference value refers to a difference value of X-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system.

Further, in order to avoid the occurrence of a negative value, the absolute value of the first tangent value may also be determined as rotation information of the target face in the plane.

As an example, in the established three-dimensional rectangular coordinate system, taking the data of the key points of the face as two-dimensional data, for example, the coordinates of the left eye el are (el_x, el_y), the coordinates of the right eye er are (er_x, er_y), the first tangent value is represented by roll_eye, and in order to avoid negative values, for determining the ratio between the first difference value and the second difference value, taking the absolute value thereof by abs, the rotation information of the target face in the plane can be determined by the following formula (1):

alternatively, the rotation information of the target face in the plane is determined by the following formula (2):

In addition, when the face pose of the target face in the plane changes, the relative position between the left and right mouth corners of the target face will also change. That is, the rotation information of the target face in the plane may be determined from the position information of the left and right mouth corners.

Thus, in another possible implementation manner, the face key point includes a left mouth corner and a right mouth corner, and the position information of the face key point includes coordinates of the left mouth corner and the right mouth corner in a three-dimensional rectangular coordinate system, where the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin.

According to the position information of the key points of the human face, the realization process of determining the rotation information of the target human face in the plane is as follows: and determining the ratio of the third difference value to the fourth difference value to obtain a second tangent value, and determining the second tangent value as rotation information of the target face in the plane. The third difference value refers to the difference value of Y-axis coordinate values of the left nozzle angle and the right nozzle angle in a three-dimensional rectangular coordinate system, and the fourth difference value refers to the difference value of X-axis coordinate values of the left nozzle angle and the right nozzle angle in the three-dimensional rectangular coordinate system.

Further, in order to avoid the occurrence of a negative value, the absolute value of the second tangent value may also be determined as rotation information of the target face in the plane.

As an example, in the established three-dimensional rectangular coordinate system, the data of the key points of the face is taken as two-dimensional data, for example, the coordinates of the left mouth angle ml are (ml_x, ml_y), the coordinates of the right mouth angle mr are (mr_x, mr_y), the second tangent value is represented by roll_mole, and in order to avoid negative values, for determining the ratio between the third difference value and the fourth difference value, the absolute value is taken by abs, and then the rotation information of the target face in the plane can be determined by the following formula (3):

alternatively, the rotation information of the target face in the plane is determined by the following formula (4):

(2) According to the face key vector, determining left-right rotation information of the target face;

the target face may rotate left or right, and thus the left-right rotation information of the target face includes left rotation information or right rotation information.

For the obtained face image to be processed, if the face image to be processed is a two-dimensional image, the face key points comprise a left eye, a right eye, a nose tip, a left mouth corner and a right mouth corner, and the left and right rotation information of the target face is determined according to the face key vector, including the following two possible implementation modes.

In one possible implementation manner, according to the outer product of a first vector pointing to the left mouth corner from the left eye center point and a fifth vector pointing to the nose tip from the left eye center point, determining a first sine value of an included angle between the first vector and the fifth vector, and taking the first sine value as left rotation information of a target face; or determining a second sine value of an included angle between the third vector and the sixth vector according to an outer product of the third vector pointing to the right mouth angle from the right eye center point and the sixth vector pointing to the nose tip from the right eye center point, and taking the second sine value as right rotation information of the target face.

As an example, in the established three-dimensional rectangular coordinate system, a first vector from the center point of the left eye to the left mouth corner is denoted as el2ml, a fifth vector from the center point of the left eye to the tip of the nose is denoted as el2n, a first sine value is denoted as yaw_ eln, and left rotation information of the target face can be determined by the following formula (5):

in the established three-dimensional rectangular coordinate system, a third vector pointing to the right mouth corner from the right eye center point is denoted as er2mr, a sixth vector pointing to the nose tip from the right eye center point is denoted as er2n, and a second sine value is denoted as yaw_ern, and right rotation information of the target face can be determined by the following formula (6):

it should be noted that, when the nose tip of the target face is located inside the line between the eyes and the mouth angle, the values of the first sine value and the second sine value are positive, otherwise, negative.

In another possible implementation manner, according to the outer product of the second vector pointing to the center point of the left eye from the left mouth angle and the seventh vector pointing to the nose tip from the left mouth angle, determining a third sine value of an included angle between the second vector and the seventh vector, and using the third sine value as left rotation information of the target face; or determining a fourth sine value of an included angle between the fourth vector and the eighth vector according to an outer product of the fourth vector pointing to a right eye center point from the right mouth angle and the eighth vector pointing to the nose tip from the right mouth angle, and taking the fourth sine value as right rotation information of the target face.

As an example, in the established three-dimensional rectangular coordinate system, the second vector pointing from the left mouth corner to the center point of the left eye is denoted as ml2el, the seventh vector pointing from the left mouth corner to the tip of the nose is denoted as ml2n, the third sine value is denoted as yaw_ mln, and the left rotation information of the target face can be determined by the following formula (7):

in the established three-dimensional rectangular coordinate system, a fourth vector pointing to the center point of the right eye from the right mouth angle is denoted as mr2er, an eighth vector pointing to the tip of the nose from the right mouth angle is denoted as mr2n, a fourth sine value is denoted as yaw_ mrn, and right rotation information of the target face can be determined by the following formula (8):

it should be noted that, when the nose tip of the target face is located inside the line between the eyes and the mouth angle, the values of the third sine value and the fourth sine value are positive, otherwise, negative.

For the obtained face image to be processed, if the face image to be processed is a three-dimensional image, the face key points comprise a left eye and a right eye, the position information of the face key points comprises coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system respectively, the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin, and the implementation process of determining the left and right rotation information of the target face according to the face key points can be as follows: and determining the ratio of the fifth difference value to the second difference value to obtain a third tangent value, and determining the third tangent value as the left-right rotation information of the target face. The fifth difference value refers to a difference value of Z-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system, and the second difference value refers to a difference value of X-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system.

Further, in order to avoid the occurrence of a negative value, the absolute value of the third tangent value may be determined as the left-right rotation information of the target face.

As an example, in the established three-dimensional rectangular coordinate system, taking the data of the key points of the face as three-dimensional data, for example, the coordinates of the left eye el are (el_x, el_y, el_z), the coordinates of the right eye er are (er_x, er_y, el_z), the third tangent value is denoted by yaw_eye, and in order to avoid negative values, for determining the ratio between the fifth difference value and the second difference value, taking the absolute value thereof through abs, the rotation information of the target face in the plane can be determined by the following formula (9):

alternatively, the rotation information of the target face in the plane is determined by the following formula (10):

(3) And determining the up-down pitching information of the target face according to the face key vector.

For the face image to be processed, the distance from the nose tip to the connecting line of the eyes is changed in the process of raising or lowering the head of the target face, and similarly, the distance from the nose tip to the connecting line between the left mouth corner and the right mouth corner is also changed, so that the up-down pitching information of the target face can be determined according to the two distances.

As one example, if the face key points include left eye, right eye, nose tip, left mouth corner, and right mouth corner, determining the up-down pitch information of the target face according to the face key vector may include the following three possible implementations.

In a first possible implementation, a first distance between the tip of the nose and the first line is determined according to an outer product of a seventh vector pointing from the left mouth corner to the tip of the nose and an eleventh vector pointing from the left mouth corner to the right mouth corner, or an outer product of an eighth vector pointing from the right mouth corner to the tip of the nose and a twelfth vector pointing from the right mouth corner to the left mouth corner, and the first distance is determined as the up-down pitch information of the target face. Wherein the first connecting line refers to a connecting line between the left mouth corner and the right mouth corner.

As an example, in the established three-dimensional rectangular coordinate system, a seventh vector from the left mouth corner to the tip of the nose is denoted as ml2n, an eleventh vector from the left mouth corner to the right mouth corner is denoted as ml2mr, and the first distance is denoted as dn, then the up-down pitch information of the target face can be determined by the following formula (11):

in the established three-dimensional rectangular coordinate system, an eighth vector pointing to the nose tip from the right mouth angle is expressed as mr2n, a twelfth vector pointing to the left mouth angle from the right mouth angle is expressed as mr2ml, and the first distance is expressed as dn, so that the up-down pitching information of the target face can be determined by the following formula (12):

in a second possible implementation manner, a second distance between the nose tip and a second connecting line is determined according to an outer product of a fifth vector pointing to the nose tip from the left eye center point and a ninth vector pointing to the right eye center point from the left eye center point, or an outer product of a sixth vector pointing to the nose tip from the right eye center point and a tenth vector pointing to the left eye center point from the right eye center point, and the second distance is determined as upper and lower pitching information of the target face. The second connecting line refers to a connecting line between the center point of the left eye and the center point of the right eye.

As an example, in the established three-dimensional rectangular coordinate system, a fifth vector pointing from the left eye center point to the tip of the nose is denoted as el2n, a ninth vector pointing from the left eye center point to the right eye center point is denoted as el2er, and the second distance is denoted as dne, and the up-down pitch information of the target face can be determined by the following formula (13):

in the established three-dimensional rectangular coordinate system, a sixth vector pointing to the nose tip from the right eye center point is denoted as er2n, a tenth vector pointing to the left eye center point from the right eye center point is denoted as er2el, and a second distance is denoted as dne, and then the up-down pitch information of the target face can be determined by the following formula (14):

in a third possible implementation manner, the first distance between the tip of the nose and the first connecting line is determined according to the outer product of a seventh vector pointing from the left mouth corner to the tip of the nose and an eleventh vector pointing from the left mouth corner to the right mouth corner, or the outer product of an eighth vector pointing from the right mouth corner to the tip of the nose and a twelfth vector pointing from the right mouth corner to the left mouth corner; determining a second distance between the tip of the nose and the second line based on an outer product of a fifth vector directed from the left eye center point to the tip of the nose and a ninth vector directed from the left eye center point to the right eye center point, or an outer product of a sixth vector directed from the right eye center point to the tip of the nose and a tenth vector directed from the right eye center point to the left eye center point; and determining the ratio of the first distance to the second distance as the up-down pitching information of the target face.

In another embodiment, a ratio between the first distance and the second distance, and at least one of the first distance and the second distance may also be determined as the up-down pitch information of the target face.

As one example, the up-down pitch information of the target face may be determined using the following formula (15):

wherein ds is the up-down pitching information of the target face, dn is the first distance, and dne is the second distance.

In another implementation manner, if the face image to be processed is a three-dimensional image, the face key points include a left eye, a right eye, a left mouth angle and a right mouth angle, and the position information of the face key points includes coordinates of the left eye, the right eye, the left mouth angle and the right mouth angle in a three-dimensional rectangular coordinate system respectively, wherein the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin. The implementation process of determining the up-down pitching information of the target face according to the face key points may also be: according to the coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system, determining the coordinates of the center points of the two eyes; according to the coordinates of the left mouth angle and the right mouth angle in a three-dimensional rectangular coordinate system, determining the coordinates of a central point of the mouth angle; and determining the ratio of the sixth difference value to the seventh difference value to obtain a fourth tangent value, and determining the fourth tangent value as the up-down pitching information of the target face. The sixth difference value refers to the difference value of the Z-axis coordinate value in the binocular central point coordinate and the mouth corner central point coordinate, and the seventh difference value refers to the difference value of the Y-axis coordinate value in the binocular central point coordinate and the mouth corner central point coordinate.

Further, in order to avoid the occurrence of a negative value, the absolute value of the fourth tangent value may also be determined as the up-down pitch information of the target face.

As an example, in the established three-dimensional rectangular coordinate system, the data of the face key points are taken as three-dimensional data, for example, the coordinates of the left eye el are (el_x, el_y, el_z), the coordinates of the right eye er are (er_x, er_y, el_z), the coordinates of the ml of the left mouth corner are (el_x, el_y, el_z), and the coordinates of the right mouth corner are (er_x, er_y, el_z).

The coordinates of the center point ec of both eyes determined based on the position information of the left eye and the position information of the right eye are (ec_x, ec_y, ec_z), wherein,ec_y and ec_z are determined in a similar manner.

Coordinates (mc_x, mc_y, mc_z) of the mouth angle center point mc determined based on the position information of the left mouth angle and the position information of the right mouth angle, wherein,mc_y and mc_z are determined in a similar way. />

The fourth tangent value is represented by pitch, and in order to avoid negative values, for determining the ratio between the sixth difference value and the seventh difference value, taking its absolute value by abs, the pitch information of the target face can be determined by the following formula (16):

alternatively, the up-down pitch information of the target face is determined by the following formula (17):

Based on the face pose determining method shown in the embodiment of fig. 2, after determining the target face pose information, the target face pose information may be compared with preset threshold information, so as to retain valid face images and delete invalid face images before other operations such as subsequent image processing, so as to reduce the data processing amount.

In one possible implementation manner, the screening process of the face image to be processed may be: if at least one of the up-down pitching information, the left-right rotation information and the rotation information in the plane is in the corresponding preset information range, determining that the face image to be processed is invalid.

The preset information range corresponding to the up-down pitching information can be represented by a maximum pitching value, and when the determined up-down pitching information of the target face is greater than the maximum pitching value, the face image to be processed is determined to be invalid.

The preset information range corresponding to the left-right rotation information can be represented by a left-hand maximum value and a right-hand maximum value, and when the left-hand rotation information of the target face is determined to be larger than the left-hand maximum value, the face image to be processed is determined to be invalid; when the right rotation information of the target face is determined to be larger than the right rotation maximum value, determining that the face image to be processed is invalid; or when the left-right rotation information of the target face is determined not to be in the (left-hand maximum value, right-hand maximum value) interval, determining that the face image to be processed is invalid.

The preset information range corresponding to the rotation information in the set plane can be represented by a maximum rotation value, and when the rotation information of the determined target face in the plane is greater than the maximum rotation value, the face image to be processed is determined to be invalid.

In summary, in the embodiment of the present application, for a target face in a to-be-processed face image, a face key point in the target face and position information of the face key point may be determined first, then a face key vector is determined according to the position information of the face key point, and then face pose information of the target face is determined according to the position information of the face key point and the face key vector, where the face pose information includes upper and lower pitch information, left and right rotation information, and rotation information in a plane. The application can directly determine the face gesture according to the face key points and the face key vectors, does not need to train a neural network in advance by using a large number of samples, has simple algorithm and improves the rate of determining the face gesture.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a face pose determining apparatus according to an embodiment of the present application, and the apparatus 500 may be implemented by software, hardware, or a combination of both. The apparatus 500 may include:

A first determining module 501, configured to determine a face key point in a target face and position information of the face key point, where the target face is a face in a to-be-processed face image;

a second determining module 502, configured to determine a face key vector according to location information of the face key point;

and a third determining module 503, configured to determine face pose information of the target face according to the position information of the face key points and the face key vectors, where the face pose information includes up-down pitch information, left-right rotation information, and rotation information in a plane.

Optionally, the first determining module 501 is configured to:

establishing a three-dimensional rectangular coordinate system by taking a reference point of a face image to be processed as an origin, wherein an X-axis of the three-dimensional rectangular coordinate system is parallel to a first edge of the face image to be processed, a Y-axis is parallel to a second edge of the face image to be processed, the second edge is perpendicular to the first edge, and a Z-axis is perpendicular to the X-axis and the Y-axis;

Optionally, the face key vector includes at least one of:

a fifth vector pointing from the left eye center point to the tip of the nose;

a sixth vector pointing from the right eye center point to the tip of the nose;

a seventh vector pointing from the left mouth corner to the tip of the nose;

an eighth vector pointing from the right mouth corner to the tip of the nose;

a twelfth vector pointing from the right mouth corner to the left mouth corner.

Optionally, the third determining module 503 includes:

the first determining submodule is used for determining rotation information of the target face in the plane according to the position information of the key points of the face;

Optionally, the face key points include left eyes and right eyes, the position information of the face key points includes coordinates of the left eyes and the right eyes in a three-dimensional rectangular coordinate system respectively, and the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin;

a first determination submodule for:

determining a ratio between a first difference value and a second difference value to obtain a first tangent value, wherein the first difference value refers to a difference value of Y-axis coordinate values of a left eye and a right eye in a three-dimensional rectangular coordinate system, and the second difference value refers to a difference value of X-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system;

and determining the first tangent value as rotation information of the target face in the plane.

Optionally, the face key points include a left mouth angle and a right mouth angle, the position information of the face key points includes coordinates of the left mouth angle and the right mouth angle in a three-dimensional rectangular coordinate system, and the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin;

a first determination submodule for:

determining a ratio between a third difference value and a fourth difference value to obtain a second tangent value, wherein the third difference value is a difference value of Y-axis coordinate values of a left nozzle angle and a right nozzle angle in a three-dimensional rectangular coordinate system, and the fourth difference value is a difference value of X-axis coordinate values of the left nozzle angle and the right nozzle angle in the three-dimensional rectangular coordinate system;

And determining the second tangent value as rotation information of the target face in the plane.

a second determination submodule for:

determining a first sine value of an included angle between the first vector and a fifth vector according to an outer product of the first vector pointing to the left mouth corner from the left eye center point and the fifth vector pointing to the nose tip from the left eye center point; taking the first sine value as left rotation information of the target face;

or determining a second sine value of the included angle between the third vector and the sixth vector according to the outer product of the third vector pointing to the right mouth angle from the right eye center point and the sixth vector pointing to the nose tip from the right eye center point; and taking the second sine value as right rotation information of the target face.

a second determination submodule for:

determining a third sine value of the included angle between the second vector and the seventh vector according to the outer product of the second vector pointing from the left mouth corner to the left eye center point and the seventh vector pointing from the left mouth corner to the tip of the nose; taking the third sine value as left rotation information of the target face;

Or determining a fourth sine value of an included angle between the fourth vector and the eighth vector according to an outer product of the fourth vector pointing to the right eye center point from the right mouth angle and the eighth vector pointing to the nose tip from the right mouth angle; and taking the fourth sine value as right rotation information of the target face.

Optionally, the face image to be processed is a three-dimensional image, the face key points comprise a left eye and a right eye, the position information of the face key points comprises coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system respectively, and the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin;

the second determination submodule is further configured to:

determining a ratio of a fifth difference value to a second difference value to obtain a third tangent value, wherein the fifth difference value is a difference value of Z-axis coordinate values of a left eye and a right eye in a three-dimensional rectangular coordinate system, and the second difference value is a difference value of X-axis coordinate values of the left eye and the right eye in the three-dimensional rectangular coordinate system;

a third determination sub-module for:

determining a first distance between the tip of the nose and a first connecting line according to an outer product of a seventh vector pointing from the left mouth corner to the tip of the nose and an eleventh vector pointing from the left mouth corner to the right mouth corner, or an outer product of an eighth vector pointing from the right mouth corner to the tip of the nose and a twelfth vector pointing from the right mouth corner to the left mouth corner, wherein the first connecting line refers to a connecting line between the left mouth corner and the right mouth corner;

a third determination sub-module for:

determining a second distance between the tip of the nose and a second connecting line according to the outer product of a fifth vector pointing to the tip of the nose from the center point of the left eye and a ninth vector pointing to the center point of the right eye from the center point of the left eye, or the outer product of a sixth vector pointing to the tip of the nose from the center point of the right eye and a tenth vector pointing to the center point of the left eye from the center point of the right eye, wherein the second connecting line is a connecting line between the center points of the left eye and the center point of the right eye;

a third determination sub-module for:

determining a first distance between the nose tip and a first connecting line according to the outer product of a seventh vector pointing from the left mouth corner to the nose tip and an eleventh vector pointing from the left mouth corner to the right mouth corner, or the outer product of an eighth vector pointing from the right mouth corner to the nose tip and a twelfth vector pointing from the right mouth corner to the left mouth corner, wherein the first connecting line refers to a connecting line between the left mouth corner and the right mouth corner;

Determining a second distance between the tip of the nose and a second connecting line according to the outer product of a fifth vector pointing to the tip of the nose from the center point of the left eye and a ninth vector pointing to the center point of the right eye from the center point of the left eye, or the outer product of a sixth vector pointing to the tip of the nose from the center point of the right eye and a tenth vector pointing to the center point of the left eye from the center point of the right eye, wherein the second connecting line refers to a connecting line between the center points of the left eye and the center point of the right eye;

Optionally, the face image to be processed is a three-dimensional image, the face key points comprise a left eye, a right eye, a left mouth corner and a right mouth corner, the position information of the face key points comprises coordinates of the left eye, the right eye, the left mouth corner and the right mouth corner in a three-dimensional rectangular coordinate system respectively, and the three-dimensional rectangular coordinate system is a three-dimensional rectangular coordinate system established by taking a reference point of the face image to be processed as an origin;

the third determination submodule is further configured to:

according to the coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system, determining the coordinates of the center points of the two eyes;

according to the coordinates of the left mouth angle and the right mouth angle in a three-dimensional rectangular coordinate system, determining the coordinates of a central point of the mouth angle;

determining a ratio between a sixth difference value and a seventh difference value to obtain a fourth tangent value, wherein the sixth difference value is a difference value of Z-axis coordinate values in the coordinates of the center point of the eyes and the coordinates of the center point of the corners of the eyes, and the seventh difference value is a difference value of Y-axis coordinate values in the coordinates of the center point of the eyes and the coordinates of the center point of the corners of the eyes;

Optionally, the apparatus 500 further includes:

a fourth determining module 504, configured to determine that the face image to be processed is invalid if at least one of the pitch information, the yaw information, and the rotation information in the plane is not within the corresponding preset information range.

It should be noted that: the face pose determining device provided in the above embodiment only illustrates the division of the above functional modules when determining the face pose, in practical application, the above functional allocation may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the face pose determining device provided in the above embodiment and the face pose determining method embodiment belong to the same concept, and detailed implementation processes of the face pose determining device are shown in the method embodiment, and are not repeated here.

Referring to fig. 6, fig. 6 shows a block diagram of a terminal 600 according to an exemplary embodiment of the present application. The terminal 600 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Terminal 600 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.

In general, the terminal 600 includes: a processor 601 and a memory 602.

Processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 601 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 601 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 601 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 601 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the method of determining a face pose provided by an embodiment of the method of the present application.

In some embodiments, the terminal 600 may further optionally include: a peripheral interface 603, and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 603 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 604, a touch display 605, a camera 606, audio circuitry 607, a positioning component 608, and a power supply 609.

Peripheral interface 603 may be used to connect at least one Input/Output (I/O) related peripheral to processor 601 and memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 601, memory 602, and peripheral interface 603 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 604 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 604 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 604 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuit 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuit 604 may also include NFC (Near Field Communication ) related circuits, which the present application is not limited to.

The display screen 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 605 is a touch display, the display 605 also has the ability to collect touch signals at or above the surface of the display 605. The touch signal may be input as a control signal to the processor 601 for processing. At this point, the display 605 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 605 may be one, providing a front panel of the terminal 600; in other embodiments, the display 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; in still other embodiments, the display 605 may be a flexible display, disposed on a curved surface or a folded surface of the terminal 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The display 605 may be made of LCD (Liquid Crystal Display ), OLED (organic light-Emitting Diode) or other materials.

The camera assembly 606 is used to capture images or video. Optionally, the camera assembly 606 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

Those skilled in the art will appreciate that the structure shown in fig. 6 is not limiting of the terminal 600 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a server 700 according to an embodiment of the present application, where the server 700 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 701 and one or more memories 702, where at least one instruction is stored in the memories 702, and the at least one instruction is loaded and executed by the processors 701 to implement the method for determining a face pose according to the foregoing method embodiments. Of course, the server 700 may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.

The application also provides a computer readable storage medium, wherein the computer readable storage medium is stored with instructions, and the instructions realize the method for determining the face gesture when being executed by a processor.

The application also provides a computer program product for implementing the above-mentioned face pose determination method when the computer program product is executed.

It should be understood that references herein to "a plurality" are to two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.

Claims

1. A method for determining a face pose, the method comprising:

determining a face key point in a target face and position information of the face key point, wherein the target face is a face in a face image to be processed, and the face key point comprises a left eye, a right eye, a nose tip, a left mouth angle and a right mouth angle;

according to the face key vector, determining the up-down pitching information of the target face;

2. The method of claim 1, wherein determining location information of face keypoints in the target face comprises:

3. The method of claim 1, wherein the face key vector comprises at least one of:

a fifth vector pointing from the left eye center point to the tip of the nose;

a sixth vector pointing from the right eye center point to the tip of the nose;

a seventh vector pointing from the left mouth corner to the tip of the nose;

an eighth vector pointing from the right mouth corner to the tip of the nose;

a twelfth vector pointing from the right mouth corner to the left mouth corner.

4. The method of claim 1, wherein the position information of the face key points includes coordinates of left and right eyes in a three-dimensional rectangular coordinate system, respectively, the three-dimensional rectangular coordinate system being a three-dimensional rectangular coordinate system established with a reference point of the face image to be processed as an origin;

5. The method according to claim 1, wherein the position information of the face key point includes coordinates of the left mouth corner and the right mouth corner in a three-dimensional rectangular coordinate system, respectively, the three-dimensional rectangular coordinate system being a three-dimensional rectangular coordinate system established with a reference point of the face image to be processed as an origin;

6. The method of claim 1, wherein determining left-right rotation information of the target face based on the face key vector further comprises:

7. The method according to claim 1, wherein the face image to be processed is a three-dimensional image, and the position information of the face key points includes coordinates of the left eye and the right eye in a three-dimensional rectangular coordinate system, respectively, the three-dimensional rectangular coordinate system having a reference point of the face image to be processed as a coordinate origin, a direction perpendicular to the face image to be processed as a Z-axis, and two directions perpendicular to the Z-axis and perpendicular to each other as an X-axis and a Y-axis;

the method further comprises the steps of:

8. The method of claim 1, wherein determining pitch information for the target face based on the face key vector comprises:

9. The method of claim 1, wherein determining pitch information for the target face based on the face key vector comprises:

10. The method of claim 1, wherein determining pitch information for the target face based on the face key vector comprises:

11. The method according to claim 1, wherein the face image to be processed is a three-dimensional image, and the position information of the face key points includes coordinates of the left eye, the right eye, the left mouth corner, and the right mouth corner in three-dimensional rectangular coordinate systems, the three-dimensional rectangular coordinate systems having a reference point of the face image to be processed as a coordinate origin, a direction perpendicular to the face image to be processed as a Z-axis, and two directions perpendicular to the Z-axis and perpendicular to each other as an X-axis and a Y-axis, respectively;

the method further comprises the steps of:

12. The method of claim 1, wherein the method further comprises:

13. A face pose determination apparatus, the apparatus comprising:

the image acquisition module is used for acquiring face images;

the processor is used for determining a face key point of a target face in the face image and position information of the face key point, wherein the face key point comprises a left eye, a right eye, a nose tip, a left mouth angle and a right mouth angle; determining a face key vector according to the position information of the face key points; determining rotation information of the target face in a plane according to the position information of the key points of the face; determining a first sine value of an included angle between a first vector pointing to a left mouth corner from a left eye center point and a fifth vector pointing to a nose tip from the left eye center point according to an outer product of the first vector and the fifth vector; taking the first sine value as left rotation information of the target face; or determining a second sine value of an included angle between a third vector pointing to a right mouth corner from a right eye center point and a sixth vector pointing to a nose tip from the right eye center point according to an outer product of the third vector and the sixth vector; taking the second sine value as right rotation information of the target face; and determining the up-down pitching information of the target face according to the face key vector.

14. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to implement the steps of the method of any one of the preceding claims 1 to 12.

15. A computer readable storage medium having stored thereon instructions which, when executed, implement the steps of the method of any of the preceding claims 1 to 12.