CN113569754B - Face key point detection method, device, equipment and computer readable storage medium - Google Patents

Face key point detection method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN113569754B
CN113569754B CN202110866863.7A CN202110866863A CN113569754B CN 113569754 B CN113569754 B CN 113569754B CN 202110866863 A CN202110866863 A CN 202110866863A CN 113569754 B CN113569754 B CN 113569754B
Authority
CN
China
Prior art keywords
face
offset
key point
loss value
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110866863.7A
Other languages
Chinese (zh)
Other versions
CN113569754A (en
Inventor
胡魁
戴磊
刘玉宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110866863.7A priority Critical patent/CN113569754B/en
Publication of CN113569754A publication Critical patent/CN113569754A/en
Priority to PCT/CN2022/072184 priority patent/WO2023005164A1/en
Application granted granted Critical
Publication of CN113569754B publication Critical patent/CN113569754B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the field of face recognition in artificial intelligence, and provides a face key point detection method, which comprises the following steps: determining a first offset of the face key points according to the first Gaussian heat map and the first face key points, and determining a first face attitude angle according to the first face key points; inputting the face image into a neural network model to obtain a second Gaussian heat map, a second offset of key points of the face and a second face attitude angle; determining a model loss value according to the first Gaussian heat map, the first offset, the first face attitude angle, the second Gaussian heat map, the second offset and the second face attitude angle; determining whether the neural network model converges according to the model loss value; if the model parameters are not converged, updating the model parameters, and continuing to train the updated neural network model. The method improves the detection accuracy of the key points of the human face. The application also relates to blockchain technology, and the neural network model and sample data can be stored in the blockchain.

Description

Face key point detection method, device, equipment and computer readable storage medium
Technical Field
The present application relates to the field of face recognition, and in particular, to a method, apparatus, device, and computer readable storage medium for detecting key points of a face.
Background
The key point detection of the human face is a key step in the fields of face recognition and analysis, and is a precondition and a break of other face related problems such as automatic face recognition, expression analysis, three-dimensional face reconstruction, three-dimensional animation and the like. Currently, two main facial key point detection algorithms exist: one is to use a full connection layer to directly return to the key points of the face. The other is training by using a Gaussian kernel generated heat map as a label, generating the heat map during prediction, and determining key points of the face according to indexes of the generated heat map peak values. However, based on the two face key point detection algorithms, the accuracy of the detected face key points cannot be guaranteed. Therefore, how to improve the accuracy of detecting the key points of the face is a problem to be solved at present.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a computer-readable storage medium for detecting key points of a human face, aiming at improving the detection accuracy of the key points of the human face.
In a first aspect, an embodiment of the present application provides a method for detecting a key point of a face, including:
acquiring sample data, wherein the sample data comprises a face image, a marked first face key point and a marked first Gaussian heat map;
Determining a first offset of a face key point according to the first Gaussian heat map and the first face key point, and determining a first face attitude angle according to the first face key point;
Inputting the face image into a preset neural network model to obtain a second Gaussian heat map, a second offset of key points of the face and a second face attitude angle;
Determining a model loss value according to the first Gaussian heat map, the first offset, the first face attitude angle, the second Gaussian heat map, the second offset and the second face attitude angle;
Determining whether the neural network model converges according to the model loss value;
If the neural network model is not converged, updating model parameters of the neural network model, and continuing to train the updated neural network model until the model is converged to obtain a face key point detection model;
Acquiring a target face image to be detected, and inputting the target face image into the face key point detection model to obtain a target Steady heat map and a target offset of the face key points;
And determining face key points in the target face image according to the target Gaussian heat map and the target offset.
In a second aspect, an embodiment of the present application further provides a face key point detection apparatus, where the face key point detection apparatus includes:
The acquisition module is used for acquiring sample data, wherein the sample data comprises a face image, a marked first face key point and a marked first Gaussian heat map;
The determining module is used for determining a first offset of the face key point according to the first Gaussian heat map and the first face key point, and determining a first face attitude angle according to the first face key point;
The training module is used for inputting the face image into a preset neural network model to obtain a second Gaussian heat map, a second offset of key points of the face and a second face attitude angle;
the training module is further configured to determine a model loss value according to the first gaussian heat map, the first offset, the first face pose angle, the second gaussian heat map, the second offset, and the second face pose angle;
the training module is further used for determining whether the neural network model converges or not according to the model loss value;
the training module is further configured to update model parameters of the neural network model if the neural network model is not converged, and continuously train the updated neural network model until the convergence results in a face key point detection model;
The key point detection module is used for acquiring a target face image to be detected, inputting the target face image into the face key point detection model, and obtaining a target Steady heat map and a target offset of a face key point;
the key point detection module is further used for determining face key points in the target face image according to the target Gaussian heat map and the target offset.
In a third aspect, an embodiment of the present application further provides a computer device, where the computer device includes a processor, a memory, and a computer program stored on the memory and executable by the processor, where the computer program when executed by the processor implements the steps of the face key point detection method as described above.
In a fourth aspect, an embodiment of the present application further provides a computer readable storage medium, where a computer program is stored on the computer readable storage medium, where the computer program, when executed by a processor, implements the steps of the face key point detection method as described above.
The embodiment of the application provides a face key point detection method, a device, equipment and a computer readable storage medium, which are used for carrying out iterative training on a neural network model by combining a Gaussian heat map, the offset of a face key point and a face attitude angle to obtain a face key point detection model, so that the accuracy and precision of the face key point detection model can be improved, an accurate target Gaussian heat map and the target offset of the face key point can be obtained after a target face image to be detected is input into the face key point detection model, and finally the face key point in the target face image can be determined by combining the Gaussian heat map and the target offset.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a face key point detection method provided by an embodiment of the application;
FIG. 2 is a schematic diagram of a hierarchical structure of a neural network model according to an embodiment of the present application;
FIG. 3 is a flow chart illustrating sub-steps of the face key point detection method of FIG. 1;
FIG. 4 is a schematic diagram of a hierarchical structure of a face key point detection model according to an embodiment of the present application;
Fig. 5 is a schematic block diagram of a face key point detection device according to an embodiment of the present application;
FIG. 6 is a schematic block diagram of a sub-module of the face keypoint detection apparatus of FIG. 5;
fig. 7 is a schematic block diagram of a computer device according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the application provides a method, a device, equipment and a computer-readable storage medium for detecting key points of a human face. The face key point detection method can be applied to terminal equipment or a server, the terminal equipment can be mobile phones, tablet computers, notebook computers, desktop computers, personal digital assistants, wearable equipment and the like, and the server can be an independent server or a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content distribution networks (Content Delivery Network, CDNs), basic cloud computing services such as big data and artificial intelligent platforms and the like.
Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a flowchart of a face key point detection method according to an embodiment of the present application.
As shown in fig. 1, the face key point detection method may include steps S101 to S108.
Step S101, acquiring sample data, wherein the sample data comprises a face image, a marked first face key point and a marked first Gaussian heat map.
The number of the first face key points may be set based on actual situations, which is not specifically limited in this embodiment. For example, the number of first face keypoints may be 68, 96, 98, 106, 186, etc.
Step S102, determining a first offset of the face key points according to the first Gaussian heat map and the first face key points, and determining a first face attitude angle according to the first face key points.
Exemplary, first coordinate information of a face key point is obtained from a first Gaussian heat map, and second coordinate information of the first face key point is obtained; and determining the offset between the first coordinate information and the second coordinate information to obtain the first offset of the key point of the face.
The first offset of the face key point comprises an x-axis offset and a y-axis offset. For example, if the first coordinate information of the face key point is (x i,yi), the second coordinate information of the first face key point is (x gt,ygt), the x-axis offset of the i-th face key point is x gt-xi, and the y-axis offset of the i-th face key point is y gt-yi.
For example, obtaining the first coordinate information of the face key point from the first gaussian heat map may include: determining a pixel point with the maximum pixel value in the first Gaussian heat map as a face key point; and acquiring pixel coordinates of the key points of the human face from the first Gaussian heat map, so as to obtain first coordinate information of the key points of the human face.
For example, according to the first face key point, the manner of determining the first face pose angle may be: inputting a plurality of first face key points into a preset face attitude angle detection model to obtain a first face attitude angle. The face attitude angle detection model is obtained by training and iterating a neural network model based on face key points and labeled face attitude angles, and the neural network model can comprise a convolutional neural network model, a cyclic convolutional neural network model and the like.
And step S103, inputting the face image into a preset neural network model to obtain a second Gaussian heat map, a second offset of key points of the face and a second face attitude angle.
The preset neural network model includes an offset prediction sub-network, a face pose angle prediction sub-network, and a gaussian heat map generation sub-network, and it is understood that specific hierarchical structures of the offset prediction sub-network, the face pose angle prediction sub-network, and the gaussian heat map generation sub-network may be set based on actual situations, which is not specifically limited in this embodiment. The neural network model can be stored in the blockchain to improve the security of the neural network model.
For example, as shown in fig. 2, the face pose angle prediction sub-network 11 includes a first convolution layer, a second convolution layer, a third convolution layer, and a first full connection layer, and the first convolution layer is connected to the second convolution layer, the second convolution layer is connected to the third convolution layer, the third convolution layer is connected to the first full connection layer, the offset prediction sub-network 12 includes a first convolution layer, a second convolution layer, a fourth convolution layer, and a second full connection layer, and the second convolution layer is connected to the fourth convolution layer, and the fourth convolution layer is connected to the second full connection layer, and the gaussian heat map generation sub-network 13 includes the first convolution layer, the second convolution layer, and the fourth convolution layer, that is, the offset prediction sub-network, the face pose angle prediction sub-network, and the gaussian heat map generation sub-network may share one or more convolution layers.
Exemplary, the way of inputting the face image into the preset neural network model to obtain the second gaussian heat map, the second offset of the face key points and the second face attitude angle may be: inputting the face image into a first convolution layer for convolution processing to obtain a first feature map; inputting the first feature map into a second convolution layer for convolution processing to obtain a second feature map; respectively inputting the second characteristic map into a third convolution layer and a fourth convolution layer for convolution treatment, outputting the third characteristic map by the third convolution layer, and outputting the fourth characteristic map and the second Gaussian heat map by the fourth convolution layer; inputting the third feature map into the first full-connection layer to obtain a second face attitude angle; and inputting the fourth feature map into a second full-connection layer to obtain a second offset of the key points of the face.
And step S104, determining a model loss value according to the first Gaussian heat map, the first offset, the first face attitude angle, the second Gaussian heat map, the second offset and the second face attitude angle.
The model loss value is determined by comprehensively considering the first Gaussian heat map, the first offset, the first face attitude angle, the second Gaussian heat map, the second offset and the second face attitude angle, so that the accuracy of the model can be improved, and the forward reasoning speed of the model can be ensured.
In one embodiment, as shown in fig. 3, step S104 includes: substep S1041 to substep S1044.
In sub-step S1041, a first loss value is determined according to the first face pose angle, the second face pose angle, the first offset, and the second offset.
Illustratively, determining a first candidate loss value according to the first face pose angle and the second face pose angle; determining a second candidate loss value according to the first offset and the second offset; and multiplying the first candidate loss value and the second candidate loss value to obtain a first loss value.
Illustratively, the manner of determining the first candidate loss value may be: determining a posture angle difference value between a first face posture angle and a second face posture angle, wherein the posture angle difference value comprises a pitch angle difference value, a yaw angle difference value and a roll angle difference value; determining a first cosine value of the pitch angle difference value, a second cosine value of the yaw angle difference value and a third cosine value of the roll angle difference value; and determining an average cosine value of the first cosine value, the second cosine value and the third cosine value, and subtracting the average cosine value from 1 to obtain a first candidate loss value.
Illustratively, the manner of determining the second candidate loss value may be: determining an absolute value of a difference value between the first offset and the second offset to obtain an offset difference value; if the offset difference value is smaller than 1, determining one half of the square value of the offset difference value to obtain a loss value corresponding to the key point of the face; if the offset difference value is greater than or equal to 1, calculating a difference value between the offset difference value and 0.5 to obtain a loss value corresponding to the key point of the face; and accumulating the loss value of each face key point to obtain a total loss value, and dividing the number of the face key points by the total loss value to obtain a second candidate loss value.
For example, the first loss value may be determined by the following first loss function:
Wherein, theta j is the first face attitude angle, namely theta 1 is the pitch angle in the first face attitude angle, theta 2 is the yaw angle in the first face attitude angle, theta 3 is the roll angle in the first face attitude angle, For the second face pose angle, i.e./>Is the pitch angle in the second face attitude angle,/>Is yaw angle in the second face attitude angle,/>For the roll angle in the second face pose angle, m is the number of face key points, delta i is the first offset of face key point i,/>Is the second offset of the face key point i.
Substep S1042, determining a second loss value from the first gaussian heat map and the second gaussian heat map.
Illustratively, determining a pixel value difference between each pixel point in the first gaussian heat map and a corresponding pixel point in the second gaussian heat map, and determining a square value of the pixel value difference; and accumulating the square value of the pixel value difference value corresponding to each pixel point to obtain a total pixel value, counting the number of the pixel points in the first Gaussian heat map to obtain the total pixel point number, and dividing the total pixel point number by the total pixel value to obtain a second loss value.
For example, the second loss value may be determined by the following second loss function:
Wherein N is the total number of pixel points, v i is the pixel value of the ith pixel point in the first Gaussian heat diagram, Is the pixel value of the ith pixel point in the second gaussian heat diagram.
And step S1043, determining a third loss value according to the first face attitude angle and the second face attitude angle.
For example, determining a pose angle difference between a first face pose angle and a second face pose angle, wherein the pose angle difference comprises a pitch angle difference, a yaw angle difference, and a roll angle difference; if the absolute value of the attitude angle difference value is smaller than 1, determining one half of the square value of the attitude angle difference value as an attitude angle loss value, wherein the attitude angle loss value comprises a pitch angle loss value, a yaw angle loss value and a roll angle loss value; if the absolute value of the attitude angle difference value is greater than or equal to 1, subtracting 0.5 from the absolute value of the attitude angle difference value to obtain an attitude angle loss value; an average loss value is determined from the pitch angle loss value, the yaw angle loss value, and the roll angle loss value, and the average loss value is determined as a third loss value.
For example, the third loss value may be determined by the following third loss function:
Wherein, theta j is the first face attitude angle, namely theta 1 is the pitch angle in the first face attitude angle, theta 2 is the yaw angle in the first face attitude angle, theta 3 is the roll angle in the first face attitude angle, For the second face pose angle, i.e./>Is the pitch angle in the second face attitude angle,/>Is yaw angle in the second face attitude angle,/>Is the roll angle in the second face pose angle.
Substep S1044, determining a model loss value based on the first loss value, the second loss value, and the third loss value.
Illustratively, determining a product between the second loss value and a preset first weighting coefficient to obtain a first weighted loss value; determining the product between the third loss value and a preset second weighting coefficient to obtain a second weighting loss value; and summing the first loss value, the first weighted loss value and the second weighted loss value to obtain a model loss value.
Wherein the model loss value may be determined by a target loss function, and the target loss function is determined based on the first, second and third function functions. For example, the objective loss function is: l=l 1+α*L2+β*L3, where L 1 is a first loss function, L 2 is a second loss function, L 3 is a third loss function, α is a first weighting coefficient, and β is a second weighting coefficient.
Step S105, determining whether the neural network model is converged according to the model loss value.
For example, it is determined whether the model loss value is greater than or equal to a preset loss value, if the model loss value is greater than or equal to the preset loss value, it is determined that the neural network model is converged, and if the model loss value is less than the preset loss value, it is determined that the neural network model is not converged. The preset loss value may be set based on practical situations, which is not specifically limited in this embodiment.
And step S106, if the neural network model is not converged, updating model parameters of the neural network model, and continuing to train the updated neural network model until the face key point detection model is converged.
Continuing to train the updated neural network model includes: steps S101 to S105 are repeatedly performed, and sample data acquired each time step S101 is performed is different. The model parameters of the neural network model may include the first weighting coefficient, the second weighting coefficient, and parameters of each level of the neural network model.
And S107, acquiring a target face image to be detected, and inputting the target face image into a face key point detection model to obtain a target Steady heat map and a target offset of the face key point.
The face key point detection model includes an offset prediction sub-network and a gaussian heat map generation sub-network, and it is understood that specific hierarchical structures of the offset prediction sub-network and the gaussian heat map generation sub-network may be set based on actual situations, which is not specifically limited in this embodiment.
For example, as shown in fig. 4, the offset prediction sub-network 21 includes a fifth convolution layer, a sixth convolution layer, a seventh convolution layer, and a third full connection layer, and the fifth convolution layer is connected to the sixth convolution layer, the sixth convolution layer is connected to the seventh convolution layer, the seventh convolution layer is connected to the third full connection layer, and the gaussian heat map generation sub-network 22 includes the fifth convolution layer, the sixth convolution layer, and the seventh convolution layer.
For example, the manner of inputting the target face image into the face key point detection model to obtain the target gaussian heat map and the target offset of the face key point may be: inputting the target face image into a fifth convolution layer for convolution processing to obtain a fifth feature map; inputting the fifth feature map into a sixth convolution layer for convolution processing to obtain a sixth feature map; inputting the sixth feature map into a seventh convolution layer for convolution treatment to obtain a target Stirling map and a seventh feature map; and inputting the seventh feature map into a third full-connection layer for processing to obtain the target offset of the key points of the human face.
And S108, determining face key points in the target face image according to the target Stirling heat map and the target offset.
Exemplary, the third coordinate information of the key points of the human face is obtained from the target Gaussian heat map, and the down-sampling rate of the key point detection model of the human face is obtained; multiplying the third coordinate information by the downsampling rate to obtain fourth coordinate information of the key points of the face; and carrying out addition operation on the fourth coordinate information and the target offset to obtain target coordinate information of the face key points in the target face image. Wherein the target offset includes an x-axis offset and a y-axis offset.
For example, if the third coordinate information of the face key point i is (x i,yi), the downsampling rate of the face key point detection model is s, and the x-axis offset and the y-axis offset in the target offsets corresponding to the face key point i are δ xi and δ yi, respectively, the target coordinate information of the face key point i is (x gt,ygt), and x gt=s*xixi,ygt=s*yiyi.
According to the face key point detection method, the face key point detection model is obtained by carrying out iterative training on the neural network model by combining the Gaussian heat map, the offset of the face key points and the face attitude angle, so that the accuracy and precision of the face key point detection model can be improved, the accurate target Gaussian heat map and the target offset of the face key points can be obtained after the target face image to be detected is input into the face key point detection model, and finally the face key points in the target face image can be determined by combining the Gaussian heat map and the target offset.
Referring to fig. 5, fig. 5 is a schematic block diagram of a face key point detection apparatus according to an embodiment of the present application.
As shown in fig. 5, the face key point detection apparatus 200 includes:
An obtaining module 210, configured to obtain sample data, where the sample data includes a face image, a labeled first face key point, and a labeled first gaussian heat map;
A determining module 220, configured to determine a first offset of a face key point according to the first gaussian heat map and the first face key point, and determine a first face pose angle according to the first face key point;
The training module 230 is configured to input the face image into a preset neural network model, so as to obtain a second gaussian heat map, a second offset of a face key point, and a second face pose angle;
The training module 230 is further configured to determine a model loss value according to the first gaussian heat map, the first offset, the first face pose angle, the second gaussian heat map, the second offset, and the second face pose angle;
the training module 230 is further configured to determine whether the neural network model converges according to the model loss value;
The training module 230 is further configured to update model parameters of the neural network model if the neural network model is not converged, and continue training the updated neural network model until the convergence results in a face key point detection model;
the key point detection module 240 is configured to obtain a target face image to be detected, and input the target face image into the face key point detection model to obtain a target gaussian heat map and a target offset of a face key point;
The keypoint detection module 240 is further configured to determine a face keypoint in the target face image according to the target gaussian heat map and the target offset.
In an embodiment, the determining module 220 is further configured to:
Acquiring first coordinate information of a face key point from the first Gaussian heat map, and acquiring second coordinate information of the first face key point;
And determining the offset between the first coordinate information and the second coordinate information to obtain a first offset of the key point of the face.
In one embodiment, as shown in fig. 6, the training module 230 includes:
A first determining sub-module 231, configured to determine a first loss value according to the first face pose angle, the second face pose angle, the first offset, and the second offset;
a second determination submodule 232 for determining a second loss value from the first gaussian heat map and the second gaussian heat map;
A third determining submodule 233, configured to determine a third loss value according to the first face pose angle and the second face pose angle;
a fourth determination submodule 234 is configured to determine a model loss value based on the first loss value, the second loss value, and the third loss value.
In an embodiment, the first determining sub-module 231 is further configured to:
Determining a first candidate loss value according to the first face attitude angle and the second face attitude angle;
determining a second candidate loss value according to the first offset and the second offset;
And multiplying the first candidate loss value and the second candidate loss value to obtain the first loss value.
In an embodiment, the first determining sub-module 231 is further configured to:
Determining a posture angle difference value between the first face posture angle and the second face posture angle, wherein the posture angle difference value comprises a pitch angle difference value, a yaw angle difference value and a roll angle difference value;
determining a first cosine value of the pitch angle difference value, a second cosine value of the yaw angle difference value and a third cosine value of the roll angle difference value;
and determining the average cosine values of the first cosine value, the second cosine value and the third cosine value, and subtracting the average cosine value from 1 to obtain a first candidate loss value.
In an embodiment, the fourth determination submodule 234 is further configured to:
determining the product between the second loss value and a preset first weighting coefficient to obtain a first weighting loss value;
determining a product between the third loss value and a preset second weighting coefficient to obtain a second weighting loss value, wherein the model parameter comprises the first weighting coefficient and the second weighting coefficient;
And summing the first loss value, the first weighted loss value and the second weighted loss value to obtain a model loss value.
In an embodiment, the keypoint detection module 240 is further configured to:
acquiring third coordinate information of the key points of the human face from the target Steady heat map, and acquiring the downsampling rate of the key point detection model of the human face;
multiplying the third coordinate information and the downsampling rate to obtain fourth coordinate information of the key points of the face;
And carrying out addition operation on the fourth coordinate information and the target offset to obtain target coordinate information of the face key points in the target face image.
It should be noted that, for convenience and brevity of description, specific working processes of the above-described apparatus and modules and units may refer to corresponding processes in the foregoing face key point detection method embodiment, and are not described herein again.
The apparatus provided by the above embodiments may be implemented in the form of a computer program which may be run on a computer device as shown in fig. 7.
Referring to fig. 7, fig. 7 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device may be a server or a terminal device.
As shown in fig. 7, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a storage medium and an internal memory.
The storage medium may store an operating system and a computer program. The computer program comprises program instructions which, when executed, cause the processor to perform any one of a number of face key point detection methods.
The processor is used to provide computing and control capabilities to support the operation of the entire computer device.
The network interface is used for network communication such as transmitting assigned tasks and the like. It will be appreciated by those skilled in the art that the structure shown in FIG. 7 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
It should be appreciated that the Processor may be a central processing unit (Central Processing Unit, CPU), it may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein in an embodiment the processor is configured to run a computer program stored in the memory to implement the steps of:
acquiring sample data, wherein the sample data comprises a face image, a marked first face key point and a marked first Gaussian heat map;
Determining a first offset of a face key point according to the first Gaussian heat map and the first face key point, and determining a first face attitude angle according to the first face key point;
Inputting the face image into a preset neural network model to obtain a second Gaussian heat map, a second offset of key points of the face and a second face attitude angle;
Determining a model loss value according to the first Gaussian heat map, the first offset, the first face attitude angle, the second Gaussian heat map, the second offset and the second face attitude angle;
Determining whether the neural network model converges according to the model loss value;
If the neural network model is not converged, updating model parameters of the neural network model, and continuing to train the updated neural network model until the model is converged to obtain a face key point detection model;
Acquiring a target face image to be detected, and inputting the target face image into the face key point detection model to obtain a target Steady heat map and a target offset of the face key points;
And determining face key points in the target face image according to the target Gaussian heat map and the target offset.
In an embodiment, when implementing determining the first offset of the face key point according to the first gaussian heat diagram and the first face key point, the processor is configured to implement:
Acquiring first coordinate information of a face key point from the first Gaussian heat map, and acquiring second coordinate information of the first face key point;
And determining the offset between the first coordinate information and the second coordinate information to obtain a first offset of the key point of the face.
In an embodiment, when determining the model loss value according to the first gaussian heat map, the first offset, the first face pose angle, the second gaussian heat map, the second offset, and the second face pose angle, the processor is configured to implement:
determining a first loss value according to the first face pose angle, the second face pose angle, the first offset and the second offset;
determining a second loss value according to the first Gaussian heat map and the second Gaussian heat map;
Determining a third loss value according to the first face attitude angle and the second face attitude angle;
and determining a model loss value according to the first loss value, the second loss value and the third loss value.
In an embodiment, the processor, when implementing determining a model loss value based on the first loss value, the second loss value, and the third loss value, is configured to implement:
determining the product between the second loss value and a preset first weighting coefficient to obtain a first weighting loss value;
determining a product between the third loss value and a preset second weighting coefficient to obtain a second weighting loss value, wherein the model parameter comprises the first weighting coefficient and the second weighting coefficient;
And summing the first loss value, the first weighted loss value and the second weighted loss value to obtain a model loss value.
In an embodiment, the processor is configured to, when implementing determining the first loss value according to the first face pose angle, the second face pose angle, the first offset, and the second offset, implement:
Determining a first candidate loss value according to the first face attitude angle and the second face attitude angle;
determining a second candidate loss value according to the first offset and the second offset;
And multiplying the first candidate loss value and the second candidate loss value to obtain the first loss value.
In an embodiment, the processor is configured, when implementing determining the first candidate loss value according to the first face pose angle and the second face pose angle, to implement:
Determining a posture angle difference value between the first face posture angle and the second face posture angle, wherein the posture angle difference value comprises a pitch angle difference value, a yaw angle difference value and a roll angle difference value;
determining a first cosine value of the pitch angle difference value, a second cosine value of the yaw angle difference value and a third cosine value of the roll angle difference value;
and determining the average cosine values of the first cosine value, the second cosine value and the third cosine value, and subtracting the average cosine value from 1 to obtain a first candidate loss value.
In an embodiment, when determining a face key point in the target face image according to the target gaussian heat map and the target offset, the processor is configured to implement:
acquiring third coordinate information of the key points of the human face from the target Steady heat map, and acquiring the downsampling rate of the key point detection model of the human face;
multiplying the third coordinate information and the downsampling rate to obtain fourth coordinate information of the key points of the face;
And carrying out addition operation on the fourth coordinate information and the target offset to obtain target coordinate information of the face key points in the target face image.
It should be noted that, for convenience and brevity of description, specific working processes of the above-described computer device may refer to corresponding processes in the foregoing face key point detection method embodiment, and are not described herein again.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.
The embodiment of the application also provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and the computer program comprises program instructions, and the method implemented by the program instructions when being executed can refer to various embodiments of the face key point detection method.
Wherein the computer readable storage medium may be volatile or nonvolatile. The computer readable storage medium may be an internal storage unit of the computer device according to the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), or the like, which are provided on the computer device.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
It is to be understood that the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. While the application has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (10)

1. The method for detecting the key points of the human face is characterized by comprising the following steps of:
acquiring sample data, wherein the sample data comprises a face image, a marked first face key point and a marked first Gaussian heat map;
Determining a first offset of a face key point according to the first Gaussian heat map and the first face key point, and determining a first face attitude angle according to the first face key point;
Inputting the face image into a preset neural network model, and respectively outputting a second Gaussian heat map, a second offset of a face key point and a second face attitude angle by a Gaussian heat map generation sub-network, an offset prediction sub-network and a face attitude angle prediction sub-network in the neural network model, wherein the Gaussian heat map generation sub-network, the offset prediction sub-network and the face attitude angle prediction sub-network share one or more convolution layers;
Determining a model loss value according to the first Gaussian heat map, the first offset, the first face attitude angle, the second Gaussian heat map, the second offset and the second face attitude angle;
Determining whether the neural network model converges according to the model loss value;
If the neural network model is not converged, updating model parameters of the neural network model, and continuing to train the updated neural network model until the model is converged to obtain a face key point detection model;
Acquiring a target face image to be detected, and inputting the target face image into the face key point detection model to obtain a target Steady heat map and a target offset of the face key points;
And determining face key points in the target face image according to the target Gaussian heat map and the target offset.
2. The method of claim 1, wherein determining a first offset of a face key point according to the first gaussian heat map and the first face key point comprises:
Acquiring first coordinate information of a face key point from the first Gaussian heat map, and acquiring second coordinate information of the first face key point;
And determining the offset between the first coordinate information and the second coordinate information to obtain a first offset of the key point of the face.
3. The face key point detection method according to claim 1, wherein the determining a model loss value according to the first gaussian heat map, the first offset, the first face pose angle, the second gaussian heat map, the second offset, the second face pose angle includes:
determining a first loss value according to the first face pose angle, the second face pose angle, the first offset and the second offset;
determining a second loss value according to the first Gaussian heat map and the second Gaussian heat map;
Determining a third loss value according to the first face attitude angle and the second face attitude angle;
and determining a model loss value according to the first loss value, the second loss value and the third loss value.
4. A face key point detection method according to claim 3, wherein said determining a model loss value from said first loss value, said second loss value, and said third loss value comprises:
determining the product between the second loss value and a preset first weighting coefficient to obtain a first weighting loss value;
determining a product between the third loss value and a preset second weighting coefficient to obtain a second weighting loss value, wherein the model parameter comprises the first weighting coefficient and the second weighting coefficient;
And summing the first loss value, the first weighted loss value and the second weighted loss value to obtain a model loss value.
5. The face key point detection method according to claim 3, wherein the determining a first loss value according to the first face pose angle, the second face pose angle, the first offset, and the second offset includes:
Determining a first candidate loss value according to the first face attitude angle and the second face attitude angle;
determining a second candidate loss value according to the first offset and the second offset;
And multiplying the first candidate loss value and the second candidate loss value to obtain the first loss value.
6. The method of claim 5, wherein determining a first candidate loss value according to the first face pose angle and the second face pose angle comprises:
Determining a posture angle difference value between the first face posture angle and the second face posture angle, wherein the posture angle difference value comprises a pitch angle difference value, a yaw angle difference value and a roll angle difference value;
determining a first cosine value of the pitch angle difference value, a second cosine value of the yaw angle difference value and a third cosine value of the roll angle difference value;
and determining the average cosine values of the first cosine value, the second cosine value and the third cosine value, and subtracting the average cosine value from 1 to obtain a first candidate loss value.
7. The face keypoint detection method according to any one of claims 1 to 5, wherein said determining a face keypoint in the target face image from the target gaussian heat map and the target offset comprises:
acquiring third coordinate information of the key points of the human face from the target Steady heat map, and acquiring the downsampling rate of the key point detection model of the human face;
multiplying the third coordinate information and the downsampling rate to obtain fourth coordinate information of the key points of the face;
And carrying out addition operation on the fourth coordinate information and the target offset to obtain target coordinate information of the face key points in the target face image.
8. A face key point detection apparatus, characterized in that the face key point detection apparatus comprises:
The acquisition module is used for acquiring sample data, wherein the sample data comprises a face image, a marked first face key point and a marked first Gaussian heat map;
The determining module is used for determining a first offset of the face key point according to the first Gaussian heat map and the first face key point, and determining a first face attitude angle according to the first face key point;
the training module is used for inputting the face image into a preset neural network model, and outputting a second Gaussian heat map, a second offset of a face key point and a second face attitude angle by a Gaussian heat map generation sub-network, an offset prediction sub-network and a face attitude angle prediction sub-network in the neural network model respectively, wherein the Gaussian heat map generation sub-network, the offset prediction sub-network and the face attitude angle prediction sub-network share one or more convolution layers;
the training module is further configured to determine a model loss value according to the first gaussian heat map, the first offset, the first face pose angle, the second gaussian heat map, the second offset, and the second face pose angle;
the training module is further used for determining whether the neural network model converges or not according to the model loss value;
the training module is further configured to update model parameters of the neural network model if the neural network model is not converged, and continuously train the updated neural network model until the convergence results in a face key point detection model;
The key point detection module is used for acquiring a target face image to be detected, inputting the target face image into the face key point detection model, and obtaining a target Steady heat map and a target offset of a face key point;
the key point detection module is further used for determining face key points in the target face image according to the target Gaussian heat map and the target offset.
9. A computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program when executed by the processor implements the steps of the face key point detection method according to any one of claims 1 to 7.
10. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, wherein the computer program, when executed by a processor, implements the steps of the face key point detection method according to any one of claims 1 to 7.
CN202110866863.7A 2021-07-29 2021-07-29 Face key point detection method, device, equipment and computer readable storage medium Active CN113569754B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110866863.7A CN113569754B (en) 2021-07-29 2021-07-29 Face key point detection method, device, equipment and computer readable storage medium
PCT/CN2022/072184 WO2023005164A1 (en) 2021-07-29 2022-01-14 Face key point detection method and apparatus, device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110866863.7A CN113569754B (en) 2021-07-29 2021-07-29 Face key point detection method, device, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN113569754A CN113569754A (en) 2021-10-29
CN113569754B true CN113569754B (en) 2024-05-07

Family

ID=78169181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110866863.7A Active CN113569754B (en) 2021-07-29 2021-07-29 Face key point detection method, device, equipment and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN113569754B (en)
WO (1) WO2023005164A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569754B (en) * 2021-07-29 2024-05-07 平安科技(深圳)有限公司 Face key point detection method, device, equipment and computer readable storage medium
CN114399803A (en) * 2021-11-30 2022-04-26 际络科技(上海)有限公司 Face key point detection method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860300A (en) * 2020-07-17 2020-10-30 广州视源电子科技股份有限公司 Key point detection method and device, terminal equipment and storage medium
CN112801043A (en) * 2021-03-11 2021-05-14 河北工业大学 Real-time video face key point detection method based on deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229488B (en) * 2016-12-27 2021-01-01 北京市商汤科技开发有限公司 Method and device for detecting key points of object and electronic equipment
CN112580515B (en) * 2020-12-21 2022-05-10 浙江大学 Lightweight face key point detection method based on Gaussian heat map regression
CN113569754B (en) * 2021-07-29 2024-05-07 平安科技(深圳)有限公司 Face key point detection method, device, equipment and computer readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860300A (en) * 2020-07-17 2020-10-30 广州视源电子科技股份有限公司 Key point detection method and device, terminal equipment and storage medium
CN112801043A (en) * 2021-03-11 2021-05-14 河北工业大学 Real-time video face key point detection method based on deep learning

Also Published As

Publication number Publication date
WO2023005164A1 (en) 2023-02-02
CN113569754A (en) 2021-10-29

Similar Documents

Publication Publication Date Title
CN111079570A (en) Human body key point identification method and device and electronic equipment
CN113569754B (en) Face key point detection method, device, equipment and computer readable storage medium
CN113673439B (en) Pet dog identification method, device, equipment and storage medium based on artificial intelligence
CN113657318B (en) Pet classification method, device, equipment and storage medium based on artificial intelligence
CN113688912A (en) Confrontation sample generation method, device, equipment and medium based on artificial intelligence
CN112651490B (en) Training method and device for human face key point detection model and readable storage medium
CN111709268B (en) Human hand posture estimation method and device based on human hand structure guidance in depth image
KR20230169104A (en) Personalized biometric anti-spoofing protection using machine learning and enrollment data
US20200005078A1 (en) Content aware forensic detection of image manipulations
CN117372877A (en) Star map identification method and device based on neural network and related medium
CN114863201A (en) Training method and device of three-dimensional detection model, computer equipment and storage medium
CN111652245B (en) Vehicle contour detection method, device, computer equipment and storage medium
CN113657321B (en) Dog face key point detection method, device, equipment and medium based on artificial intelligence
CN113111687A (en) Data processing method and system and electronic equipment
CN113837998B (en) Method and device for automatically adjusting and aligning pictures based on deep learning
CN114241411A (en) Counting model processing method and device based on target detection and computer equipment
CN114627170A (en) Three-dimensional point cloud registration method and device, computer equipment and storage medium
CN116543425A (en) Palm detection method and device based on YOLOv4, computer equipment and storage medium
CN113643348B (en) Face attribute analysis method and device
Matsui et al. Automatic feature point selection through hybrid metaheauristics based on Tabu search and memetic algorithm for augmented reality
CN110852386B (en) Data classification method, apparatus, computer device and readable storage medium
CN118155010A (en) Recognition model training method, device, computer equipment and storage medium
CN117953582A (en) Attitude estimation method, equipment and storage medium
CN116311460A (en) Image generation method, device, equipment and storage medium
CN113239996A (en) Active learning method, device, equipment and storage medium based on target detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant