CN114463826B

CN114463826B - Facial expression recognition method, electronic device and storage medium

Info

Publication number: CN114463826B
Application number: CN202210376844.0A
Authority: CN
Inventors: 李立业; 陈智超; 寇鸿斌; 付贤强
Original assignee: Hefei Dilusense Technology Co Ltd
Current assignee: Anhui Lushenshi Technology Co ltd
Priority date: 2022-04-12
Filing date: 2022-04-12
Publication date: 2022-08-12
Anticipated expiration: 2042-04-12
Also published as: CN114463826A

Abstract

The embodiment of the application relates to the field of face recognition, and discloses a face expression recognition method, electronic equipment and a storage medium. The facial expression recognition method comprises the following steps: calculating the three-dimensional variable quantity of the corresponding organ in the first depth map of the face to be detected and the standard depth map of the face to be detected; the standard depth map is a depth map in a natural expression state; sequentially matching the three-dimensional variable quantity of at least one organ with an organ action combination containing the organ in an expression library, and taking an expression type corresponding to the organ action combination matched with the three-dimensional variable quantity of the last organ in the at least one organ as the expression type of the face in the first depth map; the organ which is not first matched and the organ action combination to be matched corresponding to the organ in at least one organ are determined based on the matching result corresponding to the organ which is not first matched and is performed before the organ. The expression type of the face to be detected can be determined only by carrying out at least one-time matching, and the recognition speed is high and the recognition precision is high.

Description

Facial expression recognition method, electronic device and storage medium

Technical Field

The embodiment of the application relates to the field of face recognition, in particular to a face expression recognition method, electronic equipment and a storage medium.

Background

The facial expression recognition technology is used for analyzing and determining an expression category of a facial image to be detected, and comprises the following steps: anger, disgust, happiness, fear and the like, which play an important role in the fields of human behavior pattern analysis, human-computer interaction design and the like. At present, a facial expression recognition method based on deep learning is commonly used, in which a facial image is input into a trained facial expression recognition model (such as a convolutional neural network, a residual neural network and the like) to extract facial features, and expression recognition is performed through the facial features. However, in order to achieve higher recognition accuracy, the method needs to adopt a complex neural network to perform a large amount of calculation training, which is long in time and high in cost.

Disclosure of Invention

An object of the embodiments of the present application is to provide a facial expression recognition method, an electronic device, and a storage medium, which can determine an expression type of a face to be detected only by matching a three-dimensional variation with a plurality of organ action combinations stored in advance at least once, and have a fast recognition speed and a high recognition accuracy.

In order to solve the above technical problem, an embodiment of the present application provides a facial expression recognition method, including: calculating three-dimensional variable quantities of corresponding organs in a first depth map of the face to be detected and a standard depth map of the face to be detected; the standard depth map is a depth map in a natural expression state; sequentially matching the three-dimensional variable quantity of at least one organ with an organ action combination containing the organ in an expression library, and taking an expression type corresponding to the organ action combination matched with the three-dimensional variable quantity of the last organ in the at least one organ as an expression type of the face in the first depth map; each organ action combination in the expression library corresponds to one expression type, and the three-dimensional variation of each organ contained in the organ action combination from the natural expression to the corresponding expression type; and the organ which is not the first organ to be matched in the at least one organ and the organ action combination to be matched corresponding to the organ are determined based on the matching result corresponding to the organ which is not the first organ to be matched and is performed with matching before.

An embodiment of the present application also provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the facial expression recognition method as mentioned in the above embodiments.

Embodiments of the present application also provide a computer-readable storage medium storing a computer program, which when executed by a processor implements the facial expression recognition method mentioned in the above embodiments.

The facial expression recognition method provided by the embodiment of the application measures the action change trend and the action change degree of each organ by calculating the three-dimensional change of the corresponding organ in the first depth map of the face to be detected and the standard depth map of the face to be detected, matches the three-dimensional change of at least one organ with the organ action combination containing the organ in the expression library in sequence, takes the expression type corresponding to the organ action combination matched with the three-dimensional change of the last organ in at least one organ as the expression type of the face in the first depth map, determines the organ action combination to be matched corresponding to the organ which performs matching non-first in at least one organ based on the matching result corresponding to the organ which performs matching last time in the matching process, namely determines the matching range of each time according to the matching result last time, that is to say, the method and the device can rapidly determine the expression type of the face to be detected with the least matching times and the least number of organs, and simultaneously take the organ action combination as the matching basis, thereby avoiding the error recognition caused by the fact that different expression types have the same organ action and the individual matching is carried out on each organ, and improving the expression recognition precision.

In addition, according to the facial expression recognition method provided by the embodiment of the present application, the three-dimensional variation of the organ includes: a three-dimensional variation of a keypoint of the organ; the calculating the three-dimensional variable quantity of the corresponding organ in the first depth map of the face to be detected and the standard depth map of the face to be detected comprises the following steps: determining key points of organs in the first depth map, and detecting homonymous points of the key points in the standard depth map; and calculating the three-dimensional coordinate difference value of the key point of each organ and the homonymous point of the key point to serve as the three-dimensional variable quantity of the corresponding organ. The three-dimensional variable quantity of the organ comprises three-dimensional variable quantities of a plurality of key points of the organ, and the three-dimensional coordinate difference value of the organ is determined according to the same-name points of the key points in the standard depth map, wherein the three-dimensional coordinate difference value comprises the change of two-dimensional coordinates (x, y) of the key points and the depth change of the key points. The action change trend and the action change degree of organs under each face expression can be better quantified through the three-dimensional coordinates.

In addition, the facial expression recognition method according to the embodiment of the present application, where the determining an organ action combination matched with the three-dimensional variation of the first organ from an expression library includes: traversing the organ action combination containing the first organ in the organ action combinations to be matched in the expression library, and taking the organ action combination containing the first organ as a first organ action combination; calculating the difference between the three-dimensional variable quantity of the first organ of the face to be detected and the three-dimensional variable quantity of the first organ in each first organ action combination to obtain a plurality of first difference values; and taking the organ action combination corresponding to the first difference value smaller than a first threshold value as the organ action combination matched with the three-dimensional variation of the first organ. The method and the device for judging the matching degree of the expression types corresponding to the face to be detected and the organ action combinations are judged by calculating the difference between the three-dimensional variable quantity of the first organ and the three-dimensional variable quantity of the first organ in each organ action combination in the expression library, and the smaller the difference value is, the higher the matching degree is, and the higher the probability that the expression of the face to be detected belongs to the expression type corresponding to the organ action combination is. Meanwhile, the organ action combination is used as the expression matching basis, so that the error recognition caused by the fact that different expression types have the same organ action is avoided.

In addition, according to the facial expression recognition method provided by the embodiment of the present application, the three-dimensional variation of the organ includes: three-dimensional variations of key points of an organ; the calculating a difference between the three-dimensional variation of the first organ of the face to be detected and the three-dimensional variation of the first organ in each first organ action combination to obtain a plurality of first difference values includes: respectively calculating a second difference value of the three-dimensional variation of the first organ of the face to be detected and the three-dimensional variation of the first organ in the first organ action combination on each organ key point for each first organ action combination; taking the covariance or mean of the second difference as the first difference; or, removing a preset number of second differences with the largest difference from the second differences, and taking the covariance or mean of the remaining second differences as the first difference. The method for calculating the first difference value comprises two methods, wherein one method is to calculate the covariance or mean value of the second difference value of each organ key point, and the other method is to remove the second difference value with the maximum difference value in a preset number and then determine the covariance or mean value of the residual second difference value. The matching condition of the first organ and the organ action combination is reflected through the overall condition of the second difference value of each key point, or the second difference value with the maximum partial difference value is removed, so that the organ action combination which should be matched is avoided being removed due to the fact that the individual second difference value is inaccurate caused by the error of key point detection.

In addition, the facial expression recognition method according to the embodiment of the present application, wherein selecting an organ different from the first organ from the matched organ action combinations includes: selecting an organ from the matched organ action combinations that is different from the first organ and that appears least frequently in the matched organ action combinations; or; and selecting one organ which is different from the first organ and has the largest difference of the three-dimensional variation in each matched organ action combination from the matched organ action combinations. According to the method, two selection methods are adopted when the non-first organ is selected, one is to select the organ with the least occurrence frequency, namely the organ corresponds to fewer expression types, and the other is to select the organ with the largest three-dimensional variation difference, namely the organ with the largest action variation difference. The two methods can effectively reduce the times of expression matching operation, thereby quickly determining the facial expression of the facial depth image to be detected.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.

Fig. 1 is a first flowchart of a facial expression recognition method according to an embodiment of the present application;

fig. 2 is a flowchart ii of a facial expression recognition method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the following describes each embodiment of the present application in detail with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in various embodiments of the present application in order to provide a better understanding of the present application. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.

The details of the implementation of the facial expression recognition according to the present embodiment will be described below by way of example. The following disclosure provides implementation details for the purpose of facilitating understanding, and is not necessary to practice the present solution.

An embodiment of the present application relates to a facial expression recognition method, as shown in fig. 1, including:

step 101, calculating three-dimensional variable quantities of corresponding organs in a first depth map of a face to be detected and a standard depth map of the face to be detected; the standard depth map is a depth map in a natural expression state.

In this embodiment, a first depth map of a face to be detected, which requires expression recognition, is obtained, three-dimensional variations of corresponding organs in the first depth map and a standard depth map of the face to be detected are calculated, that is, three-dimensional coordinates of each organ are obtained in the first depth map, three-dimensional coordinates of each organ are obtained in the standard depth map, and the three-dimensional coordinates of the same organ in the first depth map and the three-dimensional coordinates in the standard depth map are subtracted to obtain the three-dimensional variations of each organ in the first depth map.

The standard depth map is a depth map of the face to be detected in a natural expression state, or a depth map of the face to be detected in a calm state. The method and the device quantize the action change amplitude and the action change trend of each organ from a calm state to an expression state through the three-dimensional change quantity, and use the action change amplitude and the action change trend as the basis of expression recognition.

In one embodiment, the three-dimensional variation of the organ comprises: three-dimensional variations of key points of an organ; calculating the three-dimensional variable quantity of the corresponding organ in the first depth map of the face to be detected and the standard depth map of the face to be detected, and the method comprises the following steps: determining key points of organs in the first depth map, and detecting homonymous points of the key points in the standard depth map; and calculating the three-dimensional coordinate difference value of the key point of each organ and the homonymous point of the key point to serve as the three-dimensional variable quantity of the corresponding organ.

Specifically, when calculating the three-dimensional variation of each organ, the key points of each organ are determined, where the key points are pixel points capable of describing the shape of the organ, such as: for the eyebrow organ, the key points are the left eyebrow corner point, the right eyebrow corner point, the left eyebrow peak point, the right eyebrow peak point, the upper edge center point of the eyebrow and the lower edge center point of the eyebrow, and for the mouth organ, the key points are the left mouth corner point, the right mouth corner point, the upper lip center point and the lower lip center point. In addition, the number of key points of each organ can be selected according to the requirements on the expression recognition accuracy and the expression recognition calculation amount.

After key points of each organ are determined in the first depth map, homonymous points of the key points are detected in the standard depth map, and key points (x, y, d) and homonymous points (x) of the key points are calculated ₁ ，y ₁ ，d ₁ ) Obtaining the three-dimensional variation of the organ. Wherein x and y represent the position of the organ on the plane of the face, d represents the depth of the organ, the larger d represents the more sunken the organ, and the smaller d represents the bulge of the organ.

That is to say, the action change trend of each organ of the face to be detected can be reflected through the positive and negative values of the three-dimensional variable quantity, and the action change size or amplitude of each organ of the face to be detected can be reflected by the value of the three-dimensional variable quantity. Such as: if the center point of the face chin is used as the original point, the right side of the original point is the positive direction of an x axis, the upper side of the original point is the positive direction of a y axis, if the mouth is bent upwards, the x value of the right mouth angle point is increased, the y value of the right mouth angle point is also increased, the right mouth angle is sunken, namely, the depth value is increased, the difference between the three-dimensional coordinate of the right mouth angle point of the mouth organ in the first depth map of the face to be measured and the three-dimensional coordinate of the right mouth angle point of the mouth organ in the standard depth map is a positive value, and the degree of bending upwards of the mouth can be measured according to the value of the specific three-dimensional variable quantity, namely, the emotional degree of the face under the expression can be measured.

Step 102, sequentially matching the three-dimensional variable quantity of at least one organ with an organ action combination containing the organ in an expression library, and taking an expression type corresponding to the organ action combination matched with the three-dimensional variable quantity of the last organ in the at least one organ as an expression type of the face in the first depth map; each organ action combination in the expression library corresponds to one expression type, and the three-dimensional variation of each organ contained in the organ action combination from a natural expression to the corresponding expression type; the action combination of the organ to be matched corresponding to the organ which is not matched with the first execution in at least one organ is determined based on the matching result corresponding to the organ which is matched with the previous execution of the organ which is not matched with the first execution.

In this embodiment, the three-dimensional variation of an organ is selected to match with the organ action combination including the organ in the expression library each time, whether to perform next matching is determined according to the matching result, and the matching range of the organ action combination is determined according to the previous matching result in each matching, that is, the facial expression may be determined by performing matching only once (only one organ of the face to be detected needs to be selected), or the facial expression may be determined by performing matching many times (multiple organs of the face to be detected need to be selected). That is to say, the expression recognition can be completed by matching at least once.

An embodiment of the present application relates to a facial expression recognition method, as shown in fig. 2, including:

step 201, calculating three-dimensional variable quantities of corresponding organs in a first depth map of a face to be detected and a standard depth map of the face to be detected; the standard depth map is a depth map in a natural expression state.

In this embodiment, the specific implementation details of step 201 are substantially the same as those of step 101, and are not described herein again.

The following steps 202 to 205 are an explanation of the above step 102.

Step 202, the organ with the largest three-dimensional variation is determined as the first organ, and the following expression matching operation is performed on the first organ (steps 203-205).

Step 203, determining an organ action combination matched with the three-dimensional variable quantity of the first organ from the expression library; a plurality of organ action combinations are stored in the expression library in advance, each organ action combination corresponds to one expression type, and three-dimensional variable quantities of organs contained in the organ action combinations from natural expressions to the corresponding expression types are stored in the expression library.

Specifically, when the expression recognition is performed on the first depth map of the face to be detected, the organ with the largest three-dimensional variation, namely the organ with the most obvious action change and the most easy detection, is used as the first organ for expression matching, so that the possible facial expression type can be quickly determined.

In this embodiment, a preset expression library stores a plurality of organ action combinations of different expression types, such as: the expression library stores 4 different expression types, including: the first expression type (corresponding to happy), the second expression type (corresponding to angry), the third expression type (corresponding to sad), and the fourth expression type (corresponding to surprise), wherein each expression type has a plurality of different organ action combinations. It is understood that there are a variety of expression patterns for each expression type. Such as: for the expression types of the happy type, the first organ action combination is eyebrow bending, cheek bulging and lip not opening but mouth corner bending, the second organ action combination is eyebrow bending, eye corner fine wrinkling with fishtail lines and mouth opening, the third organ action combination is eye corner fine wrinkling with fishtail lines, cheek bulging, mouth opening large and mouth corner bending and tongue fine stretching, and the like, and other expression types are similar and also comprise a plurality of organ action combinations. In addition, each organ action combination corresponds to an expression type and a three-dimensional variable quantity, and the three-dimensional variable quantity is the action change condition of each organ in the organ action combination from a natural expression to the expression corresponding to the organ action combination.

In one embodiment, determining from the expression library an organ action combination that matches the three-dimensional change in the first organ comprises: traversing organ action combinations to be matched in the expression library, wherein the organ action combinations comprise the first organ, and taking the organ action combinations comprising the first organ as the first organ action combinations; calculating the difference between the three-dimensional variable quantity of the first organ of the face to be detected and the three-dimensional variable quantity of the first organ in each first organ action combination to obtain a plurality of first difference values; and taking the organ action combination corresponding to the first difference value smaller than the first threshold value as the organ action combination matched with the three-dimensional variation of the first organ.

Specifically, when the organ motion combination matching the three-dimensional variation of the first organ is determined, the organ motion combination including the first organ is determined from the expression library and is set as the first organ motion combination. It should be noted that, different expression modes exist for the same expression type, when the same expression type is expressed, the same organ sometimes has motion change and sometimes has no motion change, and organs included in organ motion combinations stored in the expression library are all organs with motion change, for example, for the expression type of happy mood, eyebrows sometimes bend downwards and sometimes have no motion, so that some organ motion combinations include eyebrows, and some organ motion combinations do not include eyebrows. The organ is not contained in the organ action combination, and the organ is represented to have a default three-dimensional variation close to 0 in the organ action combination.

In addition, it is possible to represent different expression types for the same organ action, such as: if the mouth is open, it may indicate happiness, anger and sadness, and the organ action combination contained in the expression type of happiness in the expression library may include mouth organ actions, the organ action combination contained in the expression type of anger may include mouth organ actions, and the organ action combination contained in the expression type of sadness may include mouth organ actions. Therefore, when the facial expression type is determined according to the three-dimensional variable quantity of the first organ, the organ action combination is used as the expression matching basis, and the error recognition caused by the fact that different expression types have the same organ action due to the fact that the organs are matched independently is avoided.

In addition, the plurality of first difference values obtained in this embodiment are differences between the three-dimensional variation of the first organ and the three-dimensional variation of the first organ in different organ action combinations, and the smaller the first difference value is, the more matched the first organ included in the organ action combination corresponding to the difference value is with the first organ of the face to be measured.

Further, the three-dimensional variation of the organ includes: three-dimensional variations of key points of an organ; calculating the difference between the three-dimensional variation of the first organ of the face to be detected and the three-dimensional variation of the first organ in each first organ action combination to obtain a plurality of first difference values, wherein the step of calculating the difference comprises the following steps: respectively calculating a second difference value of the three-dimensional variable quantity of the first organ of the face to be detected and the three-dimensional variable quantity of the first organ in the first organ action combination on key points of each organ aiming at each first organ action combination; taking the covariance or mean of the second difference as the first difference; or, a preset number of second differences with the largest difference are removed from the second differences, and the covariance or mean of the remaining second differences is taken as the first difference.

Specifically, the first difference of the organ is calculated and determined according to the second difference of each key point of the organ, and the determination method comprises two methods: one is to use the covariance or mean of the second differences of the organ key points as the first difference. The mean value reflects the average level of change of an organ at each organ's key point. Covariance reflects how uniform changes occur to an organ at key points in each organ. Alternatively, a preset number of second differences having the largest difference are removed from the second differences, and the covariance or mean of the remaining second differences is taken as the first difference. Namely, a numerical value with a larger difference is removed from the second difference, and the purpose of the step is to avoid that the organ difference value which is matched is larger and is mistakenly removed due to the fact that the second difference value of the key point of the individual organ is inaccurate caused by errors and the like during key point detection and identical point determination.

And step 204, when the matched organ action combinations all correspond to the same expression type, determining the same expression type as the expression type of the face in the first depth map.

Specifically, the three-dimensional variation of the first organ is matched with all organ action combinations in the expression library one by one to determine the matched organ action combinations, then whether the organ action combinations correspond to the same expression type is judged, if yes, other matched organs do not need to be detected continuously, the expression type is directly determined as the expression type of the face in the first depth map, namely, the expression type of the face can be determined according to the three-dimensional variation of one organ in the first depth map, and the recognition efficiency is extremely high.

And step 205, when the matched organ action combinations correspond to different expression types, selecting an organ different from the first organ from the matched organ action combinations, and executing expression matching operation by taking the organ of the face to be detected as the next first organ, wherein the matching range of the organ action combination corresponding to the non-first organ is the organ action combination matched with the previous first organ of the non-first organ.

Specifically, when the matched organ action combination corresponds to different expression types, one organ different from the first organ is selected from the matched organ action combination to be used as the first organ of the face to be tested, which is to be subjected to the expression matching operation next, and the next organ can be randomly selected when the next first organ is selected, or one organ with the largest three-dimensional variation can be continuously selected from the organs which are not used as the first organ. Such as: the number of matched organ action combinations determined after the expression matching operation is 5, the matched organ action combinations correspond to 2 expression types respectively, the number of the matched organ action combinations in the expression type A is 2, the number of the matched organ action combinations in the expression type B is 3, if the first organ is a mouth and the first organ selected for the second time is a cheek, the cheek of the face to be detected and the 5 organ action combinations obtained by the last matching are subjected to the expression matching operation again.

It should be noted that after the expression matching operation is performed on the first organ, if all the matched organ action combinations correspond to the same expression type, it is determined that the expression type is the expression type of the face in the first depth map, and the expression recognition is finished. If the matched organ action combinations correspond to different expression types, the next first organ needing expression matching operation is selected, and the selected second first organ repeatedly executes the step 203-205 until the matched organ action combinations all correspond to the same expression type, an expression recognition result is obtained, and the expression matching operation is ended.

Further, selecting an organ different from the first organ from the matched organ action combinations, comprising: selecting one organ from the matched organ action combinations that is different from the first organ and has the least occurrence frequency in the matched organ action combinations; or; and selecting one organ which is different from the first organ and has the largest difference of three-dimensional variation in the matched organ action combinations from the matched organ action combinations.

In this embodiment, there are two selection methods when selecting the non-first organ, one is to select the organ with the least occurrence frequency, that is, the organ has fewer affected organ action combinations, and there are fewer corresponding possible expression types, for example: the a expression type includes 3 organ motion combinations and the B expression type includes 5 organ motion combinations, and then of the 8 organ motion combinations, the organ that appears the least frequently is taken as the next first organ. And the other is to select an organ with the largest difference of three-dimensional variation, namely, an organ with the largest difference of motion variation. Such as: there are 5 organ action combinations belonging to different expression types, the values of the three-dimensional variation of the nose in the 5 organ action combinations are all different from each other very little (without obvious distinction), and the three-dimensional variation of the cheek in the 5 organ action combinations is different from each other greatly (with obvious distinction), so that the cheek is taken as the first organ in the next matching. That is to say, the facial expression of the face depth map can be determined through the minimum number of times of expression matching operation.

In other words, the second method is to use the fluctuation trend of the three-dimensional variation of the organ as the selection basis of the non-first organ, such as: the abscissa represents the serial number of the organ action combination, the ordinate represents the three-dimensional variation of the organ, the three-dimensional variation of each organ in different organ action combinations is plotted on the coordinate graph, and the organ corresponding to the curve with the largest fluctuation is taken as the first organ. Specifically, the dispersion coefficient of the curve may be calculated from the ratio of the variance to the mean, and the larger the dispersion coefficient, the larger the degree of fluctuation of the curve.

In addition, after the expression type of the face in the first depth map is determined, the emotional degree of the face under the expression can be measured according to the three-dimensional variation values of the first organs (all organs which have performed the expression matching operation) corresponding to the determined expression type. The larger the three-dimensional change value is, the stronger the emotion of the face under the expression type is.

The facial expression recognition method provided by the embodiment of the application determines the organ with the largest action change amplitude in the first depth map by calculating the three-dimensional change of the corresponding organ in the first depth map of the face to be detected and the standard depth map of the face to be detected, the action change amplitude of the application comprises the shape change of the organ on the surface of the face and the shape change of the organ vertical to the surface of the face, the organ with the largest three-dimensional action change is taken as the first organ to carry out expression matching, the matched organ action combination is quickly determined, when the matched organ action combinations all correspond to the same expression type, the expression type is determined as the expression type of the face in the first depth map, when the matched organ action combination corresponds to different expression types, an organ different from the first organ is selected from the matched organ action combination, and the organ of the face to be detected is used as the next first organ to execute expression matching operation. The method and the device can rapidly determine the expression type of the face to be detected with the least matching times and the least organ number, and simultaneously take organ action combination as a matching basis, so that inaccuracy of determining the complex face expression by a single organ is avoided, and expression recognition accuracy is improved.

The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included, which are all within the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.

Embodiments of the present application relate to an electronic device, as shown in fig. 3, including:

at least one processor 301; and a memory 302 communicatively coupled to the at least one processor 301; the memory 302 stores instructions executable by the at least one processor 301, and the instructions are executed by the at least one processor 301, so that the at least one processor 301 can perform the facial expression recognition as mentioned in the above embodiments.

The electronic device includes: one or more processors 301 and a memory 302, with one processor 301 being illustrated in fig. 3. The processor 301 and the memory 302 may be connected by a bus or other means, and fig. 3 illustrates the connection by a bus as an example. The memory 302 is a non-volatile computer-readable storage medium for storing non-volatile software programs, non-volatile computer-executable programs, and modules, such as algorithms corresponding to the processing policies in the policy space of the embodiment of the present application, in the memory 302. The processor 301 executes various functional applications and data processing of the device by running non-volatile software programs, instructions and modules stored in the memory 302, i.e. implementing the above-described facial expression recognition method.

The memory 302 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store a list of options, etc. Further, the memory 302 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 302 may optionally include memory located remotely from processor 301, which may be connected to an external device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more modules are stored in the memory 302, and when executed by the one or more processors 301, perform the facial expression recognition method of any of the above embodiments.

The product can execute the method provided by the embodiment of the application, has corresponding functional modules and beneficial effects of the execution method, and can refer to the method provided by the embodiment of the application without detailed technical details in the embodiment.

Embodiments of the present application relate to a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.

That is, as can be understood by those skilled in the art, all or part of the steps in the method according to the above embodiments may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps in the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the present application, and that various changes in form and details may be made therein without departing from the spirit and scope of the present application in practice.

Claims

1. A facial expression recognition method is characterized by comprising the following steps:

calculating three-dimensional variable quantities of corresponding organs in a first depth map of the face to be detected and a standard depth map of the face to be detected; the standard depth map is a depth map in a natural expression state;

sequentially matching the three-dimensional variable quantity of at least one organ with an organ action combination containing the organ in an expression library, and taking an expression type corresponding to the organ action combination matched with the three-dimensional variable quantity of the last organ in the at least one organ as an expression type of the face in the first depth map;

each organ action combination in the expression library corresponds to one expression type, and the three-dimensional variation of each organ contained in the organ action combination from the natural expression to the corresponding expression type; and the organ which is not the first organ to be matched in the at least one organ and the organ action combination to be matched corresponding to the organ are determined based on the matching result corresponding to the organ which is not the first organ to be matched and is performed with matching before.

2. The method of claim 1, wherein the sequentially matching the three-dimensional variation of at least one organ with an organ action combination in an expression library, the organ action combination including the organ, and taking an expression type corresponding to the organ action combination matched with the three-dimensional variation of a last organ in the at least one organ as an expression type of the face in the first depth map comprises:

determining the organ with the largest three-dimensional variation in the first depth map as a first organ, and performing the following expression matching operation on the first organ:

determining organ action combinations matched with the three-dimensional variation of the first organ from an expression library;

when the matched organ action combinations all correspond to the same expression type, determining the same expression type as the expression type of the face in the first depth map;

and when the matched organ action combinations correspond to different expression types, selecting an organ different from the first organ from the matched organ action combinations, and executing expression matching operation by taking the organ of the face to be detected as the next first organ, wherein the matching range of the organ action combination corresponding to the first organ except the first organ is the organ action combination matched with the first organ before the first organ except the first organ.

3. The method according to claim 1 or 2, wherein the three-dimensional variation of the organ comprises: a three-dimensional variation of a keypoint of the organ;

the calculating the three-dimensional variable quantity of the corresponding organ in the first depth map of the face to be detected and the standard depth map of the face to be detected comprises the following steps:

determining key points of organs in the first depth map, and detecting homonymous points of the key points in the standard depth map;

and calculating the three-dimensional coordinate difference value of the key point of each organ and the homonymous point of the key point to serve as the three-dimensional variable quantity of the corresponding organ.

4. The method of claim 2, wherein the determining the organ action combination matching the three-dimensional variation of the first organ from an expression library comprises:

traversing the organ action combination containing the first organ in the organ action combinations to be matched in the expression library, and taking the organ action combination containing the first organ as a first organ action combination;

calculating the difference between the three-dimensional variable quantity of the first organ of the face to be detected and the three-dimensional variable quantity of the first organ in each first organ action combination to obtain a plurality of first difference values;

and taking the organ action combination corresponding to the first difference value smaller than a first threshold value as the organ action combination matched with the three-dimensional variation of the first organ.

5. The method according to claim 4, wherein the three-dimensional variation of the organ comprises: a three-dimensional variation of a keypoint of the organ;

the calculating a difference between the three-dimensional variation of the first organ of the face to be detected and the three-dimensional variation of the first organ in each first organ action combination to obtain a plurality of first difference values includes:

respectively calculating a second difference value of the three-dimensional variation of the first organ of the face to be detected and the three-dimensional variation of the first organ in the first organ action combination on each organ key point for each first organ action combination;

taking the covariance or the mean of the second difference as the first difference; alternatively, the first and second electrodes may be,

and removing a preset number of second difference values with the largest difference value from the second difference values, and taking the covariance or mean value of the remaining second difference values as the first difference value.

6. The method according to any one of claims 2, 4 and 5, wherein the selecting an organ different from the first organ from the matched organ action combinations comprises:

selecting an organ from the matched organ action combinations that is different from the first organ and that appears least frequently in the matched organ action combinations.

7. The method according to any one of claims 2, 4 and 5, wherein the selecting an organ different from the first organ from the matched organ action combinations comprises:

and selecting one organ which is different from the first organ and has the largest difference of the three-dimensional variation in each matched organ action combination from the matched organ action combinations.

8. The method of claim 7, wherein the organ with the largest difference in three-dimensional variation is determined by:

calculating the variance and mean of three-dimensional variable quantities of the same organ in different organ action combinations;

and obtaining a discrete coefficient of the three-dimensional variation of each organ according to the variance and the mean, and taking the organ corresponding to the maximum value of the discrete coefficient as the organ with the maximum difference of the three-dimensional variations.

9. An electronic device, comprising:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of facial expression recognition according to any one of claims 1 to 8.

10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the facial expression recognition method of any one of claims 1 to 8.