CN107886568B

CN107886568B - Method and system for reconstructing facial expression by using 3D Avatar

Info

Publication number: CN107886568B
Application number: CN201711314522.9A
Authority: CN
Inventors: 田飞; 李小波
Original assignee: BEIJING HENGXIN CAIHONG INFORMATION TECHNOLOGY Co Ltd; BEIJING HENGXIN RAINBOW TECHNOLOGY Co Ltd; Oriental Dream Culture Industry Investment Co Ltd
Current assignee: BEIJING HENGXIN CAIHONG INFORMATION TECHNOLOGY Co Ltd; BEIJING HENGXIN RAINBOW TECHNOLOGY Co Ltd; Oriental Dream Culture Industry Investment Co Ltd
Priority date: 2017-12-09
Filing date: 2017-12-09
Publication date: 2020-03-03
Anticipated expiration: 2037-12-09
Also published as: CN107886568A

Abstract

The application discloses a method and a system for reconstructing a facial expression by using 3D Avatar, relates to the technical field of image processing and recognition, and solves the technical problems of low precision and complex operation of the method for reconstructing the facial expression by using 3D Avatar in the prior art. The method for reconstructing the facial expression comprises the following steps: step S1, establishing a database to obtain training data; step S2, learning a database, establishing a random fern regression forest, and acquiring an expressionless 3D model of a user according to the 2D front-surface expressionless face picture; step S3, using a random fern regression algorithm, continuously learning a database, and regressing the facial expression weight; step S4, aligning the face by the camera, and calculating the facial expression weight; and step S5, applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model, and realizing expression reconstruction. The method and the device are mainly applied to expression reconstruction.

Description

Method and system for reconstructing facial expression by using 3D Avatar

Technical Field

The application relates to the technical field of image processing and recognition, in particular to a method and a system for reconstructing a facial expression by using 3D Avatar.

Background

The existing facial expression reconstruction method mainly comprises two modes, namely ①, a monocular camera is used for recognizing facial expression types, corresponding expressions are made in a 3D Avatar, ②, a series of facial depth images of various gestures and expressions are collected by the depth camera, expression 3D model combinations of specific users are pre-generated, when the expressions are reconstructed, the facial depth images of the users are collected and compared with pre-generated 3D expression models of the users, model deformation parameters are fitted, and then the corresponding expressions are reconstructed.

Disclosure of Invention

The application aims to provide a method and a system for reconstructing a facial expression by using 3D Avatar, which are used for solving the technical problems of low precision and complex operation of the 3D Avatar reconstruction facial expression method in the prior art.

The method for reconstructing the facial expression by using the 3D Avatar comprises the following steps:

step S1, establishing a database to obtain training data;

step S2, learning a database, establishing a random fern regression forest, and acquiring an expressionless 3D model of a user according to the 2D front-surface expressionless face picture;

step S3, using a random fern regression algorithm, continuously learning a database, and regressing the facial expression weight;

step S4, aligning the face by the camera, and calculating the facial expression weight;

and step S5, applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model, and realizing expression reconstruction.

Preferably, the formula of the fitted 3D model is:

C＝C^M*W_id*W_expformula (1)

C is the fitted 3D model of the face, C^MIs an average 3D model of a face of all training data,

W_idis the individual weight of the face, W_expIs the facial expression weight.

Further, W_expIncluding 16 individual facial expression weights, 16 expressions include: no expression, left eye closing, right eye closing, eyes closing, eyebrow lifting, eyebrow wrinkling, smiling, heart hurting, anger generation, left mouth bending, right mouth bending, mouth massaging, left jaw moving, right jaw moving, jaw moving forward, and double cheek bulging.

Optionally, the method of step S1 includes:

step S11, each individual collects a group of photos, the photos are divided into three postures, namely a left side face, a right side face and a front face, each posture is respectively subjected to the preset 16 expressions, and corresponding color images and depth images are respectively recorded;

step S12, calibrating a face posture matrix M, which comprises face orientation and scaling information;

step S13, calibrating the human face characteristic points of the 2D image;

and step S14, generating a corresponding human face 3D model by using the 2D image, the feature point information and the depth image to obtain training data.

Optionally, the 2D image face feature points are scaled by ESR algorithm or SDM algorithm.

Preferably, each set of training data obtained comprises (I, M, 2D-shape, 3D-Mesh, W)_exp) Five data elements, wherein I represents a 2D photo, M represents a human face posture matrix, 2D-shape represents characteristic point information on the 2D photo, 3D-Mesh represents a corresponding human face 3D model, and W_expRepresenting facial expression weights.

Optionally, the method of step S2 includes:

step S21, establishing a first binary tree random fern according to the training data in the database;

and step S22, providing a new expressionless photo containing 2D face characteristic points, detecting the face characteristic points, putting the face characteristic points into first bifida random ferns, and performing regression to obtain an expressionless 3D model of the user.

Preferably, the method of step S21 includes the steps of:

step S211, all the frontal non-expression face data are detected from the database, and then the average face 3D model C of all the training data is calculated^MAnd average 2D face feature point distribution 2D-shape-m;

step S212, according to the 2D-shape-m, in the face area, coordinates of a group A of 2D pixel points are randomly obtained and expressed as the coordinates of the nearest characteristic point in the 2D-shape-m and the deviation of the coordinates;

step S213, calculating the face 3D model of each group of training data and all training numbersAverage human face 3D model C^MA difference of (d);

and step S214, establishing a father vertex and a branch vertex of each binary tree to form a first binary tree random fern.

Optionally, the method of step S3 includes:

step S31, establishing a second binary tree random fern according to the expressive 3D model and the expressive 3D model of the user of the training data of each individual in the database;

and step S32, giving a new expressive photo of the specific person, wherein the expressive photo contains the 2D face characteristic points, detecting the 2D face characteristic points, putting the second binary tree random fern into the photos, and regressing the facial expression weight.

According to the method for reconstructing the facial expression by using the 3D Avatar, on one hand, the 3D information of the face is obtained, the face expression weight is calculated more accurately, and the anti-interference performance is stronger; on the other hand, a machine learning method is adopted to read in the 2D face photos, learn the corresponding 3D model, further learn the facial expression weight, reduce the equipment requirement, and improve the expression reconstruction precision by only adopting monocular camera acquisition.

The present application further provides a system for reconstructing a facial expression using 3D Avatar, including:

the database establishing module is used for establishing a database to obtain training data;

the system comprises a non-expression 3D model acquisition module, a random fern regression forest establishment module and a non-expression 3D model acquisition module, wherein the non-expression 3D model acquisition module is used for learning a database, and acquiring a non-expression 3D model of a user according to a 2D front non-expression face photo;

the facial expression weight regression module is used for continuously learning the database by using a random fern regression algorithm and regressing the facial expression weight;

the facial expression weight calculation module is used for aligning the camera to the face and solving the facial expression weight;

and the expression reconstruction module is used for applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model and realize expression reconstruction.

The 3D Avatar facial expression reconstruction system has the same technical effects as the facial expression reconstruction method, and is not repeated herein.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 is a flow chart of a method of reconstructing a facial expression using 3D Avatar of the present application;

FIG. 2 is a flow chart of the present application for building a database to obtain training data;

FIG. 3 is a schematic diagram of extracting 2D image human face feature points according to the present application;

fig. 4 is a flowchart of the present application for obtaining an expressionless 3D model of a user.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example one

Fig. 1 is a flowchart of a method of reconstructing a facial expression using 3D Avatar according to the present application. As shown in fig. 1, the method for reconstructing a facial expression by 3D Avatar of the present application includes:

step S1, establishing a database to obtain training data;

step S2, learning a database, establishing a random fern regression forest, and acquiring a non-expression 3D model (C) of the user according to the 2D front non-expression face picture^M*W_id)；

The formula of the fitted 3D model is:

C＝C^M*W_id*W_expformula (1)

C is the fitted 3D model, C^MIs an average face 3D model of all training data, W_idIs the individual weight of the face, W_expIs the facial expression weight.

W_expIncluding 16 individual facial expression weights, 16 expressions include: no expression, left eye closing, right eye closing, eyes closing, eyebrow lifting, eyebrow wrinkling, smiling, heart hurting, anger generation, left mouth bending, right mouth bending, mouth massaging, left jaw moving, right jaw moving, jaw moving forward, and double cheek bulging.

W_idThe face individual weight of (a) is a weighted sum of individual features of the face model in the database. W_idDescribed are individual features of an expressionless face, such as bridge height, eye distance, mouth width, etc.

The following describes a method for reconstructing a facial expression by using 3D Avatar according to the present application.

And step S1, establishing a database to obtain training data.

Fig. 2 is a flow chart of the present application for building a database and obtaining training data. As shown in fig. 2, the method for establishing a database and obtaining training data of the present application includes:

the method comprises the steps of collecting photos by using a kinect (3D somatosensory camera), and recording corresponding color images and depth images. The color image is the corresponding 2D image. Manually setting facial expression weight W according to the made expression_exp. The facial expression weight of each 2D photo can be obtained.

step S13, calibrating the human face characteristic points of the 2D image;

optionally, the face feature points of the 2D image are calibrated by ESR algorithm (Explicit Shape Regression algorithm) or SDM algorithm (supervisory reduced Method). Fig. 3 is a schematic diagram of extracting 2D image face feature points according to the present application. As shown in fig. 3, the facial feature points are finally obtained 74.

And step S14, generating a corresponding 3D model 3D-Mesh of the human face by using the 2D image, the feature point information 2D-shape and the depth image to obtain training data.

Wherein each set of training data comprises (I, M, 2D-shape, 3D-Mesh, W)_exp) Five data elements, wherein I represents a 2D photo, M represents a human face posture matrix, 2D-shape represents characteristic point information on the 2D photo, 3D-Mesh represents a corresponding human face 3D model, and W_expRepresenting facial expression weights. Each individual includes multiple sets of training data.

And step S2, learning a database, establishing a random fern regression forest, and acquiring an expressionless 3D model of the user according to the 2D front-surface expressionless face picture.

Fig. 4 is a flowchart of the present application for obtaining an expressionless 3D model of a user. As shown in fig. 4, the method of step S2 includes:

specifically, the method of step S21 includes the steps of:

step S211, detecting places from databaseThe facial data with positive non-expression is obtained, and then the average facial 3D model C of all the training data is obtained^MAnd average 2D face feature point distribution 2D-shape-m;

wherein, A is a natural number, the empirical value of A is 400, the pixel point coordinate is characterized as a characteristic point coordinate and the deviation thereof, which can be expressed as:

(x，y)→((Sx,Sy),(dx,dy))，

(x, y) is the pixel coordinates, (Sx, Sy) is the 2D face feature point coordinates closest to the pixel, and (dx, dy) is the deviation of the two, i.e. the difference between the coordinate point pixels and the feature point pixels.

Note that the 2D face feature point coordinates may be represented by a feature point number, that is, (Sx, Sy) may be represented by the nearest feature point number.

Step S213, calculating the face 3D model of each group of training data and the average face 3D model C of all the training data^MA difference of (d);

Preferably, the method of forming the first random fern comprises:

s2141, randomly taking out B pairs of coordinates from the A group of coordinates, randomly setting a threshold value, and calculating a value of a first binary tree vertex;

it should be noted that, B is a natural number, a has an empirical value of 400, and B has an empirical value of 20. When A is 400 and B is 20, the effect of the first binary tree fern is better. The threshold value ranges from 0 to 255.

Taking out a pair of coordinates (x) from the pair of coordinates B₁，y₁)，(x₂，y₂) Traverse all training data in the database according to ((Sx)₁,Sy₁),(dx₁,dy₁))，((Sx₂,Sy₂),(dx₂,dy₂) Calculate the coordinates of two pixels in a 2D imageComparing the pixel difference value with a threshold value, dividing all training data into two parts (namely dividing all training data into a left branch part and a right branch part), and respectively calculating the 3D-mesh of the face 3D model of each group of training data of the two parts and the average face model C of all training data^MThe difference sum of (a).

Select the pixel pair ((Sx) with the largest difference between the two samples₁,Sy₁),(dx₁,dy₁))，((Sx₂,Sy₂),(dx₂,dy₂) As the vertex of the first bifurcate tree branch). This vertex is called the parent vertex.

The calculation formula of the difference is as follows:

dot (left _ sum )/left _ count + dot (right _ sum )/right _ count equation (2)

dot is a point operation, wherein left _ sum is a face 3D model 3D-mesh of training data of the left branch and an average face model C of all training data^MLeft _ count is the number of left branch samples; right _ sum is the face 3D model 3D-mesh of the training data for the right branch and the average face model C of all training data^MAnd right _ count is the number of right branch samples.

Step S2142, the values of the left and right branches are the face 3D model 3D-mesh of the respective training partial data and the average face model C of all the training data^MAverage of the difference sums of (a);

optionally, the values of the left and right branches are 3D-mesh of the face 3D model of the respective training partial data and the average face model C of all training data^MAnd the average of the difference sum of the left and right branches and the number of samples, respectively.

Step S2143, updating the 3D-mesh difference value of each group of training data;

specifically, the updated 3D-mesh difference value of each group of training data is the original 3D-mesh value of each group of training data and the average face model C of all the training data of each branch^M’The difference of (a).

Step S2144, starting from step S2141, B pairs of coordinates are randomly taken out from the group A of coordinates again, and downward branches are continuously built by taking the training data of the left branch and the right branch as the base numbers until a complete first bifurcate tree is built.

It should be noted that, here, the 3D-mesh difference used by each set of training data is the updated 3D-mesh difference in step S2143. The branch vertex at the bottommost part of the first binary tree is a leaf node.

And then, according to the step S2141 to the step S2144, when the next first bifurcate tree is established, randomly extracting the group A coordinates from the training data again, then randomly extracting the pair B coordinates from the new group A coordinates, and continuously establishing the next first bifurcate tree until the first bifurcate tree is formed by random ferns.

It should be noted that the empirical value of the depth of each first bifurcate tree of the first bifurcate tree random fern is 5, and when the first bifurcate tree is built, the next first bifurcate tree is built from step S2141. There may be a plurality of first bifurcations in the first bifurcations random fern, and preferably the empirical value of the number of first bifurcations in the first bifurcations random fern is 10.

Specifically, a new expressionless facial photograph different from all the training data is given, the facial feature points are detected to obtain the corresponding 2D-shape, and the 2D-shape is put into the first bifurcate random fern established in step S21 to perform regression.

First, a pixel pair ((Sx) according to a parent node₁,Sy₁),(dx₁,dy₁))，((Sx₂,Sy₂),(dx2,dy₂) A corresponding pair of pixel point coordinates is retrieved from a new 2D face feature point-containing expressionless photo, the pixel difference value is calculated, the parent node threshold value is compared, and which branch falls into is calculated, so that the average face 3D model C of all training data is used^MAs initial 3D-mesh value, average face 3D model C of all training data^MAverage face model C with all training data for the corresponding branch added^M’Of the difference sumAnd averaging and updating the 3D-mesh of the user. And traversing to the tree bottom in sequence to obtain the expressionless 3D-mesh of the expressionless photo corresponding to the first binary tree.

Averaging the expressionless 3D-mesh obtained by each first binary tree to obtain the expressionless 3D-mesh of the user, namely the expressionless 3D model (C) of the user^M*W_id)。

Step S3, after the expressionless 3D model of the user is obtained, the random fern regression algorithm is used for continuously learning the database and regressing the facial expression weight;

it should be explained that, the method for regressing the facial expression weight by using the random fern regression algorithm and continuously learning the database comprises the following steps:

and step S31, establishing a second binary tree random fern according to the expressive 3D model and the expressive 3D model of the user of the training data of each individual in the database.

The method for establishing the second binary tree random fern comprises the following steps:

step S311, aiming at each group of training data of each individual, calculating the difference value 3D-mesh-diff between the individual human face 3D model and the individual non-expression 3D model.

And S312, E groups of 3D point coordinates are randomly generated according to the face 3D model in the database, and each 3D point coordinate is expressed into two 3D feature point coordinates and an interpolation coefficient thereof.

Illustratively, one 3D point coordinate is expressed as ((x)₃,y₃,z₃),(x₄,y₄,z₄),delta)。

Wherein E is a natural number, (x)₃,y₃,z₃),(x₄,y₄,z₄) The delta is an interpolation coefficient between two points, and takes a value from 0 to 1.

And step 313, establishing the vertex and the branch vertex of each second binary tree to form a second binary tree random fern.

The method for establishing the vertex and the branch vertex of each second binary tree to form the second binary tree random fern comprises the following steps:

step S3131, randomly taking out F pairs of 3D point coordinates from the E groups of 3D point coordinates, randomly setting a threshold value, and calculating a value of a second binary tree vertex;

a pair of 3D point coordinates may be expressed as: ((x)₃,y₃,z₃),(x₄,y₄,z₄),delta₁)，((x₅,y₅,z₅),(x₆,y₆,z₆),delta₂)；

It should be noted that F is a natural number, the empirical value of E is 400, the empirical value of F is 20, and when E is 400 and F is 20, the effect of the second binary tree random fern is better. The threshold value ranges from 0 to 255.

According to a 3D-Mesh projection formula, the corresponding relation between the face 2D-shape and the 3D-Mesh is known:

U＝P*M*(C^M*W_id) Formula (3)

P is a 3D projection matrix, M represents a face pose matrix, C^M*W_idAnd U is a non-expressive 3D model of the user and is a human face 2D-shape.

From equation (3), ((x) can be calculated₃,y₃,z₃),(x₄,y₄,z₄)，delta₁)，((x₅,y₅,z₅),(x₆,y₆,z₆),delta₂) 2D coordinate pair (x) corresponding to 3D point coordinate₃,y₃)(x₄,y₄)，(x₅，y₅)(x₆，y₆)。

And calculating pixel difference values of the 2D coordinates of all the training data in the database in the 2D picture, comparing the pixel difference values with a threshold value, dividing all the training data into two parts, and respectively calculating the sum of the 3D-mesh-diff of the two parts.

The pair of 3D point coordinates ((x) of the pixel pair that maximizes the difference between the two sample 3D-mesh-diff samples is selected₇,y₇,z₇),(x₈,y₈,z₈)，delta₃)，((x₉,y₉,z₉),(x₁₀,y₁₀,z₁₀),delta₄) As the vertices of the second binary tree. The vertex is a parent vertex of the second binary tree.

The difference calculation formula is as follows:

dot (Left _ sum )/Left _ count + dot (Right _ sum )/Right _ count equation (4)

dot is point operation, wherein Left _ sum is the difference sum of a 3D-mesh of a face 3D model of each group of training data of the Left branch and a non-expression face model corresponding to the group of training data, and Left _ count is the sample number of the Left branch; right _ sum is the sum of differences between the 3D-mesh of the face 3D model of each set of training data of the Right branch and the corresponding expressionless face model of the set of training data, and Right _ count is the number of samples of the Right branch.

Step S3132, the value of the left and right branch is the average of the sum of differences of the 3D-mesh of the face 3D model of the respective training data and the corresponding non-expressive face model of the set of training data.

Step S3133, updating the 3D-mesh difference value of each group of training data;

specifically, the updated 3D-mesh difference value of each set of training data is the difference value of the 3D-mesh value of each set of training data and the average value of the sum of the differences of the 3D-mesh of the face 3D model of the training data of each branch and the expressionless face model corresponding to the set of training data.

And step S3134, starting from step S3131, randomly extracting the F pairs of 3D point coordinates from the E sets of 3D point coordinates again, and continuing to establish branches downwards by using the training data of the left branch and the right branch as bases until a complete second binary tree is established.

It should be noted that, here, the 3D-mesh difference used by each set of training data is the updated 3D-mesh difference in step S3133. The vertex of the branch at the bottommost part of the second binary tree is a leaf node.

And then, according to the step S3131 to the step S3133, when a next second binary tree is established, randomly extracting the E group of 3D point coordinates from the training data again, randomly extracting the F pairs of 3D point coordinates from the E group of 3D point coordinates again, and continuously establishing the next second binary tree until the second binary tree is formed by random ferns.

It should be noted that the depth empirical value of each tree of the second binary tree random fern is 5, and when the establishment of one second binary tree is completed, the next second binary tree is continuously established from step S3131. There may be a plurality of second binary trees in the second binary-tree random fern, and preferably, the empirical value of the number of second binary trees in the second binary-tree random fern is 10.

Wherein the expressive photos of the given 2D face feature points of the specific person are 2D photos that do not belong to all training data.

Specifically, firstly, the non-expression 3D-Mesh of the specific person needs to be regressed, the 2D photo of the specific person with the front non-expression is given, and the non-expression 3D-Mesh and the C-Mesh of the specific person are regressed through the first bifurcate random fern established in the step S2₁(ii) a And then giving an expressive 2D photo containing 2D face characteristic points of the specific person, and calculating a face posture matrix M. With C₁And putting a second binary tree random fern into the initial 3D-Mesh to perform regression.

Firstly, according to the 3D point coordinate value pair ((x) of the father node₇,y₇,z₇),(x₈,y₈,z₈)，delta₃)，((x₉,y₉,z₉),(x₁₀,y₁₀,z₁₀),delta₄) And the gesture matrix M calculated in the previous step is used for searching a corresponding pair of pixel point coordinates in the regression expressive 3D-mesh of the specific person from the expressive 2D photo containing the 2D face characteristic points of the new specific person by using a formula (3), calculating a pixel difference value, comparing a father node threshold value, calculating which branch falls into, and updating the 3D-mesh of the user by using the average value of the difference sum of the 3D-mesh of the face 3D model of the training data falling into the branch and the expressive face model corresponding to the group of training data. Sequentially traversing to the tree bottom to obtain the expressive 3D-mesh of the expressive photo corresponding to the second binary tree, and the average value of the facial expression weights of all training samples of the leaf node where the expressive photo is locatedNamely the facial expression weight of the 2D photo with the expression.

And averaging the facial expression weights obtained by each second binary tree to obtain the real facial expression weight of the 2D photo with the expression.

And step S4, aligning the face by the camera, and calculating the facial expression weight.

Optionally, step S4 includes the following sub-steps:

step S41, firstly, collecting a front-side expressionless face photo, and regressing a human-face expressionless 3D model according to first bifurcations;

step S42, randomly shooting the face with the expression, and regressing the facial expression weight W according to the second binary tree random fern_exp。

Specifically, the cameras used for collecting the front non-expression face picture and the expression face in step S4 are both monocular cameras.

Step S5, applying the facial expression weight (as blendshape) of the user to the 3D Avatar model to obtain the fitted 3D model, and realizing expression reconstruction.

According to the method for reconstructing the facial expression by the 3D Avatar, the expression reconstruction can be realized by using the monocular camera; in addition, the method for reconstructing the facial expression by the 3D Avatar has the advantages of strong anti-interference performance and high expression reconstruction precision.

Example two

The system for reconstructing the facial expression by using the 3D Avatar has the same technical effects as the method for reconstructing the facial expression, and the detailed description is omitted here.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application. It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method for reconstructing a facial expression by using 3D Avatar, comprising:

step S1, establishing a database to obtain training data;

step S5, applying the facial expression weight of the user to the 3D Avatar model to obtain a fitted 3D model and realize expression reconstruction;

wherein the method of step S1 includes:

step S11, each individual collects a group of photos, the photos are divided into three postures, namely a left side face, a right side face and a front face, each posture is respectively subjected to preset 16 expressions, and corresponding color images and depth images are respectively recorded;

step S13, calibrating the human face characteristic points of the 2D image;

step S14, generating a corresponding human face 3D model by using the 2D image, the feature point information and the depth image to obtain training data;

wherein the method of step S2 includes:

step S22, providing a new expressionless photo containing 2D face characteristic points, detecting the face characteristic points, putting first bifida random ferns into the photos, and performing regression to obtain an expressionless 3D model of the user;

wherein the method of step S21 includes the steps of:

step S214, establishing a father vertex and a branch vertex of each binary tree to form a first binary tree random fern;

wherein the method of step S3 includes:

2. The method of reconstructing a facial expression as claimed in claim 1, wherein the fitted 3D model has the formula:

C＝C^M*W_id*W_expformula (1)

3. The method of reconstructing facial expressions according to claim 2, wherein W_expIncluding 16 individual facial expression weights, 16 expressions include: no expression, left eye closing, right eye closing, eyes closing, eyebrow lifting, eyebrow wrinkling, smiling, heart hurting, anger generation, left mouth bending, right mouth bending, mouth massaging, left jaw moving, right jaw moving, jaw moving forward, and double cheek bulging.

4. The method of reconstructing human facial expressions according to claim 1, wherein the 2D image human facial feature points are scaled by ESR algorithm or SDM algorithm.

5. The method of reconstructing facial expressions according to claim 1, wherein each set of training data obtained comprises (I, M, 2D-shape, 3D-Mesh, W)_exp) Five data elements, wherein I represents a 2D photo, M represents a human face posture matrix, 2D-shape represents characteristic point information on the 2D photo, 3D-Mesh represents a corresponding human face 3D model, and W_expRepresenting facial expression weights.

6. A system for reconstructing a facial expression using 3D Avatar, comprising:

the database establishing module is used for establishing a database to obtain training data; the establishing of the database to obtain the training data specifically comprises the following steps:

step S13, calibrating the human face characteristic points of the 2D image;

the system comprises a non-expression 3D model acquisition module, a random fern regression forest establishment module and a non-expression 3D model acquisition module, wherein the non-expression 3D model acquisition module is used for learning a database, and acquiring a non-expression 3D model of a user according to a 2D front non-expression face photo; the learning database is used for establishing a random fern regression forest and acquiring a non-expressive 3D model of a user according to a 2D front non-expressive face photo, and the method specifically comprises the following steps:

wherein the method of step S21 includes the steps of:

the facial expression weight regression module is used for continuously learning the database by using a random fern regression algorithm and regressing the facial expression weight; the method for continuously learning the database and regressing the facial expression weight by using the random fern regression algorithm specifically comprises the following steps of:

step S32, a new expressive photo of a specific person is given, the expressive photo contains 2D face feature points, the 2D face feature points are detected, second binary tree random ferns are placed into the photos, and the facial expression weight is regressed;