CN107886568B - Method and system for reconstructing facial expression by using 3D Avatar - Google Patents

Method and system for reconstructing facial expression by using 3D Avatar Download PDF

Info

Publication number
CN107886568B
CN107886568B CN201711314522.9A CN201711314522A CN107886568B CN 107886568 B CN107886568 B CN 107886568B CN 201711314522 A CN201711314522 A CN 201711314522A CN 107886568 B CN107886568 B CN 107886568B
Authority
CN
China
Prior art keywords
face
model
training data
facial expression
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711314522.9A
Other languages
Chinese (zh)
Other versions
CN107886568A (en
Inventor
田飞
李小波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING HENGXIN CAIHONG INFORMATION TECHNOLOGY Co Ltd
BEIJING HENGXIN RAINBOW TECHNOLOGY Co Ltd
Oriental Dream Culture Industry Investment Co Ltd
Original Assignee
BEIJING HENGXIN CAIHONG INFORMATION TECHNOLOGY Co Ltd
BEIJING HENGXIN RAINBOW TECHNOLOGY Co Ltd
Oriental Dream Culture Industry Investment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING HENGXIN CAIHONG INFORMATION TECHNOLOGY Co Ltd, BEIJING HENGXIN RAINBOW TECHNOLOGY Co Ltd, Oriental Dream Culture Industry Investment Co Ltd filed Critical BEIJING HENGXIN CAIHONG INFORMATION TECHNOLOGY Co Ltd
Priority to CN201711314522.9A priority Critical patent/CN107886568B/en
Publication of CN107886568A publication Critical patent/CN107886568A/en
Application granted granted Critical
Publication of CN107886568B publication Critical patent/CN107886568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method and a system for reconstructing a facial expression by using 3D Avatar, relates to the technical field of image processing and recognition, and solves the technical problems of low precision and complex operation of the method for reconstructing the facial expression by using 3D Avatar in the prior art. The method for reconstructing the facial expression comprises the following steps: step S1, establishing a database to obtain training data; step S2, learning a database, establishing a random fern regression forest, and acquiring an expressionless 3D model of a user according to the 2D front-surface expressionless face picture; step S3, using a random fern regression algorithm, continuously learning a database, and regressing the facial expression weight; step S4, aligning the face by the camera, and calculating the facial expression weight; and step S5, applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model, and realizing expression reconstruction. The method and the device are mainly applied to expression reconstruction.

Description

Method and system for reconstructing facial expression by using 3D Avatar
Technical Field
The application relates to the technical field of image processing and recognition, in particular to a method and a system for reconstructing a facial expression by using 3D Avatar.
Background
The existing facial expression reconstruction method mainly comprises two modes, namely ①, a monocular camera is used for recognizing facial expression types, corresponding expressions are made in a 3D Avatar, ②, a series of facial depth images of various gestures and expressions are collected by the depth camera, expression 3D model combinations of specific users are pre-generated, when the expressions are reconstructed, the facial depth images of the users are collected and compared with pre-generated 3D expression models of the users, model deformation parameters are fitted, and then the corresponding expressions are reconstructed.
Disclosure of Invention
The application aims to provide a method and a system for reconstructing a facial expression by using 3D Avatar, which are used for solving the technical problems of low precision and complex operation of the 3D Avatar reconstruction facial expression method in the prior art.
The method for reconstructing the facial expression by using the 3D Avatar comprises the following steps:
step S1, establishing a database to obtain training data;
step S2, learning a database, establishing a random fern regression forest, and acquiring an expressionless 3D model of a user according to the 2D front-surface expressionless face picture;
step S3, using a random fern regression algorithm, continuously learning a database, and regressing the facial expression weight;
step S4, aligning the face by the camera, and calculating the facial expression weight;
and step S5, applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model, and realizing expression reconstruction.
Preferably, the formula of the fitted 3D model is:
C=CM*Wid*Wexpformula (1)
C is the fitted 3D model of the face, CMIs an average 3D model of a face of all training data,
Widis the individual weight of the face, WexpIs the facial expression weight.
Further, WexpIncluding 16 individual facial expression weights, 16 expressions include: no expression, left eye closing, right eye closing, eyes closing, eyebrow lifting, eyebrow wrinkling, smiling, heart hurting, anger generation, left mouth bending, right mouth bending, mouth massaging, left jaw moving, right jaw moving, jaw moving forward, and double cheek bulging.
Optionally, the method of step S1 includes:
step S11, each individual collects a group of photos, the photos are divided into three postures, namely a left side face, a right side face and a front face, each posture is respectively subjected to the preset 16 expressions, and corresponding color images and depth images are respectively recorded;
step S12, calibrating a face posture matrix M, which comprises face orientation and scaling information;
step S13, calibrating the human face characteristic points of the 2D image;
and step S14, generating a corresponding human face 3D model by using the 2D image, the feature point information and the depth image to obtain training data.
Optionally, the 2D image face feature points are scaled by ESR algorithm or SDM algorithm.
Preferably, each set of training data obtained comprises (I, M, 2D-shape, 3D-Mesh, W)exp) Five data elements, wherein I represents a 2D photo, M represents a human face posture matrix, 2D-shape represents characteristic point information on the 2D photo, 3D-Mesh represents a corresponding human face 3D model, and WexpRepresenting facial expression weights.
Optionally, the method of step S2 includes:
step S21, establishing a first binary tree random fern according to the training data in the database;
and step S22, providing a new expressionless photo containing 2D face characteristic points, detecting the face characteristic points, putting the face characteristic points into first bifida random ferns, and performing regression to obtain an expressionless 3D model of the user.
Preferably, the method of step S21 includes the steps of:
step S211, all the frontal non-expression face data are detected from the database, and then the average face 3D model C of all the training data is calculatedMAnd average 2D face feature point distribution 2D-shape-m;
step S212, according to the 2D-shape-m, in the face area, coordinates of a group A of 2D pixel points are randomly obtained and expressed as the coordinates of the nearest characteristic point in the 2D-shape-m and the deviation of the coordinates;
step S213, calculating the face 3D model of each group of training data and all training numbersAverage human face 3D model CMA difference of (d);
and step S214, establishing a father vertex and a branch vertex of each binary tree to form a first binary tree random fern.
Optionally, the method of step S3 includes:
step S31, establishing a second binary tree random fern according to the expressive 3D model and the expressive 3D model of the user of the training data of each individual in the database;
and step S32, giving a new expressive photo of the specific person, wherein the expressive photo contains the 2D face characteristic points, detecting the 2D face characteristic points, putting the second binary tree random fern into the photos, and regressing the facial expression weight.
According to the method for reconstructing the facial expression by using the 3D Avatar, on one hand, the 3D information of the face is obtained, the face expression weight is calculated more accurately, and the anti-interference performance is stronger; on the other hand, a machine learning method is adopted to read in the 2D face photos, learn the corresponding 3D model, further learn the facial expression weight, reduce the equipment requirement, and improve the expression reconstruction precision by only adopting monocular camera acquisition.
The present application further provides a system for reconstructing a facial expression using 3D Avatar, including:
the database establishing module is used for establishing a database to obtain training data;
the system comprises a non-expression 3D model acquisition module, a random fern regression forest establishment module and a non-expression 3D model acquisition module, wherein the non-expression 3D model acquisition module is used for learning a database, and acquiring a non-expression 3D model of a user according to a 2D front non-expression face photo;
the facial expression weight regression module is used for continuously learning the database by using a random fern regression algorithm and regressing the facial expression weight;
the facial expression weight calculation module is used for aligning the camera to the face and solving the facial expression weight;
and the expression reconstruction module is used for applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model and realize expression reconstruction.
The 3D Avatar facial expression reconstruction system has the same technical effects as the facial expression reconstruction method, and is not repeated herein.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a flow chart of a method of reconstructing a facial expression using 3D Avatar of the present application;
FIG. 2 is a flow chart of the present application for building a database to obtain training data;
FIG. 3 is a schematic diagram of extracting 2D image human face feature points according to the present application;
fig. 4 is a flowchart of the present application for obtaining an expressionless 3D model of a user.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Fig. 1 is a flowchart of a method of reconstructing a facial expression using 3D Avatar according to the present application. As shown in fig. 1, the method for reconstructing a facial expression by 3D Avatar of the present application includes:
step S1, establishing a database to obtain training data;
step S2, learning a database, establishing a random fern regression forest, and acquiring a non-expression 3D model (C) of the user according to the 2D front non-expression face pictureM*Wid);
Step S3, using a random fern regression algorithm, continuously learning a database, and regressing the facial expression weight;
step S4, aligning the face by the camera, and calculating the facial expression weight;
and step S5, applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model, and realizing expression reconstruction.
According to the method for reconstructing the facial expression by using the 3D Avatar, on one hand, the 3D information of the face is obtained, the face expression weight is calculated more accurately, and the anti-interference performance is stronger; on the other hand, a machine learning method is adopted to read in the 2D face photos, learn the corresponding 3D model, further learn the facial expression weight, reduce the equipment requirement, and improve the expression reconstruction precision by only adopting monocular camera acquisition.
The formula of the fitted 3D model is:
C=CM*Wid*Wexpformula (1)
C is the fitted 3D model, CMIs an average face 3D model of all training data, WidIs the individual weight of the face, WexpIs the facial expression weight.
WexpIncluding 16 individual facial expression weights, 16 expressions include: no expression, left eye closing, right eye closing, eyes closing, eyebrow lifting, eyebrow wrinkling, smiling, heart hurting, anger generation, left mouth bending, right mouth bending, mouth massaging, left jaw moving, right jaw moving, jaw moving forward, and double cheek bulging.
WidThe face individual weight of (a) is a weighted sum of individual features of the face model in the database. WidDescribed are individual features of an expressionless face, such as bridge height, eye distance, mouth width, etc.
The following describes a method for reconstructing a facial expression by using 3D Avatar according to the present application.
And step S1, establishing a database to obtain training data.
Fig. 2 is a flow chart of the present application for building a database and obtaining training data. As shown in fig. 2, the method for establishing a database and obtaining training data of the present application includes:
step S11, each individual collects a group of photos, the photos are divided into three postures, namely a left side face, a right side face and a front face, each posture is respectively subjected to the preset 16 expressions, and corresponding color images and depth images are respectively recorded;
the method comprises the steps of collecting photos by using a kinect (3D somatosensory camera), and recording corresponding color images and depth images. The color image is the corresponding 2D image. Manually setting facial expression weight W according to the made expressionexp. The facial expression weight of each 2D photo can be obtained.
Step S12, calibrating a face posture matrix M, which comprises face orientation and scaling information;
step S13, calibrating the human face characteristic points of the 2D image;
optionally, the face feature points of the 2D image are calibrated by ESR algorithm (Explicit Shape Regression algorithm) or SDM algorithm (supervisory reduced Method). Fig. 3 is a schematic diagram of extracting 2D image face feature points according to the present application. As shown in fig. 3, the facial feature points are finally obtained 74.
And step S14, generating a corresponding 3D model 3D-Mesh of the human face by using the 2D image, the feature point information 2D-shape and the depth image to obtain training data.
Wherein each set of training data comprises (I, M, 2D-shape, 3D-Mesh, W)exp) Five data elements, wherein I represents a 2D photo, M represents a human face posture matrix, 2D-shape represents characteristic point information on the 2D photo, 3D-Mesh represents a corresponding human face 3D model, and WexpRepresenting facial expression weights. Each individual includes multiple sets of training data.
And step S2, learning a database, establishing a random fern regression forest, and acquiring an expressionless 3D model of the user according to the 2D front-surface expressionless face picture.
Fig. 4 is a flowchart of the present application for obtaining an expressionless 3D model of a user. As shown in fig. 4, the method of step S2 includes:
step S21, establishing a first binary tree random fern according to the training data in the database;
specifically, the method of step S21 includes the steps of:
step S211, detecting places from databaseThe facial data with positive non-expression is obtained, and then the average facial 3D model C of all the training data is obtainedMAnd average 2D face feature point distribution 2D-shape-m;
step S212, according to the 2D-shape-m, in the face area, coordinates of a group A of 2D pixel points are randomly obtained and expressed as the coordinates of the nearest characteristic point in the 2D-shape-m and the deviation of the coordinates;
wherein, A is a natural number, the empirical value of A is 400, the pixel point coordinate is characterized as a characteristic point coordinate and the deviation thereof, which can be expressed as:
(x,y)→((Sx,Sy),(dx,dy)),
(x, y) is the pixel coordinates, (Sx, Sy) is the 2D face feature point coordinates closest to the pixel, and (dx, dy) is the deviation of the two, i.e. the difference between the coordinate point pixels and the feature point pixels.
Note that the 2D face feature point coordinates may be represented by a feature point number, that is, (Sx, Sy) may be represented by the nearest feature point number.
Step S213, calculating the face 3D model of each group of training data and the average face 3D model C of all the training dataMA difference of (d);
and step S214, establishing a father vertex and a branch vertex of each binary tree to form a first binary tree random fern.
Preferably, the method of forming the first random fern comprises:
s2141, randomly taking out B pairs of coordinates from the A group of coordinates, randomly setting a threshold value, and calculating a value of a first binary tree vertex;
it should be noted that, B is a natural number, a has an empirical value of 400, and B has an empirical value of 20. When A is 400 and B is 20, the effect of the first binary tree fern is better. The threshold value ranges from 0 to 255.
Taking out a pair of coordinates (x) from the pair of coordinates B1,y1),(x2,y2) Traverse all training data in the database according to ((Sx)1,Sy1),(dx1,dy1)),((Sx2,Sy2),(dx2,dy2) Calculate the coordinates of two pixels in a 2D imageComparing the pixel difference value with a threshold value, dividing all training data into two parts (namely dividing all training data into a left branch part and a right branch part), and respectively calculating the 3D-mesh of the face 3D model of each group of training data of the two parts and the average face model C of all training dataMThe difference sum of (a).
Select the pixel pair ((Sx) with the largest difference between the two samples1,Sy1),(dx1,dy1)),((Sx2,Sy2),(dx2,dy2) As the vertex of the first bifurcate tree branch). This vertex is called the parent vertex.
The calculation formula of the difference is as follows:
dot (left _ sum )/left _ count + dot (right _ sum )/right _ count equation (2)
dot is a point operation, wherein left _ sum is a face 3D model 3D-mesh of training data of the left branch and an average face model C of all training dataMLeft _ count is the number of left branch samples; right _ sum is the face 3D model 3D-mesh of the training data for the right branch and the average face model C of all training dataMAnd right _ count is the number of right branch samples.
Step S2142, the values of the left and right branches are the face 3D model 3D-mesh of the respective training partial data and the average face model C of all the training dataMAverage of the difference sums of (a);
optionally, the values of the left and right branches are 3D-mesh of the face 3D model of the respective training partial data and the average face model C of all training dataMAnd the average of the difference sum of the left and right branches and the number of samples, respectively.
Step S2143, updating the 3D-mesh difference value of each group of training data;
specifically, the updated 3D-mesh difference value of each group of training data is the original 3D-mesh value of each group of training data and the average face model C of all the training data of each branchM’The difference of (a).
Step S2144, starting from step S2141, B pairs of coordinates are randomly taken out from the group A of coordinates again, and downward branches are continuously built by taking the training data of the left branch and the right branch as the base numbers until a complete first bifurcate tree is built.
It should be noted that, here, the 3D-mesh difference used by each set of training data is the updated 3D-mesh difference in step S2143. The branch vertex at the bottommost part of the first binary tree is a leaf node.
And then, according to the step S2141 to the step S2144, when the next first bifurcate tree is established, randomly extracting the group A coordinates from the training data again, then randomly extracting the pair B coordinates from the new group A coordinates, and continuously establishing the next first bifurcate tree until the first bifurcate tree is formed by random ferns.
It should be noted that the empirical value of the depth of each first bifurcate tree of the first bifurcate tree random fern is 5, and when the first bifurcate tree is built, the next first bifurcate tree is built from step S2141. There may be a plurality of first bifurcations in the first bifurcations random fern, and preferably the empirical value of the number of first bifurcations in the first bifurcations random fern is 10.
And step S22, providing a new expressionless photo containing 2D face characteristic points, detecting the face characteristic points, putting the face characteristic points into first bifida random ferns, and performing regression to obtain an expressionless 3D model of the user.
Specifically, a new expressionless facial photograph different from all the training data is given, the facial feature points are detected to obtain the corresponding 2D-shape, and the 2D-shape is put into the first bifurcate random fern established in step S21 to perform regression.
First, a pixel pair ((Sx) according to a parent node1,Sy1),(dx1,dy1)),((Sx2,Sy2),(dx2,dy2) A corresponding pair of pixel point coordinates is retrieved from a new 2D face feature point-containing expressionless photo, the pixel difference value is calculated, the parent node threshold value is compared, and which branch falls into is calculated, so that the average face 3D model C of all training data is usedMAs initial 3D-mesh value, average face 3D model C of all training dataMAverage face model C with all training data for the corresponding branch addedM’Of the difference sumAnd averaging and updating the 3D-mesh of the user. And traversing to the tree bottom in sequence to obtain the expressionless 3D-mesh of the expressionless photo corresponding to the first binary tree.
Averaging the expressionless 3D-mesh obtained by each first binary tree to obtain the expressionless 3D-mesh of the user, namely the expressionless 3D model (C) of the userM*Wid)。
Step S3, after the expressionless 3D model of the user is obtained, the random fern regression algorithm is used for continuously learning the database and regressing the facial expression weight;
it should be explained that, the method for regressing the facial expression weight by using the random fern regression algorithm and continuously learning the database comprises the following steps:
and step S31, establishing a second binary tree random fern according to the expressive 3D model and the expressive 3D model of the user of the training data of each individual in the database.
The method for establishing the second binary tree random fern comprises the following steps:
step S311, aiming at each group of training data of each individual, calculating the difference value 3D-mesh-diff between the individual human face 3D model and the individual non-expression 3D model.
And S312, E groups of 3D point coordinates are randomly generated according to the face 3D model in the database, and each 3D point coordinate is expressed into two 3D feature point coordinates and an interpolation coefficient thereof.
Illustratively, one 3D point coordinate is expressed as ((x)3,y3,z3),(x4,y4,z4),delta)。
Wherein E is a natural number, (x)3,y3,z3),(x4,y4,z4) The delta is an interpolation coefficient between two points, and takes a value from 0 to 1.
And step 313, establishing the vertex and the branch vertex of each second binary tree to form a second binary tree random fern.
The method for establishing the vertex and the branch vertex of each second binary tree to form the second binary tree random fern comprises the following steps:
step S3131, randomly taking out F pairs of 3D point coordinates from the E groups of 3D point coordinates, randomly setting a threshold value, and calculating a value of a second binary tree vertex;
a pair of 3D point coordinates may be expressed as: ((x)3,y3,z3),(x4,y4,z4),delta1),((x5,y5,z5),(x6,y6,z6),delta2);
It should be noted that F is a natural number, the empirical value of E is 400, the empirical value of F is 20, and when E is 400 and F is 20, the effect of the second binary tree random fern is better. The threshold value ranges from 0 to 255.
According to a 3D-Mesh projection formula, the corresponding relation between the face 2D-shape and the 3D-Mesh is known:
U=P*M*(CM*Wid) Formula (3)
P is a 3D projection matrix, M represents a face pose matrix, CM*WidAnd U is a non-expressive 3D model of the user and is a human face 2D-shape.
From equation (3), ((x) can be calculated3,y3,z3),(x4,y4,z4),delta1),((x5,y5,z5),(x6,y6,z6),delta2) 2D coordinate pair (x) corresponding to 3D point coordinate3,y3)(x4,y4),(x5,y5)(x6,y6)。
And calculating pixel difference values of the 2D coordinates of all the training data in the database in the 2D picture, comparing the pixel difference values with a threshold value, dividing all the training data into two parts, and respectively calculating the sum of the 3D-mesh-diff of the two parts.
The pair of 3D point coordinates ((x) of the pixel pair that maximizes the difference between the two sample 3D-mesh-diff samples is selected7,y7,z7),(x8,y8,z8),delta3),((x9,y9,z9),(x10,y10,z10),delta4) As the vertices of the second binary tree. The vertex is a parent vertex of the second binary tree.
The difference calculation formula is as follows:
dot (Left _ sum )/Left _ count + dot (Right _ sum )/Right _ count equation (4)
dot is point operation, wherein Left _ sum is the difference sum of a 3D-mesh of a face 3D model of each group of training data of the Left branch and a non-expression face model corresponding to the group of training data, and Left _ count is the sample number of the Left branch; right _ sum is the sum of differences between the 3D-mesh of the face 3D model of each set of training data of the Right branch and the corresponding expressionless face model of the set of training data, and Right _ count is the number of samples of the Right branch.
Step S3132, the value of the left and right branch is the average of the sum of differences of the 3D-mesh of the face 3D model of the respective training data and the corresponding non-expressive face model of the set of training data.
Step S3133, updating the 3D-mesh difference value of each group of training data;
specifically, the updated 3D-mesh difference value of each set of training data is the difference value of the 3D-mesh value of each set of training data and the average value of the sum of the differences of the 3D-mesh of the face 3D model of the training data of each branch and the expressionless face model corresponding to the set of training data.
And step S3134, starting from step S3131, randomly extracting the F pairs of 3D point coordinates from the E sets of 3D point coordinates again, and continuing to establish branches downwards by using the training data of the left branch and the right branch as bases until a complete second binary tree is established.
It should be noted that, here, the 3D-mesh difference used by each set of training data is the updated 3D-mesh difference in step S3133. The vertex of the branch at the bottommost part of the second binary tree is a leaf node.
And then, according to the step S3131 to the step S3133, when a next second binary tree is established, randomly extracting the E group of 3D point coordinates from the training data again, randomly extracting the F pairs of 3D point coordinates from the E group of 3D point coordinates again, and continuously establishing the next second binary tree until the second binary tree is formed by random ferns.
It should be noted that the depth empirical value of each tree of the second binary tree random fern is 5, and when the establishment of one second binary tree is completed, the next second binary tree is continuously established from step S3131. There may be a plurality of second binary trees in the second binary-tree random fern, and preferably, the empirical value of the number of second binary trees in the second binary-tree random fern is 10.
And step S32, giving a new expressive photo of the specific person, wherein the expressive photo contains the 2D face characteristic points, detecting the 2D face characteristic points, putting the second binary tree random fern into the photos, and regressing the facial expression weight.
Wherein the expressive photos of the given 2D face feature points of the specific person are 2D photos that do not belong to all training data.
Specifically, firstly, the non-expression 3D-Mesh of the specific person needs to be regressed, the 2D photo of the specific person with the front non-expression is given, and the non-expression 3D-Mesh and the C-Mesh of the specific person are regressed through the first bifurcate random fern established in the step S21(ii) a And then giving an expressive 2D photo containing 2D face characteristic points of the specific person, and calculating a face posture matrix M. With C1And putting a second binary tree random fern into the initial 3D-Mesh to perform regression.
Firstly, according to the 3D point coordinate value pair ((x) of the father node7,y7,z7),(x8,y8,z8),delta3),((x9,y9,z9),(x10,y10,z10),delta4) And the gesture matrix M calculated in the previous step is used for searching a corresponding pair of pixel point coordinates in the regression expressive 3D-mesh of the specific person from the expressive 2D photo containing the 2D face characteristic points of the new specific person by using a formula (3), calculating a pixel difference value, comparing a father node threshold value, calculating which branch falls into, and updating the 3D-mesh of the user by using the average value of the difference sum of the 3D-mesh of the face 3D model of the training data falling into the branch and the expressive face model corresponding to the group of training data. Sequentially traversing to the tree bottom to obtain the expressive 3D-mesh of the expressive photo corresponding to the second binary tree, and the average value of the facial expression weights of all training samples of the leaf node where the expressive photo is locatedNamely the facial expression weight of the 2D photo with the expression.
And averaging the facial expression weights obtained by each second binary tree to obtain the real facial expression weight of the 2D photo with the expression.
And step S4, aligning the face by the camera, and calculating the facial expression weight.
Optionally, step S4 includes the following sub-steps:
step S41, firstly, collecting a front-side expressionless face photo, and regressing a human-face expressionless 3D model according to first bifurcations;
step S42, randomly shooting the face with the expression, and regressing the facial expression weight W according to the second binary tree random fernexp
Specifically, the cameras used for collecting the front non-expression face picture and the expression face in step S4 are both monocular cameras.
Step S5, applying the facial expression weight (as blendshape) of the user to the 3D Avatar model to obtain the fitted 3D model, and realizing expression reconstruction.
According to the method for reconstructing the facial expression by the 3D Avatar, the expression reconstruction can be realized by using the monocular camera; in addition, the method for reconstructing the facial expression by the 3D Avatar has the advantages of strong anti-interference performance and high expression reconstruction precision.
Example two
The present application further provides a system for reconstructing a facial expression using 3D Avatar, including:
the database establishing module is used for establishing a database to obtain training data;
the system comprises a non-expression 3D model acquisition module, a random fern regression forest establishment module and a non-expression 3D model acquisition module, wherein the non-expression 3D model acquisition module is used for learning a database, and acquiring a non-expression 3D model of a user according to a 2D front non-expression face photo;
the facial expression weight regression module is used for continuously learning the database by using a random fern regression algorithm and regressing the facial expression weight;
the facial expression weight calculation module is used for aligning the camera to the face and solving the facial expression weight;
and the expression reconstruction module is used for applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model and realize expression reconstruction.
The system for reconstructing the facial expression by using the 3D Avatar has the same technical effects as the method for reconstructing the facial expression, and the detailed description is omitted here.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application. It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (6)

1. A method for reconstructing a facial expression by using 3D Avatar, comprising:
step S1, establishing a database to obtain training data;
step S2, learning a database, establishing a random fern regression forest, and acquiring an expressionless 3D model of a user according to the 2D front-surface expressionless face picture;
step S3, using a random fern regression algorithm, continuously learning a database, and regressing the facial expression weight;
step S4, aligning the face by the camera, and calculating the facial expression weight;
step S5, applying the facial expression weight of the user to the 3D Avatar model to obtain a fitted 3D model and realize expression reconstruction;
wherein the method of step S1 includes:
step S11, each individual collects a group of photos, the photos are divided into three postures, namely a left side face, a right side face and a front face, each posture is respectively subjected to preset 16 expressions, and corresponding color images and depth images are respectively recorded;
step S12, calibrating a face posture matrix M, which comprises face orientation and scaling information;
step S13, calibrating the human face characteristic points of the 2D image;
step S14, generating a corresponding human face 3D model by using the 2D image, the feature point information and the depth image to obtain training data;
wherein the method of step S2 includes:
step S21, establishing a first binary tree random fern according to the training data in the database;
step S22, providing a new expressionless photo containing 2D face characteristic points, detecting the face characteristic points, putting first bifida random ferns into the photos, and performing regression to obtain an expressionless 3D model of the user;
wherein the method of step S21 includes the steps of:
step S211, all the frontal non-expression face data are detected from the database, and then the average face 3D model C of all the training data is calculatedMAnd average 2D face feature point distribution 2D-shape-m;
step S212, according to the 2D-shape-m, in the face area, coordinates of a group A of 2D pixel points are randomly obtained and expressed as the coordinates of the nearest characteristic point in the 2D-shape-m and the deviation of the coordinates;
step S213, calculating the face 3D model of each group of training data and the average face 3D model C of all the training dataMA difference of (d);
step S214, establishing a father vertex and a branch vertex of each binary tree to form a first binary tree random fern;
wherein the method of step S3 includes:
step S31, establishing a second binary tree random fern according to the expressive 3D model and the expressive 3D model of the user of the training data of each individual in the database;
and step S32, giving a new expressive photo of the specific person, wherein the expressive photo contains the 2D face characteristic points, detecting the 2D face characteristic points, putting the second binary tree random fern into the photos, and regressing the facial expression weight.
2. The method of reconstructing a facial expression as claimed in claim 1, wherein the fitted 3D model has the formula:
C=CM*Wid*Wexpformula (1)
C is the fitted 3D model, CMIs an average face 3D model of all training data, WidIs the individual weight of the face, WexpIs the facial expression weight.
3. The method of reconstructing facial expressions according to claim 2, wherein WexpIncluding 16 individual facial expression weights, 16 expressions include: no expression, left eye closing, right eye closing, eyes closing, eyebrow lifting, eyebrow wrinkling, smiling, heart hurting, anger generation, left mouth bending, right mouth bending, mouth massaging, left jaw moving, right jaw moving, jaw moving forward, and double cheek bulging.
4. The method of reconstructing human facial expressions according to claim 1, wherein the 2D image human facial feature points are scaled by ESR algorithm or SDM algorithm.
5. The method of reconstructing facial expressions according to claim 1, wherein each set of training data obtained comprises (I, M, 2D-shape, 3D-Mesh, W)exp) Five data elements, wherein I represents a 2D photo, M represents a human face posture matrix, 2D-shape represents characteristic point information on the 2D photo, 3D-Mesh represents a corresponding human face 3D model, and WexpRepresenting facial expression weights.
6. A system for reconstructing a facial expression using 3D Avatar, comprising:
the database establishing module is used for establishing a database to obtain training data; the establishing of the database to obtain the training data specifically comprises the following steps:
step S11, each individual collects a group of photos, the photos are divided into three postures, namely a left side face, a right side face and a front face, each posture is respectively subjected to preset 16 expressions, and corresponding color images and depth images are respectively recorded;
step S12, calibrating a face posture matrix M, which comprises face orientation and scaling information;
step S13, calibrating the human face characteristic points of the 2D image;
step S14, generating a corresponding human face 3D model by using the 2D image, the feature point information and the depth image to obtain training data;
the system comprises a non-expression 3D model acquisition module, a random fern regression forest establishment module and a non-expression 3D model acquisition module, wherein the non-expression 3D model acquisition module is used for learning a database, and acquiring a non-expression 3D model of a user according to a 2D front non-expression face photo; the learning database is used for establishing a random fern regression forest and acquiring a non-expressive 3D model of a user according to a 2D front non-expressive face photo, and the method specifically comprises the following steps:
step S21, establishing a first binary tree random fern according to the training data in the database;
step S22, providing a new expressionless photo containing 2D face characteristic points, detecting the face characteristic points, putting first bifida random ferns into the photos, and performing regression to obtain an expressionless 3D model of the user;
wherein the method of step S21 includes the steps of:
step S211, all the frontal non-expression face data are detected from the database, and then the average face 3D model C of all the training data is calculatedMAnd average 2D face feature point distribution 2D-shape-m;
step S212, according to the 2D-shape-m, in the face area, coordinates of a group A of 2D pixel points are randomly obtained and expressed as the coordinates of the nearest characteristic point in the 2D-shape-m and the deviation of the coordinates;
step S213, calculating the face 3D model of each group of training data and the average face 3D model C of all the training dataMA difference of (d);
step S214, establishing a father vertex and a branch vertex of each binary tree to form a first binary tree random fern;
the facial expression weight regression module is used for continuously learning the database by using a random fern regression algorithm and regressing the facial expression weight; the method for continuously learning the database and regressing the facial expression weight by using the random fern regression algorithm specifically comprises the following steps of:
step S31, establishing a second binary tree random fern according to the expressive 3D model and the expressive 3D model of the user of the training data of each individual in the database;
step S32, a new expressive photo of a specific person is given, the expressive photo contains 2D face feature points, the 2D face feature points are detected, second binary tree random ferns are placed into the photos, and the facial expression weight is regressed;
the facial expression weight calculation module is used for aligning the camera to the face and solving the facial expression weight;
and the expression reconstruction module is used for applying the facial expression weight of the user to the 3D Avatar model to obtain the fitted 3D model and realize expression reconstruction.
CN201711314522.9A 2017-12-09 2017-12-09 Method and system for reconstructing facial expression by using 3D Avatar Active CN107886568B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711314522.9A CN107886568B (en) 2017-12-09 2017-12-09 Method and system for reconstructing facial expression by using 3D Avatar

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711314522.9A CN107886568B (en) 2017-12-09 2017-12-09 Method and system for reconstructing facial expression by using 3D Avatar

Publications (2)

Publication Number Publication Date
CN107886568A CN107886568A (en) 2018-04-06
CN107886568B true CN107886568B (en) 2020-03-03

Family

ID=61773701

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711314522.9A Active CN107886568B (en) 2017-12-09 2017-12-09 Method and system for reconstructing facial expression by using 3D Avatar

Country Status (1)

Country Link
CN (1) CN107886568B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673287B (en) * 2020-05-15 2023-09-12 深圳市光鉴科技有限公司 Depth reconstruction method, system, equipment and medium based on target time node
CN111797756A (en) * 2020-06-30 2020-10-20 平安国际智慧城市科技股份有限公司 Video analysis method, device and medium based on artificial intelligence
CN113221503B (en) * 2020-12-31 2024-05-31 芯和半导体科技(上海)股份有限公司 Passive device modeling simulation engine based on machine learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216889A (en) * 2008-01-14 2008-07-09 浙江大学 A face image super-resolution method with the amalgamation of global characteristics and local details information
CN101976453A (en) * 2010-09-26 2011-02-16 浙江大学 GPU-based three-dimensional face expression synthesis method
CN102779189A (en) * 2012-06-30 2012-11-14 北京神州泰岳软件股份有限公司 Method and system for analyzing expressions
CN105469042A (en) * 2015-11-20 2016-04-06 天津汉光祥云信息科技有限公司 Improved face image comparison method
CN106469465A (en) * 2016-08-31 2017-03-01 深圳市唯特视科技有限公司 A kind of three-dimensional facial reconstruction method based on gray scale and depth information
CN106600667A (en) * 2016-12-12 2017-04-26 南京大学 Method for driving face animation with video based on convolution neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015192369A1 (en) * 2014-06-20 2015-12-23 Intel Corporation 3d face model reconstruction apparatus and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216889A (en) * 2008-01-14 2008-07-09 浙江大学 A face image super-resolution method with the amalgamation of global characteristics and local details information
CN101976453A (en) * 2010-09-26 2011-02-16 浙江大学 GPU-based three-dimensional face expression synthesis method
CN102779189A (en) * 2012-06-30 2012-11-14 北京神州泰岳软件股份有限公司 Method and system for analyzing expressions
CN105469042A (en) * 2015-11-20 2016-04-06 天津汉光祥云信息科技有限公司 Improved face image comparison method
CN106469465A (en) * 2016-08-31 2017-03-01 深圳市唯特视科技有限公司 A kind of three-dimensional facial reconstruction method based on gray scale and depth information
CN106600667A (en) * 2016-12-12 2017-04-26 南京大学 Method for driving face animation with video based on convolution neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
融合SFM和动态纹理映射的视频流三维表情重建;张剑;《计算机辅助设计与图形学学报》;20100630;第22卷(第6期);第949-958页 *

Also Published As

Publication number Publication date
CN107886568A (en) 2018-04-06

Similar Documents

Publication Publication Date Title
WO2019174439A1 (en) Image recognition method and apparatus, and terminal and storage medium
CN109815826B (en) Method and device for generating face attribute model
CN106068514B (en) System and method for identifying face in free media
JP6788264B2 (en) Facial expression recognition method, facial expression recognition device, computer program and advertisement management system
CN110569768B (en) Construction method of face model, face recognition method, device and equipment
CN107610209A (en) Human face countenance synthesis method, device, storage medium and computer equipment
CN105427385A (en) High-fidelity face three-dimensional reconstruction method based on multilevel deformation model
CN107886568B (en) Method and system for reconstructing facial expression by using 3D Avatar
CN102567716B (en) Face synthetic system and implementation method
US10860755B2 (en) Age modelling method
JP6207210B2 (en) Information processing apparatus and method
CN109685713B (en) Cosmetic simulation control method, device, computer equipment and storage medium
CN108615256B (en) Human face three-dimensional reconstruction method and device
CN111680550B (en) Emotion information identification method and device, storage medium and computer equipment
JP2020177620A (en) Method of generating 3d facial model for avatar and related device
CN112883867A (en) Student online learning evaluation method and system based on image emotion analysis
CN111127642A (en) Human face three-dimensional reconstruction method
KR101116838B1 (en) Generating Method for exaggerated 3D facial expressions with personal styles
CN106909904B (en) Human face obverse method based on learnable deformation field
CN117157673A (en) Method and system for forming personalized 3D head and face models
CN114743241A (en) Facial expression recognition method and device, electronic equipment and storage medium
CN114586069A (en) Method for generating dental images
CN116630599A (en) Method for generating post-orthodontic predicted pictures
US9786030B1 (en) Providing focal length adjustments
KR101734212B1 (en) Facial expression training system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant