CN114067057A - Human body reconstruction method, model and device based on attention mechanism - Google Patents

Human body reconstruction method, model and device based on attention mechanism Download PDF

Info

Publication number
CN114067057A
CN114067057A CN202111382077.6A CN202111382077A CN114067057A CN 114067057 A CN114067057 A CN 114067057A CN 202111382077 A CN202111382077 A CN 202111382077A CN 114067057 A CN114067057 A CN 114067057A
Authority
CN
China
Prior art keywords
human body
parameter
module
attention
smpl
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111382077.6A
Other languages
Chinese (zh)
Inventor
方贤勇
汪楷
汪粼波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202111382077.6A priority Critical patent/CN114067057A/en
Publication of CN114067057A publication Critical patent/CN114067057A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of computer vision, and particularly relates to a human body reconstruction method, a human body reconstruction model and a human body reconstruction device based on an attention mechanism. The reconstruction method comprises the following steps: the method comprises the following steps: constructing a human body reconstruction network model, wherein the human body reconstruction network model comprises a feature extraction module, an attention module, a fusion module, a parameter inference module and an SMPL sub-module; secondly, acquiring a plurality of original images containing characters, and preprocessing the original images to form a training data set; thirdly, training the human body reconstruction network model by using the training data set in the previous step through a minimum network loss function; and step four, inputting the human body image to be processed into the trained network model after preprocessing, and generating the human body three-dimensional model with the specific posture. The method solves the problem that the existing method is difficult to accurately reconstruct a three-dimensional human body model with accurate posture and shape according to a single human body image with shielding.

Description

Human body reconstruction method, model and device based on attention mechanism
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a human body reconstruction method, a human body reconstruction model and a human body reconstruction device based on an attention mechanism.
Background
The virtual reality technology is a new artificial intelligence technology and is widely applied to scenes such as virtual fitting, body type animation, human body motion simulation games and the like. In the application of the technologies, three-dimensional modeling of a human body by using images is an important link. The existing method for reconstructing a human body three-dimensional model from an image mainly comprises two types, namely an optimization-based method and a regression-based method. The former fits a parameterized body model to the two-dimensional viewing of a given image through an iterative optimization process, with the emphasis on using the two-dimensional joint point locations and contours to implement the fitting and modeling process. The latter mainly constructs a deep learning network, and performs feature extraction on the input single image in a deep neural network so as to obtain information such as human model parameters, volume representation of a three-dimensional human body, model vertexes and the like; and generating a three-dimensional human body model by using the information.
The two methods mentioned above have better model reconstruction effect under the condition that the target person in the image has no obstruction or the obstruction condition is not obvious. However, in practical applications, it is very common that the target person in the image is blocked by other people or objects; therefore, the above methods have limitations in their applications. Particularly, when a deep learning network is adopted to reconstruct a three-dimensional model, the deep neural network cannot effectively distinguish key information and redundant information in a human body image, and predicts parameters of the three-dimensional model by using all pixel characteristics in the human body image. Therefore, obvious errors occur, and the obstruction can generate serious interference on the actual three-dimensional human body model, so that the human body posture and shape in the constructed three-dimensional model are not in accordance with the actual situation.
Disclosure of Invention
In order to solve the problem that the existing human body three-dimensional model reconstruction method is difficult to accurately reconstruct a three-dimensional human body model with accurate posture and shape according to a single human body image with shielding, a human body reconstruction method, a human body model and a human body reconstruction device based on an attention mechanism are provided.
The invention is realized by adopting the following technical scheme:
a human body reconstruction method based on an attention mechanism comprises the following steps:
the method comprises the following steps: and constructing a human body reconstruction network model, wherein the human body reconstruction network model comprises a feature extraction module, an attention module, a fusion module, a parameter inference module and an SMPL sub-module. The feature extraction module is used for generating a corresponding original feature map according to the input human body image. The attention module comprises two pooling layers, a convolution layer and a Sigmoid operation layer; the two pooling layers are an average pooling layer and a maximum pooling layer, respectively. The attention module is used for generating an attention diagram according to the input original feature map. And the fusion module is used for carrying out fusion operation on the original feature map and the attention map to obtain a body attention feature map. The parameter inference module comprises a pooling layer and three full-connection layers; and the parameter inference module is used for generating the SMPL parameters of the corresponding target person in the human body image according to the input body attention feature map. The SMPL submodule is used for generating a three-dimensional human body model corresponding to the target person according to the SMPL parameters.
And secondly, acquiring a plurality of human body images containing the target person as original images, and preprocessing the original images to form a training data set, wherein the original images in the training data set at least comprise human body images with part of human body images blocked by the persons.
And step three, training the human body reconstruction network model by using the training data set in the step three through a minimum network loss function.
Step four, storing the human body reconstruction network model after training; and inputting the human body image to be processed into a stored network model after preprocessing, and generating a human body three-dimensional model with a specific gesture.
As a further improvement of the invention, the feature extraction module is obtained by simplifying and repackaging the deep convolutional neural network Resnet50, and only the convolutional part in the original network model is reserved in the simplification process; the input human body image is processed by convolution of the characteristic extraction module to obtain an original characteristic diagram.
As a further improvement of the invention, the attention module takes the output of the feature extraction module as input, the input original feature map respectively passes through an average pooling layer and a maximum pooling layer in the attention module, and the two pooling results are subjected to feature splicing and then sequentially pass through convolution processing and Sigmoid operation to obtain the attention map.
In the attention module, the pooling operation formula for the average pooling layer is:
Favg=AvgPool(F);
the pooling operation formula of the maximum pooling layer is as follows:
Fmax=MaxPool(F);
in the above formula, F represents the original characteristic diagram, FavgFeature graphs after the average pooling operation, FmaxThe feature map after the maximum pooling operation is shown, MaxPool (. cndot.) shows the maximum pooling operation, and AvgPool (. cndot.) shows the average pooling operation.
The generation operation formula of the attention map is as follows:
M(F)=σ(f(cat(Favg,Fmax)));
in the above formula, M (F) represents an attention map; σ (-) denotes Sigmoid activation function; the f (-) table is a convolution operation; cat (-) represents the concatenation operation of the feature map.
As a further improvement of the invention, in the fusion module, the fused body attention feature map is obtained by performing corresponding element multiplication operation on the attention map and the original feature map. Wherein, the formula of the fusion operation is as follows:
Figure BDA0003365961190000021
in the above formula, F' represents a body attention feature map, and m (F) represents an attention map;
Figure BDA0003365961190000031
representing multiplication operations by corresponding elements; f denotes the original feature map.
As a further improvement of the present invention, the pooling layer in the parameter inference module is an average pooling layer. The first two of the three fully connected layers each have 1024 neurons and are connected by a Dropout operation. The third fully-connected layer has 85 neurons and is directly connected to the last fully-connected layer. Wherein, the three fully connected layers form an iterative regression part in the parameter inference module.
As a further improvement of the present invention, in the parameter inference module, the SMPL parameter is generated as follows:
(1) and obtaining a feature phi by averaging and pooling the input body attention feature map F'.
(2) The SMPL pose parameter θ, the shape parameter β, and the camera parameter c are pieced together, and are formulated as:
Θ=cat(θ,β,c);
in the above formula, θ represents a pose parameter of the SMPL model; beta represents a shape parameter of the SMPL model; c represents a camera parameter; Θ represents a concatenated set of parameters of pose parameter θ, shape parameter β, and camera parameter c.
(3) The initialization parameter set Θ is formed by the average pose parameter, the average shape parameter and the average camera parameter0The feature phi is related to the parameter set theta0And splicing is carried out to be used as the input of an iterative regression part in the parameter inference module.
(4) Generating a residual error of a parameter set corresponding to the current input, and then updating the current parameter set, wherein an updating formula is as follows:
Θt+1=Θt+ΔΘt
in the above formula, thetatRepresenting the parameter set, Θ, corresponding to the current inputt+1Representing a parameter set ΘtUpdated State, Δ ΘtRepresenting a parameter set ΘtThe residual error of (a).
(5) Iterating the update operation of the previous step 3 times; in each iterative updating process, the parameter set obtained by the last updating is spliced with the characteristic phi to be used as the input of the iterative regression part of the parameter inference module at this time, and the parameter set is updated.
(6) After the iterative operation is finished, the SMPL parameters including the final posture parameter theta and the form parameter beta and the corresponding camera parameters c are obtained.
As a further improvement of the invention, in the SMPL submodule, an SMPL parameter is input into an SMPL function to obtain a three-dimensional human body model; the expression of the SMPL function is:
Figure BDA0003365961190000032
in the above formula, the first and second carbon atoms are,
Figure BDA0003365961190000033
the vertex coordinates of the three-dimensional human body model under the T posture are obtained; b isP(theta) and BS(β) represents the amount of offset from the vertex vector of the SMPL standard template caused by the pose parameter θ and the morphology parameter β, respectively; j (beta) is the position of the joint point of the model corresponding to the morphological parameter beta; w (-) is a linear hybrid skin function;
Figure BDA0003365961190000034
is the mixing weight.
As a further improvement of the present invention, the pre-processing process of the original image comprises:
(1) and positioning a target person in the human body image, and performing cutting operation on the image so as to enable the target person to be positioned in the central area of the human body image.
(2) The size of the cut human body image is adjusted, and the pixel values of the adjusted image are unified to 224 × 224.
(3) And carrying out normalization processing on the adjusted image to obtain data elements in the training data set.
As a further improvement of the invention, in the training process of the network model, all parameters in the network model are adjusted by adopting an Adam algorithm under the condition of minimizing a loss function, so as to train the network.
The expression for the minimization loss function is as follows:
L=λ2DL2Djoint3DL3DjointparaLSMPL
in the above formula, L2DjointRepresenting a 2D joint loss function; l is3DjointRepresenting a 3D joint loss function; l isSMPLRepresenting an SMPL parameter loss function; lambda [ alpha ]2DA weight coefficient representing a 2D joint loss function; lambda [ alpha ]3DA weight coefficient representing a 3D joint loss function; lambda [ alpha ]paraA weight coefficient representing a SMPL parameter loss function.
Wherein the 2D joint loss function L2DjointThe expression of (a) is:
Figure BDA0003365961190000041
in the above formula, viThe visibility of the ith 2D joint point is represented, the value is 0 or 1, 0 represents invisible, and 1 represents visible; n represents the number of 2D joint points;
Figure BDA0003365961190000042
a predicted value representing an ith 2D joint point; k is a radical ofiRepresenting the true value of the ith 2D joint point; wherein the predicted value of the 2D joint
Figure BDA0003365961190000043
Are derived from the predicted 3D joint projection.
3D joint loss function L3DjointThe expression of (a) is:
Figure BDA0003365961190000044
in the above formula, M represents the number of images participating in the calculation of the 3D joint point;
Figure BDA0003365961190000045
representing a 3D joint point prediction value of an ith image; j. the design is a squareiThe true value of the 3D joint point representing the ith image.
SMPL parameter loss function LSMPLThe expression of (a) is:
Figure BDA0003365961190000046
in the above equation, O represents the number of images participating in the calculation of the SMPL parameter,
Figure BDA0003365961190000047
and
Figure BDA0003365961190000048
respectively representing the predicted values of the posture parameter and the morphological parameter of the ith image, thetaiAnd betaiRespectively representing the posture parameter and the real value of the morphological parameter of the ith image.
The invention also comprises a human body reconstruction model, and the human body reconstruction method based on the attention mechanism adopts the human body reconstruction model to process the input human body image with the shielding function so as to generate the three-dimensional human body model of the target task in the human body image. The human body reconstruction model comprises the following steps: the system comprises a preprocessing module, a feature extraction module, an attention module, a fusion module, a parameter inference module and an SMPL submodule.
Wherein the preprocessing module is used for: (1) positioning a target person in the human body image, and performing cutting operation on the image to enable the target person to be located in the central area of the human body image; (2) adjusting the size of the cut human body image, wherein the pixel values of the adjusted image are unified to 224 multiplied by 224; (3) and carrying out normalization processing on the adjusted image.
The feature extraction module adopts the convolution part in the deep convolution neural network Resnet50 as a backbone network. The output of the preprocessing module is used as the input of the feature extraction module; the feature extraction module is used for extracting features in the preprocessed human body image through convolution operation, and then generating a corresponding original feature map.
The attention module comprises a maximum pooling sub-module, an average pooling sub-module, a feature splicing sub-module, a convolution sub-module and a Sigmoid operation sub-module. The output of the feature extraction module is used as the input of the attention module; the original feature map is processed by a maximum pooling submodule and an average pooling submodule respectively in the attention module to obtain two feature maps, and the two feature maps are spliced in a feature splicing submodule; and obtaining an attention diagram after convolution processing in the convolution submodule and Sigmoid operation in the Sigmoid operation submodule.
And the fusion module uses the original feature map output by the feature extraction module and the attention map output by the attention module, and then multiplies the original feature map and the attention map by corresponding elements to obtain a fused body attention feature map.
The parameter inference module comprises an average pooling layer, a full-connection layer I, a full-connection layer II and a full-connection layer III. The first full connection layer and the second full connection layer are provided with 1024 neurons and are connected through Dropout operation; the full connection layer III is provided with 85 neurons, and the full connection layer II is directly connected with the full connection layer III. The full connection layer I, the full connection layer II and the full connection layer III form an iterative regression part of the network model. The output of the fusion module is used as the input of the parameter inference module; and the parameter inference module generates the SMPL parameters after iterative updating according to different input data.
The SMPL submodule is used for generating a three-dimensional human body model of a target person corresponding to the human body image according to the SMPL parameters output by the parameter deduction submodule.
The technical scheme provided by the invention has the following beneficial effects:
in the three-dimensional human body reconstruction method based on the attention mechanism, the introduced attention mechanism can process the features in the human body image, so that the network focuses on the features containing important information, and the attention to the unimportant information is ignored and reduced. The original feature map is weighted by the attention map generated by the attention mechanism, so that the network focuses attention on information related to a human body part in an image, the attention on other information is reduced, and the interference of the obstruction information on the network is reduced. Meanwhile, the network model can deduce the condition of the shielded body part by utilizing the characteristics of the visible part of the body, thereby ensuring the integrity of the extracted characteristic information. The human body posture and the shape reflected in the finally constructed three-dimensional human body model are ensured to be more in line with reality.
Drawings
Fig. 1 is a flowchart illustrating steps of a human body reconstruction method based on an attention mechanism in embodiment 1 of the present invention.
Fig. 2 is a block diagram of an attention module in embodiment 1 of the present invention.
Fig. 3 is a flowchart of steps of a process of generating SMPL parameters of a target person in the parameter inference module according to embodiment 1 of the present invention.
Fig. 4 is a flowchart of steps of a three-dimensional human body model reconstruction process in embodiment 1 of the present invention.
Fig. 5 is a schematic block diagram of a human body reconstruction model provided in embodiment 2 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1
The present embodiment provides a human body reconstruction method based on an attention mechanism, as shown in fig. 1, the human body reconstruction method includes the following steps:
s1: and constructing a human body reconstruction network model, wherein the human body reconstruction network model comprises a feature extraction module, an attention module, a fusion module, a parameter inference module and an SMPL sub-module. The method specifically comprises the following steps:
s11: simplifying the Resnet50 of the deep convolutional neural network, wherein only the convolutional part in the original network model is reserved in the simplification process, and then repackaging the simplified network to obtain the required feature extraction module; the feature extraction module is used for generating a corresponding original feature map according to the input human body image.
S12: an attention module is constructed that includes two pooling layers, a convolutional layer and a Sigmoid operational layer. Wherein, the two pooling layers in the attention module are an average pooling layer and a maximum pooling layer, respectively. The attention module is used for generating an attention diagram according to the input original feature map. As shown in fig. 2, the processing procedure of the attention module is as follows:
the input original feature map firstly passes through an average pooling layer and a maximum pooling layer in an attention module, and the two pooling results are subjected to feature splicing and then sequentially subjected to convolution processing and Sigmoid operation to obtain the attention map.
Wherein, in the attention module, the pooling operation formula of the average pooling layer is:
Favg=AvgPool(F);
the pooling operation formula of the maximum pooling layer is as follows:
Fmax=MaxPool(F);
in the above formula, F represents the original characteristic diagram, FavgFeature graphs after the average pooling operation, FmaxThe feature map after the maximum pooling operation is shown, MaxPool (. cndot.) shows the maximum pooling operation, and AvgPool (. cndot.) shows the average pooling operation.
The generation operation formula of the attention map is as follows:
M(F)=σ(f(cat(Favg,Fmax)));
in the above formula, M (F) represents an attention map; σ (-) denotes Sigmoid activation function; the f (-) table is a convolution operation; cat (-) represents the concatenation operation of the feature map.
S13: and constructing a fusion module for carrying out fusion operation on the original feature map and the attention map to obtain a body attention feature map. The feature fusion method is to multiply the attention diagram and the original feature diagram by corresponding elements.
Wherein, the formula of the fusion operation is as follows:
Figure BDA0003365961190000071
in the above formula, F' represents a body attention feature map, and m (F) represents an attention map;
Figure BDA0003365961190000072
representing multiplication operations by corresponding elements; f denotes the original feature map.
S14: and constructing a parameter inference module comprising a pooling layer and three fully-connected layers. And the parameter inference module is used for generating the SMPL parameters of the corresponding target person in the human body image according to the input body attention feature map and the original feature map.
Wherein the pooling layer in the parameter inference module is an average pooling layer. The first two of the three fully connected layers each have 1024 neurons and are connected by a Dropout operation. The third fully-connected layer has 85 neurons and is directly connected to the last fully-connected layer. Wherein, the three fully connected layers form an iterative regression part in the parameter inference module.
Specifically, in this embodiment, as shown in fig. 3, the SMPL parameter of the target person is generated as follows:
(1) and obtaining a feature phi by averaging and pooling the input body attention feature map F'.
(2) The SMPL pose parameter θ, the shape parameter β, and the camera parameter c are pieced together, and are formulated as:
Θ=cat(θ,β,c);
in the above formula, θ represents a pose parameter of the SMPL model; beta represents a shape parameter of the SMPL model; c represents a camera parameter; Θ represents a concatenated set of parameters of pose parameter θ, shape parameter β, and camera parameter c.
(3) The initialization parameter set Θ is formed by the average pose parameter, the average shape parameter and the average camera parameter0The feature phi is related to the parameter set theta0And splicing is carried out to be used as the input of an iterative regression part in the parameter inference module.
(4) Generating a residual error of a parameter set corresponding to the current input, and then updating the current parameter set, wherein an updating formula is as follows:
Θt+1=Θt+ΔΘt
in the above formula, thetatRepresenting the parameter set, Θ, corresponding to the current inputt+1Representing a parameter set ΘtUpdated State, Δ ΘtRepresenting a parameter set ΘtThe residual error of (a).
(5) Iterating the update operation of the previous step 3 times; in each iterative updating process, the parameter set obtained by the last updating is spliced with the characteristic phi to be used as the input of the iterative regression part of the parameter inference module at this time, and the parameter set is updated.
(6) After the iterative operation is finished, the SMPL parameters including the final posture parameter theta and the form parameter beta and the corresponding camera parameters c are obtained.
S15: the parameter inference module is followed by an SMPL submodule. The SMPL (Skinned Multi-Person Linear Model) in the present embodiment is a vertex-based three-dimensional naked body Model of a human body, which is capable of accurately representing different shapes (shape) and postures (position) of the human body.
In the SMPL submodule, inputting an SMPL parameter into an SMPL function to obtain a three-dimensional human body model; the expression of the SMPL function is:
Figure BDA0003365961190000081
in the above formula, the first and second carbon atoms are,
Figure BDA0003365961190000082
the vertex coordinates of the three-dimensional human body model under the T posture are obtained; b isP(theta) and BS(β) represents the amount of offset from the vertex vector of the SMPL standard template caused by the pose parameter θ and the morphology parameter β, respectively; j (beta) is the position of the joint point of the model corresponding to the morphological parameter beta; w (-) is a linear hybrid skin function;
Figure BDA0003365961190000083
is the mixing weight.
S2: the method comprises the steps of obtaining a plurality of human body images containing target characters as original images, preprocessing the original images to further form a training data set, wherein the original images in the training data set at least comprise human body images with character occlusion parts.
In this embodiment, the pre-processing procedure for the original image includes:
(1) and positioning a target person in the human body image, and performing cutting operation on the image so as to enable the target person to be positioned in the central area of the human body image.
(2) The size of the cut human body image is adjusted, and the pixel values of the adjusted image are unified to 224 × 224.
(3) And carrying out normalization processing on the adjusted image to obtain data elements in the training data set.
S3: and training the human body reconstruction network model by minimizing a network loss function by using the training data set in the step.
In the training process of the network model, all parameters in the network model are adjusted by adopting an Adam algorithm under the condition of minimizing a loss function, and the network is trained.
The expression of the loss function is as follows:
L=λ2DL2Djoint3DL3DjointparaLSMPL
in the above formula, L2DjointRepresenting a 2D joint loss function; l is3DjointRepresenting a 3D joint loss function; l isSMPLRepresenting an SMPL parameter loss function; lambda [ alpha ]2DA weight coefficient representing a 2D joint loss function; lambda [ alpha ]3DA weight coefficient representing a 3D joint loss function; lambda [ alpha ]paraA weight coefficient representing a SMPL parameter loss function.
Wherein the 2D joint loss function L2DjointThe expression of (a) is:
Figure BDA0003365961190000084
in the above formula, viThe visibility of the ith 2D joint point is represented, the value is 0 or 1, 0 represents invisible, and 1 represents visible; n represents the number of 2D joint points;
Figure BDA0003365961190000085
a predicted value representing an ith 2D joint point; k is a radical ofiRepresenting the true value of the ith 2D joint point; wherein, the predicted value of the 2D joint point,
Figure BDA0003365961190000086
are derived from the predicted 3D joint orthographic projection.
Specifically, in this embodiment, the projection formula is:
Figure BDA0003365961190000091
in the above formula, the first and second carbon atoms are,
Figure BDA0003365961190000092
represents the 3D joint point prediction value,
Figure BDA0003365961190000093
to represent
Figure BDA0003365961190000094
The corresponding 2D joint prediction value, (-) represents a projection function based on the camera parameters c.
3D joint loss function L3DjointThe expression of (a) is:
Figure BDA0003365961190000095
in the above formula, M represents the number of images participating in the calculation of the 3D joint point;
Figure BDA0003365961190000096
representing a 3D joint point prediction value of an ith image; j. the design is a squareiThe true value of the 3D joint point representing the ith image.
SMPL parameter loss function LSMpLThe expression of (a) is:
Figure BDA0003365961190000097
in the above equation, O represents the number of images participating in the calculation of the SMPL parameter,
Figure BDA0003365961190000098
and
Figure BDA0003365961190000099
respectively representing the predicted values of the posture parameter and the morphological parameter of the ith image, thetaiAnd betaiRespectively representing the posture parameter and the real value of the morphological parameter of the ith image.
S4: storing the trained human body reconstruction network model; and inputting the human body image to be processed into a stored network model after preprocessing, and generating a human body three-dimensional model with a specific gesture.
The processing procedure of the human body reconstruction network model is shown in fig. 4, and specifically includes that the preprocessed human body image is firstly subjected to convolution processing through a feature extraction model, features related to a target person in the human body image are extracted, and an original feature map is generated. And then, the original characteristic diagram is divided into two paths backwards and respectively transmitted to the attention module and the parameter inference module. The method comprises the steps of inputting an original feature map into an attention module, firstly carrying out average pooling processing and maximum pooling processing to obtain an average pooling feature map and a maximum pooling feature map respectively, splicing the two types of pooling feature maps, and then carrying out convolution processing and Sigmoid operation in sequence to obtain an attention map. Then the fusion module simultaneously receives the attention diagram output by the attention module and the original feature diagram output by the feature extraction module; carrying out fusion processing on the attention diagram and the original feature map to obtain a body attention feature map; the body attention feature map is input into a parameter inference module, and the parameter inference module generates a corresponding SMPL parameter according to the body attention feature map and carries out iterative updating on the SMPL parameter. The SMPL parameters include posture parameters and morphology parameters. And finally, inputting the SMPL parameters subjected to iterative updating into an SMPL submodule to generate a three-dimensional human body model of the target person.
The method provided by the invention can process the characteristics in the human body image through the introduced attention mechanism, so that the network focuses on the characteristics containing important information, and the attention to the unimportant information is ignored and reduced. The original feature map is weighted by the attention map generated by the attention mechanism, so that the network focuses attention on information related to a human body part in an image, the attention on other information is reduced, and the interference of the obstruction information on the network is reduced. At the same time the network model can infer the situation of the occluded body part using the characteristics of the visible part of the body. The human body posture and the shape reflected in the finally constructed three-dimensional human body model are ensured to be more in line with reality.
Example 2
The present embodiment provides a human body reconstruction model, which uses the human body reconstruction method based on the attention mechanism as in embodiment 1 to process an input occluded human body image, so as to generate a three-dimensional human body model of a target task in the human body image. As shown in fig. 5, the human body reconstruction model includes the following: the system comprises a preprocessing module, a feature extraction module, an attention module, a fusion module, a parameter inference module and an SMPL submodel.
Wherein the preprocessing module is used for: (1) positioning a target person in the human body image, and performing cutting operation on the image to enable the target person to be located in the central area of the human body image; (2) adjusting the size of the cut human body image, wherein the pixel values of the adjusted image are unified to 224 multiplied by 224; (3) and carrying out normalization processing on the adjusted image.
The feature extraction module adopts the convolution part in the deep convolution neural network Resnet50 as a backbone network. The output of the preprocessing module is used as the input of the feature extraction module; the feature extraction module is used for extracting features in the preprocessed human body image through convolution operation, and then generating a corresponding original feature map.
The attention module comprises a maximum pooling sub-module, an average pooling sub-module, a feature splicing sub-module, a convolution sub-module and a Sigmoid operation sub-module. The output of the feature extraction module is used as the input of the attention module; the original feature map is processed by a maximum pooling submodule and an average pooling submodule respectively in the attention module to obtain two feature maps, and the two feature maps are subjected to feature splicing in a feature splicing submodule; and obtaining an attention diagram after convolution processing in the convolution submodule and Sigmoid operation in the Sigmoid operation submodule. Note that the detailed generation process of the force diagram has already been described in detail in embodiment 1, and is not described here again.
And the fusion module uses the original feature map output by the feature extraction module and the attention map output by the attention module, and then multiplies the original feature map and the attention map by corresponding elements to obtain a fused body attention feature map. The feature fusion method is to multiply the attention diagram and the original feature diagram by corresponding elements. Wherein, the formula of the fusion operation is as follows:
Figure BDA0003365961190000101
in the above formula, F' represents a body attention feature map, and m (F) represents an attention map;
Figure BDA0003365961190000102
representing multiplication operations by corresponding elements; f denotes the original feature map.
The parameter inference module comprises an average pooling layer, a full-connection layer I, a full-connection layer II and a full-connection layer III. The fully connected layer I and the fully connected layer II are provided with 1024 neurons and are connected through Dropout operation; the full connection layer III is provided with 85 neurons, and the full connection layer II is directly connected with the full connection layer III. The full connection layer I, the full connection layer II and the full connection layer III form an iterative regression part of the network model. The output of the fusion module is used as the input of the parameter inference module; and the parameter inference module generates the SMPL parameters after iterative updating according to different input data.
The SMPL submodule is used for generating a three-dimensional human body model of a target person corresponding to the human body image according to the SMPL parameters output by the parameter deduction submodule.
In other embodiments, the pre-processing module and the SMPL submodel may or may not be part of the human reconstruction model. When the preprocessing module does not belong to the human body reconstruction model, manual processing can be performed before each human body image is input into the human body reconstruction model, so that the input human body image is more in line with the requirements. Meanwhile, the target person can be positioned in the center of the image through manual processing, and the ratio of the shielding object in the image is relatively reduced. This allows a more accurate result of the three-dimensional phantom to be obtained.
When the SMPL sub-model does not belong to one part of the human body reconstruction model, the existing SMPL model can be called through a related module calling program, corresponding SMPL parameters generated by the human body reconstruction model are input into the SMPL model, and meanwhile, the three-dimensional human body model generated by the SMPL model is obtained. When the method is adopted for processing, the structure and the scale of the human body reconstruction model are simplified, the calculation force can be saved, and the requirement on hardware equipment is reduced. Meanwhile, distributed operation can be adopted for processing in the framework, so that the generation rate of the three-dimensional human body model is improved.
In order to verify the performance of the human body reconstruction model provided by the embodiment, the embodiment also simulates the processing procedure of the model. The simulation experiment environment adopts Intel (R) Xeon (R) CPU E5-2609V [email protected], a 16G memory and an Ubuntu18.04 system, the display card is GTX1080Ti, the programming environment is Pycharm, the deep learning framework is pytorch1.1.0, and the data set adopts a 2D data set Leeds Sports Pose (LSP) data set, an MPII data set and a 3D data set 3DPW data set and a Human3.6M data set.
Simulation shows that the human body reconstruction model provided by the embodiment still has good three-dimensional model reconstruction performance aiming at various human body images with shielding, and the human body posture and the shape of the constructed three-dimensional human body model are very practical, so that the human body reconstruction model has good practical value and is suitable for being applied to various scenes such as virtual fitting, body animation, human body motion simulation games and the like depending on human body three-dimensional modeling.
Example 3
The present embodiment provides an attention-based human body reconstruction apparatus, which is a computer device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the attention-based human body reconstruction method according to embodiment 1.
The computer device may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server or a cabinet server (including an independent server or a server cluster composed of a plurality of servers) capable of executing programs, and the like. The computer device of the embodiment at least includes but is not limited to: a memory, a processor communicatively coupled to each other via a system bus.
In this embodiment, the memory (i.e., the readable storage medium) includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. In other embodiments, the memory may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device. Of course, the memory may also include both internal and external storage devices for the computer device. In this embodiment, the memory is generally used for storing an operating system, various types of application software, and the like installed in the computer device. In addition, the memory may also be used to temporarily store various types of data that have been output or are to be output.
The processor may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor is typically used to control the overall operation of the computer device. In this embodiment, the processor is configured to run a program code stored in the memory or process data to implement the processing procedure of the human body reconstruction method based on the attention mechanism in the foregoing embodiment, so as to construct a three-dimensional human body model corresponding to the target task according to a single human body image of the given target person.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A human body reconstruction method based on an attention mechanism is characterized by comprising the following steps:
the method comprises the following steps: constructing a human body reconstruction network model, wherein the human body reconstruction network model comprises a feature extraction module, an attention module, a fusion module, a parameter inference module and an SMPL sub-module; the characteristic extraction module is used for generating a corresponding original characteristic diagram according to the input human body image; the attention module comprises two pooling layers, a convolutional layer and a Sigmoid operation layer; the two pooling layers are an average pooling layer and a maximum pooling layer respectively; the attention module is used for generating an attention diagram according to the input original feature map; the fusion module is used for carrying out fusion operation on the original feature map and the attention map to obtain a body attention feature map; the parameter inference module comprises a pooling layer and three full-connection layers; the parameter inference module is used for generating an SMPL parameter of a corresponding target person in the human body image according to the input body attention feature map; the SMPL submodule is used for generating a three-dimensional human body model corresponding to a target person according to the SMPL parameters;
secondly, acquiring a plurality of human body images containing target characters as original images, and preprocessing the original images to form a training data set, wherein the original images in the training data set at least comprise human body images with part being blocked by the characters;
thirdly, training the human body reconstruction network model by using the training data set in the previous step through a minimum network loss function;
step four, storing the human body reconstruction network model after training; and inputting the human body image to be processed into a stored network model after preprocessing, and generating a human body three-dimensional model with a specific gesture.
2. The attention mechanism-based human body reconstruction method of claim 1, wherein: the feature extraction module is obtained by simplifying and repackaging a deep convolutional neural network Resnet50, and the simplification process only reserves the convolutional part in the original network model; and after the input human body image is subjected to convolution processing of the feature extraction module, the original feature map is obtained.
3. The attention mechanism-based human body reconstruction method of claim 1, wherein: the attention module takes the output of the feature extraction module as input, the input original feature map respectively passes through an average pooling layer and a maximum pooling layer in the attention module, and the two pooling results are subjected to feature splicing and then sequentially subjected to convolution processing and Sigmoid operation to obtain the attention map;
in the attention module, the pooling operation formula of the average pooling layer is:
Favg=AvgPool(F);
the pooling operation formula of the maximum pooling layer is as follows:
Fmax=MaxPool(F);
in the above formula, F represents the original characteristic diagram, FavgFeature graphs after the average pooling operation, FmaxA feature map after maximum pooling operation is shown, MaxPool (. cndot.) shows maximum pooling operation, AvgPool (. cndot.) shows average pooling operation;
the generation operation formula of the attention map is as follows:
M(F)=σ(f(cat(Favg,Fmax)));
in the above formula, M (F) represents an attention map; σ (-) denotes Sigmoid activation function; f (-) represents a convolution operation; cat (-) represents the concatenation operation of the feature map.
4. The attention mechanism-based human body reconstruction method of claim 1, wherein: in the fusion module, the fused body attention feature map is obtained by multiplying the attention map and the original feature map by corresponding elements; wherein, the formula of the fusion operation is as follows:
Figure FDA0003365961180000021
in the above formula, F' represents a body attention feature map, and m (F) represents an attention map;
Figure FDA0003365961180000022
representing multiplication operations by corresponding elements; f denotes the original feature map.
5. The attention mechanism-based human body reconstruction method of claim 1, wherein: the pooling layer in the parameter inference module is an average pooling layer; the first two of the three fully connected layers each have 1024 neurons and operate through Dropout; the third full-connection layer for connection is provided with 85 neurons and is directly connected with the last full-connection layer; wherein three fully connected layers constitute an iterative regression portion in the parameter inference module.
6. The attention mechanism-based human body reconstruction method of claim 5, wherein: in the parameter inference module, the SMPL parameter is generated as follows:
(1) obtaining a feature phi by averaging and pooling the input body attention feature map F';
(2) the SMPL pose parameter θ, the shape parameter β, and the camera parameter c are pieced together, and are formulated as:
Θ=cat(θ,β,c);
in the above formula, θ represents a pose parameter of the SMPL model; beta represents a shape parameter of the SMPL model; c represents a camera parameter; Θ represents a parameter set of the pose parameter θ, the shape parameter β, and the camera parameter c;
(3) the initialization parameter set Θ is formed by the average pose parameter, the average shape parameter and the average camera parameter0The feature phi is compared with the parameter set theta0Performing a splice as an iteration back in the parameter inference moduleInputting a classification part;
(4) generating a residual error of a parameter set corresponding to the current input, and then updating the current parameter set, wherein an updating formula is as follows:
Θt+1=Θt+ΔΘt
in the above formula, thetatRepresenting the parameter set, Θ, corresponding to the current inputt+1Representing a parameter set ΘtUpdated State, Δ ΘtRepresenting a parameter set ΘtThe residual error of (a);
(5) iterating the update operation of the previous step 3 times; in each iterative updating process, splicing the parameter set obtained by last updating and the characteristic phi as the input of the iterative regression part of the parameter inference module at this time, and updating the parameter set;
(6) after the iterative operation is finished, the SMPL parameters including the final posture parameter theta and the form parameter beta and the corresponding camera parameters c are obtained.
7. The attention mechanism-based human body reconstruction method of claim 1, wherein: in the SMPL sub-module, inputting SMPL parameters into an SMPL function, and mapping the morphological parameters and the posture parameters into vertexes of a model by the SMPL function to obtain the three-dimensional human body model; the expression of the SMPL function is:
Figure FDA0003365961180000031
in the above formula, the first and second carbon atoms are,
Figure FDA0003365961180000032
the vertex coordinates of the three-dimensional human body model under the T posture are obtained; b isP(theta) and BS(β) represents the amount of offset from the vertex vector of the SMPL standard template caused by the pose parameter θ and the morphology parameter β, respectively; j (beta) is the position of the joint point of the model corresponding to the morphological parameter beta; w (-) is a linear hybrid skin function;
Figure FDA0003365961180000036
is the mixing weight.
8. The attention mechanism-based human body reconstruction method of claim 1, wherein: the preprocessing process of the original image comprises the following steps:
positioning a target person in the human body image, and performing cutting operation on the image to enable the target person to be located in the central area of the human body image;
adjusting the size of the cut human body image, wherein the pixel values of the adjusted image are unified to 224 multiplied by 224;
and carrying out normalization processing on the adjusted image to obtain data elements in the training data set.
9. The attention mechanism-based human body reconstruction method of claim 1, wherein: in the training process of the network model, all parameters in the network model are adjusted by adopting an Adam algorithm under the condition of minimizing a loss function, and the network is trained;
the expression of the loss function is as follows:
L=λ2DL2D joint3DL3D jointparaLSMPL
in the above formula, L2D jointRepresenting a 2D joint loss function; l is3D jointRepresenting a 3D joint loss function; l isSMPLRepresenting an SMPL parameter loss function; lambda [ alpha ]2DA weight coefficient representing a 2D joint loss function; lambda [ alpha ]3DA weight coefficient representing a 3D joint loss function; lambda [ alpha ]paraA weight coefficient representing a SMPL parameter loss function;
wherein the 2D joint loss function L2D jointThe expression of (a) is:
Figure FDA0003365961180000033
in the above formula, viThe visibility of the ith 2D joint point is represented, the value is 0 or 1, 0 represents invisible, and 1 representsVisible; n represents the number of 2D joint points;
Figure FDA0003365961180000034
a predicted value representing an ith 2D joint point; k is a radical ofiRepresenting the true value of the ith 2D joint point; wherein the predicted value of the 2D joint
Figure FDA0003365961180000035
Is derived from a predicted 3D joint projection;
3D joint loss function L3D jointThe expression of (a) is:
Figure FDA0003365961180000041
in the above formula, M represents the number of images participating in the calculation of the 3D joint point;
Figure FDA0003365961180000042
representing a 3D joint point prediction value of an ith image; j. the design is a squareiA 3D joint point true value representing the ith image;
SMPL parameter loss function LSMPLThe expression of (a) is:
Figure FDA0003365961180000043
in the above equation, O represents the number of images participating in the calculation of the SMPL parameter,
Figure FDA0003365961180000044
and
Figure FDA0003365961180000045
respectively representing the predicted values of the posture parameter and the morphological parameter of the ith image, thetaiAnd betaiRespectively representing the posture parameter and the real value of the morphological parameter of the ith image.
10. A human body reconstruction model, characterized in that the human body reconstruction method based on attention mechanism according to any one of claims 1 to 9 is used for processing the input occluded human body image, so as to generate a three-dimensional human body model of the target person in the human body image; the human body reconstruction model comprises the following steps:
a pre-processing module to: (1) positioning a target person in the human body image, and performing cutting operation on the image to enable the target person to be located in the central area of the human body image; (2) adjusting the size of the cut human body image, wherein the pixel values of the adjusted image are unified to 224 multiplied by 224; (3) carrying out normalization processing on the adjusted image;
the feature extraction module adopts a convolution part in a deep convolution neural network Resnet50 as a backbone network; the output of the preprocessing module is used as the input of the feature extraction module; the feature extraction module is used for extracting features in the preprocessed human body image through convolution operation so as to generate a corresponding original feature map;
the attention module comprises a maximum pooling sub-module, an average pooling sub-module, a feature splicing sub-module, a convolution sub-module and a Sigmoid operation sub-module; the output of the feature extraction module is used as the input of the attention module; the original feature map is processed by a maximum pooling submodule and an average pooling submodule respectively in the attention module to obtain two feature maps, and the two feature maps are subjected to feature splicing in a feature splicing submodule; obtaining an attention diagram after convolution processing in the convolution submodule and Sigmoid operation in the Sigmoid operation submodule;
the fusion module is used for simultaneously acquiring the original feature map output by the feature extraction module and the attention map output by the attention module, and then multiplying the original feature map and the attention map by corresponding elements to obtain a fused body attention feature map;
the parameter inference module comprises an average pooling layer, a full-connection layer I, a full-connection layer II and a full-connection layer III; the first full connection layer and the second full connection layer are provided with 1024 neurons and are connected through Dropout operation; the full connection layer III is provided with 85 neurons, and the full connection layer II is directly connected with the full connection layer III; the full connection layer I, the full connection layer II and the full connection layer III form an iterative regression part of the network model; the output of the fusion module is used as the input of the parameter inference module; the parameter inference module generates an SMPL parameter after iterative update according to different input data;
and the SMPL submodule is used for generating a three-dimensional human body model of the target person corresponding to the human body image according to the SMPL parameters output by the parameter deduction submodule.
CN202111382077.6A 2021-11-22 2021-11-22 Human body reconstruction method, model and device based on attention mechanism Pending CN114067057A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111382077.6A CN114067057A (en) 2021-11-22 2021-11-22 Human body reconstruction method, model and device based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111382077.6A CN114067057A (en) 2021-11-22 2021-11-22 Human body reconstruction method, model and device based on attention mechanism

Publications (1)

Publication Number Publication Date
CN114067057A true CN114067057A (en) 2022-02-18

Family

ID=80278834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111382077.6A Pending CN114067057A (en) 2021-11-22 2021-11-22 Human body reconstruction method, model and device based on attention mechanism

Country Status (1)

Country Link
CN (1) CN114067057A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663591A (en) * 2022-03-24 2022-06-24 清华大学 Three-dimensional reconstruction method and device, electronic equipment and storage medium
CN115147547A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Human body reconstruction method and device
CN115496864A (en) * 2022-11-18 2022-12-20 苏州浪潮智能科技有限公司 Model construction method, model reconstruction device, electronic equipment and storage medium
CN115775300A (en) * 2022-12-23 2023-03-10 北京百度网讯科技有限公司 Reconstruction method of human body model, training method and device of human body reconstruction model
CN116561591A (en) * 2023-07-10 2023-08-08 北京邮电大学 Training method for semantic feature extraction model of scientific and technological literature, feature extraction method and device
CN116934972A (en) * 2023-07-26 2023-10-24 石家庄铁道大学 Three-dimensional human body reconstruction method based on double-flow network
CN117077723A (en) * 2023-08-15 2023-11-17 支付宝(杭州)信息技术有限公司 Digital human action production method and device
CN117115363A (en) * 2023-10-24 2023-11-24 清华大学 Human chest plane estimation method and device

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663591A (en) * 2022-03-24 2022-06-24 清华大学 Three-dimensional reconstruction method and device, electronic equipment and storage medium
CN115147547B (en) * 2022-06-30 2023-09-19 北京百度网讯科技有限公司 Human body reconstruction method and device
CN115147547A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Human body reconstruction method and device
CN115496864A (en) * 2022-11-18 2022-12-20 苏州浪潮智能科技有限公司 Model construction method, model reconstruction device, electronic equipment and storage medium
CN115496864B (en) * 2022-11-18 2023-04-07 苏州浪潮智能科技有限公司 Model construction method, model reconstruction device, electronic equipment and storage medium
CN115775300A (en) * 2022-12-23 2023-03-10 北京百度网讯科技有限公司 Reconstruction method of human body model, training method and device of human body reconstruction model
CN115775300B (en) * 2022-12-23 2024-06-11 北京百度网讯科技有限公司 Human body model reconstruction method, human body model reconstruction training method and device
CN116561591A (en) * 2023-07-10 2023-08-08 北京邮电大学 Training method for semantic feature extraction model of scientific and technological literature, feature extraction method and device
CN116561591B (en) * 2023-07-10 2023-10-31 北京邮电大学 Training method for semantic feature extraction model of scientific and technological literature, feature extraction method and device
CN116934972A (en) * 2023-07-26 2023-10-24 石家庄铁道大学 Three-dimensional human body reconstruction method based on double-flow network
CN117077723A (en) * 2023-08-15 2023-11-17 支付宝(杭州)信息技术有限公司 Digital human action production method and device
CN117115363A (en) * 2023-10-24 2023-11-24 清华大学 Human chest plane estimation method and device
CN117115363B (en) * 2023-10-24 2024-03-26 清华大学 Human chest plane estimation method and device

Similar Documents

Publication Publication Date Title
CN114067057A (en) Human body reconstruction method, model and device based on attention mechanism
CN109859296B (en) Training method of SMPL parameter prediction model, server and storage medium
CN108961369B (en) Method and device for generating 3D animation
CN110717977B (en) Method, device, computer equipment and storage medium for processing game character face
CN112614213B (en) Facial expression determining method, expression parameter determining model, medium and equipment
US11276218B2 (en) Method for skinning character model, device for skinning character model, storage medium and electronic device
CN114648613B (en) Three-dimensional head model reconstruction method and device based on deformable nerve radiation field
CN113570684A (en) Image processing method, image processing device, computer equipment and storage medium
CN115239861A (en) Face data enhancement method and device, computer equipment and storage medium
CN113593001A (en) Target object three-dimensional reconstruction method and device, computer equipment and storage medium
CN114897136A (en) Multi-scale attention mechanism method and module and image processing method and device
CN112085835A (en) Three-dimensional cartoon face generation method and device, electronic equipment and storage medium
Vilanova et al. VirEn: A virtual endoscopy system
CN114677572B (en) Object description parameter generation method and deep learning model training method
CN114429518B (en) Face model reconstruction method, device, equipment and storage medium
CN117218300B (en) Three-dimensional model construction method, three-dimensional model construction training method and device
CN117292041B (en) Semantic perception multi-view three-dimensional human body reconstruction method, device and medium
CN115482557B (en) Human body image generation method, system, equipment and storage medium
EP4086853A2 (en) Method and apparatus for generating object model, electronic device and storage medium
CN116229548A (en) Model generation method and device, electronic equipment and storage medium
CN115775300A (en) Reconstruction method of human body model, training method and device of human body reconstruction model
CN115994944A (en) Three-dimensional key point prediction method, training method and related equipment
CN113592971A (en) Virtual human body image generation method, system, equipment and medium
Uzolas et al. MotionDreamer: Zero-Shot 3D Mesh Animation from Video Diffusion Models
CN117132501B (en) Human body point cloud cavity repairing method and system based on depth camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination