WO2023103600A1 - 表情生成方法、装置、设备、介质和计算机程序产品 - Google Patents

表情生成方法、装置、设备、介质和计算机程序产品 Download PDF

Info

Publication number
WO2023103600A1
WO2023103600A1 PCT/CN2022/126077 CN2022126077W WO2023103600A1 WO 2023103600 A1 WO2023103600 A1 WO 2023103600A1 CN 2022126077 W CN2022126077 W CN 2022126077W WO 2023103600 A1 WO2023103600 A1 WO 2023103600A1
Authority
WO
WIPO (PCT)
Prior art keywords
expression
face
key points
target
sample
Prior art date
Application number
PCT/CN2022/126077
Other languages
English (en)
French (fr)
Inventor
周志强
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to US18/331,906 priority Critical patent/US20230316623A1/en
Publication of WO2023103600A1 publication Critical patent/WO2023103600A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/66Methods for processing data by generating or executing the game program for rendering three dimensional images

Definitions

  • This application relates to artificial intelligence technology, and more to the field of game-based technology, in particular to an emoticon generation method, device, equipment, medium and computer program product.
  • Character expression generation technology refers to the technology of automatically generating complex expressions of characters through computers.
  • 3D (Three-Dimensional) character complex expressions can be automatically generated through 3D character expression generation technology.
  • the complex expression to be generated is first decomposed into multiple meta-expressions, and then all the meta-expressions corresponding to the complex expression are produced by an animator, and then based on the degree value corresponding to each meta-expression, the complex expression is corresponding to Synthesize all the meta-expressions of the character to obtain the character's expression.
  • a method, device, device, medium and computer program product for generating an expression are provided.
  • a method for generating emoticons, applied to a terminal comprising:
  • the expression position difference information obtained based on the real face; the expression position difference information is used to represent the position difference of the facial key points of the real face under the first expression and the second expression;
  • the virtual face is the face of the virtual object
  • An expression generating device comprising:
  • the obtaining module is used to obtain the expression position difference information obtained based on the real face;
  • the expression position difference information is used to represent the position difference of the facial key points of the real face under the first expression and the second expression;
  • the face key points of the virtual face under the first expression obtain the initial virtual face key points;
  • the virtual face is the face of the virtual object;
  • a determining module configured to obtain target virtual facial key points of the virtual face under a second expression based on the expression position difference information and the initial virtual facial key points; based on the target virtual facial key points extract key point distribution features, and determine target control parameters based on the key point distribution features; Parameters that are controlled by related associated vertices;
  • a generating module configured to control movement of the associated vertices in the face mesh based on the target control parameters, so as to generate the second expression in the virtual face.
  • a computer device includes a memory and one or more processors, where computer-readable instructions are stored in the memory, and the one or more processors implement the steps in the method embodiments of the present application when executing the computer-readable instructions.
  • One or more computer-readable storage media store computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, the steps in the various method embodiments of the present application are implemented.
  • a computer-readable instruction product including computer-readable instructions.
  • the steps in the various method embodiments of the present application are implemented.
  • Fig. 1 is the application environment diagram of expression generating method in an embodiment
  • Fig. 2 is a schematic flow chart of an expression generation method in an embodiment
  • Fig. 3 is a schematic diagram of facial key points of a real face in an embodiment
  • Fig. 4 is a schematic diagram of the main interface of the emoticon generating application in one embodiment
  • Fig. 5 is a schematic diagram of the interface of marking the facial key points of the virtual face in one embodiment
  • Fig. 6 is a schematic diagram of the initial facial key points of the virtual face under the first expression in one embodiment
  • Fig. 7 is a schematic diagram of a facial grid of a virtual face in an embodiment
  • Fig. 8 is a schematic diagram of the target face key points of the virtual face under the second expression in an embodiment
  • FIG. 9 is a schematic diagram of a key point detection process in an embodiment
  • FIG. 10 is a schematic diagram of a parameter estimation process in an embodiment
  • Fig. 11 is a schematic diagram of the binding of the face mesh of the virtual face and the target controller in one embodiment
  • Fig. 12 is a schematic diagram of setting control parameters of the target controller in an embodiment
  • Fig. 13 is a schematic diagram of a virtual face under a second expression produced in one embodiment
  • Fig. 14 is a schematic flow chart of an expression generation method in another embodiment
  • Fig. 15 is a schematic flow chart of an expression generation method in yet another embodiment
  • Fig. 16 is a structural block diagram of an expression generation device in an embodiment
  • Fig. 17 is a structural block diagram of an expression generating device in another embodiment
  • Figure 18 is a diagram of the internal structure of a computer device in one embodiment.
  • the emoticon generation method provided in this application can be applied to the application environment shown in FIG. 1 .
  • the terminal 102 communicates with the server 104 through the network.
  • the terminal 102 can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, portable wearable devices, and vehicle-mounted terminals
  • the server 104 can be an independent physical server, or a server composed of multiple physical servers
  • Cluster or distributed system can also provide cloud service, cloud database, cloud computing, cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and big data and artificial intelligence platform Cloud servers for basic cloud computing services.
  • the terminal 102 and the server 104 may be connected directly or indirectly through wired or wireless communication, which is not limited in this application.
  • the terminal 102 may acquire the first expression image of the real face under the first expression and the second expression image of the real face under the second expression from the server 104 .
  • the terminal 102 can acquire expression position difference information between the first expression and the second expression based on the first expression image and the second expression image; the expression position difference information is used to represent the key points of the real face in the first expression. and the position difference under the second expression.
  • the terminal 102 can acquire facial key points of the virtual face under the first expression to obtain initial virtual facial key points; the virtual face is the face of the virtual object.
  • the terminal 102 can obtain the target virtual face key points of the virtual face under the second expression based on the expression position difference information and the initial virtual face key points, extract key point distribution features based on the target virtual face key points, and based on the key points
  • the distribution feature determines the target control parameter; the target control parameter is a parameter used to control the associated vertices in the face mesh of the virtual face and related to the second expression.
  • the terminal 102 can control the movement of the associated vertices in the face mesh based on the target control parameters, so as to generate the second expression on the virtual face.
  • the key point distribution features corresponding to the key points of the target virtual face belong to the features extracted by using artificial intelligence technology, and the target control parameters also belong to the parameters predicted by using artificial intelligence technology.
  • the facial expression generation method in some embodiments of the present application also uses computer vision technology (Computer Vision, CV).
  • CV Computer Vision
  • the first position information and the second position information of the facial key points of the real face belong to the information obtained from the first expression image and the second expression image respectively by using computer vision technology.
  • a method for generating emoticons is provided, and the method can be applied to a terminal, and can also be applied to an interaction process between a terminal and a server.
  • This embodiment is described by taking the method applied to the terminal 102 in FIG. 1 as an example, including the following steps:
  • Step 202 acquiring expression position difference information obtained based on the real face; the expression position difference information is used to represent the position difference of facial key points of the real face under the first expression and the second expression.
  • the real face is a face of a real object.
  • the face of a real person is not limited to a person, but can also be any other object with facial expressions, such as an animal.
  • Face key points are key points used to locate key areas on the face.
  • the key points of the face may include key points of the contour of the face and key points of the facial organs.
  • the first expression is the expression of the first type
  • the second expression is the expression of the second type. It can be understood that the first expression and the second expression are different types of expressions.
  • the first expression is expressionless, and the second expression It's laughing.
  • the first expression is laughing, and the second expression is crying.
  • the expression position difference information is the difference information between the position of the facial key points of the real face under the first expression and the position of the facial key points of the real face under the second expression.
  • the terminal can directly obtain the expression position difference information from the server, that is, the server can determine the expression position difference information based on the position difference of the facial key points of the real face under the first expression and the second expression in advance, and the terminal can directly Obtain the emoticon position difference information from the server.
  • the terminal may also obtain position information of facial key points of the real face under the first expression, and obtain position information of facial key points of the real face under the second expression.
  • the terminal can compare the position information of the facial key points of the real face under the first expression with the position information of the facial key points of the real face under the second expression, so as to determine the facial key points of the real face. Click the expression position difference information under the first expression and the second expression.
  • the server stores the position information of the facial key points of the real face under the first expression, and the position information of the facial key points of the real face under the second expression.
  • the server may send the location information of the facial key points of the real face under the first expression and the location information of the facial key points of the real face under the second expression to the terminal.
  • the terminal may receive position information of key points of the real face under the first expression, and position information of key points of the real face under the second expression.
  • the terminal may perform key point detection processing on the real face under the first expression, and obtain position information of facial key points on the real face under the first expression.
  • the terminal can perform key point detection processing on the real face under the second expression, and obtain the position information of the key points of the real face under the second expression.
  • the figure shows the key points of the real face The distribution of points under the second expression.
  • the terminal after the terminal obtains the position information of the key points of the real face under the first expression and the position information of the key points of the real face under the second expression, it can Calculate the difference between the position information of the key points of the face under the first expression and the position information of the key points of the real face under the second expression, and then directly use the obtained difference as the face of the real face The expression position difference information of the internal key points under the first expression and the second expression.
  • Step 204 acquiring key points of the virtual face under the first expression to obtain key points of the initial virtual face; the virtual face is the face of the virtual object.
  • the virtual object is a virtual object, for example, a three-dimensional virtual character in a game. It can be understood that the virtual object may also be a non-real object virtualized in other non-game scenes, for example, a virtual model in some design industries.
  • the initial virtual facial key points are facial key points of the virtual face under the first expression.
  • the terminal can display the virtual face under the first expression on the display interface, and the user can perform key point labeling operations on the virtual face under the first expression, and then the terminal can obtain the user's virtual face under the first expression.
  • the facial key points marked on the face are used to obtain the initial virtual facial key points of the virtual face under the first expression.
  • an expression generating application runs in the terminal, and the expression generating application may provide a key point labeling interface, and the key point labeling interface may include a key point labeling trigger control.
  • the terminal can display the virtual face under the first expression on the key point labeling interface, and the user can trigger the control based on the key point labeling to perform key point labeling operations on the virtual face under the first expression.
  • the key points of the face marked on the virtual face under the first expression are used to obtain the initial virtual face key points of the virtual face under the first expression.
  • the emoticon generation application is an application program for generating emoticons.
  • FIG. 4 is a schematic diagram of the main interface of the expression generation application.
  • the main interface includes file paths such as project paths and avatar annotation files, wherein the project paths can be used to store items corresponding to facial key points of virtual faces,
  • the avatar annotation file can be used to record the location information of the facial key points of the virtual face.
  • the user can perform a key point marking operation on the virtual face under the first expression, and then the terminal can obtain the initial virtual face key points of the virtual face under the first expression .
  • the annotation page of the emoticon generation application can display the real face that has been marked with key points of the face, and then the user can refer to the existing face.
  • the real face marked with facial key points, the key point labeling operation is performed on the virtual face under the first expression, and the initial virtual face key points are obtained, and the distribution of the initial virtual face key points on the virtual face is as follows: Figure 6 shows.
  • Step 206 based on the expression position difference information and the initial virtual face key points, obtain the target virtual face key points of the virtual face under the second expression.
  • the key points of the target virtual face are the key points of the virtual face under the second expression.
  • the terminal may determine the adjustment information of the key points of the initial virtual face based on the expression position difference information. Furthermore, the terminal can adjust the key points of the initial virtual face according to the determined adjustment information, and after the adjustment is completed, the target virtual face key points of the virtual face under the second expression can be obtained.
  • the adjustment information includes the movement distance.
  • the terminal can determine the current position of the key points of the initial virtual face, and determine the adjustment information of the key points of the initial virtual face based on the expression position difference information. Furthermore, the terminal can move the key points of the initial virtual face from the current position of the key points of the initial virtual face according to the determined moving distance, so as to obtain the key points of the target virtual face of the virtual face under the second expression.
  • Step 208 extract key point distribution features based on key points of the target virtual face, and determine target control parameters based on the key point distribution features; target control parameters are used in the face grid of the virtual face and are related to the first Parameters to control the associated vertices related to the two expressions.
  • the key point distribution feature is the distribution feature of the target virtual face key points on the virtual face.
  • a face mesh is a mesh constituting a virtual face. It can be understood that the facial grid is a group of facial shape grids acquired each time the expression changes.
  • a face mesh includes a plurality of mesh regions, and a mesh region is the smallest unit constituting a face mesh.
  • the face mesh includes a plurality of mesh lines and a plurality of mesh vertices, and a mesh vertex is a point where two or more mesh lines intersect.
  • the associated vertex is a mesh vertex related to the second expression in the face mesh. It can be understood that the associated vertices are the mesh vertices in the same mesh region as the key points of the target virtual face.
  • the virtual face can be set on a facial mesh, and each mesh includes a plurality of mesh vertices.
  • the terminal may determine the grid vertex related to the second expression (that is, the grid vertex corresponding to the key point of the target virtual face) as the associated vertex. For example, referring to FIG. One of the associated vertices related to the second expression, 702 in the figure (ie, the white point) is the key point of the target virtual face.
  • the key point distribution feature is used to represent the position distribution of the target virtual face key points on the virtual face.
  • the distribution of facial key points on the face is different. Therefore, the terminal can perform distribution feature extraction processing on the key points of the target virtual face, that is, the terminal can analyze the relative position information between the key points of the target virtual face in the virtual face.
  • the relative position information of each target virtual face key point can reflect the position distribution of each target virtual face key point on the virtual face. Therefore, based on the relative position information between the target virtual face key points, the terminal can obtain the Keypoint distribution features of the face.
  • the terminal may perform parameter estimation processing on the key points of the target virtual face based on the key point distribution characteristics of the key points of the target virtual face on the virtual face, and obtain target control parameters corresponding to the key points of the target virtual face.
  • Step 210 control the movement of associated vertices in the face mesh based on the target control parameters, so as to generate a second expression on the virtual face.
  • the target control parameters have a binding relationship with associated vertices in the face mesh.
  • the terminal can control the associated vertices in the face mesh under the first expression to move, so as to control the transformation of the virtual face from the first expression to the second expression.
  • the facial key points of the virtual face have a corresponding relationship with the associated vertices, and the movement of the associated vertices can control the facial key points of the virtual face to also move accordingly.
  • the key point distribution characteristics of the key points of the target virtual face in the virtual face are known, and based on the target control parameters corresponding to the key points of the target virtual face, the association in the face grid under the first expression can be controlled
  • the vertex moves, and then the facial key points controlling the virtual face also move accordingly, until the distribution of the facial key points of the virtual face satisfies the key point distribution characteristics corresponding to the target virtual face key points, the virtual face moves from The first expression is transformed into the second expression.
  • the distribution of key points of the target virtual face on the virtual face can refer to FIG. 8 .
  • the key points of the target virtual face are key point information used to locate key areas on the virtual face.
  • the key points of the target virtual face are represented by hollow circles, which can more clearly represent the target virtual face The distribution of internal key points in the virtual face.
  • expression position difference information based on the real face is obtained, wherein the expression position difference information can be used to represent the position difference of facial key points of the real face under the first expression and the second expression.
  • the target virtual face key points of the virtual face under the second expression can be obtained, and the key point distribution features can be extracted based on the target virtual face key points.
  • the target control parameter can be quickly determined, wherein the target control parameter is a parameter used to control an associated vertex in the face mesh of the virtual face and related to the second expression.
  • the second expression can be generated in the virtual face.
  • the expression generation method of this application only needs to obtain the key points of the virtual face under the first expression, and combine the facial key points of the real face with the expression position differences under different expressions , can conveniently control the expression that produces corresponding changes in the virtual face, and improves the expression generation efficiency.
  • traditional expression generation methods can only generate expressions that can be synthesized by meta-expressions produced by animators.
  • the expression generation method of the present application directly controls the virtual face to generate corresponding expressions through the target control parameters corresponding to the key points of the virtual face, so that any expression can be generated, and the flexibility of expression production is improved.
  • obtaining the expression position difference information obtained based on the real face includes: obtaining a first expression image of the real face under the first expression; obtaining a second expression image of the real face under the second expression; From the first expression image, locate the first position information of the face key points of the real face; from the second expression image, locate the second position information of the face key points of the real face; determine the first position information and The difference between the second position information is used to obtain expression position difference information.
  • the first expression image refers to an image of a real face under the first expression.
  • the second expression image refers to an image of a real face under the second expression.
  • the first position information is the position information of the facial key points of the real face under the first expression.
  • the second position information is the position information of the facial key points of the real face under the second expression.
  • the terminal may acquire a first expression image of a real face under a first expression, and acquire a second expression image of a real face under a second expression.
  • the terminal may perform key point detection processing on the first expression image, so as to determine the first position information of facial key points of the real face from the first expression image.
  • the terminal may perform key point detection processing on the second expression image, so as to determine the second position information of facial key points of the real face from the second expression image.
  • the terminal may determine a difference between the first location information and the second location information, and determine expression location difference information based on the difference between the first location information and the second location information.
  • the terminal includes an image acquisition unit, and the terminal can acquire a first expression image of a real face under a first expression and a second expression image of a real face under a second expression through the image acquisition unit.
  • the first expression image of the real face under the first expression and the second expression image of the real face under the second expression are stored in the server.
  • the server may send the first expression image of the real face under the first expression and the second expression image of the real face under the second expression to the terminal.
  • the terminal may receive the first expression image of the real face under the first expression and the second expression image of the real face under the second expression sent by the server.
  • the main interface of the expression generation application further includes a picture path, a picture key point path, and a picture list.
  • the picture path can be used to store the first expression image and the second expression image
  • the picture key point path can be used to store the first position information and the second position information
  • the picture list can be used to display the images of the first expression image and the second expression image Information, such as image serial number and image name, etc.
  • the main interface of the expression generating application further includes a button 404 of "annotate picture".
  • the terminal can directly locate the real face from the first expression image.
  • the first expression image and the second expression image may specifically be RGB (Red, Green, Blue, red, green, blue) images.
  • the image of the real face can be directly and accurately obtained.
  • the first position information and the second position information are detected by a trained key point detection model;
  • the step of obtaining the trained key point detection model includes: obtaining sample data; the sample data includes sample expression images and the reference position information of the facial key points marked for the sample expression image; input the sample expression image to the key point detection model to be trained to obtain the predicted position information of the facial key points; based on the predicted position information of the facial key points and The error between the reference position information of facial key points is used to determine the first loss value; toward the direction of reducing the first loss value, the key point detection model to be trained is iteratively trained until the iteration stop condition is met, and the obtained A trained keypoint detection model.
  • the sample data is the data used to train the key point detection model.
  • the sample expression image is the image used to train the key point detection model.
  • the reference location information is the location information of facial key points marked for the sample expression image.
  • the predicted position information is the position information of the facial key points obtained by predicting the sample expression images through the key point detection model to be trained.
  • the first loss value is a loss value determined based on an error between predicted position information of facial key points and reference position information of facial key points.
  • a trained key point detection model runs in the terminal.
  • the terminal may acquire the first expression image of the real face under the first expression, and acquire the second expression image of the real face under the second expression.
  • the terminal can input the first facial expression image into the trained key point detection model, and perform key point detection processing on the first facial expression image through the trained key point detection model, so as to obtain the key point detection process from the first facial expression image , the first position information of the face key points of the real face is detected.
  • the terminal can input the second expression image into the trained key point detection model, and perform key point detection processing on the second expression image through the trained key point detection model, so as to detect the real The second location information of facial key points of the face.
  • the terminal may determine a difference between the first location information and the second location information, and determine expression location difference information based on the difference between the first location information and the second location information.
  • the key point detection model to be trained runs in the terminal.
  • the terminal can obtain sample data, and input the sample expression images in the sample data to the key point detection model to be trained, and perform key point detection processing on the sample expression images through the key point detection model to be trained to obtain the facial key points predicted location information.
  • the terminal can determine the error between the predicted position information of the key points of the face and the reference position information of the key points of the face marked for the sample expression image, and based on the predicted position information of the key points of the face and the face marked for the sample expression image An error between the reference position information of the key points is used to determine a first loss value.
  • the terminal may perform iterative training on the key point detection model to be trained in the direction of reducing the first loss value, until the iteration stop condition is met, and the trained key point detection model is obtained.
  • the iteration stop condition may be that the first loss value reaches a preset loss threshold, or that the number of iteration rounds reaches a preset threshold.
  • the mean square error function can be used as the loss function of the trained key point detection model, and the loss function can be expressed by the following formula:
  • L loss represents the first loss value
  • MeanSquaredError() represents the mean square error function
  • K predict represents the predicted position information of facial key points
  • K groundtruth represents the reference position information of facial key points.
  • the key point detection model to be trained is iteratively trained through the sample expression image and the reference position information of the facial key points marked on the sample expression image, so as to improve the key point detection accuracy of the key point detection model.
  • the method for generating facial expressions further includes: obtaining the third position information of the key points of the initial virtual face; performing normalization processing on the first position information, the second position information and the third position information respectively to obtain the normalized The first position information after normalization, the second position information after normalization, and the third position information after normalization; determine the difference between the first position information and the second position information, and obtain expression position difference information; Including: based on the difference between the normalized first position information and the normalized second position information, the expression position difference information is obtained; based on the expression position difference information and the initial virtual face key points, the virtual face is obtained
  • the target virtual face key points under the second expression include: adjusting the normalized third position information according to the expression position difference information to obtain the target virtual face key points of the virtual face under the second expression.
  • the third position information is position information of key points of the initial virtual face under the first expression.
  • the user may mark the key points of the virtual face under the first expression, and the terminal may directly acquire the third position information of the key points of the initial virtual face based on the marked key points of the initial virtual face.
  • the terminal may perform normalization processing on the first location information, the second location information, and the third location information respectively to obtain the normalized first location information, the normalized second location information, and the normalized Third location information.
  • the terminal may determine the difference between the normalized first location information and the normalized second location information, and based on the difference between the normalized first location information and the normalized second location information
  • the difference of facial key points of the real face is determined to determine the expression position difference information between the first expression and the second expression.
  • the terminal may adjust the normalized third position information according to the expression position difference information to obtain target virtual face key points of the virtual face under the second expression.
  • the normalized location information when the terminal performs normalization processing on each of the first location information, the second location information and the third location information, the normalized location information may be used as the current location information.
  • the terminal may determine a reference facial key point that satisfies the location stabilization condition.
  • the terminal may determine normalization standard information according to the reference face key points, and perform normalization processing on each of the first position information, the second position information and the third position information based on the normalization standard information.
  • the position stability condition is a condition that keeps the position of the key points of the face relatively stable
  • the key point of the face is the key point of the face used as a reference in the normalization process.
  • the normalization standard information is the standard information in the normalization process.
  • the position stability condition may specifically be that when switching from the first expression to the second expression, the moving distance of the facial key points on the corresponding face is less than a preset moving distance.
  • the position stability condition may also be that when switching from the first expression to the second expression, the key points of the face will not move on the corresponding face.
  • the facial key points corresponding to the four corners of the face will basically not move on the corresponding face.
  • the facial key points corresponding to the temples on both sides of the face will basically not move on the corresponding face.
  • the face shape difference between the real face and the virtual face can be reduced. Furthermore, based on the difference between the normalized first position information and the normalized second position information, more accurate expression position difference information can be obtained. By adjusting the normalized third position information according to the expression position difference information, more accurate target virtual face key points of the virtual face under the second expression can be obtained, so that the second expression generated on the virtual face A secondary expression that more closely matches the corresponding real face.
  • performing normalization processing on the first position information, the second position information and the third position information respectively includes: performing normalization processing on each of the first position information, the second position information and the third position information
  • the normalized position information is used as the current position information; for each current position information, a reference face key point that meets the position stability condition is determined; the current position is determined based on the reference face key point The reference point corresponding to the information, and determine the relative distance between the reference face key point and the reference point; based on the distance between any two reference face key points, determine the zoom ratio; according to the relative distance and zoom ratio, the current position The information is normalized.
  • the reference point corresponding to the current location information is a point used as a reference when normalizing the current location information.
  • the relative distance is the distance between the benchmark facial key point and the reference point.
  • the zoom ratio is a ratio used as a reference when normalizing the current location information.
  • the terminal may use the normalized location information as the current location information. For each piece of current position information, the terminal can filter out the reference facial key points satisfying the position stability condition from the facial key points of the corresponding face.
  • the terminal may determine the reference point corresponding to the current location information based on the reference facial key point, and may use the distance between the reference facial key point and the reference point as a relative distance.
  • the terminal may determine the distance between any two reference facial key points, and determine the scaling ratio based on the distance between any two reference facial key points.
  • the terminal may perform normalization processing on the current location information according to the relative distance and the zoom ratio.
  • the terminal determines the reference point corresponding to the current position information based on the reference facial key points. Specifically, the terminal may determine the reference point corresponding to the current position information based on all the selected reference facial key points, or Yes, the terminal can determine the reference point corresponding to the current location information based on some of the selected reference facial key points.
  • the face key points corresponding to the four corners of the eyes in the corresponding face belong to four of the selected reference face key points.
  • the terminal may determine the central point of the four reference facial key points as the reference point corresponding to the current position information.
  • the terminal may determine the scaling ratio based on the distance between key points on the face corresponding to temples on both sides of the corresponding face.
  • the terminal may subtract the coordinates of the reference point from the coordinates of the reference facial key point to obtain the subtracted coordinates.
  • the terminal may use the ratio of the subtracted coordinates and the scaling ratio as the normalized current position information, that is, the normalized coordinates of the key points of the face.
  • the normalized coordinates of the facial key points can be calculated by the following formula:
  • NK ri represents the normalized coordinates of the facial key points of the real face under the corresponding expression
  • K ri represents the coordinates of the reference facial key points of the real face under the corresponding expression
  • O ri represents the real face in the The coordinates of the reference point under the corresponding expression
  • L ri is the scaling ratio of the real face under the corresponding expression. It can be understood that the corresponding expression refers to the first expression or the second expression.
  • the normalized coordinates of the facial key points can be calculated by the following formula:
  • NK g, neutral represents the normalized coordinates of the facial key points of the virtual face under the first expression
  • K g, neutral represents the coordinates of the reference facial key points of the virtual face under the first expression
  • O g, neutral represents the coordinates of the reference point of the virtual face under the first expression
  • L g, neutral represents the scaling ratio of the virtual face under the first expression.
  • the reference point corresponding to the current position information can be determined, and the relative distance between the reference facial key point and the reference point can be determined.
  • the distance between the internal key points can determine the zoom ratio, and the current position information can be normalized according to the relative distance and zoom ratio. In this way, the face shape difference between the real face and the virtual face can be further reduced, thereby further The second expression generated on the virtual face is made to fit the corresponding second expression on the real face more closely.
  • the normalized third position information is adjusted according to the expression position difference information to obtain the target virtual face key points of the virtual face under the second expression, including: normalized The third position information is adjusted according to the expression position difference information to obtain the intermediate state position information of the facial key points of the virtual face under the second expression; based on the relative distance and scaling ratio corresponding to the third position information, the intermediate state position information Denormalization processing is performed to obtain key points of the target virtual face of the virtual face under the second expression.
  • the intermediate state position information is the position information of the key points of the face of the virtual face under the second expression, which belongs to the intermediate state.
  • the further processed position information is equivalent to an intermediate result, which only exists in the denormalization process, and is not output as the result of the denormalization process.
  • the normalized third position information is adjusted according to the expression position difference information to obtain the intermediate state position information of the facial key points of the virtual face under the second expression; based on the relative position information corresponding to the third position information
  • the distance and scaling ratio are used to denormalize the position information of the intermediate state to obtain the key points of the target virtual face of the virtual face under the second expression.
  • the intermediate state location information includes intermediate state coordinates.
  • the terminal can determine the expression position difference according to the difference between the normalized coordinates of the facial key points of the real face under the second expression and the normalized coordinates of the facial key points of the real face under the first expression information.
  • the terminal may adjust the normalized coordinates of the facial key points of the virtual face under the first expression according to the expression position difference information to obtain intermediate state coordinates.
  • the terminal may first multiply the coordinates of the intermediate state by the scaling ratio of the virtual face under the first expression, and then add it to the coordinates of the reference point of the virtual face under the first expression (that is, perform regression on the intermediate state position information normalized processing) to obtain the coordinates of the key points of the target virtual face of the virtual face under the second expression.
  • the coordinates of the key points of the target virtual face of the virtual face under the second expression can be calculated by the following formula:
  • k i,r2g ((NK ri -NK r,neutral )+NK g,neutral )*L g,neutral +O g,neutral
  • K i, r2g represent the coordinates of the key points of the target virtual face under the second expression
  • NK r, neutral represent the normalized coordinates of the key points of the real face under the first expression
  • the intermediate state position information can be denormalized to reduce the position error caused by the normalization process, so that a more accurate The target virtual face key point of the virtual face under the second expression.
  • extracting key point distribution features based on key points of the target virtual face, and determining target control parameters based on the key point distribution features including: inputting key points of the target virtual face into a trained parameter estimation model, Extract the distribution feature of key points of the target virtual face through the trained parameter estimation model, obtain the key point distribution feature, and perform parameter estimation based on the key point distribution feature, and output the target control parameters.
  • a trained parameter estimation model is running in the terminal, as shown in Figure 10, the terminal can input the key points of the target virtual face into the trained parameter estimation model, and use the trained parameter estimation model to The key points of the target virtual face are extracted to obtain the distribution features of the key points. Furthermore, the terminal can predict the parameters of the key points of the target virtual face based on the key point distribution characteristics through the trained parameter estimation model, and output the target control parameters.
  • the target control parameter can be obtained by the following formula:
  • P i the target control parameters
  • K2P() the trained parameter prediction model
  • the distribution feature extraction is performed on the key points of the target virtual face to obtain the key point distribution feature
  • the parameter estimation is performed based on the key point distribution feature, and the direct output Target control parameters. In this way, the efficiency of obtaining the target control parameters and the accuracy of the target control parameters can be improved.
  • the step of obtaining the trained parameter estimation model includes: obtaining reference control parameters used to generate sample expressions; obtaining sample facial key points corresponding to the reference control parameters; inputting sample facial key points to The parameter estimation model to be trained obtains the predictive control parameters; based on the error between the predictive control parameters and the reference control parameters, the second loss value is determined; toward the direction of reducing the second loss value, the parameter estimation to be trained The model is iteratively trained until the iteration stop condition is met, and the trained parameter estimation model is obtained.
  • the reference control parameter is a control parameter used to train the parameter estimation model.
  • the sample facial key points are the facial key points used to train the parameter estimation model.
  • the predictive control parameter is the control parameter obtained by predicting the parameters of the key points of the sample face by the parameter prediction model to be trained.
  • an expression generating application runs in the terminal, and the terminal can randomly generate reference control parameters for generating sample expressions through the expression generating application.
  • the terminal After the terminal acquires the reference control parameters, it can acquire the key points of the sample face corresponding to the reference control parameters.
  • the terminal can input the key points of the sample face into the parameter estimation model to be trained, and perform parameter estimation on the key points of the sample face through the parameter estimation model to be trained to obtain the predictive control parameters.
  • the terminal may determine an error between the predicted control parameter and the reference control parameter, and determine the second loss value based on the error between the predicted control parameter and the reference control parameter.
  • the terminal may perform iterative training on the parameter estimation model to be trained in the direction of reducing the second loss value, until the iteration stop condition is met, and a trained parameter estimation model is obtained.
  • the main interface of the expression generation application further includes a "start training" button 401.
  • the terminal can obtain reference control parameters for generating sample expressions, And acquire the key points of the sample face corresponding to the reference control parameters, and train the parameter estimation model to be trained through the reference control parameters and the reference control parameters.
  • the terminal obtains the key points of the sample face corresponding to the reference control parameters. It can be understood that under the control of the reference control parameters, the virtual face can be deformed to generate a corresponding sample expression, and the terminal can directly obtain the key points in the sample face. The location information of the facial key points of the virtual face under the expression.
  • the mean square error function can be used as the loss function of the trained parameter prediction model, and the loss function can be expressed by the following formula:
  • Loss represents the second loss value
  • P predict represents the predictive control parameter
  • P groundtruth represents the reference control parameter
  • the parameter estimation model to be trained is iteratively trained by referring to the control parameters and the sample face key points corresponding to the reference control parameters, so that the parameter estimation accuracy of the parameter estimation model can be improved.
  • the key point of the sample face is located in the target grid area in the face grid; the key point of the sample face corresponds to a plurality of target vertices constituting the target grid area; the sample corresponding to the reference control parameter is obtained Face key points, including: for each sample face key point, determine each target vertex corresponding to the sample face key point, and determine the space coordinates of each target vertex under the sample expression; the space coordinates are in the world coordinate system Coordinates; determine the area coordinates of the key points of the sample face in the target grid area; the area coordinates are the coordinates in the area coordinate system established based on the target grid area; based on the spatial coordinates of each target vertex, the key points of the sample face Coordinate transformation is performed on the area coordinates of the points to obtain the spatial coordinates of the key points of the sample face, so as to obtain the key points of the sample face corresponding to the reference control parameters.
  • the target grid area is a grid area in the face grid where the key points of the sample face are located.
  • the target vertices are the mesh vertices of the face mesh constituting the target mesh region.
  • the terminal may determine the mesh vertices constituting the corresponding target mesh area as each target vertex corresponding to the sample face key point. Since the spatial coordinates of the mesh vertices of the face mesh are known, the terminal can directly obtain the spatial coordinates of each target vertex under the sample expression. Based on the target vertex, the terminal can directly determine the area coordinates of the key points of the sample face in the target grid area.
  • the terminal can perform coordinate transformation on the area coordinates of the key points of the sample face, so as to convert the area coordinates of the key points of the sample face in the area coordinate system to the space coordinates in the world coordinate system,
  • the spatial coordinates of the key points of the sample face are obtained to obtain the key points of the sample face corresponding to the reference control parameters.
  • the spatial coordinates of each target vertex under the sample expression can be accurately determined.
  • the coordinate conversion of the area coordinates of the key points of the sample face can be performed to obtain the accurate space of the key points of the sample face Coordinates to obtain the key points of the sample face corresponding to the reference control parameters, which can improve the accuracy of the key points of the sample face.
  • the area coordinate system is a coordinate system established with the origin of area coordinates; the origin of area coordinates is any one of a plurality of target vertices constituting the target grid area; based on the spatial coordinates of each target vertex, the sample Coordinate conversion of the area coordinates of the key points of the face to obtain the spatial coordinates of the key points of the sample face, including: based on the space coordinates of each target vertex and the area coordinates of the key points of the sample face, determine the key points of the sample face in the world coordinate system Next, the relative position relative to the origin of the area coordinates; obtain the spatial coordinates of the origin of the area coordinates in the world coordinate system; based on the spatial coordinates and relative positions of the origin of the area coordinates, determine the spatial coordinates of the key points of the sample face.
  • the terminal may determine the relative position of the key points of the sample face in the world coordinate system relative to the origin of the area coordinates. Since the spatial coordinates of the grid vertices of the face grid are known, and the origin of the area coordinates is any one of the multiple target vertices that constitute the target grid area, the terminal can directly obtain the origin of the area coordinates in the world coordinate system The space coordinates below. Furthermore, the terminal may determine the spatial coordinates of the key points of the sample face based on the spatial coordinates and the relative position of the origin of the area coordinates.
  • the relative position of the key points of the sample face in the world coordinate system relative to the origin of the area coordinates can be accurately determined. Since the position of the origin of the area coordinates in the world coordinate system is known, the space coordinates of the origin of the area coordinates in the world coordinate system can be obtained directly. Furthermore, based on the spatial coordinates and relative positions of the origin of the region coordinates, the spatial coordinates of the key points of the sample face can be accurately determined, further improving the accuracy of the spatial coordinates of the key points of the sample face.
  • the plurality of target vertices are three target vertices; based on the space coordinates of each target vertex and the area coordinates of the sample face key points, determine the position of the sample face key points relative to the origin of the area coordinates in the world coordinate system Relative position, including: according to the space coordinates of each target vertex, determine the first space vector and the second space vector; the first space vector and the second space vector point from the origin of the area coordinates to the other two The vector of the target vertex; according to the area coordinates of the key points of the sample face, determine the first area vector and the second area vector; the first area vector is consistent with the vector direction of the first space vector; the second area vector is consistent with the second space vector The direction of the vectors is the same; based on the first space vector and the first area vector, determine the first conversion ratio; based on the second space vector and the second area vector, determine the second conversion ratio; convert the first area vector according to the first conversion ratio , to obtain the first intermediate vector, convert the second region vector according
  • the terminal determines the first conversion ratio based on the first space vector and the first area vector, and determines the second conversion ratio based on the second space vector and the second area vector, including: the terminal can convert the first area vector
  • the ratio of the first space vector to the first space vector is directly determined as the first conversion ratio
  • the ratio of the second area vector to the second space vector is directly determined as the second conversion ratio.
  • the terminal determines the relative position of the key points of the sample face in the world coordinate system relative to the origin of the area coordinates, including: the terminal can combine the first intermediate vector and the second The intermediate vectors are added together, and based on the coordinates of the added vectors, the relative positions of the key points of the sample face in the world coordinate system relative to the origin of the area coordinates are directly determined.
  • the first space vector and the second space vector can be directly determined according to the space coordinates of each target vertex, and the first area vector and the second area vector can be directly determined according to the area coordinates of the key points of the sample face.
  • a first transformation scale can be determined, and based on the second space vector and the second area vector, a second transformation scale can be determined.
  • the first intermediate vector can be obtained by converting the first area vector according to the first conversion ratio, and the second intermediate vector can be obtained by converting the second area vector according to the second conversion ratio.
  • the relative position of the key points of the sample face in the world coordinate system relative to the origin of the area coordinates can be accurately determined, thereby further improving the relative position of the key points of the sample face in the world coordinate system.
  • the accuracy of the relative position to the origin of the area coordinates can be accurately determined, thereby further improving the relative position of the key points of the sample face in the world coordinate system.
  • determining the area coordinates of the key points of the sample face in the target grid area includes: obtaining a key point index file; Vertex identification, and the area coordinates of the corresponding sample face key points corresponding to the vertex identification in the target grid area; through the vertex identification of each target vertex corresponding to the sample face key points, search from the key point index file The area coordinates corresponding to the vertex identifiers are used to obtain the area coordinates of the key points of the sample face in the target grid area.
  • the key point index file is suitable for finding the area coordinates of the key points of the sample face in the target grid area.
  • Vertex ID is a string used to uniquely identify a vertex.
  • the user can mark facial key points in the virtual face under the first expression, and record the vertex identification of each target vertex corresponding to each sample facial key point under the first expression in the key point index In the file, the area coordinates of the corresponding sample face key points in the target grid area are bound and recorded in the key point index file.
  • the terminal can obtain the key point index file recorded by the user.
  • the terminal can search the area coordinates corresponding to the vertex identification from the key point index file through the vertex identification of each target vertex corresponding to the key point of the sample face, and obtain the position of the key point of the sample face in the target grid area. area coordinates.
  • the key point index file can be expressed as the following table 1:
  • the area coordinates corresponding to the vertex identification are directly searched from the key point index file, and the key points of the sample face are obtained in the target grid area.
  • the area coordinates in can improve the acquisition efficiency of the area coordinates of the key points of the sample face in the target grid area, thereby further improving the expression generation efficiency.
  • controlling the movement of the associated vertices in the face mesh based on the target control parameters to generate the second expression in the virtual face includes: using the target control parameters as the control parameters of the target controller; Each vertex of the facial grid of the virtual face has a binding relationship; through the target controller, based on the target control parameters, the movement of the associated vertices in the facial grid is controlled to generate a second expression in the virtual face.
  • a target controller may run in the terminal, and the terminal may use the target control parameter as the control parameter of the target controller. Furthermore, the terminal can control the movement of associated vertices in the facial mesh under the first expression based on the target control parameters through the target controller, so as to control the transformation of the virtual face from the first expression to the second expression.
  • the main interface of the emoticon generation application further includes a "file” path, which is used to store controller information.
  • the user can select a target controller from a plurality of controllers stored in the "file”, and bind the target controller to each vertex of the face mesh of the virtual face.
  • the main interface of the emoticon generation application also includes an "auto frame” button 402.
  • the terminal can automatically use the target control parameters as the control of the target controller. parameters, and through the target controller, based on the target control parameters, the associated vertices in the face mesh under the first expression are controlled to move, thereby controlling the transformation of the virtual face from the first expression to the second expression.
  • the target controller has a binding relationship with each mesh vertex of the face mesh of the virtual face. It can be understood that the position of each mesh vertex of the face mesh can be controlled to change through the target controller, so as to control the deformation of the face mesh to generate corresponding expressions.
  • the terminal may use the target control parameter as the control parameter of the target controller.
  • the target controller can include multiple control parameters, such as transformation parameters, rotation parameters, level parameters, shear parameters, rotation sequence parameters, and rotation axis parameters, etc.
  • the target control parameters can be the transformation parameters in the target controller parameter.
  • the transformation parameter is a parameter used to control the position of the associated vertex for movement and transformation. It can be understood that the transformation parameters can control the associated vertices to move to corresponding positions, so that the facial key points of the virtual face meet the corresponding key point distribution characteristics.
  • the terminal can control the movement of associated vertices in the facial mesh under the first expression based on the target control parameters through the target controller, so as to control the transformation of the virtual face from the first expression to the second expression.
  • the terminal can generate a second expression 1302 on the virtual face based on the second expression 1301 of the real face through the target controller. It can be understood that, for example, if the second expression is "wink”, the terminal can generate the expression “wink” on the virtual face through the target controller based on the expression of "wink” on the real face.
  • the target control parameters are directly used as the control parameters of the target controller, and then through the target controller, based on the target control parameters, the movement of the associated vertices in the face mesh can be directly controlled to generate the second vertices in the virtual face.
  • Emoticons further improve the efficiency of expression generation.
  • the terminal can obtain a first expression image of a real face under a first expression, and obtain a second expression image of a real face under a second expression (that is, the real face facial expression image 1401), through the trained key point detection model, from the first facial expression image, locate the first position information of the facial key points of the real face, and from the second facial expression image, locate the face of the real facial expression
  • the second position information of the internal key points that is, face key point detection, to obtain the facial key points of the real face
  • the terminal may acquire facial key points of the virtual face under the first expression to obtain initial virtual facial key points.
  • the target virtual face key points of the virtual face under the second expression are obtained (that is, the face key points are converted, and the face of the virtual face under the second expression is obtained key point).
  • the terminal can input the key points of the target virtual face into the trained parameter estimation model for parameter estimation to obtain the target control parameters.
  • the terminal may use the target control parameter as the control parameter of the bound target controller, and control the generation of the second expression on the virtual face (that is, the expression image 1402 of the virtual face) based on the target control parameter through the target controller.
  • a method for generating facial expressions is provided, which specifically includes the following steps:
  • Step 1502 acquire sample data; the sample data includes sample expression images and reference position information of facial key points marked for the sample expression images.
  • Step 1504 input the sample expression image to the key point detection model to be trained, and obtain the predicted position information of the key point of the face.
  • Step 1506 Determine a first loss value based on the error between the predicted position information of the key points of the face and the reference position information of the key points of the face.
  • Step 1508 iteratively trains the key point detection model to be trained in the direction of reducing the first loss value, until the iteration stop condition is met, and a trained key point detection model is obtained.
  • Step 1510 obtain the reference control parameters used to generate the sample expression; the sample face key points corresponding to the reference control parameters are located in the target grid area in the face grid of the virtual face; the sample face key points and the target network corresponding to multiple target vertices in the grid area.
  • Step 1512 for each sample facial key point, determine each target vertex corresponding to the sample facial key point, and determine the space coordinates of each target vertex under the sample expression; the space coordinates are the coordinates in the world coordinate system.
  • Step 1514 determine the area coordinates of the key points of the sample face in the target grid area; the area coordinates are the coordinates in the area coordinate system established based on the target grid area.
  • Step 1516 based on the spatial coordinates of each target vertex, perform coordinate transformation on the area coordinates of the sample facial key points to obtain the spatial coordinates of the sample facial key points, so as to obtain the sample facial key points corresponding to the reference control parameters.
  • Step 1518 input the key points of the sample face into the parameter estimation model to be trained to obtain the predictive control parameters; based on the error between the predictive control parameters and the reference control parameters, determine the second loss value.
  • Step 1520 iteratively trains the parameter estimation model to be trained in the direction of reducing the second loss value until the iteration stop condition is met, and a trained parameter estimation model is obtained.
  • Step 1522 acquire the first expression image of the real face under the first expression; acquire the second expression image of the real face under the second expression.
  • Step 1524 using the trained key point detection model, locate the first position information of the facial key points of the real face from the first expression image, and locate the facial key points of the real face from the second expression image The second position information of the point.
  • Step 1526 acquire the facial key points of the virtual face under the first expression, and obtain the initial virtual facial key points; acquire the third position information of the initial virtual facial key points.
  • Step 1528 Perform normalization processing on the first position information, the second position information and the third position information respectively to obtain the normalized first position information, the normalized second position information and the normalized The third location information of .
  • Step 1530 based on the difference between the normalized first position information and the normalized second position information, the expression position difference information is obtained; the expression position difference information is used to represent the facial key points of the real face The difference in position under the first expression and the second expression.
  • Step 1532 Adjust the normalized third position information according to the expression position difference information to obtain the target virtual face key points of the virtual face under the second expression.
  • Step 1534 input the key points of the target virtual face into the trained parameter estimation model, so as to extract the distribution feature of the key points of the target virtual face through the trained parameter estimation model, and obtain the key point distribution feature, based on the key point Parameter estimation is performed on the distribution feature, and a target control parameter is output;
  • the target control parameter is a parameter used to control an associated vertex in the face mesh of the virtual face and related to the second expression.
  • Step 1536 using the target control parameters as the control parameters of the target controller; the target controller has a binding relationship with each vertex of the face mesh of the virtual face.
  • Step 1538 through the target controller, control the movement of the associated vertices in the face mesh based on the target control parameters, so as to generate a second expression on the virtual face.
  • the present application also provides an application scenario, where the above-mentioned expression generation method is applied.
  • the method for generating expressions can be applied to the scene of generating expressions of three-dimensional virtual characters in games.
  • the terminal can obtain sample data; the sample data includes sample facial expression images and reference location information of key points of human faces marked on the sample facial expression images.
  • the first loss value is determined based on an error between the predicted position information of the key points of the human face and the reference position information of the key points of the human face.
  • the face key point detection model to be trained is iteratively trained until the iteration stop condition is met, and the trained face key point detection model is obtained.
  • the terminal can obtain reference control parameters for generating sample facial expressions; the key points of the sample face corresponding to the reference control parameters are located in the target grid area in the face grid of the virtual face; the key points of the sample face and the constituent target Corresponding to multiple target vertices in the mesh area.
  • For each key point of the sample face determine each target vertex corresponding to the key point of the sample face, and determine the space coordinates of each target vertex under the sample face expression; the space coordinates are the coordinates in the world coordinate system.
  • the terminal may acquire a first expression image of a real human face under the first expression; acquire a second expression image of a real human face under the second expression.
  • the second location information of Acquiring the key points of the virtual human face under the first expression to obtain the key points of the initial virtual human face; obtaining the third position information of the key points of the initial virtual human face.
  • the first position information, the second position information and the third position information are respectively normalized to obtain the normalized first position information, the normalized second position information and the normalized third position information. location information.
  • the expression position difference information is obtained; the expression position difference information is used to represent the face key points of the real face in the first The difference in position under the expression and the second expression.
  • the normalized third position information is adjusted according to the expression position difference information to obtain the target virtual face key points of the virtual face under the second expression.
  • the terminal can input the key points of the target virtual face into the trained parameter estimation model, so as to extract the distribution features of the key points of the target virtual face through the trained parameter estimation model, and obtain the key point distribution features, based on the key point distribution
  • the feature performs parameter estimation, and outputs a target control parameter;
  • the target control parameter is a parameter used to control an associated vertex in the face grid of the virtual face and related to the second expression.
  • the target control parameter is used as the control parameter of the target controller; the target controller has a binding relationship with each vertex of the face grid of the virtual face.
  • the associated vertices in the face mesh are controlled to move based on the target control parameters, so as to generate the second expression in the virtual human face.
  • the present application further provides an application scenario, where the above-mentioned expression generation method is applied.
  • the expression generation method can be applied to a virtual animal expression generation scene.
  • the terminal can obtain expression position difference information obtained based on real animal faces; the expression position difference information is used to represent the position difference of animal face key points of real animal faces under the first expression and the second expression.
  • the animal face key points of the virtual animal face under the first expression are obtained to obtain the initial virtual animal face key points; the virtual animal face is the animal face of the virtual object.
  • the target virtual animal face key points of the virtual animal face under the second expression are obtained.
  • steps in the flow charts of the above embodiments are shown sequentially, these steps are not necessarily executed sequentially. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in the above-mentioned embodiments may include multiple sub-steps or multiple stages, these sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, these sub-steps or stages The order of execution is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
  • an expression generation apparatus 1600 is provided.
  • This apparatus can adopt software modules or hardware modules, or a combination of the two to become a part of computer equipment.
  • the apparatus specifically includes:
  • the obtaining module 1601 is used to obtain the expression position difference information obtained based on the real face; the expression position difference information is used to represent the position difference of the facial key points of the real face under the first expression and the second expression; The face key points of the virtual face under an expression are obtained to obtain the initial virtual face key points; the virtual face is the face of the virtual object.
  • the determination module 1602 is used to obtain the target virtual face key points of the virtual face under the second expression based on the expression position difference information and the initial virtual face key points; extract key point distribution features based on the target virtual face key points, and A target control parameter is determined based on the key point distribution feature; the target control parameter is a parameter used to control an associated vertex in the face mesh of the virtual face and related to the second expression.
  • a generation module 1603 configured to control movement of associated vertices in the face mesh based on the target control parameters, so as to generate a second expression on the virtual face.
  • the obtaining module 1601 is also used to obtain the first expression image of the real face under the first expression; obtain the second expression image of the real face under the second expression; from the first expression image, locate first location information of the facial key points of the real face; from the second expression image, locate the second location information of the facial key points of the real face; determine the difference between the first location information and the second location information , to get the expression position difference information.
  • the first position information and the second position information are detected through a trained key point detection model; the device also includes:
  • the first training module 1604 is used to obtain sample data; the sample data includes sample expression images and the reference position information of the facial key points marked for the sample expression images; the sample expression images are input to the key point detection model to be trained to obtain the face The predicted position information of the internal key points; based on the error between the predicted position information of the facial key points and the reference position information of the facial key points, the first loss value is determined; towards the direction of reducing the first loss value, treat The trained key point detection model is iteratively trained until the iteration stop condition is satisfied, and the trained key point detection model is obtained.
  • the obtaining module 1601 is also used to obtain the third position information of the key points of the initial virtual face; normalize the first position information, the second position information and the third position information respectively to obtain the normalized The first location information after normalization, the second location information after normalization, and the third location information after normalization; based on the difference between the first location information after normalization and the second location information after normalization The difference of expression position to obtain expression position difference information; the determination module 1602 is also used to adjust the normalized third position information according to the expression position difference information to obtain the target virtual face key points of the virtual face under the second expression.
  • the acquiring module 1601 is further configured to use the normalized position information as Current position information; for each current position information, determine the reference facial key point that meets the position stability condition; determine the reference point corresponding to the current position information based on the reference facial key point, and determine the distance between the reference facial key point and the reference point The relative distance; based on the distance between any two reference face key points, determine the zoom ratio; according to the relative distance and the zoom ratio, the current position information is normalized.
  • the determining module 1602 is further configured to adjust the normalized third position information according to the expression position difference information to obtain the intermediate state position information of the facial key points of the virtual face under the second expression ; Based on the relative distance and scaling ratio corresponding to the third position information, the intermediate state position information is denormalized to obtain the target virtual face key points of the virtual face under the second expression.
  • the determining module 1602 is also used to input the key points of the target virtual face into the trained parameter estimation model, so as to perform distribution feature extraction on the key points of the target virtual face through the trained parameter estimation model, The distribution characteristics of the key points are obtained, and the parameters are estimated based on the distribution characteristics of the key points, and the target control parameters are output.
  • the device also includes:
  • the second training module 1605 is used to obtain reference control parameters for generating sample expressions; obtain sample facial key points corresponding to the reference control parameters; input sample facial key points to the parameter estimation model to be trained to obtain predictive control Parameters; based on the error between the predictive control parameters and the reference control parameters, determine the second loss value; towards the direction of reducing the second loss value, iteratively train the parameter estimation model to be trained until the iteration stop condition is satisfied , to get the trained parameter estimation model.
  • the sample face key points are located in the target grid area in the face grid; the sample face key points correspond to a plurality of target vertices forming the target grid area; the second training module 1605 also uses For each key point of the sample face, determine each target vertex corresponding to the key point of the sample face, and determine the spatial coordinates of each target vertex under the sample expression; the spatial coordinates are the coordinates in the world coordinate system; determine the sample face The area coordinates of the key points in the target grid area; the area coordinates are the coordinates in the area coordinate system established based on the target grid area; based on the space coordinates of each target vertex, coordinate the area coordinates of the key points of the sample face Transform to obtain the spatial coordinates of the key points of the sample face to obtain the key points of the sample face corresponding to the reference control parameters.
  • the area coordinate system is a coordinate system established with the area coordinate origin; the area coordinate origin is any one of a plurality of target vertices constituting the target grid area; the second training module 1605 is also used to The spatial coordinates of the target vertex and the area coordinates of the key points of the sample face determine the relative position of the key points of the sample face in the world coordinate system relative to the origin of the area coordinates; obtain the space coordinates of the origin of the area coordinates in the world coordinate system; based on the area The spatial coordinates and relative position of the coordinate origin determine the spatial coordinates of the key points of the sample face.
  • the second training module 1605 is also used to obtain the key point index file;
  • the key point index file records the vertex identification of each target vertex corresponding to each sample facial key point, and has a corresponding relationship with the vertex identification The area coordinates of the corresponding sample face key points in the target grid area; through the vertex identification of each target vertex corresponding to the sample face key points, the area coordinates corresponding to the vertex identification are searched from the key point index file, Get the area coordinates of the key points of the sample face in the target grid area.
  • the generation module 1603 is also used to use the target control parameters as the control parameters of the target controller; the target controller has a binding relationship with each vertex of the face mesh of the virtual face; through the target controller, based on The target control parameters control movement of associated vertices in the face mesh to generate a second expression in the virtual face.
  • the expression generating device 1600 may further include a first training module 1604 and a second training module 1605 .
  • the above-mentioned expression generation device obtains expression position difference information based on real faces, wherein the expression position difference information can be used to represent the position differences of facial key points of the real face under the first expression and the second expression. Acquiring face key points of the virtual face under the first expression to obtain initial virtual face key points. Based on the expression position difference information and the key points of the initial virtual face, the target virtual face key points of the virtual face under the second expression can be obtained, and the key point distribution features can be extracted based on the target virtual face key points. , the target control parameter can be quickly determined, wherein the target control parameter is a parameter used to control an associated vertex in the face mesh of the virtual face and related to the second expression.
  • the second expression can be generated in the virtual face.
  • the expression generation method of this application only needs to obtain the key points of the virtual face under the first expression, and combine the facial key points of the real face with the expression position differences under different expressions , can conveniently control the expression that produces corresponding changes in the virtual face, and improves the expression generation efficiency.
  • Each module in the above-mentioned expression generation device can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a terminal, and its internal structure may be as shown in FIG. 18 .
  • the computer device includes a processor, a memory, a communication interface, a display screen and an input device connected through a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer readable instructions.
  • the internal memory provides an environment for the execution of the operating system and computer readable instructions in the non-volatile storage medium.
  • the communication interface of the computer device is used to communicate with an external terminal in a wired or wireless manner, and the wireless manner can be realized through WIFI, an operator network, NFC (Near Field Communication) or other technologies.
  • WIFI Wireless Fidelity
  • NFC Near Field Communication
  • the computer-readable instructions are executed by the processor, a method for generating facial expressions is realized.
  • the display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen
  • the input device of the computer device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the casing of the computer device , and can also be an external keyboard, touchpad, or mouse.
  • Figure 18 is only a block diagram of a partial structure related to the solution of this application, and does not constitute a limitation to the computer equipment on which the solution of this application is applied.
  • the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • a computer device including a memory and one or more processors, where computer-readable instructions are stored in the memory, and the above-mentioned methods are implemented when the one or more processors execute the computer-readable instructions Steps in the examples.
  • one or more computer-readable storage media are provided, storing computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, the steps in the foregoing method embodiments are implemented.
  • a computer program product including computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, the steps in the foregoing method embodiments are implemented.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, etc.
  • Volatile memory can include Random Access Memory (RAM) or external cache memory.
  • RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).
  • the user information including but not limited to user equipment information, user personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Geometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种表情生成方法、装置、设备、介质和计算机程序产品,属人工智能技术领域和游戏基础技术领域。所述方法包括:获取真实脸部的脸部关键点在第一表情和第二表情下的表情位置差异信息(202);获取在第一表情下的虚拟脸部的初始虚拟脸部关键点(204);基于表情位置差异信息和初始虚拟脸部关键点,得到虚拟脸部在第二表情下的目标虚拟脸部关键点(206);基于目标虚拟脸部关键点提取关键点分布特征,基于关键点分布特征,确定对虚拟脸部的脸部网格中的、且与第二表情相关的关联顶点进行控制的目标控制参数(208);基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情(210)。

Description

表情生成方法、装置、设备、介质和计算机程序产品
本申请要求于2021年12月06日提交中国专利局,申请号为2021114733417、发明名称为“表情生成方法、装置、设备、介质和计算机程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术,更涉及游戏基础技术领域,特别是涉及一种表情生成方法、装置、设备、介质和计算机程序产品。
背景技术
随着人工智能技术的发展,出现了角色表情生成技术。角色表情生成技术,是指通过计算机自动生成角色复杂表情的技术,比如,可通过3D角色表情生成技术,自动生成3D(Three-Dimensional)角色复杂表情。传统技术中,首先将需要生成的复杂表情分解为多个元表情,然后通过动画师制作出该复杂表情对应的所有元表情,进而再基于每个元表情对应的程度值,将该复杂表情对应的所有元表情进行表情合成,得到角色的表情。
然而,传统的角色表情生成方法,针对不同的角色,均需要分别制作元表情,而每一个元表情的制作都需要花费大量的时间,从而导致表情生成效率低。
发明内容
根据本申请提供的各种实施例,提供一种表情生成方法、装置、设备、介质和计算机程序产品。
一种表情生成方法,应用于终端,所述方法包括:
获取基于真实脸部得到的表情位置差异信息;所述表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异;
获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点;所述虚拟脸部是虚拟对象的脸部;
基于所述表情位置差异信息和所述初始虚拟脸部关键点,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点;
基于所述目标虚拟脸部关键点提取关键点分布特征,并基于所述关键点分布特征,确定目标控制参数;所述目标控制参数,是用于对所述虚拟脸部的脸部网格中的、且与所述第二表情相关的关联顶点进行控制的参数;
基于所述目标控制参数控制所述脸部网格中的所述关联顶点移动,以在所述虚拟脸部中产生所述第二表情。
一种表情生成装置,所述装置包括:
获取模块,用于获取基于真实脸部得到的表情位置差异信息;所述表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异;获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点;所述虚拟脸部是虚拟对象的脸部;
确定模块,用于基于所述表情位置差异信息和所述初始虚拟脸部关键点,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点;基于所述目标虚拟脸部关键点提取关键点分布特征,并基于所述关键点分布特征,确定目标控制参数;所述目标控制参数,是用于对所述虚拟脸部的脸部网格中的、且与所述第二表情相关的关联顶点进行控制的参数;
生成模块,用于基于所述目标控制参数控制所述脸部网格中的所述关联顶点移动,以在所述虚拟脸部中产生所述第二表情。
一种计算机设备,包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,该一个或多个处理器执行计算机可读指令时实现本申请各方法实施例中的步骤。
一个或多个计算机可读存储介质,存储有计算机可读指令,该计算机可读指令被一个 或多个处理器执行时实现本申请各方法实施例中的步骤。
一种计算机可读指令产品,包括计算机可读指令,计算机可读指令被一个或多个处理器执行时实现本申请各方法实施例中的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中表情生成方法的应用环境图;
图2为一个实施例中表情生成方法的流程示意图;
图3为一个实施例中真实脸部的脸部关键点示意图;
图4为一个实施例中表情生成应用的主界面示意图;
图5为一个实施例中标注虚拟脸部的脸部关键点的界面示意图;
图6为一个实施例中第一表情下虚拟脸部的初始脸部关键点示意图;
图7为一个实施例中虚拟脸部的脸部网格示意图;
图8为一个实施例中第二表情下虚拟脸部的目标脸部关键点示意图;
图9为一个实施例中关键点检测流程示意图;
图10为一个实施例中参数预估流程示意图;
图11为一个实施例中虚拟脸部的脸部网格与目标控制器的绑定示意图;
图12为一个实施例中目标控制器的控制参数设置示意图;
图13为一个实施例中制作完成的在第二表情下的虚拟脸部示意图;
图14为另一个实施例中表情生成方法的流程示意图;
图15为又一个实施例中表情生成方法的流程示意图;
图16为一个实施例中表情生成装置的结构框图;
图17为另一个实施例中表情生成装置的结构框图;
图18为一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的表情生成方法,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104进行通信。其中,终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑、便携式可穿戴设备和车载终端,服务器104可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式***,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。终端102以及服务器104可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。
终端102可从服务器104中获取真实脸部在第一表情下的第一表情图像,以及真实脸部在第二表情下的第二表情图像。终端102可基于第一表情图像和第二表情图像,获取第一表情和第二表情之间的表情位置差异信息;表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异。终端102可获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点;虚拟脸部是虚拟对象的脸部。终端102可基于 表情位置差异信息和初始虚拟脸部关键点,得到虚拟脸部在第二表情下的目标虚拟脸部关键点,基于目标虚拟脸部关键点提取关键点分布特征,并基于关键点分布特征,确定目标控制参数;目标控制参数,是用于对虚拟脸部的脸部网格中的、且与第二表情相关的关联顶点进行控制的参数。终端102可基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情。
需要说明的是,本申请一些实施例中的表情生成方法使用到了人工智能技术。比如,目标虚拟脸部关键点对应的关键点分布特征,则属于使用人工智能技术提取得到的特征,以及,目标控制参数,也属于使用人工智能技术预测得到的参数。
此外,本申请一些实施例中的表情生成方法还使用到了计算机视觉技术(Computer Vision,CV)。比如,真实脸部的脸部关键点的第一位置信息和第二位置信息,则属于使用计算机视觉技术,分别从第一表情图像和第二表情图像中定位得到的信息。
在一个实施例中,如图2所示,提供了一种表情生成方法,该方法可应用于终端,也可应用于终端与服务器的交互过程。本实施例以该方法应用于图1中的终端102为例进行说明,包括以下步骤:
步骤202,获取基于真实脸部得到的表情位置差异信息;表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异。
其中,真实脸部是真实对象的脸部。比如,真实的人的脸部。可以理解,真实对象并不限定于人,还可以是其他任何具有脸部表情的对象,比如,动物。脸部关键点,是用于定位脸部上关键区域位置的关键点。脸部关键点可以包括脸部轮廓的关键点以及脸部器官的关键点。第一表情是第一种类型的表情,第二表情是第二种类型的表情,可以理解,第一表情和第二表情是不同类型的表情,比如,第一表情是无表情,第二表情是笑,又比如,第一表情是笑,第二表情是哭。表情位置差异信息,是真实脸部的脸部关键点在第一表情下的位置与真实脸部的脸部关键点在第二表情下的位置之间的差异信息。
可以理解,终端可以直接从服务器获取表情位置差异信息,即,服务器预先可以基于真实脸部的脸部关键点在第一表情和第二表情下的位置差异,确定表情位置差异信息,终端可以直接从服务器获取该表情位置差异信息。终端也可以获取真实脸部的脸部关键点在第一表情下的位置信息,以及获取真实脸部的脸部关键点在第二表情下的位置信息。终端可对真实脸部的脸部关键点在第一表情下的位置信息和真实脸部的脸部关键点在第二表情下的位置信息进行差异比对,以确定真实脸部的脸部关键点在第一表情和第二表情下的表情位置差异信息。
在一个实施例中,服务器中存储有真实脸部的脸部关键点在第一表情下的位置信息,以及真实脸部的脸部关键点在第二表情下的位置信息。服务器可将真实脸部的脸部关键点在第一表情下的位置信息,以及真实脸部的脸部关键点在第二表情下的位置信息发送至终端。终端可接收真实脸部的脸部关键点在第一表情下的位置信息,以及真实脸部的脸部关键点在第二表情下的位置信息。
在一个实施例中,终端可对在第一表情下的真实脸部进行关键点检测处理,获得真实脸部的脸部关键点在第一表情下的位置信息。终端可对在第二表情下的真实脸部进行关键点检测处理,获得真实脸部的脸部关键点在第二表情下的位置信息,参照图3,图中为真实脸部的脸部关键点在第二表情下的分布情况。
在一个实施例中,终端在获取真实脸部的脸部关键点在第一表情下的位置信息,以及获取真实脸部的脸部关键点在第二表情下的位置信息之后,可对真实脸部的脸部关键点在第一表情下的位置信息与该真实脸部的脸部关键点在第二表情下的位置信息求差值,然后将求取的差值直接作为真实脸部的脸部关键点在第一表情和第二表情下的表情位置差异信息。
步骤204,获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点; 虚拟脸部是虚拟对象的脸部。
其中,虚拟对象,是虚拟的对象,比如,游戏中的三维虚拟角色。可以理解,虚拟对象还可以是其他非游戏场景中虚拟出的非真实的对象,比如,一些设计行业中虚拟出的模型。初始虚拟脸部关键点,是虚拟脸部在第一表情下的脸部关键点。
具体地,终端可在显示界面展示在第一表情下的虚拟脸部,用户可对在第一表情下的虚拟脸部进行关键点标注操作,进而,终端可获取用户在第一表情下的虚拟脸部所标注的脸部关键点,得到在第一表情下的虚拟脸部的初始虚拟脸部关键点。
在一个实施例中,终端中运行有表情生成应用,表情生成应用可提供关键点标注界面,关键点标注界面中可包括关键点标注触发控件。终端可在关键点标注界面展示在第一表情下的虚拟脸部,用户可基于关键点标注触发控件,对在第一表情下的虚拟脸部进行关键点标注操作,进而,终端可获取用户在第一表情下的虚拟脸部所标注的脸部关键点,得到在第一表情下的虚拟脸部的初始虚拟脸部关键点。其中,表情生成应用是用于生成表情的应用程序。
在一个实施例中,图4为表情生成应用的主界面示意图,该主界面包括项目路径和头像标注文件等文件路径,其中,项目路径可用于存储虚拟脸部的脸部关键点对应的项目,头像标注文件可用于记录虚拟脸部的脸部关键点的位置信息。用户可基于主界面中的“标注头像”按钮403,对在第一表情下的虚拟脸部进行关键点标注操作,进而终端可获取在第一表情下的虚拟脸部的初始虚拟脸部关键点。
在一个实施例中,如图5所示,在用户触发“标注头像”按钮403时,表情生成应用的标注页面可展示已标注有脸部关键点的真实脸部,进而,用户可参照该已标注有脸部关键点的真实脸部,对在第一表情下的虚拟脸部进行关键点标注操作,得到初始虚拟脸部关键点,且初始虚拟脸部关键点在虚拟脸部的分布情况如图6所示。
步骤206,基于表情位置差异信息和初始虚拟脸部关键点,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
其中,目标虚拟脸部关键点,是虚拟脸部在第二表情下的脸部关键点。
具体地,终端可基于表情位置差异信息,确定初始虚拟脸部关键点的调整信息。进而终端可将初始虚拟脸部关键点,按照确定的调整信息进行调整,调整完成后,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
在一个实施例中,调整信息包括移动距离。终端可确定初始虚拟脸部关键点当前所在的位置,以及基于表情位置差异信息,确定初始虚拟脸部关键点的调整信息。进而终端可将初始虚拟脸部关键点,按照确定的移动距离,从初始虚拟脸部关键点当前所在的位置进行移动,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
步骤208,基于目标虚拟脸部关键点提取关键点分布特征,并基于关键点分布特征,确定目标控制参数;目标控制参数,是用于对虚拟脸部的脸部网格中的、且与第二表情相关的关联顶点进行控制的参数。
其中,关键点分布特征,是目标虚拟脸部关键点在虚拟脸部的分布特征。脸部网格是构成虚拟脸部的网格。可以理解,脸部网格是在每次表情改变时所获取得到的一组脸部形态网格。脸部网格包括多个网格区域,网格区域是构成脸部网格的最小单元。脸部网格中包括多条网线条以及多个网格顶点,网格顶点是两条以上的网线条相交的点。关联顶点,是脸部网格中与第二表情相关的网格顶点。可以理解,关联顶点即是与目标虚拟脸部关键点同在一个网格区域的网格顶点。
在一个实施例中,虚拟脸部可设置于脸部网格上,每个网格包括多个网格顶点。终端可将与第二表情相关的网格顶点(即,目标虚拟脸部关键点对应的网格顶点)确定为关联顶点,比如,参考图7,图中的701(即,黑点)为与第二表情相关的其中一个关联顶点,图中的702(即,白点)为目标虚拟脸部关键点。
可以理解,关键点分布特征,用于表征目标虚拟脸部关键点在虚拟脸部上的位置分布情况。在不同的表情下,脸部关键点在脸部的位置分布情况不同。因而,终端可对目标虚拟脸部关键点进行分布特征提取处理,即,终端可以分析虚拟脸部中各目标虚拟脸部关键点之间的相对位置信息,由于各目标虚拟脸部关键点之间的相对位置信息能够反映各目标虚拟脸部关键点在虚拟脸部上的位置分布情况,因而,终端可以基于目标虚拟脸部关键点之间的相对位置信息,得到目标虚拟脸部关键点在虚拟脸部的关键点分布特征。进而,终端可基于目标虚拟脸部关键点在虚拟脸部的关键点分布特征,对目标虚拟脸部关键点进行参数预估处理,得到目标虚拟脸部关键点对应的目标控制参数。
步骤210,基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情。
具体地,目标控制参数与脸部网格中的关联顶点具有绑定关系。终端可基于目标控制参数,控制在第一表情下的脸部网格中的关联顶点产生移动,从而控制虚拟脸部从第一表情变换为第二表情。可以理解,虚拟脸部的脸部关键点与关联顶点具有对应关系,关联顶点移动可以控制虚拟脸部的脸部关键点也跟随发生移动。而目标虚拟脸部关键点在虚拟脸部的关键点分布特征是已知的,基于目标虚拟脸部关键点对应的目标控制参数,便可控制在第一表情下的脸部网格中的关联顶点产生移动,进而控制虚拟脸部的脸部关键点也跟随发生移动,直至虚拟脸部的脸部关键点的分布满足与目标虚拟脸部关键点对应的关键点分布特征时,虚拟脸部从第一表情变换为第二表情。
在一个实施例中,在第二表情下,目标虚拟脸部关键点在虚拟脸部的分布情况可参考图8。可以理解,目标虚拟脸部关键点,是用于定位虚拟脸部上关键区域位置的关键点信息,在图8中,用空心圆圈表示目标虚拟脸部关键点,可以更清楚地表征目标虚拟脸部关键点在虚拟脸部的分布情况。
上述表情生成方法中,获取基于真实脸部得到的表情位置差异信息,其中,表情位置差异信息,可用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异。获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点。基于表情位置差异信息和初始虚拟脸部关键点,可以得到虚拟脸部在第二表情下的目标虚拟脸部关键点,基于目标虚拟脸部关键点可以提取关键点分布特征,基于关键点分布特征,可以快速确定目标控制参数,其中,目标控制参数,是用于对虚拟脸部的脸部网格中的、且与第二表情相关的关联顶点进行控制的参数。进而,基于目标控制参数控制脸部网格中的关联顶点移动,可以在虚拟脸部中产生第二表情。相较于传统的表情生成方法,本申请的表情生成方法,只需获取虚拟脸部在第一表情下的脸部关键点,结合真实脸部的脸部关键点在不同表情下的表情位置差异,就能便捷地控制在虚拟脸部中产生相应变化的表情,提升了表情生成效率。
此外,传统的表情生成方法,只能生成动画师所制作的元表情所能够合成的表情。而本申请的表情生成方法,通过虚拟脸部的关键点对应的目标控制参数,直接控制虚拟脸部生成相应的表情,从而可以生成任意的表情,提升了表情制作的灵活性。
在一个实施例中,获取基于真实脸部得到的表情位置差异信息,包括:获取真实脸部在第一表情下的第一表情图像;获取真实脸部在第二表情下的第二表情图像;从第一表情图像中,定位真实脸部的脸部关键点的第一位置信息;从第二表情图像中,定位真实脸部的脸部关键点的第二位置信息;确定第一位置信息和第二位置信息之间的差异,得到表情位置差异信息。
其中,第一表情图像,是指真实脸部在第一表情下的图像。第二表情图像,是指真实脸部在第二表情下的图像。第一位置信息,是真实脸部的脸部关键点在第一表情下的位置信息。第二位置信息,是真实脸部的脸部关键点在第二表情下的位置信息。
具体地,终端可获取真实脸部在第一表情下的第一表情图像,以及,获取真实脸部 在第二表情下的第二表情图像。终端可对第一表情图像进行关键点检测处理,以从第一表情图像中,确定真实脸部的脸部关键点的第一位置信息。终端可对第二表情图像进行关键点检测处理,以从第二表情图像中,确定真实脸部的脸部关键点的第二位置信息。终端可确定第一位置信息和第二位置信息之间的差异,并基于第一位置信息和第二位置信息之间的差异,确定表情位置差异信息。
在一个实施例中,终端中包括图像采集单元,终端可通过图像采集单元采集真实脸部在第一表情下的第一表情图像,以及真实脸部在第二表情下的第二表情图像。
在一个实施例中,服务器中存储有真实脸部在第一表情下的第一表情图像,以及真实脸部在第二表情下的第二表情图像。服务器可将真实脸部在第一表情下的第一表情图像,以及真实脸部在第二表情下的第二表情图像发送至终端。终端可接收服务器所发送的真实脸部在第一表情下的第一表情图像,以及真实脸部在第二表情下的第二表情图像。
在一个实施例中,如图4所示,表情生成应用的主界面还包括图片路径、图片关键点路径、以及图片列表。其中,图片路径可用于存储第一表情图像和第二表情图像,图片关键点路径可用于存储第一位置信息和第二位置信息,图片列表可用于展示第一表情图像和第二表情图像的图像信息,比如,图像序号和图像名称等。
在一个实施例中,如图4所示,表情生成应用的主界面还包括“标注图片”按钮404,通过触发“标注图片”按钮404,终端可直接从第一表情图像中,定位真实脸部的脸部关键点的第一位置信息,以及从第二表情图像中,定位真实脸部的脸部关键点的第二位置信息。
在一个实施例中,第一表情图像和第二表情图像具体可以是RGB(Red,Green,Blue,红,绿,蓝)图像。
上述实施例中,通过对真实脸部在第一表情下的第一表情图像,以及真实脸部在第二表情下的第二表情图像,进行关键点检测,可以直接准确地获得真实脸部的脸部关键点的第一位置信息,以及真实脸部的脸部关键点的第二位置信息,进而再基于第一位置信息和第二位置信息之间的差异,得到表情位置差异信息,提升了表情位置差异信息的获取效率和准确率,从而进一步提升了表情生成效率。
在一个实施例中,第一位置信息和第二位置信息是通过已训练的关键点检测模型检测得到;得到已训练的关键点检测模型的步骤,包括:获取样本数据;样本数据包括样本表情图像和针对样本表情图像标注的脸部关键点的参考位置信息;将样本表情图像输入至待训练的关键点检测模型,得到脸部关键点的预测位置信息;基于脸部关键点的预测位置信息与脸部关键点的参考位置信息之间的误差,确定第一损失值;朝着使第一损失值减小的方向,对待训练的关键点检测模型进行迭代训练,直至满足迭代停止条件时,得到已训练的关键点检测模型。
其中,样本数据,是用于训练关键点检测模型的数据。样本表情图像,是用于训练关键点检测模型的图像。参考位置信息,是针对样本表情图像标注的脸部关键点的位置信息。预测位置信息,是通过待训练的关键点检测模型对样本表情图像进行预测得到的脸部关键点的位置信息。第一损失值,是基于脸部关键点的预测位置信息与脸部关键点的参考位置信息之间的误差所确定的损失值。
在一个实施例中,终端中运行有已训练的关键点检测模型。终端可获取真实脸部在第一表情下的第一表情图像,以及,获取真实脸部在第二表情下的第二表情图像。如图9所示,终端可将第一表情图像输入至已训练的关键点检测模型,并通过已训练的关键点检测模型对第一表情图像进行关键点检测处理,以从第一表情图像中,检测出真实脸部的脸部关键点的第一位置信息。以及,终端可将第二表情图像输入至已训练的关键点检测模型,并通过已训练的关键点检测模型对第二表情图像进行关键点检测处理,以从第二表情图像中,检测出真实脸部的脸部关键点的第二位置信息。进而,终端可确定第一位置信息和第 二位置信息之间的差异,并基于第一位置信息和第二位置信息之间的差异,确定表情位置差异信息。
具体地,终端中运行有待训练的关键点检测模型。终端可获取样本数据,并将样本数据中的样本表情图像,输入至待训练的关键点检测模型,通过待训练的关键点检测模型对样本表情图像进行关键点检测处理,得到脸部关键点的预测位置信息。终端可确定脸部关键点的预测位置信息与针对样本表情图像标注的脸部关键点的参考位置信息之间的误差,并基于脸部关键点的预测位置信息与针对样本表情图像标注的脸部关键点的参考位置信息之间的误差,确定第一损失值。终端可朝着使第一损失值减小的方向,对待训练的关键点检测模型进行迭代训练,直至满足迭代停止条件时,得到已训练的关键点检测模型。
在一个实施例中,迭代停止条件可以是第一损失值达到预先设置的损失值阈值,也可以是迭代轮数达到预先设置的轮数阈值。
在一个实施例中,可采用均方误差函数作为训练的关键点检测模型的损失函数,其损失函数可采用以下公式表示:
L loss=MeanSquaredError(K predict,K groundtruth)
其中,L loss表示第一损失值,MeanSquaredError()表示均方误差函数,K predict表示脸部关键点的预测位置信息,K groundtruth表示脸部关键点的参考位置信息。
上述实施例中,通过样本表情图像,以及针对样本表情图像标注的脸部关键点的参考位置信息,对待训练的关键点检测模型进行迭代训练,可以提升关键点检测模型的关键点检测准确率。
在一个实施例中,上述表情生成方法还包括:获取初始虚拟脸部关键点的第三位置信息;对第一位置信息、第二位置信息和第三位置信息分别进行归一化处理,得到归一化后的第一位置信息、归一化后的第二位置信息以及归一化后的第三位置信息;确定第一位置信息和第二位置信息之间的差异,得到表情位置差异信息;包括:基于归一化后的第一位置信息与归一化后的第二位置信息之间的差异,得到表情位置差异信息;基于表情位置差异信息和初始虚拟脸部关键点,得到虚拟脸部在第二表情下的目标虚拟脸部关键点,包括:将归一化后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
其中,第三位置信息,是初始虚拟脸部关键点在第一表情下的位置信息。
具体地,用户可对在第一表情下的虚拟脸部进行关键点标注,终端可基于标注的初始虚拟脸部关键点,直接获取初始虚拟脸部关键点的第三位置信息。终端可对第一位置信息、第二位置信息和第三位置信息分别进行归一化处理,得到归一化后的第一位置信息、归一化后的第二位置信息以及归一化后的第三位置信息。终端可确定归一化后的第一位置信息与归一化后的第二位置信息之间的差异,并基于归一化后的第一位置信息与归一化后的第二位置信息之间的差异,确定真实脸部的脸部关键点在第一表情和第二表情下的表情位置差异信息。终端可将归一化后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
在一个实施例中,终端在对第一位置信息、第二位置信息和第三位置信息中的每一个进行归一化处理时,可将被进行归一化处理的位置信息作为当前位置信息。针对每个当前位置信息,终端可确定满足位置稳定条件的基准脸部关键点。进而,终端可根据基准脸部关键点确定归一化标准信息,并基于归一化标准信息,对第一位置信息、第二位置信息和第三位置信息中的每一个进行归一化处理。其中,位置稳定条件,是使得脸部关键点的位置保持相对稳定的条件,基准脸部关键点,是归一化过程中作为基准的脸部关键点。归 一化标准信息,是归一化过程中作为标准的信息。
在一个实施例中,位置稳定条件具体可以是从第一表情转换至第二表情时,脸部关键点在相应脸部上的移动距离小于预设移动距离。位置稳定条件具体还可以是从第一表情转换至第二表情时,脸部关键点在相应脸部上不会发生移动。比如,从第一表情转换至第二表情时,脸部中四个眼角所对应的脸部关键点在相应脸部上基本不会发生移动。再比如,从第一表情转换至第二表情时,脸部中两边太阳穴所对应的脸部关键点在相应脸部上也基本不会发生移动。
上述实施例中,通过对第一位置信息、第二位置信息和第三位置信息分别进行归一化处理,可以缩小真实脸部和虚拟脸部之间的脸型差异。进而,基于归一化后的第一位置信息与归一化后的第二位置信息之间的差异,可以得到更准确的表情位置差异信息。将归一化后的第三位置信息按照表情位置差异信息进行调整,可以得到虚拟脸部在第二表情下的更准确的目标虚拟脸部关键点,从而使得在虚拟脸部产生的第二表情更加贴合相应的真实脸部的第二表情。
在一个实施例中,对第一位置信息、第二位置信息和第三位置信息分别进行归一化处理,包括:在对第一位置信息、第二位置信息和第三位置信息中的每一个进行归一化处理时,将被进行归一化处理的位置信息作为当前位置信息;针对每个当前位置信息,确定满足位置稳定条件的基准脸部关键点;基于基准脸部关键点确定当前位置信息对应的参照点,并确定基准脸部关键点与参照点之间的相对距离;基于任意两个基准脸部关键点之间的距离,确定缩放比例;根据相对距离和缩放比例,对当前位置信息进行归一化处理。
其中,当前位置信息对应的参照点,是对当前位置信息进行归一化处理时作为参照的点。相对距离,是基准脸部关键点与参照点之间的距离。缩放比例,是对当前位置信息进行归一化处理时作为参照的比例。
具体地,在对第一位置信息、第二位置信息和第三位置信息中的每一个进行归一化处理时,终端可将被进行归一化处理的位置信息作为当前位置信息。针对每个当前位置信息,终端可从相应脸部的脸部关键点中,筛选出满足位置稳定条件的基准脸部关键点。终端可基于基准脸部关键点确定当前位置信息对应的参照点,并可将基准脸部关键点与参照点之间的距离,作为相对距离。终端可确定任意两个基准脸部关键点之间的距离,并基于该任意两个基准脸部关键点之间的距离,确定缩放比例。进而,终端可根据相对距离和缩放比例,对当前位置信息进行归一化处理。
在一个实施例中,终端基于基准脸部关键点确定当前位置信息对应的参照点,具体可以是,终端可基于筛选出的全部基准脸部关键点,确定当前位置信息对应的参照点,也可以是,终端可基于选出的全部基准脸部关键点中的其中部分基准脸部关键点,确定当前位置信息对应的参照点。
举例说明,相应脸部中四个眼角所对应的脸部关键点,属于选出的全部基准脸部关键点中的其中四个基准脸部关键点。终端可将四个基准脸部关键点的中心点,确定为当前位置信息对应的参照点。终端可将相应脸部中两边太阳穴所分别对应的脸部关键点之间的距离,确定缩放比例。
在一个实施例中,终端可将基准脸部关键点的坐标减去参照点的坐标,得到相减后的坐标。终端可将相减后的坐标与缩放比例的比值,作为归一化后的当前位置信息,即脸部关键点归一化后的坐标。
在一个实施例中,当前位置信息为第一位置信息或第二位置信息时,脸部关键点归一化后的坐标可通过以下公式计算得到:
Figure PCTCN2022126077-appb-000001
其中,NK ri表示真实脸部在相应表情下的脸部关键点归一化后的坐标,K ri表示真实脸部在相应表情下的基准脸部关键点的坐标,O ri表示真实脸部在相应表情下的参照点的坐标,L ri为真实脸部在相应表情下的缩放比例。可以理解,在相应表情,是指第一表情或第二表情。
在一个实施例中,当前位置信息为第三位置信息时,脸部关键点归一化后的坐标可通过以下公式计算得到:
Figure PCTCN2022126077-appb-000002
其中,NK g,neutral表示虚拟脸部在第一表情下的脸部关键点归一化后的坐标,K g,neutral表示虚拟脸部在第一表情下的基准脸部关键点的坐标,O g,neutral表示虚拟脸部在第一表情下的参照点的坐标,L g,neutral为虚拟脸部在第一表情下的缩放比例。
上述实施例中,基于不会发生大形变的基准脸部关键点,可以确定当前位置信息对应的参照点,并确定基准脸部关键点与参照点之间的相对距离,基于任意两个基准脸部关键点之间的距离,可以确定缩放比例,根据相对距离和缩放比例,对当前位置信息进行归一化处理,这样,可以进一步缩小真实脸部和虚拟脸部之间的脸型差异,从而进一步使得在虚拟脸部产生的第二表情更加贴合相应的真实脸部的第二表情。
在一个实施例中,将归一化后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部在第二表情下的目标虚拟脸部关键点,包括:将归一化处理后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部在第二表情下的脸部关键点的中间态位置信息;基于第三位置信息对应的相对距离和缩放比例,对中间态位置信息进行反归一化处理,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
其中,中间态位置信息,是虚拟脸部在第二表情下的脸部关键点的、属于中间状态的位置信息,可以理解,中间态位置信息,是指在反归一化处理过程中需要被进一步处理的位置信息,相当于一种中间结果,其只存在于反归一化处理过程中,而不作为反归一化处理的结果输出。
具体地,将归一化处理后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部在第二表情下的脸部关键点的中间态位置信息;基于第三位置信息对应的相对距离和缩放比例,对中间态位置信息进行反归一化处理,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
在一个实施例中,中间态位置信息包括中间态坐标。终端可根据真实脸部在第二表情下的脸部关键点归一化后的坐标,与真实脸部在第一表情下的脸部关键点归一化后的坐标之差,确定表情位置差异信息。终端可将虚拟脸部在第一表情下的脸部关键点归一化后的坐标,按照表情位置差异信息进行调整,得到中间态坐标。终端可先将中间态坐标乘以虚拟脸部在第一表情下的缩放比例,进而再与虚拟脸部在第一表情下的参照点的坐标相加(即,对中间态位置信息进行反归一化处理),得到虚拟脸部在第二表情下的目标虚拟脸部关键点的坐标。
在一个实施例中,虚拟脸部在第二表情下的目标虚拟脸部关键点的坐标,可以通过 以下公式计算得到:
k i,r2g=((NK ri-NK r,neutral)+NK g,neutral)*L g,neutral+O g,neutral
其中,K i,r2g表示虚拟脸部在第二表情下的目标虚拟脸部关键点的坐标,NK r,neutral表示真实脸部在第一表情下的脸部关键点归一化后的坐标。
上述实施例中,基于第三位置信息对应的相对距离和缩放比例,可以对中间态位置信息进行反归一化处理,减小归一化处理所带来的位置误差,从而可以得到更加准确的虚拟脸部在第二表情下的目标虚拟脸部关键点。
在一个实施例中,基于目标虚拟脸部关键点提取关键点分布特征,并基于关键点分布特征,确定目标控制参数,包括:将目标虚拟脸部关键点输入至已训练的参数预估模型,以通过已训练的参数预估模型对目标虚拟脸部关键点进行分布特征提取,得到关键点分布特征,并基于关键点分布特征进行参数预估,输出目标控制参数。
具体地,终端中运行有已训练的参数预估模型,如图10所示,终端可将目标虚拟脸部关键点输入至已训练的参数预估模型,并通过已训练的参数预估模型对目标虚拟脸部关键点进行分布特征提取,得到关键点分布特征。进而,终端可通过已训练的参数预估模型,基于关键点分布特征对目标虚拟脸部关键点进行参数预估,输出目标控制参数。
在一个实施例中,目标控制参数可通过以下公式获取得到:
P i=K2P(K i,r2g)
其中,P i代表目标控制参数,K2P()代表已训练的参数预估模型。
上述实施例中,通过预测效果较好的已训练的参数预估模型,对目标虚拟脸部关键点进行分布特征提取,得到关键点分布特征,并基于关键点分布特征进行参数预估,直接输出目标控制参数,这样,可以提升目标控制参数获取的效率,以及提升目标控制参数的准确性。
在一个实施例中,得到已训练的参数预估模型的步骤,包括:获取用于产生样本表情的参考控制参数;获取参考控制参数对应的样本脸部关键点;将样本脸部关键点输入至待训练的参数预估模型,得到预测控制参数;基于预测控制参数与参考控制参数之间的误差,确定第二损失值;朝着使第二损失值减小的方向,对待训练的参数预估模型进行迭代训练,直至满足迭代停止条件时,得到已训练的参数预估模型。
其中,参考控制参数,是用于训练参数预估模型的控制参数。样本脸部关键点,是用于训练参数预估模型的脸部关键点。预测控制参数,是待训练的参数预估模型对样本脸部关键点进行参数预估所得到的控制参数。
具体地,终端中运行有表情生成应用,终端可通过表情生成应用随机生成用于产生样本表情的参考控制参数。终端获取到参考控制参数之后,可获取与参考控制参数对应的样本脸部关键点。终端可将样本脸部关键点输入至待训练的参数预估模型,并通过待训练的参数预估模型对样本脸部关键点进行参数预估,得到预测控制参数。终端可确定预测控制参数与参考控制参数之间的误差,并基于预测控制参数与参考控制参数之间的误差,确定第二损失值。终端可朝着使第二损失值减小的方向,对待训练的参数预估模型进行迭代训练,直至满足迭代停止条件时,得到已训练的参数预估模型。
在一个实施例中,如图4所示,表情生成应用的主界面还包括“开始训练”按钮401,通过触发该“开始训练”按钮401,终端可获取用于产生样本表情的参考控制参数,以及获取与参考控制参数对应的样本脸部关键点,并通过参考控制参数和参考控制参数对待训练的参数预估模型进行训练。
在一个实施例中,终端获取与参考控制参数对应的样本脸部关键点,可以理解,在参考控制参数的控制下,虚拟脸部可发生形变产生相应的样本表情,终端可直接获取在该样本表情下的虚拟脸部的脸部关键点的位置信息。
在一个实施例中,可采用均方误差函数作为训练的参数预估模型的损失函数,其损失函数可采用以下公式表示:
Loss=MeanSquaredError=(P predict,P groundtruth)
其中,Loss表示第二损失值,P predict表示预测控制参数,P groundtruth表示参考控制参数。
上述实施例中,通过参考控制参数,以及与参考控制参数对应的样本脸部关键点,对待训练的参数预估模型进行迭代训练,可以提升参数预估模型的参数预估准确率。
在一个实施例中,样本脸部关键点位于脸部网格中的目标网格区域内;样本脸部关键点与构成目标网格区域的多个目标顶点相对应;获取参考控制参数对应的样本脸部关键点,包括:针对每个样本脸部关键点,确定样本脸部关键点所对应的各个目标顶点,确定各个目标顶点在样本表情下的空间坐标;空间坐标是在世界坐标系下的坐标;确定样本脸部关键点在目标网格区域中的区域坐标;区域坐标,是在基于目标网格区域建立的区域坐标系下的坐标;基于各个目标顶点的空间坐标,对样本脸部关键点的区域坐标进行坐标转换,得到样本脸部关键点的空间坐标,以获得与参考控制参数对应的样本脸部关键点。
其中,目标网格区域,是样本脸部关键点所位于的脸部网格中的网格区域。目标顶点是构成目标网格区域的脸部网格的网格顶点。
具体地,针对每个样本脸部关键点,终端可将构成相应目标网格区域的网格顶点,确定为样本脸部关键点所对应的各个目标顶点。由于脸部网格的网格顶点的空间坐标是已知的,因此,终端可直接获取各个目标顶点在样本表情下的空间坐标。终端可基于目标顶点,直接确定样本脸部关键点在目标网格区域中的区域坐标。终端可基于各个目标顶点的空间坐标,对样本脸部关键点的区域坐标进行坐标转换,以将样本脸部关键点在区域坐标系下的区域坐标,转换至在世界坐标系下的空间坐标,得到样本脸部关键点的空间坐标,以获得与参考控制参数对应的样本脸部关键点。
上述实施例中,通过确定各样本脸部关键点所对应的各个目标顶点,进而可以准确地确定各个目标顶点在样本表情下的空间坐标。通过确定样本脸部关键点在目标网格区域中的区域坐标,进而可以基于各个目标顶点的空间坐标,对样本脸部关键点的区域坐标进行坐标转换,得到准确的样本脸部关键点的空间坐标,以获得与参考控制参数对应的样本脸部关键点,这样可提升样本脸部关键点的准确率。
在一个实施例中,区域坐标系,是以区域坐标原点建立的坐标系;区域坐标原点,是构成目标网格区域的多个目标顶点中的任意一个;基于各个目标顶点的空间坐标,对样本脸部关键点的区域坐标进行坐标转换,得到样本脸部关键点的空间坐标,包括:基于各个目标顶点的空间坐标和样本脸部关键点的区域坐标,确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置;获取区域坐标原点在世界坐标系下的空间坐标;基于区域坐标原点的空间坐标和相对位置,确定样本脸部关键点的空间坐标。
具体地,终端可基于各个目标顶点的空间坐标和样本脸部关键点的区域坐标,确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置。由于脸部网格的网格顶点的空间坐标是已知的,区域坐标原点又是构成目标网格区域的多个目标顶点中的任意一个,因此,终端可直接获取区域坐标原点在世界坐标系下的空间坐标。进而,终端可基于区域坐标原点的空间坐标和相对位置,确定样本脸部关键点的空间坐标。
上述实施例中,基于各个目标顶点的空间坐标和样本脸部关键点的区域坐标,可以 准确地确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置。由于区域坐标原点在世界坐标系下的位置是已知的,因此可直接获取区域坐标原点在世界坐标系下的空间坐标。进而,基于区域坐标原点的空间坐标和相对位置,可以准确地确定样本脸部关键点的空间坐标,进一步提升样本脸部关键点的空间坐标的准确性。
在一个实施例中,多个目标顶点为三个目标顶点;基于各个目标顶点的空间坐标和样本脸部关键点的区域坐标,确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置,包括:根据各个目标顶点的空间坐标,确定第一空间向量和第二空间向量;第一空间向量和第二空间向量,是从区域坐标原点分别指向除区域坐标原点外的其他两个目标顶点的向量;根据样本脸部关键点的区域坐标,确定第一区域向量和第二区域向量;第一区域向量与第一空间向量的向量方向一致;第二区域向量与第二空间向量的向量方向一致;基于第一空间向量和第一区域向量,确定第一转换比例,基于第二空间向量和第二区域向量,确定第二转换比例;将第一区域向量按照第一转换比例进行转换,得到第一中间向量,将第二区域向量按照第二转换比例进行转换,得到第二中间向量;基于第一中间向量和第二中间向量,确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置。
在一个实施例中,终端基于第一空间向量和第一区域向量,确定第一转换比例,基于第二空间向量和第二区域向量,确定第二转换比例,包括:终端可将第一区域向量与第一空间向量的比值,直接确定为第一转换比例,将第二区域向量与第二空间向量的比值,直接确定为第二转换比例。
在一个实施例中,终端基于第一中间向量和第二中间向量,确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置,包括:终端可将第一中间向量和第二中间向量进行向量相加,并基于相加后得到的向量的坐标,直接确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置。
上述实施例中,根据各个目标顶点的空间坐标,可以直接确定第一空间向量和第二空间向量,根据样本脸部关键点的区域坐标,可以直接确定第一区域向量和第二区域向量。基于第一空间向量和第一区域向量,可以确定第一转换比例,基于第二空间向量和第二区域向量,可以确定第二转换比例。将第一区域向量按照第一转换比例进行转换,可以得到第一中间向量,将第二区域向量按照第二转换比例进行转换,可以得到第二中间向量。进而,基于第一中间向量和第二中间向量,可以准确地确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置,从而进一步提升样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置的准确性。
在一个实施例中,确定样本脸部关键点在目标网格区域中的区域坐标,包括:获取关键点索引文件;关键点索引文件记录有各样本脸部关键点所分别对应的各个目标顶点的顶点标识,以及与顶点标识具有对应关系的相应样本脸部关键点在目标网格区域中的区域坐标;通过样本脸部关键点所对应的各个目标顶点的顶点标识,从关键点索引文件中查找与顶点标识具有对应关系的区域坐标,得到样本脸部关键点在目标网格区域中的区域坐标。
其中,关键点索引文件,适用于查找样本脸部关键点在目标网格区域中的区域坐标的文件。顶点标识,是用于唯一标识顶点的字符串。
具体地,用户可在第一表情下的虚拟脸部中标注脸部关键点,并将第一表情下的各样本脸部关键点所分别对应的各个目标顶点的顶点标识,记录在关键点索引文件中,将对应样本脸部关键点在目标网格区域中的区域坐标绑定记录在关键点索引文件中。终端可获取用户所记录的关键点索引文件。进而,终端可通过样本脸部关键点所对应的各个目标顶点的顶点标识,从关键点索引文件中查找与顶点标识具有对应关系的区域坐标,得到样本脸部关键点在目标网格区域中的区域坐标。
在一个实施例中,目标顶点的数量为三个,目标网格区域为三角面片,则关键点索 引文件可以表示为如下表1:
表1
Figure PCTCN2022126077-appb-000003
从表中可知,比如,网格名称为“名称1”的网格,其对应的样本脸部关键点所对应的三个目标顶点的顶点标识,分别为1565、2286和2246,且该样本脸部关键点在三角面片中的区域坐标为(0.4207,0.2293)。
上述实施例中,通过样本脸部关键点所对应的各个目标顶点的顶点标识,直接从关键点索引文件中查找与顶点标识具有对应关系的区域坐标,得到样本脸部关键点在目标网格区域中的区域坐标,可以提升样本脸部关键点在目标网格区域中的区域坐标的获取效率,从而进一步提升了表情生成效率。
在一个实施例中,基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情,包括:将目标控制参数作为目标控制器的控制参数;目标控制器与虚拟脸部的脸部网格的各顶点具有绑定关系;通过目标控制器,基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情。
具体地,终端中可运行有目标控制器,终端可将目标控制参数作为目标控制器的控制参数。进而,终端可通过目标控制器,基于目标控制参数,控制在第一表情下的脸部网格中的关联顶点产生移动,从而控制虚拟脸部从第一表情变换为第二表情。
在一个实施例中,如图4所示,表情生成应用的主界面还包括“文件”路径,其用于存储控制器的信息。用户可从“文件”中存储的多个控制器中选择目标控制器,并将目标控制器与虚拟脸部的脸部网格的各顶点进行绑定。
在一个实施例中,如图4所示,表情生成应用的主界面还包括“自动帧”按钮402,通过触发该“自动帧”按钮402,终端可自动将目标控制参数作为目标控制器的控制参数,并通过目标控制器,基于目标控制参数,控制在第一表情下的脸部网格中的关联顶点产生移动,从而控制虚拟脸部从第一表情变换为第二表情。
在一个实施例中,如图11所示,目标控制器与虚拟脸部的脸部网格的各网格顶点具有绑定关系。可以理解,通过目标控制器可以控制脸部网格的各网格顶点的位置发生改变,从而控制脸部网格发生形变,以产生相应的表情。
在一个实施例中,终端可将目标控制参数作为目标控制器的控制参数。为便于理解控制器,现结合图12对控制器进行示意说明。如图12所示,目标控制器可以包括多项控制参数,比如转变参数、旋转参数、等级参数、剪切参数、旋转顺序参数和旋转轴参数等,目标控制参数可以是目标控制器中的转变参数。其中,转变参数,是用于控制关联顶点的位置进行移动转变的参数。可以理解,转变参数可以控制关联顶点移动至相应位置,从而使虚拟脸部的脸部关键点满足相应的关键点分布特征。进而,终端可通过目标控制器,基于目标控制参数,控制在第一表情下的脸部网格中的关联顶点产生移动,从而控制虚拟脸部从第一表情变换为第二表情。
在一个实施例中,参考图13,终端可通过目标控制器,基于真实脸部的第二表情1301,在虚拟脸部中产生第二表情1302。可以理解,比如,第二表情为“眨眼”,则终端可通过目标控制器,基于真实脸部的“眨眼”的表情,在虚拟脸部中产生“眨眼”这个表情。
上述实施例中,将目标控制参数直接作为目标控制器的控制参数,进而通过目标控制器,基于目标控制参数可以直接控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情,进一步提升了表情生成效率。
在一个实施例中,如图14所示,终端可获取真实脸部在第一表情下的第一表情图像,以及获取真实脸部在第二表情下的第二表情图像(即,真实脸部的表情图像1401),通过已训练的关键点检测模型从第一表情图像中,定位真实脸部的脸部关键点的第一位置信息,以及从第二表情图像中,定位真实脸部的脸部关键点的第二位置信息(即,脸部关键点检测,得到真实脸部的脸部关键点),确定第一位置信息和第二位置信息之间的差异,得到表情位置差异信息。终端可获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点。基于表情位置差异信息和初始虚拟脸部关键点,得到虚拟脸部在第二表情下的目标虚拟脸部关键点(即,脸部关键点转换,得到在第二表情下虚拟脸部的脸部关键点)。终端可将目标虚拟脸部关键点输入至已训练的参数预估模型进行参数预估,得到目标控制参数。终端可将目标控制参数作为已绑定的目标控制器的控制参数,通过目标控制器,基于目标控制参数控制在虚拟脸部产生第二表情(即,虚拟脸部的表情图像1402)。
如图15所示,在一个实施例中,提供了一种表情生成方法,该方法具体包括以下步骤:
步骤1502,获取样本数据;样本数据包括样本表情图像和针对样本表情图像标注的脸部关键点的参考位置信息。
步骤1504,将样本表情图像输入至待训练的关键点检测模型,得到脸部关键点的预测位置信息。
步骤1506,基于脸部关键点的预测位置信息与脸部关键点的参考位置信息之间的误差,确定第一损失值。
步骤1508,朝着使第一损失值减小的方向,对待训练的关键点检测模型进行迭代训练,直至满足迭代停止条件时,得到已训练的关键点检测模型。
步骤1510,获取用于产生样本表情的参考控制参数;参考控制参数对应的样本脸部关键点位于虚拟脸部的脸部网格中的目标网格区域内;样本脸部关键点与构成目标网格区域的多个目标顶点相对应。
步骤1512,针对每个样本脸部关键点,确定样本脸部关键点所对应的各个目标顶点,确定各个目标顶点在样本表情下的空间坐标;空间坐标是在世界坐标系下的坐标。
步骤1514,确定样本脸部关键点在目标网格区域中的区域坐标;区域坐标,是在基于目标网格区域建立的区域坐标系下的坐标。
步骤1516,基于各个目标顶点的空间坐标,对样本脸部关键点的区域坐标进行坐标转换,得到样本脸部关键点的空间坐标,以获得与参考控制参数对应的样本脸部关键点。
步骤1518,将样本脸部关键点输入至待训练的参数预估模型,得到预测控制参数;基于预测控制参数与参考控制参数之间的误差,确定第二损失值。
步骤1520,朝着使第二损失值减小的方向,对待训练的参数预估模型进行迭代训练,直至满足迭代停止条件时,得到已训练的参数预估模型。
需要说明的是,本申请对参数预估模型和关键点检测模型的训练先后顺序不做限定。
步骤1522,获取真实脸部在第一表情下的第一表情图像;获取真实脸部在第二表情下的第二表情图像。
步骤1524,通过已训练的关键点检测模型,从第一表情图像中,定位真实脸部的脸部关键点的第一位置信息,以及从第二表情图像中,定位真实脸部的脸部关键点的第二位置信息。
步骤1526,获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点;获取初始虚拟脸部关键点的第三位置信息。
步骤1528,对第一位置信息、第二位置信息和第三位置信息分别进行归一化处理,得到归一化后的第一位置信息、归一化后的第二位置信息以及归一化后的第三位置信息。
步骤1530,基于归一化后的第一位置信息与归一化后的第二位置信息之间的差异, 得到表情位置差异信息;表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异。
步骤1532,将归一化后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
步骤1534,将目标虚拟脸部关键点输入至已训练的参数预估模型,以通过已训练的参数预估模型对目标虚拟脸部关键点进行分布特征提取,得到关键点分布特征,基于关键点分布特征进行参数预估,输出目标控制参数;目标控制参数,是用于对虚拟脸部的脸部网格中的、且与第二表情相关的关联顶点进行控制的参数。
步骤1536,将目标控制参数作为目标控制器的控制参数;目标控制器与虚拟脸部的脸部网格的各顶点具有绑定关系。
步骤1538,通过目标控制器,基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情。
本申请还提供一种应用场景,该应用场景应用上述的表情生成方法。具体地,该表情生成方法可应用于游戏中的三维虚拟角色表情生成场景。终端可获取样本数据;样本数据包括样本人脸表情图像和针对样本人脸表情图像标注的人脸关键点的参考位置信息。将样本人脸表情图像输入至待训练的人脸关键点检测模型,得到人脸关键点的预测位置信息。基于人脸关键点的预测位置信息与人脸关键点的参考位置信息之间的误差,确定第一损失值。朝着使第一损失值减小的方向,对待训练的人脸关键点检测模型进行迭代训练,直至满足迭代停止条件时,得到已训练的人脸关键点检测模型。
终端可获取用于产生样本人脸表情的参考控制参数;参考控制参数对应的样本人脸关键点位于虚拟人脸的人脸网格中的目标网格区域内;样本人脸关键点与构成目标网格区域的多个目标顶点相对应。针对每个样本人脸关键点,确定样本人脸关键点所对应的各个目标顶点,确定各个目标顶点在样本人脸表情下的空间坐标;空间坐标是在世界坐标系下的坐标。确定样本人脸关键点在目标网格区域中的区域坐标;区域坐标,是在基于目标网格区域建立的区域坐标系下的坐标。基于各个目标顶点的空间坐标,对样本人脸关键点的区域坐标进行坐标转换,得到样本人脸关键点的空间坐标,以获得与参考控制参数对应的样本人脸关键点。将样本人脸关键点输入至待训练的参数预估模型,得到预测控制参数;基于预测控制参数与参考控制参数之间的误差,确定第二损失值。朝着使第二损失值减小的方向,对待训练的参数预估模型进行迭代训练,直至满足迭代停止条件时,得到已训练的参数预估模型。
终端可获取真实人脸在第一表情下的第一表情图像;获取真实人脸在第二表情下的第二表情图像。通过已训练的人脸关键点检测模型,从第一表情图像中,定位真实人脸的人脸关键点的第一位置信息,以及从第二表情图像中,定位真实人脸的人脸关键点的第二位置信息。获取在第一表情下的虚拟人脸的人脸关键点,得到初始虚拟人脸关键点;获取初始虚拟人脸关键点的第三位置信息。对第一位置信息、第二位置信息和第三位置信息分别进行归一化处理,得到归一化后的第一位置信息、归一化后的第二位置信息以及归一化后的第三位置信息。基于归一化后的第一位置信息与归一化后的第二位置信息之间的差异,得到表情位置差异信息;表情位置差异信息,用于表征真实人脸的人脸关键点在第一表情和第二表情下的位置差异。将归一化后的第三位置信息按照表情位置差异信息进行调整,得到虚拟人脸在第二表情下的目标虚拟人脸关键点。
终端可将目标虚拟人脸关键点输入至已训练的参数预估模型,以通过已训练的参数预估模型对目标虚拟人脸关键点进行分布特征提取,得到关键点分布特征,基于关键点分布特征进行参数预估,输出目标控制参数;目标控制参数,是用于对虚拟人脸的人脸网格中的、且与第二表情相关的关联顶点进行控制的参数。将目标控制参数作为目标控制器的控制参数;目标控制器与虚拟人脸的人脸网格的各顶点具有绑定关系。通过目标控制器,基 于目标控制参数控制人脸网格中的关联顶点移动,以在虚拟人脸中产生第二表情。
本申请还另外提供一种应用场景,该应用场景应用上述的表情生成方法。具体地,该表情生成方法可应用于虚拟动物表情生成场景。终端可获取基于真实动物脸部得到的表情位置差异信息;表情位置差异信息,用于表征真实动物脸部的动物脸部关键点在第一表情和第二表情下的位置差异。获取在第一表情下的虚拟动物脸部的动物脸部关键点,得到初始虚拟动物脸部关键点;虚拟动物脸部是虚拟对象的动物脸部。基于表情位置差异信息和初始虚拟动物脸部关键点,得到虚拟动物脸部在第二表情下的目标虚拟动物脸部关键点。基于目标虚拟动物脸部关键点提取关键点分布特征,并基于关键点分布特征,确定目标控制参数;目标控制参数,是用于对虚拟动物脸部的动物脸部网格中的、且与第二表情相关的关联顶点进行控制的参数。基于目标控制参数控制动物脸部网格中的关联顶点移动,以在虚拟动物脸部中产生第二表情。
应该理解的是,虽然上述各实施例的流程图中的各个步骤按照顺序依次显示,但是这些步骤并不是必然按照顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,上述各实施例中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图16所示,提供了一种表情生成装置1600,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:
获取模块1601,用于获取基于真实脸部得到的表情位置差异信息;表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异;获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点;虚拟脸部是虚拟对象的脸部。
确定模块1602,用于基于表情位置差异信息和初始虚拟脸部关键点,得到虚拟脸部在第二表情下的目标虚拟脸部关键点;基于目标虚拟脸部关键点提取关键点分布特征,并基于关键点分布特征,确定目标控制参数;目标控制参数,是用于对虚拟脸部的脸部网格中的、且与第二表情相关的关联顶点进行控制的参数。
生成模块1603,用于基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情。
在一个实施例中,获取模块1601还用于获取真实脸部在第一表情下的第一表情图像;获取真实脸部在第二表情下的第二表情图像;从第一表情图像中,定位真实脸部的脸部关键点的第一位置信息;从第二表情图像中,定位真实脸部的脸部关键点的第二位置信息;确定第一位置信息和第二位置信息之间的差异,得到表情位置差异信息。
在一个实施例中,第一位置信息和第二位置信息是通过已训练的关键点检测模型检测得到;装置还包括:
第一训练模块1604,用于获取样本数据;样本数据包括样本表情图像和针对样本表情图像标注的脸部关键点的参考位置信息;将样本表情图像输入至待训练的关键点检测模型,得到脸部关键点的预测位置信息;基于脸部关键点的预测位置信息与脸部关键点的参考位置信息之间的误差,确定第一损失值;朝着使第一损失值减小的方向,对待训练的关键点检测模型进行迭代训练,直至满足迭代停止条件时,得到已训练的关键点检测模型。
在一个实施例中,获取模块1601还用于获取初始虚拟脸部关键点的第三位置信息;对第一位置信息、第二位置信息和第三位置信息分别进行归一化处理,得到归一化后的第一位置信息、归一化后的第二位置信息以及归一化后的第三位置信息;基于归一化后的第一位置信息与归一化后的第二位置信息之间的差异,得到表情位置差异信息;确定模块1602还用于将归一化后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部 在第二表情下的目标虚拟脸部关键点。
在一个实施例中,获取模块1601还用于在对第一位置信息、第二位置信息和第三位置信息中的每一个进行归一化处理时,将被进行归一化处理的位置信息作为当前位置信息;针对每个当前位置信息,确定满足位置稳定条件的基准脸部关键点;基于基准脸部关键点确定当前位置信息对应的参照点,并确定基准脸部关键点与参照点之间的相对距离;基于任意两个基准脸部关键点之间的距离,确定缩放比例;根据相对距离和缩放比例,对当前位置信息进行归一化处理。
在一个实施例中,确定模块1602还用于将归一化处理后的第三位置信息按照表情位置差异信息进行调整,得到虚拟脸部在第二表情下的脸部关键点的中间态位置信息;基于第三位置信息对应的相对距离和缩放比例,对中间态位置信息进行反归一化处理,得到虚拟脸部在第二表情下的目标虚拟脸部关键点。
在一个实施例中,确定模块1602还用于将目标虚拟脸部关键点输入至已训练的参数预估模型,以通过已训练的参数预估模型对目标虚拟脸部关键点进行分布特征提取,得到关键点分布特征,并基于关键点分布特征进行参数预估,输出目标控制参数。
在一个实施例中,装置还包括:
第二训练模块1605,用于获取用于产生样本表情的参考控制参数;获取参考控制参数对应的样本脸部关键点;将样本脸部关键点输入至待训练的参数预估模型,得到预测控制参数;基于预测控制参数与参考控制参数之间的误差,确定第二损失值;朝着使第二损失值减小的方向,对待训练的参数预估模型进行迭代训练,直至满足迭代停止条件时,得到已训练的参数预估模型。
在一个实施例中,样本脸部关键点位于脸部网格中的目标网格区域内;样本脸部关键点与构成目标网格区域的多个目标顶点相对应;第二训练模块1605还用于针对每个样本脸部关键点,确定样本脸部关键点所对应的各个目标顶点,确定各个目标顶点在样本表情下的空间坐标;空间坐标是在世界坐标系下的坐标;确定样本脸部关键点在目标网格区域中的区域坐标;区域坐标,是在基于目标网格区域建立的区域坐标系下的坐标;基于各个目标顶点的空间坐标,对样本脸部关键点的区域坐标进行坐标转换,得到样本脸部关键点的空间坐标,以获得与参考控制参数对应的样本脸部关键点。
在一个实施例中,区域坐标系,是以区域坐标原点建立的坐标系;区域坐标原点,是构成目标网格区域的多个目标顶点中的任意一个;第二训练模块1605还用于基于各个目标顶点的空间坐标和样本脸部关键点的区域坐标,确定样本脸部关键点在世界坐标系下相对于区域坐标原点的相对位置;获取区域坐标原点在世界坐标系下的空间坐标;基于区域坐标原点的空间坐标和相对位置,确定样本脸部关键点的空间坐标。
在一个实施例中,第二训练模块1605还用于获取关键点索引文件;关键点索引文件记录有各样本脸部关键点所分别对应的各个目标顶点的顶点标识,以及与顶点标识具有对应关系的相应样本脸部关键点在目标网格区域中的区域坐标;通过样本脸部关键点所对应的各个目标顶点的顶点标识,从关键点索引文件中查找与顶点标识具有对应关系的区域坐标,得到样本脸部关键点在目标网格区域中的区域坐标。
在一个实施例中,生成模块1603还用于将目标控制参数作为目标控制器的控制参数;目标控制器与虚拟脸部的脸部网格的各顶点具有绑定关系;通过目标控制器,基于目标控制参数控制脸部网格中的关联顶点移动,以在虚拟脸部中产生第二表情。
参考图17,在一个实施例中,表情生成装置1600还可以包括第一训练模块1604和第二训练模块1605。
上述表情生成装置,获取基于真实脸部得到的表情位置差异信息,其中,表情位置差异信息,可用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异。获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点。基于表情位置差异信息 和初始虚拟脸部关键点,可以得到虚拟脸部在第二表情下的目标虚拟脸部关键点,基于目标虚拟脸部关键点可以提取关键点分布特征,基于关键点分布特征,可以快速确定目标控制参数,其中,目标控制参数,是用于对虚拟脸部的脸部网格中的、且与第二表情相关的关联顶点进行控制的参数。进而,基于目标控制参数控制脸部网格中的关联顶点移动,可以在虚拟脸部中产生第二表情。相较于传统的表情生成方法,本申请的表情生成方法,只需要获取虚拟脸部在第一表情下的脸部关键点,结合真实脸部的脸部关键点在不同表情下的表情位置差异,就能便捷地控制在虚拟脸部中产生相应变化的表情,提升了表情生成效率。
关于表情生成装置的具体限定可以参见上文中对于表情生成方法的限定,在此不再赘述。上述表情生成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图18所示。该计算机设备包括通过***总线连接的处理器、存储器、通信接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***和计算机可读指令。该内存储器为非易失性存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的通信接口用于与外部的终端进行有线或无线方式的通信,无线方式可通过WIFI、运营商网络、NFC(近场通信)或其他技术实现。该计算机可读指令被处理器执行时以实现一种表情生成方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图18中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,还提供了一种计算机设备,包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,该一个或多个处理器执行计算机可读指令时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一个或多个计算机可读存储介质,存储有计算机可读指令,该计算机可读指令被一个或多个处理器执行时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机程序产品,包括计算机可读指令,计算机可读指令被一个或多个处理器执行时实现上述各方法实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
需要说明的是,本申请所涉及的用户信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于分析的数据、存储的数据、展示的数据等),均为经用户授权或者经过各方充分授权的信息和数据,且相关数据的收集、使用和处理需要遵守 相关国家和地区的相关法律法规和标准。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (18)

  1. 一种表情生成方法,其特征在于,应用于终端,所述方法包括:
    获取基于真实脸部得到的表情位置差异信息;所述表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异;
    获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点;
    基于所述表情位置差异信息和所述初始虚拟脸部关键点,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点;
    基于所述目标虚拟脸部关键点提取关键点分布特征,并基于所述关键点分布特征,确定目标控制参数;所述目标控制参数,是用于对所述虚拟脸部的脸部网格中的、且与所述第二表情相关的关联顶点进行控制的参数;
    基于所述目标控制参数控制所述脸部网格中的所述关联顶点移动,以在所述虚拟脸部中产生所述第二表情。
  2. 根据权利要求1所述的方法,其特征在于,所述获取基于真实脸部得到的表情位置差异信息,包括:
    获取真实脸部在第一表情下的第一表情图像;
    获取所述真实脸部在第二表情下的第二表情图像;
    从所述第一表情图像中,定位所述真实脸部的脸部关键点的第一位置信息;
    从所述第二表情图像中,定位所述真实脸部的脸部关键点的第二位置信息;
    确定所述第一位置信息和所述第二位置信息之间的差异,得到表情位置差异信息。
  3. 根据权利要求2所述的方法,其特征在于,所述第一位置信息和所述第二位置信息是通过已训练的关键点检测模型检测得到;得到所述已训练的关键点检测模型的步骤,包括:
    获取样本数据;所述样本数据包括样本表情图像和针对所述样本表情图像标注的脸部关键点的参考位置信息;
    将所述样本表情图像输入至待训练的关键点检测模型,得到脸部关键点的预测位置信息;
    基于所述脸部关键点的预测位置信息与所述脸部关键点的参考位置信息之间的误差,确定第一损失值;
    朝着使所述第一损失值减小的方向,对所述待训练的关键点检测模型进行迭代训练,直至满足迭代停止条件时,得到已训练的关键点检测模型。
  4. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    获取所述初始虚拟脸部关键点的第三位置信息;
    对所述第一位置信息、所述第二位置信息和所述第三位置信息分别进行归一化处理,得到归一化后的第一位置信息、归一化后的第二位置信息以及归一化后的第三位置信息;
    所述确定所述第一位置信息和所述第二位置信息之间的差异,得到表情位置差异信息;包括:
    基于归一化后的第一位置信息与归一化后的第二位置信息之间的差异,得到表情位置差异信息;
    所述基于所述表情位置差异信息和所述初始虚拟脸部关键点,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点,包括:
    将归一化后的第三位置信息按照所述表情位置差异信息进行调整,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点。
  5. 根据权利要求4所述的方法,其特征在于,所述对所述第一位置信息、所述第二位置信息和所述第三位置信息分别进行归一化处理,包括:
    在对所述第一位置信息、所述第二位置信息和所述第三位置信息中的每一个进行归一 化处理时,将被进行归一化处理的位置信息作为当前位置信息;
    针对每个所述当前位置信息,确定满足位置稳定条件的基准脸部关键点;
    基于所述基准脸部关键点确定所述当前位置信息对应的参照点,并确定所述基准脸部关键点与所述参照点之间的相对距离;
    基于任意两个所述基准脸部关键点之间的距离,确定缩放比例;
    根据所述相对距离和所述缩放比例,对所述当前位置信息进行归一化处理。
  6. 根据权利要求5所述的方法,其特征在于,所述位置稳定条件包括以下至少一种情况:
    从第一表情转换至第二表情时,脸部关键点在虚拟脸部上的移动距离小于预设移动距离;
    从第一表情转换至第二表情时,脸部关键点在虚拟脸部上不发生移动。
  7. 根据权利要求5所述的方法,其特征在于,所述将归一化后的第三位置信息按照所述表情位置差异信息进行调整,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点,包括:
    将归一化处理后的所述第三位置信息按照所述表情位置差异信息进行调整,得到所述虚拟脸部在第二表情下的脸部关键点的中间态位置信息;
    基于所述第三位置信息对应的相对距离和缩放比例,对所述中间态位置信息进行反归一化处理,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点。
  8. 根据权利要求1所述的方法,其特征在于,所述基于所述目标虚拟脸部关键点提取关键点分布特征,并基于所述关键点分布特征,确定目标控制参数,包括:
    将所述目标虚拟脸部关键点输入至已训练的参数预估模型,以通过所述已训练的参数预估模型对所述目标虚拟脸部关键点进行分布特征提取,得到关键点分布特征,并基于所述关键点分布特征进行参数预估,输出目标控制参数。
  9. 根据权利要求8所述的方法,其特征在于,得到所述已训练的参数预估模型的步骤,包括:
    获取用于产生样本表情的参考控制参数;
    获取所述参考控制参数对应的样本脸部关键点;
    将所述样本脸部关键点输入至待训练的参数预估模型,得到预测控制参数;
    基于所述预测控制参数与所述参考控制参数之间的误差,确定第二损失值;
    朝着使所述第二损失值减小的方向,对所述待训练的参数预估模型进行迭代训练,直至满足迭代停止条件时,得到已训练的参数预估模型。
  10. 根据权利要求9所述的方法,其特征在于,所述样本脸部关键点位于脸部网格中的目标网格区域内;所述样本脸部关键点与构成所述目标网格区域的多个目标顶点相对应;所述获取所述参考控制参数对应的样本脸部关键点,包括:
    针对每个样本脸部关键点,确定所述样本脸部关键点所对应的各个目标顶点,确定所述各个目标顶点在所述样本表情下的空间坐标;所述空间坐标是在世界坐标系下的坐标;
    确定所述样本脸部关键点在目标网格区域中的区域坐标;所述区域坐标,是在基于所述目标网格区域建立的区域坐标系下的坐标;
    基于所述各个目标顶点的所述空间坐标,对所述样本脸部关键点的所述区域坐标进行坐标转换,得到所述样本脸部关键点的空间坐标,以获得与所述参考控制参数对应的样本脸部关键点。
  11. 根据权利要求10所述的方法,其特征在于,所述区域坐标系,是以区域坐标原点建立的坐标系;所述区域坐标原点,是构成所述目标网格区域的多个目标顶点中的任意一个;
    所述基于所述各个目标顶点的所述空间坐标,对所述样本脸部关键点的所述区域坐标 进行坐标转换,得到所述样本脸部关键点的空间坐标,包括:
    基于所述各个目标顶点的所述空间坐标和所述样本脸部关键点的区域坐标,确定所述样本脸部关键点在世界坐标系下相对于所述区域坐标原点的相对位置;
    获取区域坐标原点在所述世界坐标系下的空间坐标;
    基于所述区域坐标原点的空间坐标和所述相对位置,确定所述样本脸部关键点的空间坐标。
  12. 根据权利要求10所述的方法,其特征在于,所述确定所述样本脸部关键点在目标网格区域中的区域坐标,包括:
    获取关键点索引文件;所述关键点索引文件记录有各样本脸部关键点所分别对应的各个目标顶点的顶点标识,以及与所述顶点标识具有对应关系的相应样本脸部关键点在目标网格区域中的区域坐标;
    通过所述样本脸部关键点所对应的各个目标顶点的顶点标识,从所述关键点索引文件中查找与所述顶点标识具有对应关系的区域坐标,得到所述样本脸部关键点在目标网格区域中的区域坐标。
  13. 根据权利要求1至12中任一项所述的方法,其特征在于,所述基于所述目标控制参数控制所述脸部网格中的所述关联顶点移动,以在所述虚拟脸部中产生所述第二表情,包括:
    将所述目标控制参数作为目标控制器的控制参数;所述目标控制器与所述虚拟脸部的脸部网格的各顶点具有绑定关系;
    通过所述目标控制器,基于所述目标控制参数控制所述脸部网格中的所述关联顶点移动,以在所述虚拟脸部中产生所述第二表情。
  14. 根据权利要求13所述的方法,其特征在于,所述目标控制器的控制参数包括转变参数、旋转参数、等级参数、剪切参数、旋转顺序参数和旋转轴参数等中的至少一种。
  15. 一种表情生成装置,其特征在于,所述装置包括:
    获取模块,用于获取基于真实脸部得到的表情位置差异信息;所述表情位置差异信息,用于表征真实脸部的脸部关键点在第一表情和第二表情下的位置差异;获取在第一表情下的虚拟脸部的脸部关键点,得到初始虚拟脸部关键点;
    确定模块,用于基于所述表情位置差异信息和所述初始虚拟脸部关键点,得到所述虚拟脸部在第二表情下的目标虚拟脸部关键点;基于所述目标虚拟脸部关键点提取关键点分布特征,并基于所述关键点分布特征,确定目标控制参数;所述目标控制参数,是用于对所述虚拟脸部的脸部网格中的、且与所述第二表情相关的关联顶点进行控制的参数;
    生成模块,用于基于所述目标控制参数控制所述脸部网格中的所述关联顶点移动,以在所述虚拟脸部中产生所述第二表情。
  16. 一种计算机设备,包括存储器和一个或多个处理器,所述存储器存储有计算机可读指令,其特征在于,所述一个或多个处理器执行所述计算机可读指令时实现权利要求1至14中任一项所述的方法的步骤。
  17. 一个或多个计算机可读存储介质,存储有计算机可读指令,其特征在于,所述计算机可读指令被一个或多个处理器执行时实现权利要求1至14中任一项所述的方法的步骤。
  18. 一种计算机程序产品,包括计算机可读指令,其特征在于,所述计算机可读指令被一个或多个处理器执行时实现权利要求1至14中任一项所述的方法的步骤。
PCT/CN2022/126077 2021-12-06 2022-10-19 表情生成方法、装置、设备、介质和计算机程序产品 WO2023103600A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/331,906 US20230316623A1 (en) 2021-12-06 2023-06-08 Expression generation method and apparatus, device, and medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111473341.7 2021-12-06
CN202111473341.7A CN113870401B (zh) 2021-12-06 2021-12-06 表情生成方法、装置、设备、介质和计算机程序产品

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/331,906 Continuation US20230316623A1 (en) 2021-12-06 2023-06-08 Expression generation method and apparatus, device, and medium

Publications (1)

Publication Number Publication Date
WO2023103600A1 true WO2023103600A1 (zh) 2023-06-15

Family

ID=78986087

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/126077 WO2023103600A1 (zh) 2021-12-06 2022-10-19 表情生成方法、装置、设备、介质和计算机程序产品

Country Status (3)

Country Link
US (1) US20230316623A1 (zh)
CN (1) CN113870401B (zh)
WO (1) WO2023103600A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870401B (zh) * 2021-12-06 2022-02-25 腾讯科技(深圳)有限公司 表情生成方法、装置、设备、介质和计算机程序产品
CN116778107A (zh) * 2022-03-11 2023-09-19 腾讯科技(深圳)有限公司 表情模型的生成方法、装置、设备及介质
CN115239860B (zh) * 2022-09-01 2023-08-01 北京达佳互联信息技术有限公司 表情数据生成方法、装置、电子设备及存储介质
CN115393486B (zh) * 2022-10-27 2023-03-24 科大讯飞股份有限公司 虚拟形象的生成方法、装置、设备及存储介质
CN117152382A (zh) * 2023-10-30 2023-12-01 海马云(天津)信息技术有限公司 虚拟数字人面部表情计算方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080037836A1 (en) * 2006-08-09 2008-02-14 Arcsoft, Inc. Method for driving virtual facial expressions by automatically detecting facial expressions of a face image
CN109410298A (zh) * 2018-11-02 2019-03-01 北京恒信彩虹科技有限公司 一种虚拟模型的制作方法及表情变化方法
US20190325633A1 (en) * 2018-04-23 2019-10-24 Magic Leap, Inc. Avatar facial expression representation in multidimensional space
CN111632374A (zh) * 2020-06-01 2020-09-08 网易(杭州)网络有限公司 游戏中虚拟角色的脸部处理方法、装置及可读存储介质
CN113870401A (zh) * 2021-12-06 2021-12-31 腾讯科技(深圳)有限公司 表情生成方法、装置、设备、介质和计算机程序产品

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103854306A (zh) * 2012-12-07 2014-06-11 山东财经大学 一种高真实感的动态表情建模方法
CN108257162B (zh) * 2016-12-29 2024-03-05 北京三星通信技术研究有限公司 合成脸部表情图像的方法和装置
US10860841B2 (en) * 2016-12-29 2020-12-08 Samsung Electronics Co., Ltd. Facial expression image processing method and apparatus
CN109087380B (zh) * 2018-08-02 2023-10-20 咪咕文化科技有限公司 一种漫画动图生成方法、装置及存储介质
CN109147024A (zh) * 2018-08-16 2019-01-04 Oppo广东移动通信有限公司 基于三维模型的表情更换方法和装置
CN112766027A (zh) * 2019-11-05 2021-05-07 广州虎牙科技有限公司 图像处理方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080037836A1 (en) * 2006-08-09 2008-02-14 Arcsoft, Inc. Method for driving virtual facial expressions by automatically detecting facial expressions of a face image
US20190325633A1 (en) * 2018-04-23 2019-10-24 Magic Leap, Inc. Avatar facial expression representation in multidimensional space
CN109410298A (zh) * 2018-11-02 2019-03-01 北京恒信彩虹科技有限公司 一种虚拟模型的制作方法及表情变化方法
CN111632374A (zh) * 2020-06-01 2020-09-08 网易(杭州)网络有限公司 游戏中虚拟角色的脸部处理方法、装置及可读存储介质
CN113870401A (zh) * 2021-12-06 2021-12-31 腾讯科技(深圳)有限公司 表情生成方法、装置、设备、介质和计算机程序产品

Also Published As

Publication number Publication date
CN113870401A (zh) 2021-12-31
US20230316623A1 (en) 2023-10-05
CN113870401B (zh) 2022-02-25

Similar Documents

Publication Publication Date Title
WO2023103600A1 (zh) 表情生成方法、装置、设备、介质和计算机程序产品
AU2016201655B2 (en) Estimating depth from a single image
WO2021120834A1 (zh) 基于生物识别的手势识别方法、装置、计算机设备及介质
US11308655B2 (en) Image synthesis method and apparatus
CN103425964B (zh) 图像处理设备和图像处理方法
CN111652974B (zh) 三维人脸模型的构建方法、装置、设备及存储介质
KR102599977B1 (ko) 정보 처리 방법과 장치, 전자 기기, 컴퓨터 판독가능 저장 매체 및 매체에 저장된 컴퓨터 프로그램
WO2021232609A1 (zh) Rgb-d图像的语义分割方法、***、介质及电子设备
JP7443647B2 (ja) キーポイント検出及びモデル訓練方法、装置、デバイス、記憶媒体、並びにコンピュータプログラム
US20220358675A1 (en) Method for training model, method for processing video, device and storage medium
CN113705297A (zh) 检测模型的训练方法、装置、计算机设备和存储介质
WO2023020358A1 (zh) 面部图像处理方法、面部图像处理模型的训练方法、装置、设备、存储介质及程序产品
WO2021223738A1 (zh) 模型参数的更新方法、装置、设备及存储介质
CN105096353A (zh) 一种图像处理方法及装置
US20240037898A1 (en) Method for predicting reconstructabilit, computer device and storage medium
CN110956131A (zh) 单目标追踪方法、装置及***
KR20230132350A (ko) 연합 감지 모델 트레이닝, 연합 감지 방법, 장치, 설비 및 매체
CN110717405A (zh) 人脸特征点定位方法、装置、介质及电子设备
CN113902848A (zh) 对象重建方法、装置、电子设备及存储介质
CN115953330B (zh) 虚拟场景图像的纹理优化方法、装置、设备和存储介质
CN112233161A (zh) 手部图像深度确定方法、装置、电子设备及存储介质
EP4086853A2 (en) Method and apparatus for generating object model, electronic device and storage medium
Ng et al. Syntable: A synthetic data generation pipeline for unseen object amodal instance segmentation of cluttered tabletop scenes
Galbrun et al. Interactive redescription mining
CN112700481B (zh) 基于深度学习的纹理图自动生成方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22903024

Country of ref document: EP

Kind code of ref document: A1