CN113570634A

CN113570634A - Object three-dimensional reconstruction method and device, electronic equipment and storage medium

Info

Publication number: CN113570634A
Application number: CN202010348666.1A
Authority: CN
Inventors: 马里千; 张国鑫; 孙佳佳; 张博宁
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-04-28
Filing date: 2020-04-28
Publication date: 2021-10-29

Abstract

The disclosure relates to the technical field of computer vision, in particular to a method and a device for three-dimensional reconstruction of an object, electronic equipment and a storage medium, which are used for solving the problem of low efficiency of three-dimensional reconstruction of the object. The disclosed method comprises the following steps: acquiring key information of each part of an object in a two-dimensional target image; obtaining a loss function determined by a first difference and a second difference based on the key information, wherein the first difference is the difference between the key information determined according to the three-dimensional target grid projection and the key information obtained according to the two-dimensional target image, and the second difference is the difference between the relative position of each vertex and the adjacent vertex in the three-dimensional target grid and the relative position of each vertex and the adjacent vertex in the preset three-dimensional grid; and three-dimensionally reconstructing the position coordinates of each vertex in the three-dimensional target mesh used when the loss function takes the minimum value to generate a three-dimensional model corresponding to the object. According to the method and the device, the three-dimensional model is not directly constructed in the three-dimensional space, but the three-dimensional grid is obtained based on the two-dimensional image, so that the three-dimensional model is generated, and the efficiency is higher.

Description

Object three-dimensional reconstruction method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer vision technologies, and in particular, to a method and an apparatus for three-dimensional reconstruction of an object, an electronic device, and a storage medium.

Background

With the popularization of functions such as self-shooting and short video, the requirements of the subject on beauty and related face effects are gradually strengthened. Many of the special effects depend on 3D (Three Dimensions) head models, such as movie animation special effects, which include 3D head models in a quadratic style. How to establish a large number of two-dimensional 3D human head models for on-line special effects by using less design resources becomes a very important problem. In the related technical scheme, each two-dimensional 3D human head model is designed by a 3D designer, the time of 2 persons in a week is consumed by the big outline of each human head model, and the design workload is huge.

In summary, at present, there is no simple and efficient method for designers to use, so that designing a two-dimensional 3D human head model is time-consuming and inefficient.

Disclosure of Invention

The present disclosure provides an object three-dimensional reconstruction method, an object three-dimensional reconstruction device, an electronic device, and a storage medium, so as to at least solve the problem of low object three-dimensional reconstruction efficiency in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided an object three-dimensional reconstruction method, including:

acquiring key information of each part in an object contained in a two-dimensional target image, wherein the key information comprises key points and/or mask images;

obtaining a loss function determined by a first difference and a second difference based on the key information, wherein the first difference is a difference between the key information determined according to the projection of the three-dimensional target mesh of the object in the two-dimensional space and the key information obtained according to the two-dimensional target image, and the second difference is a difference between the relative position of each vertex and the adjacent vertex in the three-dimensional target mesh and the relative position of each vertex and the adjacent vertex in the preset three-dimensional mesh of the object;

determining the position coordinates of each vertex in the three-dimensional target grid used when the loss function takes the minimum value;

and performing three-dimensional reconstruction according to the determined position coordinates of each vertex to generate a three-dimensional model corresponding to the object.

In an optional implementation manner, the two-dimensional target image is obtained by superimposing layered images of the respective portions according to a preset layer order, each layered image includes a portion of at least one object, and different layered images include different portions;

the acquiring key information of each part in the object contained in the two-dimensional target image comprises:

acquiring a layered image of each part in an object contained in the two-dimensional target image, wherein the layered image of each part is generated according to a preset layered template of each part;

and extracting key information corresponding to each part from the layered images of each part.

In an optional implementation manner, the acquiring key information of each part in the object included in the two-dimensional target image includes:

for any part, converting an image corresponding to the part in the two-dimensional target image into a mask image, and taking the mask image as key information of the part; or

And aiming at any part, extracting corresponding key points of the part in the two-dimensional target image, and taking the extracted key points as key information of the part.

In an optional embodiment, before the obtaining the loss function determined by the first difference and the second difference based on the key information, the method further includes:

if the key information of all the parts is the mask image, taking the sum of the distances between the first mask vector and the second mask vector corresponding to each part as the first difference; and the second mask vector corresponding to the part is obtained by combining the position coordinates of all pixel points in the mask image of the part obtained according to the two-dimensional target image.

if the key information of all the parts is the key point, taking the sum of the distances between the first position vector and the second position vector corresponding to each part as the first difference; and the first position vector corresponding to the part is the position vector of the key point of the part obtained according to the projection, and the second position vector corresponding to the part is the position vector of the key point of the part obtained according to the two-dimensional target image.

if the key information of the first target part in all the parts is a mask image and the key information of the second target part is a key point, taking the sum of the distances between the first mask vector and the second mask vector corresponding to each first target part as a mask difference, taking the sum of the distances between the first position vector and the second position vector corresponding to each second target part as a key point difference, and determining the first difference according to the mask difference and the key point difference.

for any part, determining the area of the part in the two-dimensional target image;

if the area occupied by the part is larger than a preset threshold value, determining the part as a first target part;

and if the area occupied by the part is not larger than a preset threshold value, determining the part as a second target part.

and taking the distance between a first coordinate vector obtained by combining the Laplace coordinates of all vertexes in the three-dimensional target grid and a second coordinate vector obtained by combining the Laplace coordinates of all vertexes in the preset three-dimensional grid as the second difference.

In an alternative embodiment, the obtaining a loss function determined by the first difference and the second difference based on the key information includes:

and determining the loss function according to the first difference, the second difference and a third difference, wherein the third difference comprises a normal vector difference between a normal vector of each vertex in the three-dimensional target grid and a normal vector of each vertex in the preset three-dimensional grid, and/or a side length difference between a side length of each side in the three-dimensional target grid and a side length of each side in the preset three-dimensional grid.

and combining the normal vector coordinates of each vertex in the three-dimensional target grid to obtain a first normal vector, and combining the normal vector coordinates of each vertex in the preset three-dimensional grid to obtain a second normal vector, wherein the distance between the first normal vector and the second normal vector is used as the normal vector difference.

and taking the distance between a first side length vector obtained by combining the side lengths of all sides in the three-dimensional target grid and a second side length vector obtained by combining the side lengths of all sides in the preset three-dimensional grid as the side length difference.

In an alternative embodiment, the three-dimensional reconstruction according to the determined position coordinates of each vertex to generate a three-dimensional model corresponding to the object in the two-dimensional target image includes:

adjusting the positions of the vertexes in the preset three-dimensional grid according to the position coordinates of the vertexes;

and performing texture rendering on the adjusted preset three-dimensional grid to generate a three-dimensional model corresponding to the two-dimensional target face image.

According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for three-dimensional reconstruction of an object, including:

an information acquisition unit configured to perform acquisition of key information of each part in an object contained in a two-dimensional target image, wherein the key information includes a key point and/or a mask image;

a function construction unit configured to perform obtaining a loss function determined by a first difference and a second difference based on the key information, wherein the first difference is a difference between key information determined according to a projection of a three-dimensional target mesh of the object in a two-dimensional space and key information obtained according to the two-dimensional target image, and the second difference is a difference between a relative position of each vertex and an adjacent vertex in the three-dimensional target mesh and a relative position of each vertex and an adjacent vertex in a preset three-dimensional mesh of the object;

a vertex determination unit configured to perform determination of position coordinates of respective vertices in a three-dimensional target mesh used when the loss function takes a minimum value;

a three-dimensional reconstruction unit configured to perform three-dimensional reconstruction from the determined position coordinates of the respective vertices, generating a three-dimensional model corresponding to the object.

the information acquisition unit is specifically configured to perform:

In an alternative embodiment, the information obtaining unit is specifically configured to perform:

In an alternative embodiment, before obtaining the loss function determined by the first difference and the second difference based on the key information, the function construction unit is further configured to perform:

In an optional implementation manner, the key information of the first target region in all the regions is a mask image, and the key information of the second target region is a key point;

before obtaining, based on the key information, a loss function determined by the first difference and the second difference, the function construction unit is further configured to perform:

In an alternative embodiment, the function building unit is specifically configured to perform:

In an alternative embodiment, the three-dimensional reconstruction unit is specifically configured to perform:

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method for three-dimensional reconstruction of an object according to any one of the first aspect of the embodiments of the present disclosure.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a non-transitory readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the method for three-dimensional reconstruction of an object according to any one of the first aspect of the embodiments of the present disclosure.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, which, when run on an electronic device, causes the electronic device to perform a method that implements any of the above first aspect and the first aspect of embodiments of the present disclosure may relate to.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

since the embodiment of the present disclosure does not directly construct a three-dimensional model, but constructs based on a two-dimensional target image, and sets a standard three-dimensional mesh of an object, that is, a preset three-dimensional mesh of the object in advance, in the embodiment of the present disclosure, a first difference between key information determined according to a projection of the three-dimensional target mesh to be determined in a two-dimensional space and key information determined according to the two-dimensional target image, and a second difference between the three-dimensional target mesh of the object and the preset three-dimensional mesh are obtained based on the key information, coordinates of respective vertices in the three-dimensional target mesh used when a loss function determined by the first difference and the second difference takes a minimum value, that is, a solution determined when an object function constructed according to the first difference and the second difference takes a minimum value, position coordinates of the respective vertices in the three-dimensional target mesh corresponding to the two-dimensional target image can be directly obtained based on the above information, three-dimensional meshes can be constructed based on the vertex coordinates, and then a three-dimensional model corresponding to the object can be directly obtained.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a flow chart illustrating a method for three-dimensional reconstruction of an object in accordance with an exemplary embodiment;

FIG. 2 is a schematic diagram illustrating a two-dimensional target image in accordance with an exemplary embodiment;

FIG. 3 is a schematic diagram illustrating a facial skin layer in accordance with an exemplary embodiment;

FIG. 4 is a schematic diagram illustrating a mouth layer in accordance with an exemplary embodiment;

FIG. 5 is a schematic diagram illustrating a nose map layer in accordance with an exemplary embodiment;

FIG. 6 is a schematic diagram illustrating a left eye layer in accordance with an exemplary embodiment;

FIG. 7 is a schematic diagram illustrating a right eye layer in accordance with an exemplary embodiment;

FIG. 8A is a schematic illustration of another two-dimensional target image shown in accordance with an exemplary embodiment;

FIG. 8B is a schematic diagram illustrating a three-dimensional face image in accordance with an exemplary embodiment;

FIG. 9 is a flowchart illustrating a complete method for three-dimensional reconstruction of an object in accordance with an exemplary embodiment;

FIG. 10 is a schematic diagram illustrating a complete method for three-dimensional reconstruction of a human face in accordance with an exemplary embodiment;

FIG. 11 is a block diagram illustrating an apparatus for three-dimensional reconstruction of an object in accordance with an exemplary embodiment;

FIG. 12 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment;

FIG. 13 is a block diagram illustrating a computing device in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Some of the words that appear in the text are explained below:

1. the term "and/or" in the embodiments of the present disclosure describes an association relationship of associated objects, and means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

2. The term "electronic device" in the embodiments of the present disclosure may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like.

3. The term "three-dimensional reconstruction" in the embodiments of the present disclosure refers to establishing a mathematical model suitable for computer representation and processing on a three-dimensional object, which is the basis for processing, operating and analyzing the properties of the three-dimensional object in a computer environment, and is also a key technology for establishing a virtual reality expressing an objective world in a computer. In the embodiment of the present disclosure, the three-dimensional reconstruction refers to building a three-dimensional model of an object based on a three-dimensional mesh preset by the object based on key information after determining the key information of each part related to the object based on a two-dimensional target image including the object.

4. The term "rendering" in the embodiments of the present disclosure is to project a three-dimensional object onto a two-dimensional plane. I.e. using two-dimensional planes to represent a three-dimensional graphic. Namely, the vertex, color and triangle information of the three-dimensional model are known, and a two-dimensional plane map corresponding to the three-dimensional model is obtained.

5. The term "mask" in the embodiments of the present disclosure refers to the outside of the box (the inside of the box is the box). The mask image in the embodiment of the present disclosure refers to an image that can represent the outline of each part. For example, in the mask image corresponding to the left-eye region, the pixel value of the pixel point in the left-eye region is 1.0, and the pixel value of the pixel point in the background region (non-left-eye region) is 0.0, where a pixel value of 1 indicates white in the mask image, and a pixel value of 0 indicates black. In the embodiment of the present disclosure, the mask image of any portion may be directly converted from the layered image, or may be obtained by determining an area (image) corresponding to the portion from a non-layered two-dimensional image and further converting the area (image).

6. The term "Laplacian (Laplacian) coordinates" in the embodiments of the present disclosure refers to the euclidean space coordinates of the mesh vertices converted by the Laplacian operator. The Laplace operator is a second order differential operator in n-dimensional Euclidean space, defined as a gradient

Divergence of

The Laplacian coordinate contains the local detail characteristics of the grid, and can better keep the local detail of the grid model. Therefore, the Laplacian coordinate is efficient and robust.

7. The term "object" in the embodiments of the present disclosure refers to a target object when performing three-dimensional reconstruction, such as a human face (which may be a quadratic face), a cartoon animal face, or a target object of other human faces.

8. The term "key information" in the embodiments of the present disclosure refers to information related to key positions, regions, or contours, etc., related to various parts of an object in an image, such as a mask image or key points, etc.

The application scenario described in the embodiment of the present disclosure is for more clearly illustrating the technical solution of the embodiment of the present disclosure, and does not form a limitation on the technical solution provided in the embodiment of the present disclosure, and as a person having ordinary skill in the art knows, with the occurrence of a new application scenario, the technical solution provided in the embodiment of the present disclosure is also applicable to similar technical problems. Wherein, in the description of the present disclosure, unless otherwise indicated, "plurality" means.

Fig. 1 is a flowchart illustrating a method for three-dimensional reconstruction of an object according to an exemplary embodiment, as shown in fig. 1, including the following steps.

In step S11, key information of each part in the object included in the two-dimensional target image is acquired, wherein the key information includes key points and/or mask images;

in step S12, based on the key information, obtaining a loss function determined by a first difference between the key information determined from the projection of the three-dimensional target mesh of the object in the two-dimensional space and the key information obtained from the two-dimensional target image and a second difference between the relative positions of the vertices and the adjacent vertices in the three-dimensional target mesh and the relative positions of the vertices and the adjacent vertices in the preset three-dimensional mesh of the object;

the three-dimensional mesh composed of triangular surfaces is taken as an example, adjacent triangular surfaces have shared edges, and the shared edges are the case. The relative position of each vertex relative to the surrounding vertices can be determined according to the relative positions of each vertex and the adjacent vertices, and when the relative position of each vertex relative to the surrounding vertices in the three-dimensional target mesh is consistent with the relative position of the vertex relative to the surrounding vertices in the preset three-dimensional mesh, the relative positions are relatively smooth.

In step S13, position coordinates of each vertex in the three-dimensional target mesh used when the loss function takes the minimum value are determined;

in step S14, a three-dimensional model corresponding to the object is generated by performing three-dimensional reconstruction from the position coordinates of the determined vertices.

In the embodiment of the present disclosure, the objects are different, and the preset three-dimensional meshes of the objects are also different, for example, when the object is a human face, the preset three-dimensional meshes are three-dimensional mesh models of the human face, and when the object is a cartoon animal face, the preset three-dimensional meshes are three-dimensional mesh models of the cartoon animal face.

The three-dimensional target mesh or the preset three-dimensional mesh is a three-dimensional mesh, and the 3D model formed by splicing one mesh is generally formed by splicing triangular patches, namely the meshes in the three-dimensional mesh are triangles, and vector triangles share part of vertexes.

It should be noted that the preset three-dimensional mesh in the embodiment of the present disclosure is a standard three-dimensional mesh which is manufactured in advance, and taking an object as a face as an example, the preset three-dimensional mesh of the face may be a standard 3D head model, and the position of each vertex in the model is fixed and is set in advance; while the three-dimensional target mesh is corresponding to the object in the two-dimensional target image, in the embodiment of the present disclosure, the position coordinates of each vertex in the three-dimensional target mesh are independent variables, so the loss function determined based on the first difference and the second difference is a dependent variable related to the independent variable, and when the loss function takes the minimum value, the obtained solution, that is, the position coordinates of each vertex in the three-dimensional target mesh used when the loss function takes the minimum value. Based on the position coordinates determined at this time, a three-dimensional model conforming to the object in the two-dimensional target image can be generated.

In the above-described embodiment, since the present disclosure is not to directly construct a three-dimensional model, but to construct based on a two-dimensional target image, and a standard three-dimensional mesh of an object, that is, a preset three-dimensional mesh of the object is set in advance, in the present disclosure, a first difference between key information determined from a projection of a three-dimensional target mesh to be determined in a two-dimensional space and key information determined from the two-dimensional target image, and a second difference between the three-dimensional target mesh of the object and the preset three-dimensional mesh are obtained based on the key information, coordinates of respective vertices in the three-dimensional target mesh used when a loss function determined by the first difference and the second difference takes a minimum value, that is, a solution determined when an object function constructed based on the first difference and the second difference takes a minimum value, position coordinates of respective vertices in the three-dimensional target mesh corresponding to the two-dimensional target image can be directly obtained based on the above-described information, three-dimensional meshes can be constructed based on the vertex coordinates, and then a three-dimensional model corresponding to the object can be directly obtained.

In an optional implementation manner, if a two-dimensional target image is obtained by superimposing layered images, where an object included in the two-dimensional target image is obtained by superimposing layered images of respective parts according to a preset layer order, each layered image includes a part of at least one object, and different layered images include different parts, when key information of each part in the object included in the two-dimensional target image is obtained, a specific manner is as follows:

acquiring a layered image of each part in an object contained in a two-dimensional target image; key information corresponding to each part is extracted from the layered image of each part.

In the embodiment of the present disclosure, the layered images of each part are generated according to a preset layered template of each part, and specifically, the layered images may be designed by (animation) designers according to their own will or needs when making the three-dimensional model, or may be preset templates that are designed in advance and stored in a database. For example, the database has layered images related to each part, and the layered images corresponding to different parts include various types, for example, for a left eye or a right eye, the part may be divided into a large eye, a small eye, a circular eye, a crescent eye, and the like, and each part selects a preset template, and the two-dimensional target image can be obtained by stacking according to a preset image layer sequence.

In the embodiment of the disclosure, the image layering can conveniently identify the key information corresponding to each part, and the texture in the three-dimensional model generated based on the layered image is more complete, for example, in the case of no layering, an eye may block a part of facial skin, and when the three-dimensional model generated based on the layered image shows some actions, the texture information of the eyelid part is lost in the eye closing state when the eye is blinked. When the layered image is adopted, the complete texture characteristics of each part can be obtained, the texture of the generated three-dimensional model is more complete, and the texture of the eyelid part is complete under the eye closing state.

Taking the object as a human face as an example, the layered image of each part includes but is not limited to part or all of the following:

the face skin picture layer, mouth picture layer, nose picture layer, left eye picture layer, right eye picture layer.

Fig. 2 is a diagram obtained by superimposing a plurality of layered images according to a preset layer order, wherein the preset layer order is a facial skin layer, a mouth layer, a nose layer, a left eye layer, and a right eye layer from bottom to top. As shown in fig. 3 to 7, the two-dimensional target image shown in fig. 2 includes layered images, where fig. 3 is a face skin layer, fig. 4 is a mouth layer, fig. 5 is a nose layer, fig. 6 is a left eye layer, and fig. 7 is a right eye layer.

In an optional implementation manner, when key information of each part in an object included in a two-dimensional target image is acquired, for any part, the following two manners may be adopted:

the method comprises the steps of obtaining a first mode, converting an image corresponding to a certain part in a two-dimensional target image into a mask image, and taking the mask image as key information of the part.

The image can be segmented based on image segmentation and other methods, wherein a certain part in the two-dimensional target image is used as a foreground, and the other parts are used as backgrounds, so that a mask image corresponding to the certain part is obtained.

If the part is a face, in the face mask image, pixel points in the face area are white, and pixel points in other areas are black; or the pixels in the face area are black, and the pixels in other areas are white, and the like.

It should be noted that the above-mentioned method of converting an image corresponding to a certain portion in a two-dimensional target image into a mask image is merely an example, and any method of acquiring a mask image is applicable to the embodiments of the present disclosure.

In this manner, if the two-dimensional target image includes a layered image, the image corresponding to the part, that is, the layered image of the part, and at this time, the key information of the part can be directly acquired from the layered image.

Taking the facial skin layered image shown in fig. 3 as an example, key information of the facial skin part can be obtained by directly converting the layered image into a mask image.

If the two-dimensional target image does not include a layered image, it is necessary to recognize a region (image) corresponding to the facial skin region from the two-dimensional target image and convert the image of the partial region into a mask image as key information of the facial skin region.

And secondly, extracting corresponding key points of a certain part in the two-dimensional target image, and taking the extracted key points as key information of the part.

In the embodiment of the present disclosure, when acquiring a keypoint corresponding to a certain portion based on a two-dimensional target image, a mask image of the certain portion may be acquired first, and an outline of the certain portion may be determined based on the mask image, and at this time, the outline may be equally divided by a distance to determine a plurality of keypoints.

Or, the layered image is directly extracted, and taking the mouth part layer image shown in fig. 4 as an example, the left and right end points of the mouth part layer can be directly obtained as key points corresponding to the part of the mouth part, that is, the left mouth corner point and the right mouth corner point.

Taking the nose layered image shown in fig. 5 as an example, the central point of the obtained nose layer can be directly used as a key point corresponding to the nose, i.e., a nose tip point.

Based on the two acquisition modes, when the key information of each part in the object is acquired, the key information of all the parts can be mask images; or the key information of all parts can be key points; the key information of part of the part may be a mask image, and the key information of part of the part may be a key point.

Taking an object as a face as an example, as shown in fig. 2, the object is a schematic diagram of a two-dimensional target image in the embodiment of the present disclosure, where the object included in the image is a face, and each part of the face may be divided into: facial skin, left eye, right eye, nose, and mouth. When key information of each part of the object is acquired, a mask image obtained by converting an image (area) of each part in the 5 parts can be acquired, and the mask image of each part is taken as the key information; the key points corresponding to each of the 5 parts can also be obtained, for example, 24 key points correspond to the facial skin, 6 key points correspond to the left eye, 6 key points correspond to the right eye, 2 key points (left mouth corner and right mouth corner) in the mouth, and 1 key point (nose tip) in the nose.

In addition, key information of the face skin and the left and right eyes may be mask images, and key information of the nose and mouth may be key points. Specifically, a face mask image obtained by converting a face skin layer is used as key information corresponding to a face skin part, a left eye mask image obtained by converting a left eye layer is used as key information corresponding to a left eye part, and a right eye mask image obtained by converting a right eye layer is used as key information corresponding to a right eye part; and respectively taking the key points of the mouth extracted from the mouth image layer as key information corresponding to the mouth part, and taking the key points of the nose extracted from the nose image layer as key information corresponding to the nose part.

In the embodiment, different types of key information are extracted according to different parts, so that the calculation amount can be effectively reduced on the basis of ensuring that the generated three-dimensional model conforms to the object in the two-dimensional target image.

In the embodiments of the present disclosure, the first difference mainly refers to a difference between key information of each part obtained from projection of a three-dimensional target mesh to be determined and key information of each part obtained from a two-dimensional target image. Specifically, according to the type of the key information of each part, the first difference can be divided into the following three forms, which are described in detail below:

in the first form, if the key information of all the parts is mask images, the sum of the distances between the first mask vector and the second mask vector corresponding to each part is used as a first difference; the first mask vector corresponding to the part is obtained by combining the position coordinates of all pixel points in the mask image of the part obtained by projection, and the second mask vector corresponding to the part is obtained by combining the position coordinates of all pixel points in the mask image of the part obtained by the two-dimensional target image; the first difference D1, for example, is as follows:

where N1 is the total number of sites in the object, M is the three-dimensional target mesh, R_i(M) is a rendering operator, by R_i(M) calculating a projection outline of a grid corresponding to an area where the ith part is located in the three-dimensional target grid in a two-dimensional space, and obtaining a first mask vector obtained by combining position coordinates of all pixel points in a mask image of the part obtained according to projection through vector form representation; ii denotes a second mask vector obtained by combining the position coordinates of all pixel points in the mask image of the part obtained from the two-dimensional target image, and i denotes mask images of different parts when different values are taken.

For example, when the facial skin region is 500 × 500, the first mask vector and the second mask vector are long vectors formed by splicing the position coordinates of 2500 pixels.

Still taking the face shown in fig. 2 as an example, the face includes 5 parts in total, N1 is 5, i has a value ranging from 0 to 4, i takes different values to represent different parts in the face, where i is 0 denotes facial skin, i is 1 denotes left eye, i is 2 denotes right eye, i is 3 denotes nose, and i is 4 denotes mouth. In this case, the first difference is a difference between the first mask vector and the second mask vector corresponding to the facial skin, a difference between the first mask vector and the second mask vector corresponding to the left eye, a difference between the first mask vector and the second mask vector corresponding to the right eye, a difference between the first mask vector and the second mask vector corresponding to the nose, a difference between the first mask vector and the second mask vector corresponding to the mouth, and a sum of these five differences is the first difference, where a difference between two vectors indicates a distance between these two vectors.

In this way, the constraint based on the first difference (mask image) can ensure that the projection profile of the corresponding region of the three-dimensional target grid on the two-dimensional space is as consistent as possible with the mask images of the respective sites acquired based on the two-dimensional target image.

In the second form, if the key information of all the parts is the key point, the sum of the distances between the first position vector and the second position vector corresponding to each part is used as a first difference; the first position vector corresponding to the part is a position vector of a key point of the part obtained according to the projection, and the second position vector corresponding to the part is a position vector of a key point of the part obtained according to the two-dimensional target image. For example, the first difference D2 of the following formula:

wherein N2 is the total number of key points, LM, corresponding to each part_j(M) is a key point projection operator, which is used for calculating the projection of the jth key point on the two-dimensional space, and the first position vector of the key point of a certain position can be obtained through the expression of the vector form, P_jAnd a second position vector representing a position coordinate representation of a key point corresponding to a part obtained from the two-dimensional target image, wherein different values of j represent different key points.

Still taking the face shown in fig. 2 as an example, the face includes 5 parts in total, the total number N2 of the key points corresponding to the 5 parts is 39, j has a value range of 0 to 38, j takes different values to represent different key points corresponding to each part in the face, for example, j ═ 0 represents a nose tip point, j ═ 1 represents a left mouth corner point, j ═ 2 represents a right mouth corner point, j ═ 3 represents a first eye key point, j ═ 4 represents a second eye key point, and so on. At this time, the first difference is the difference between the first position vector and the second position vector corresponding to the nose tip, the difference between the first position vector and the second position vector corresponding to the left mouth corner point, the difference between the first position vector and the second position vector corresponding to the right mouth corner point, the difference between the first position vector and the second position vector corresponding to the first eye key point, and so on, and the sum of the 39 vector differences corresponding to the 39 key points is the first difference.

In this way, the constraint based on the first difference (keypoint) can ensure that the projection of the keypoint corresponding to the three-dimensional target mesh on the two-dimensional space is as consistent as possible with the keypoint corresponding to each part acquired based on the two-dimensional target image.

And thirdly, if the key information of the first target part in all the parts is a mask image and the key information of the second target part is a key point, taking the sum of the distances between the first mask vector and the second mask vector corresponding to each first target part as a mask difference, taking the sum of the distances between the first position vector and the second position vector corresponding to each second target part as a key point difference, and determining a first difference according to the mask difference and the key point difference, wherein the first difference comprises two parts, namely the mask difference is D3 and the key point difference is D4, and the calculation formula is as follows:

where N3 represents the total number of first target sites, N3 is less than N1, N4 represents the total number of keypoints corresponding to second target sites, and N4 is less than N2, the first difference is composed of two parts, that is, the mask difference determined according to each first target site and the keypoint difference determined according to each second target site.

In an alternative embodiment, the first target site and the second target site may be selected from the respective sites by:

determining the area of any part in the two-dimensional target image;

and if the face occupied by the part is not larger than the preset threshold value, determining the part as a second target part.

Still taking the human face shown in fig. 2 as an example, the area occupied by the face skin, the left eye and the right eye is large, and the area occupied by the mouth and the nose is small, assuming that the area occupied by the face skin region shown in fig. 3 is a1, the area occupied by the mouth region shown in fig. 4 is a2, the area occupied by the nose region shown in fig. 5 is A3, the area occupied by the left eye region shown in fig. 6 is a4, and the area occupied by the left eye region shown in fig. 7 is a5, where the preset threshold value is B, and a1> a4 is a5> B > a2> A3. Therefore, if the first target sites determined according to the preset threshold are the facial skin, the left eye and the right eye, and the second target sites are the nose and the mouth, N3 is 3, and N4 is 3.

Wherein the mask differences are: the difference between the first mask vector and the second mask vector corresponding to the facial skin, the difference between the first mask vector and the second mask vector corresponding to the left eye part, the difference between the first mask vector and the second mask vector corresponding to the right eye part, and the sum of the differences of the three vectors.

The key point differences are: the difference between the first and second position vectors corresponding to the nose, and the difference between the first and second position vectors corresponding to the mouth, the sum of these two vector differences.

In the above embodiment, each part is divided into two types according to the area of the region occupied by each part, namely, the first target part and the second target part, the types of the key information corresponding to the two types of parts are different, and since there are more key points corresponding to the part with the larger area and fewer key points corresponding to the parts with the smaller area, and the part can be drawn based on a small number of key points, it is ensured that each part can be effectively drawn based on the acquired key information of each part, and the calculation amount is reduced.

It should be noted that, in the embodiment of the present disclosure, in addition to distinguishing the first target portion and the second target portion according to the area, the first target portion and the second target portion may also be distinguished according to the number of pixel points included in the area occupied by each portion, and the like, which is not specifically limited herein.

After the description of the several forms of the first difference, a second difference in the embodiment of the present disclosure is described below, where the second difference refers to a difference between a relative position of each vertex and an adjacent vertex in the three-dimensional target mesh and a relative position of each vertex and an adjacent vertex in the preset three-dimensional network of the object.

In an alternative embodiment, the laplacian coordinates of each vertex are obtained by performing laplacian transformation on the euclidean space coordinates of each vertex, and the laplacian coordinates include local detail features of the mesh and can be used to determine the relative position between the vertices, so that a second difference can be constructed according to the laplacian coordinates of each vertex in the mesh, specifically:

and taking the distance between a first coordinate vector obtained by combining the Laplace coordinates of all vertexes in the three-dimensional target grid and a second coordinate vector obtained by combining the Laplace coordinates of all vertexes in the preset three-dimensional grid as a second difference.

In the embodiment of the present disclosure, the laplacian coordinate is obtained by converting the euclidean space coordinate of the vertex based on the laplacian operator, and includes the local detail feature of the mesh, so that the local detail of the mesh model can be better maintained. Therefore, the second difference is constructed according to the coordinate vectors obtained by combining the laplace coordinates of the vertices, the laplace coordinates of the vertices in the three-dimensional target mesh can be constrained to be consistent as much as possible, and the calculation formula of the second difference D5 is as follows:

wherein M is a three-dimensional target grid, M₀A three-dimensional grid is preset. Lap (M) is a Laplace coordinate operator, Lap (M) is a coordinate vector obtained by splicing vertex-by-vertex Laplace coordinates corresponding to the three-dimensional target grid M, namely a first coordinate vector, Lap (M)₀) I.e. presetting a three-dimensional grid M₀And the coordinate vector obtained by splicing the corresponding vertex-by-vertex Laplace coordinates is a second coordinate vector.

In the disclosed embodiments, M and M₀The number of vertices involved is as uniform as possible, except that the location of the coordinates may be different. The laplace coordinate of one vertex is a three-dimensional vector, and assuming that the grid M has 1000 vertices, the first coordinate vector is a 3000-dimensional vector, and similarly, the second coordinate vector is also a 3000-dimensional vector. For example, the Laplace coordinates of the vertex A0 are [ x0, y0, z0 ]]The laplace coordinates of the vertex A1 are [ x1, y1, z1 ]]…, Laplace coordinates of vertex A999 of [ x999, y999, z999]Then the first coordinate vector is [ x0, y0, z0, x1, y1, z1, …, x999, y999, z999]And the second coordinate vector has the same principle.

In the above embodiment, constructing the second difference based on the laplacian coordinates is more efficient and robust.

In an alternative embodiment, when obtaining the position coordinates of each vertex in the three-dimensional target mesh used when the loss function determined by the first difference and the second difference takes the minimum value based on the key information, the following two cases can be divided:

and under the first condition, determining a loss function according to the first difference and the second difference, and taking the coordinate position of each vertex in the corresponding three-dimensional target grid when the loss function takes the minimum value as the position coordinate of each vertex in the three-dimensional target grid used when the loss function takes the minimum value.

Optionally, the loss function is a sum of products of each of the first difference and the second difference and the corresponding weight.

Assuming that the first difference only includes a mask difference, taking an object as a human face as an example, the first difference totally includes 5 parts, a value range of i is 0-4, and then the loss function L1 is:

wherein alpha is₁Is the weight corresponding to the first difference, α₂Is the weight corresponding to the second difference.

Assuming that the first difference only includes a keypoint difference, taking a human face as an example, the 5 parts include 39 keypoints in total, and the value of i is 0-18, the loss function L2 can be expressed as:

wherein, beta₁Is the weight corresponding to the first difference, beta₂Is the weight corresponding to the second difference.

Assuming that the first difference includes both the mask difference and the keypoint difference, the first target sites are the facial skin, the left eye and the right eye listed above, and the second target sites are the nose and the mouth listed above, the loss function L3 can be expressed as:

at this time, each of the first difference and the second difference includes: mask variance, keypoint variance, and second variance. Wherein, γ₁Is the weight corresponding to the mask difference, gamma₂Is the weight, gamma, corresponding to the keypoint difference₃Is the weight corresponding to the second difference.

And determining a loss function according to the first difference, the second difference and the third difference, and taking the coordinate position of each vertex in the corresponding three-dimensional target mesh when the loss function takes the minimum value as the position coordinate of each vertex in the three-dimensional target mesh used when the loss function takes the minimum value.

The third difference comprises a normal vector difference between a normal vector of each vertex in the three-dimensional target grid and a normal vector of each vertex in the preset three-dimensional grid, and/or a side length difference between the side length of each side in the three-dimensional target grid and the side length of each side in the preset three-dimensional grid. That is, the third difference includes one or more of a normal vector difference and a side length difference, similar to the first difference.

The third difference is described in detail below:

and aiming at the difference of the normal vectors, the distance between a first normal vector obtained by combining normal vector coordinates of each vertex in the three-dimensional target grid and a second normal vector obtained by combining normal vector coordinates of each vertex in the preset three-dimensional grid is specifically referred. When the third difference is a normal vector difference, the third difference D6 is calculated as follows:

in the embodiment of the disclosure, the vertex normal direction of the three-dimensional target mesh is constrained to be consistent with the vertex normal direction of the preset three-dimensional mesh as much as possible based on the normal vector difference. Wherein norm (M) is a vertex normal operator, which can be used to calculate the three-dimensional target mesh M and the preset three-dimensional mesh M₀The corresponding vertex-by-vertex normal. In the embodiment of the disclosure, a first normal vector (norm (M)) obtained by splicing normal vector coordinates of each vertex corresponding to a three-dimensional target grid M is preset for the three-dimensional grid M₀A second normal vector (Norm (M)) obtained by splicing normal vector coordinates of corresponding vertexes₀)。

In the disclosed embodiments, M and M₀The number of vertices involved is as uniform as possible. The vertex normal coordinates of each vertex are three-dimensional vectors, and assuming that the grid M has 1000 vertices, the first normal vector is a 3000-dimensional vector, and similarly, the second normal vectorAlso a 3000-dimensional vector. For example, the vertex A0 has vertex normal coordinates of [ a0, b0, c0 ]]The vertex normal coordinates of the vertex A1 are [ a1, b1, c1 ]]…, vertex A999 has a vertex normal coordinate of [ a999, b999, c999]Then the first normal vector is [ a0, b0, c0, a1, b1, c1, …, a999, b999, c999]The second normal vector is the same.

The side length difference specifically refers to a distance between a first side length vector obtained by combining the side lengths of all sides in the three-dimensional target grid and a second side length vector obtained by combining the side lengths of all sides in the preset three-dimensional grid. When the third difference is the side length difference, the third difference D7 is calculated as follows:

in the embodiment of the disclosure, the side length of the grid of the three-dimensional target grid is constrained to be as consistent as possible with the side length of the grid of the preset three-dimensional grid based on the side length difference. Wherein EL (M) is a grid side length operator which can be used for calculating a three-dimensional target grid M and presetting a three-dimensional grid M₀The corresponding side lengths of all sides. In the embodiment of the disclosure, a first edge length vector, namely el (M), obtained by splicing edge length coordinates of each edge corresponding to a three-dimensional target grid M is preset for the three-dimensional grid M₀EL (M) as the second side length vector obtained by splicing the side length coordinates of the corresponding sides₀)。

Also, in the disclosed embodiments, M and M₀The number of vertices involved is as uniform as possible, as are the number of edges. The side length of each side is a one-dimensional vector, and assuming that the grid M has 1000 sides, the first side length vector is a 1000-dimensional vector, and similarly, the second side length vector is also a 1000-dimensional vector. For example, if the side EL0 has a side length of EL0, the side EL1 has a side length of EL1, …, and the side EL999 has a side length of EL999, the first side length vector is [ EL0, EL1, …, and EL999]The second side length vector is the same.

Furthermore, the third difference may also include both the normal vector difference and the side length difference, in which case the third difference includes both D6 and D7.

Optionally, the loss function is a sum of products of each of the first difference, the second difference, and the third difference and the corresponding weight.

For example, when the first difference includes two parts, namely a mask difference and a key point difference, and the third difference includes two parts, namely a normal vector difference and a side length difference, taking the object as a face as an example, the loss function L4 is:

at this time, the first difference, the second difference and the third difference include a total of 5 differences, λ₁Weight, λ, corresponding to mask differences comprised by the first difference₂The weight, λ, corresponding to the keypoint difference included in the first difference₃Is the weight corresponding to the second difference, λ₄Weight, λ, corresponding to the normal vector difference comprised by the third difference₅And the weight corresponding to the side length difference included in the third difference.

When the first difference only includes one of the mask difference and the keypoint difference, or the third difference only includes one of the normal vector difference and the side length difference, the loss function may also have the following forms:

it should be noted that the eight loss functions listed above from L5 to L12 are determined according to the first difference, the second difference, and the third difference, and in the embodiment of the present disclosure, a total of 12 calculation formulas of the loss functions are given, and these calculation formulas are all applicable to the embodiment of the present disclosure, among the 12 calculation modes, the position coordinates of each vertex in the three-dimensional target mesh determined when L5 takes the minimum value are optimal, because it is guaranteed that the three-dimensional target mesh corresponding to the two-dimensional target image can be obtained as far as possible by a small deformation of the preset three-dimensional mesh on the premise of guaranteeing the relative positions between the vertices and the key information of the three-dimensional target mesh.

It should be noted that, when the vectors in the embodiment of the present disclosure need to be obtained by stitching the coordinates of the vertices, the stitching order of the first vector and the second vector is the same, for example, when the first coordinate vector is formed by stitching the laplace coordinates of 1000 vertices, the stitching order is assumed to be numbered from number a0 to number a999 at M₀The numbers of the vertexes corresponding to the numbers A0 to A999 are B0 to B999 respectively, and the second coordinate vector is M according to the sequence from B0 to B999₀The laplacian coordinates of the middle 1000 vertexes are spliced, other vectors are similar, and repeated description is omitted here.

In an alternative embodiment, at the determined position coordinates of each vertex, a three-dimensional model corresponding to the object in the two-dimensional target image may be generated based on the determined position coordinates by the following specific process:

In the embodiment of the present disclosure, the position of the vertex to be adjusted in the preset three-dimensional mesh is adjusted by the acquired position of each vertex, so that the position coordinate of the vertex after adjustment is the same as the position coordinate of the determined corresponding vertex. The adjusted preset three-dimensional mesh is actually the same as the three-dimensional mesh constructed according to the determined position coordinates of each vertex, and in the embodiment of the disclosure, the three-dimensional target mesh corresponding to the object in the two-dimensional target image can be obtained only by adjusting and deforming the preset three-dimensional mesh, and the three-dimensional mesh does not need to be constructed by self, so that the efficiency is higher, and the three-dimensional model corresponding to the object in the two-dimensional target image can be generated by texture rendering after the three-dimensional target mesh is obtained. Based on the method, the three-dimensional face reconstruction can be realized, the quadratic element 3D head can be efficiently and quickly generated, and the method is applied to scenes such as movie animation special effects and the like.

For example, as shown in fig. 8A and 8B, fig. 8A is a schematic diagram of a two-dimensional target image (only including a face) in the embodiment of the present disclosure, and fig. 8B is a schematic diagram of a three-dimensional model corresponding to the face in fig. 8A provided in the embodiment of the present disclosure, and after determining position coordinates of each vertex in the three-dimensional target mesh corresponding to fig. 8A, a corresponding three-dimensional target mesh may be obtained, and then the three-dimensional target mesh is rendered based on a texture in the two-dimensional target image, so that the three-dimensional model shown in fig. 8B, which is a two-dimensional 3D human head, may be obtained. After the 3D human head is obtained based on the method, dynamic expressions and the like can be further designed.

In the embodiment of the present disclosure, there are many ways to find the optimal solution of the loss function, such as least square method, gradient descent method, iteration method, etc.

The least square method can directly obtain the optimal solution at one time; in the gradient descent method, the iterative method, or the like, the position coordinates (independent variables) of each vertex in the three-dimensional target mesh need to be adjusted for multiple times, so that the position coordinates of each vertex used when the loss function takes the minimum value are obtained, and the position coordinates of each vertex used at this time are a set of optimal solutions.

Fig. 9 is a flowchart illustrating a complete method for three-dimensional reconstruction of an object according to an exemplary embodiment, which specifically includes the following steps:

s91: acquiring an object layered design draft designed by a planar designer according to a planar design layered template of an object;

s92: reading a mask image of a first target part and key point information of a second target part from the object layered design draft;

s93: constructing a mask difference included in the first difference based on a difference between a mask image determined according to a projection of an area corresponding to each first target portion in a three-dimensional target grid of the object in a two-dimensional space and a mask image acquired according to the two-dimensional target image;

s94: constructing a key point difference contained in the first difference based on the difference between the projection position of the key point corresponding to each second target part in the three-dimensional target grid in the two-dimensional space and the key point position obtained according to the two-dimensional target image;

s95: constructing a second difference based on the difference between the Laplace coordinates of each vertex in the three-dimensional target mesh and the Laplace coordinates of each vertex in the preset three-dimensional mesh of the object;

s96: constructing a normal vector difference contained in a third difference based on the difference between the normal vector of each vertex in the three-dimensional target mesh and the normal vector of each vertex in the preset three-dimensional mesh;

s97: constructing a side length difference contained in the third difference based on the difference between the length of each side in the three-dimensional target grid and the length of each side in the preset three-dimensional grid;

s98: taking the sum of products of each difference and corresponding weight in the first difference, the second difference and the third difference as a loss function, and obtaining the position coordinates of each vertex in the three-dimensional target grid used when the loss function obtains the minimum value;

s99: adjusting the positions of the vertexes in the preset three-dimensional grid according to the position coordinates of the vertexes;

s910: and performing texture rendering on the adjusted preset three-dimensional grid to generate a three-dimensional model corresponding to the object in the two-dimensional target image.

Taking an object as a face as an example, as shown in fig. 10, a schematic diagram of a complete method for three-dimensional reconstruction of a face provided by the embodiment of the present disclosure is shown.

Firstly, acquiring a human head layered design draft which is obtained by a planar designer designing according to a quadratic element human head planar design layered template; further reading a face mask image I from the human head layered design manuscript₁Left eye mask image I₂Right eye mask image I₃And key point information of nose and mouth, based on the information and standard quadratic element 3D human head grid M₀(3D design is made in advance), a mask constraint term, a key point constraint term, a Laplace coordinate constraint term, a grid vertex normal constraint term and a grid side length constraint term are constructed, the sum of products of each constraint term and corresponding weight is used as a loss function, the loss function is solved according to an optimization problem, finally, face three-dimensional reconstruction is carried out according to the obtained optimal solution, and a quadratic element 3D human head corresponding to a two-dimensional image is obtained, wherein the loss function is as follows:

the mask constraint item is mask difference included in the first difference, the key point constraint item is key point difference included in the first difference, the laplace coordinate constraint item is second difference, the grid vertex normal constraint item is normal vector difference included in the third difference, and the grid side length constraint item is side length difference in the third difference.

In the three-dimensional target mesh in the embodiment of the present disclosure, the position coordinates of each vertex are unknown, that is, the independent variable in the loss function, the optimal solution obtained by solving the optimal solution, that is, the position coordinates of each vertex in the three-dimensional target mesh corresponding to the two-dimensional target image, is determined based on the position coordinates of each vertex used in obtaining the optimal solution, and the determined three-dimensional target mesh is obtained by optimally deforming the preset three-dimensional mesh and conforms to the three-dimensional mesh of the object in the two-dimensional target image.

In the above embodiment, a quadratic element human head plane design layered template is designed, and a matched three-dimensional reconstruction algorithm is provided. A2D designer can automatically generate a high-quality quadratic element 3D human head grid model through an algorithm according to the layered materials designed by the template strictly. According to the method, the quadratic element 3D human head is designed, each human head needs 2 design resources per day, and compared with the design resources of 2 people in one week in the related technology, the human resource consumption in the 3D human head design is greatly reduced. By constructing a design template and assisting with a matching algorithm, the generation efficiency of the quadratic element 3D human head is greatly improved.

Fig. 11 is a block diagram illustrating an apparatus 1100 for three-dimensional reconstruction of an object according to an exemplary embodiment. Referring to fig. 11, the apparatus includes an information acquisition unit 1101, a function construction unit 1102, a vertex determination unit 1103, and a three-dimensional reconstruction unit 1104.

An information acquisition unit 1101 configured to perform acquiring key information of each part in an object included in a two-dimensional target image, wherein the key information includes a key point and/or a mask image;

a function building unit 1102 configured to perform obtaining a loss function determined by a first difference and a second difference based on the key information, wherein the first difference is a difference between key information determined according to a projection of a three-dimensional target mesh of the object in a two-dimensional space and key information obtained according to the two-dimensional target image, and the second difference is a difference between a relative position of each vertex and an adjacent vertex in the three-dimensional target mesh and a relative position of each vertex and an adjacent vertex in a preset three-dimensional mesh of the object;

a vertex determination unit 1103 configured to perform determination of position coordinates of respective vertices in a three-dimensional target mesh used when the loss function takes a minimum value;

a three-dimensional reconstruction unit 1104 configured to perform three-dimensional reconstruction from the determined position coordinates of the respective vertices, generating a three-dimensional model corresponding to the object.

the information acquisition unit 1101 is specifically configured to perform:

In an optional implementation, the information obtaining unit 1101 is specifically configured to perform:

In an alternative embodiment, before obtaining the loss function determined by the first difference and the second difference based on the key information, the function building unit 1102 is further configured to perform:

In an alternative embodiment, the function building unit 1102 is specifically configured to perform:

In an optional implementation manner, before determining, according to the first difference and the second difference, the position coordinates of each vertex in the three-dimensional target mesh corresponding to the object in the two-dimensional target image, the function building unit 1102 is further configured to perform:

In an alternative embodiment, the three-dimensional reconstruction unit 1104 is specifically configured to perform:

With regard to the apparatus in the above-described embodiment, the specific manner in which each unit executes the request has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 12 is a block diagram illustrating an electronic device 1200 according to an example embodiment, the apparatus comprising:

a processor 1210;

a memory 1220 for storing instructions executable by the processor 1210;

wherein the processor 1210 is configured to execute the instructions to implement a method of three-dimensional reconstruction of an object in an embodiment of the present disclosure.

In an exemplary embodiment, a storage medium comprising instructions, such as the memory 1220 comprising instructions, executable by the processor 1210 of the electronic device 1200 to perform the above-described method is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In some possible implementations, the disclosed embodiments also provide a computing device that may include at least one processing unit, and at least one storage unit. Wherein the storage unit stores program code which, when executed by the processing unit, causes the processing unit to perform the steps of the service invocation method according to various exemplary embodiments of the present disclosure described above in the present specification. For example, the processing unit may perform the steps as shown in fig. 1.

The computing device 130 according to this embodiment of the present disclosure is described below with reference to fig. 13. The computing device 130 of fig. 13 is only one example and should not impose any limitations on the functionality or scope of use of embodiments of the disclosure.

As shown in fig. 13, computing device 130 is embodied in the form of a general purpose computing device. Components of computing device 130 may include, but are not limited to: the at least one processing unit 131, the at least one memory unit 132, and a bus 133 connecting various system components (including the memory unit 132 and the processing unit 131).

Bus 133 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.

The storage unit 132 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)1321 and/or cache memory unit 1322, and may further include Read Only Memory (ROM) 1323.

Storage unit 132 may also include a program/utility 1325 having a set (at least one) of program modules 1324, such program modules 1324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Computing device 130 may also communicate with one or more external devices 134 (e.g., keyboard, pointing device, etc.), with one or more devices that enable a user to interact with computing device 130, and/or with any devices (e.g., router, modem, etc.) that enable computing device 130 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 135. Also, computing device 130 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via network adapter 136. As shown, network adapter 136 communicates with other modules for computing device 130 over bus 133. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computing device 130, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

In some possible embodiments, each aspect of the content recommendation method provided by the present disclosure may also be implemented in the form of a program product including program code for causing a computer device to perform the steps in the content recommendation method according to various exemplary embodiments of the present disclosure described above in this specification when the program product is run on the computer device, for example, the computer device may perform the steps as shown in fig. 1.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product of embodiments of the present disclosure may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a computing device. However, the program product of the present disclosure is not limited thereto, and in this document, the readable storage medium may be any tangible medium containing or storing a program that can be used by or in connection with a command execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a command execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on the user equipment, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of three-dimensional reconstruction of an object, comprising:

2. The method according to claim 1, wherein the two-dimensional target image is obtained by superimposing layered images of the respective portions according to a preset layer order, each of the layered images including a portion of at least one object, different layered images including different portions;

3. The method of claim 1, wherein the obtaining key information of each part in the object contained in the two-dimensional target image comprises:

4. The method of claim 3, wherein prior to said obtaining a loss function determined by a first difference and a second difference based on said key information, further comprising:

5. The method of claim 3, wherein prior to said obtaining a loss function determined by a first difference and a second difference based on said key information, further comprising:

6. The method of claim 3, wherein prior to said obtaining a loss function determined by a first difference and a second difference based on said key information, further comprising:

7. The method of claim 6, wherein prior to said obtaining a loss function determined by a first difference and a second difference based on said key information, further comprising:

8. An apparatus for three-dimensional reconstruction of an object, comprising:

9. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of three-dimensional reconstruction of an object as claimed in any one of claims 1 to 7.

10. A storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the object three-dimensional reconstruction method of any one of claims 1 to 7.