CN117671146A - Single-view three-dimensional reconstruction method based on cyclic diffusion model - Google Patents

Single-view three-dimensional reconstruction method based on cyclic diffusion model Download PDF

Info

Publication number
CN117671146A
CN117671146A CN202311650864.3A CN202311650864A CN117671146A CN 117671146 A CN117671146 A CN 117671146A CN 202311650864 A CN202311650864 A CN 202311650864A CN 117671146 A CN117671146 A CN 117671146A
Authority
CN
China
Prior art keywords
point cloud
noise
denoising
network
diffusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311650864.3A
Other languages
Chinese (zh)
Inventor
周燕
叶德旺
周月霞
刘翔宇
许业文
李文俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan University
Original Assignee
Foshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan University filed Critical Foshan University
Priority to CN202311650864.3A priority Critical patent/CN117671146A/en
Publication of CN117671146A publication Critical patent/CN117671146A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a single-view three-dimensional reconstruction method based on a cyclic diffusion model, which comprises the following steps of: extracting and fusing to obtain a fused feature vector, and predicting the mean value and variance of the conditional feature vector of the image; sampling the mean value and the variance of the predicted image condition feature vector to obtain an image condition feature vector; performing furthest point sampling to obtain a three-dimensional model point cloud, and training a denoising network in a diffusion model; the guiding capability of the image condition feature vector to the denoising network is improved through cyclic denoising; the denoising network is guided to gradually denoise the pure noise point cloud conforming to the standard Gaussian distribution through the input sheet Zhang Shitu, and finally, the three-dimensional model point cloud consistent with the geometric structure of the single view is obtained; the invention aims to provide a single-view three-dimensional reconstruction method based on a cyclic diffusion model, which has the advantages of stable training and higher operation efficiency and can improve the guiding capability of views.

Description

Single-view three-dimensional reconstruction method based on cyclic diffusion model
Technical Field
The invention relates to the technical field of machine learning, in particular to a single-view three-dimensional reconstruction method based on a cyclic diffusion model.
Background
Single view three-dimensional model reconstruction is a challenging task in the fields of computer vision, augmented reality, and industrial manufacturing, with the goal of generating a corresponding three-dimensional structure from a single image.
In recent years, with the continued development of deep learning, several voxel-based reconstruction methods have been proposed. These methods have the ability to generate shapes from a single image. However, voxel-based methods also have a significant disadvantage: it is difficult to balance the sampling resolution with the network efficiency. To overcome this limitation, researchers have explored point cloud-based reconstruction methods. These methods utilize a generation model (e.g., a variational auto-encoder, a generation antagonism network, a normalized flow model, and a diffusion probability model) for reconstruction. However, for the early methods of using a variational automatic encoder to generate an countermeasure network, since the number of output points is not decoupled from the network structure design, a different network needs to be retrained to obtain a point cloud with different points. The method of the diffusion probability model is based on the normalized flow model, the distribution of the point cloud is modeled through a network, the generation of the point cloud can be regarded as a process of sampling in the distribution, and therefore the number of sampling points can be set arbitrarily. The main drawbacks of the current method based on normalized flow model and diffusion probability model are that more noise is introduced, the quality of the generated point cloud is limited, and the time of training the network and the time of sampling generation are longer.
Disclosure of Invention
The invention aims to provide a single-view three-dimensional reconstruction method based on a circular diffusion model, which adopts the diffusion model to generate a three-dimensional model point cloud and has the advantage of stable training; in the process of training a denoising network by adopting a cyclic denoising mode, the guiding capability of the view is improved; the quality of generating the three-dimensional model point cloud is maintained, and the operation efficiency is high.
To achieve the purpose, the invention adopts the following technical scheme: a single-view three-dimensional reconstruction method based on a cyclic diffusion model comprises the following steps:
step S1: extracting and fusing to obtain fused feature vectors based on a random inactivation fused image feature extraction algorithm, and predicting the mean value and variance of image condition feature vectors through the fused feature vectors;
step S2: sampling the mean value and the variance of the predicted image condition feature vector through a heavy parameterization skill to obtain the image condition feature vector;
step S3: performing furthest point sampling on a triangular patch of the three-dimensional model to obtain a three-dimensional model point cloud, gradually adding noise to the three-dimensional model point cloud according to a noise adding strategy, and training a noise removing network in the diffusion model through the three-dimensional model point cloud after noise adding;
step S4: in each iteration of the training diffusion model, improving the guiding capability of the image condition feature vector to the denoising network through cyclic denoising;
step S5: the denoising network is guided by the input sheet Zhang Shitu to gradually denoise the pure noise point cloud conforming to the standard Gaussian distribution, and finally the three-dimensional model point cloud consistent with the geometric structure of the single view is obtained.
Preferably, step S1 comprises the following sub-steps:
substep S11: inputting rendering view { v } of i three-dimensional model 1 ,v 2 ,...,v i The method comprises the steps of (1) inputting the view feature vector into a two-dimensional backbone network psi, and extracting features to obtain a set { f } of i view feature vectors 1 ,f 2 ,...,f i The dimension of each view feature vector is set to 512; randomly inactivating the set of the view feature vectors by taking the probability as p to obtain a set F of the view feature vectors after random inactivation, wherein the specific formula is as follows:
wherein: f represents a set of randomly inactivated view feature vectors, VD p Indicating random deactivationOperation, p represents the probability of random deactivation;
substep S12: and fusing the set F of the view feature vectors after the random inactivation by using maximum value pooling to obtain a final fused feature vector.
Preferably, step S2 comprises the following sub-steps:
substep S21: constructing networks for the mean and variance of the image condition feature vectors, each network comprising two full-connection layers, a batch normalization layer and a linear rectification layer; the number of network output channels is set to 256;
substep S22: sampling the mean value and the variance of the image condition feature vector predicted by the network to obtain the image condition feature vector used for guiding the denoising network, wherein the specific formula is as follows:
wherein: z represents the image condition feature vector for final use in guiding the denoising network, σ represents the variance of the network prediction, μ represents the mean of the network prediction, ε represents the noise sampled from the standard Gaussian distribution N (0,I), and I represents the identity matrix.
Preferably, step S3 comprises the following sub-steps:
substep S31: the original three-dimensional model point cloud isWherein->N represents the number of points, < >>Representing the value of the ith point in the point cloud on the x-axis; />Representing the value of the ith point in the point cloud on the y-axis; />Representing the value of the ith point in the point cloud on the z-axis;
substep S32: the process of gradually adding noise to the three-dimensional model point cloud is regarded as a Markov chain, and the formula for gradually adding noise to the three-dimensional model point cloud is as follows:
wherein: t represents the total number of diffusion steps, M (0) Representing a three-dimensional model point cloud, M (T) To follow the pure noise point cloud of the standard Gaussian distribution N (0,I), M (t) Representing an intermediate noise point cloud with the diffusion step number t in the noise adding process; q (M) (t) |M (t-1) ) Representing a noise adding operation, adding noise points to point M (t-1) On the noise point cloud M (t) Modeling the distribution of (2);
substep S33: noise-adding operation q (M) for three-dimensional model point cloud under each diffusion step (t) |M (t-1) ) The formula of (2) is:
wherein: beta t Indicating the super-parameter, beta, of the noise adding intensity controlled according to the time step t t From beta 1 Linearly increasing to β=0.0004 T =0.02; i represents an identity matrix; n represents the number of points; m is M (t) Representing the intermediate point cloud during the noise addition.
Preferably, step S4 comprises the following sub-steps:
substep S41: will bePoint cloud distribution as original three-dimensional model>Is a separate sample of (a)Z is the image condition feature vector obtained by re-parameterization in the step S2;
substep S42: the process of gradually denoising the three-dimensional model point cloud is also used as a Markov chain, and the formula of gradually denoising the initial three-dimensional Gaussian noise under the guidance of the image condition feature vector is as follows;
wherein: t represents the total number of diffusion steps, M (0) Representing an original three-dimensional model point cloud, M (T) Is pure noise point cloud, M (t) Representing the intermediate point cloud, p, in the denoising process θ (M (t-1) |M (t) Z) represents a denoising operation;
according to the image condition feature vector z, the current moment point set M (t) Denoising to obtain a point set M at the next moment (t -1) The formula of (2) is:
p θ (M (t-1) |M (t) ,z):=N(M (t-1)θ (M (t) ,t,z),β t I);
wherein: mu (mu) θ The method comprises the steps of predicting a noise mean value added in a point set in a current time step for a denoising network; m is M (t) Represents the noise point cloud, beta, at the diffusion step number t t The noise adding intensity when the diffusion step number is t is represented; n represents the number of point clouds; i represents an identity matrix;
substep S43: the guiding capability of the image on network denoising under each time step is improved through cyclic diffusion; the cyclic diffusion includes: first, a point set M at a certain time step is given (t) Taking the first input point set as a first input point set of a cyclic denoising networkObtaining an input point set of the second cyclic denoising network through denoising operation>CirculationC times get->I.e. finally outputting the point set M of the last time step (t-1)
Substep S44: the formula for optimizing the network parameters by the gradient descent algorithm is as follows:
wherein: d (D) KL Represents q (M) (t-1) |M (t) ,M (0) ) AndKL divergence between two probability distributions;modeling probability distribution of noise point cloud of the next diffusion step number t-1 by inputting noise point cloud of the diffusion step number t and image condition feature vector into a denoising network; p is p θ (M (0) |M (1) Z) is a noise point cloud M with a spread number of 1 by input (1) Modeling noise point cloud M of next moment by image condition feature vector into denoising network (0) Probability distribution of (2);
q(M (t-1) |M (t) ,M (0) ) For the prior probability distribution, the calculation formula is:
wherein:
preferably, step S5 comprises the following sub-steps:
substep S51: randomly selecting a view V, V epsilon V from the atlas V of the rendering view in the step S1, and setting the total iteration step number T=100 of the diffusion model;
substep S52: extracting image characteristics of the single view through the step S2 and the step S3, and sampling to obtain an image condition characteristic vector z;
substep S53: sampling a set of points M from a standard Gaussian distribution (T)
Substep S54: based on Euler method, using trained denoising model, for M (T) Step denoising gradually until the denoising step number reaches the total iteration step number T to obtain a three-dimensional model point cloud M (0)
The technical scheme of the invention has the beneficial effects that: compared with the existing method for generating an countermeasure network based on an automatic encoder, the method for generating the three-dimensional model point cloud based on the diffusion model has the advantage of stable training, the method based on the diffusion model models data distribution, the point number of network output can be set at will, and network parameters do not need to be modified again for retraining.
In order to improve the correlation between the generated three-dimensional model point cloud and the input view, a cyclic denoising mode is adopted, and the guiding capability of the view is improved in the process of training a denoising network.
The method has the advantages that the quality of generating the three-dimensional model point cloud is maintained while the sampling step number is far smaller than that of other existing diffusion model-based methods, and the method has higher operation efficiency.
Drawings
FIG. 1 is a schematic flow diagram of one embodiment of the present invention;
FIG. 2 is a schematic diagram of a multi-view feature extraction and fusion process for random inactivation according to one embodiment of the invention;
FIG. 3 is a flow chart of image conditional feature vector sampling according to one embodiment of the present invention;
FIG. 4 is a schematic flow chart of cyclic denoising according to one embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further described below by the specific embodiments with reference to the accompanying drawings.
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
Referring to fig. 1 to 4, a single view three-dimensional reconstruction method based on a cyclic diffusion model is characterized by comprising the following steps:
step S1: extracting and fusing to obtain fused feature vectors based on a random inactivation fused image feature extraction algorithm, and predicting the mean value and variance of image condition feature vectors through the fused feature vectors;
step S2: sampling the mean value and the variance of the predicted image condition feature vector through a heavy parameterization skill to obtain the image condition feature vector;
step S3: performing furthest point sampling on a triangular patch of the three-dimensional model to obtain a three-dimensional model point cloud, gradually adding noise to the three-dimensional model point cloud according to a noise adding strategy, and training a noise removing network in the diffusion model through the three-dimensional model point cloud after noise adding; to obtain the noise point cloud M under each diffusion step t in the diffusion process (t) For training the noise point cloud M under the noise removal network prediction next diffusion step t-1 (t-1)
Step S4: in each iteration of the training diffusion model, improving the guiding capability of the image condition feature vector to the denoising network through cyclic denoising;
step S5: the denoising network is guided by the input sheet Zhang Shitu to gradually denoise the pure noise point cloud conforming to the standard Gaussian distribution, and finally the three-dimensional model point cloud consistent with the geometric structure of the single view is obtained.
Compared with the existing method for generating an countermeasure network based on an automatic encoder, the method based on the diffusion model has the advantage of stable training, models data distribution, and the number of points output by the network can be set at will without revising network parameters for retraining.
In order to improve the correlation between the generated three-dimensional model point cloud and the input view, a cyclic denoising mode is adopted, and the guiding capability of the view is improved in the process of training a denoising network.
The method has the advantages that the quality of generating the three-dimensional model point cloud is maintained while the sampling step number is far smaller than that of other existing diffusion model-based methods, and the method has higher operation efficiency.
Preferably, step S1 comprises the following sub-steps:
substep S11: inputting rendering view { v } of i three-dimensional model 1 ,v 2 ,...,v i The method comprises the steps of (1) inputting the view feature vector into a two-dimensional backbone network psi, and extracting features to obtain a set { f } of i view feature vectors 1 ,f 2 ,...,f i The dimension of each view feature vector is set to 512; randomly inactivating the set of the view feature vectors by taking the probability as p to obtain a set F of the view feature vectors after random inactivation, wherein the specific formula is as follows:
wherein: f represents a set of randomly inactivated view feature vectors, VD p Represents a random deactivation operation, and p represents a probability of random deactivation;
substep S12: and fusing the set F of the view feature vectors after the random inactivation by using maximum value pooling to obtain a final fused feature vector.
Specifically, step S2 includes the following sub-steps:
substep S21: constructing networks for the mean and variance of the image condition feature vectors, each network comprising two full-connection layers, a batch normalization layer and a linear rectification layer; the number of network output channels is set to 256;
substep S22: sampling the mean value and the variance of the image condition feature vector predicted by the network to obtain the image condition feature vector used for guiding the denoising network, wherein the specific formula is as follows:
wherein: z represents the image condition feature vector for final use in guiding the denoising network, σ represents the variance of the network prediction, μ represents the mean of the network prediction, ε represents the noise sampled from the standard Gaussian distribution N (0,I), and I represents the identity matrix.
Preferably, step S3 comprises the following sub-steps:
substep S31: the original three-dimensional model point cloud isWherein->N represents the number of points, < >>Representing the value of the ith point in the point cloud on the x-axis; />Representing the value of the ith point in the point cloud on the y-axis; />Representing the value of the ith point in the point cloud on the z-axis;
substep S32: the process of gradually adding noise to the three-dimensional model point cloud is regarded as a Markov chain, and the formula for gradually adding noise to the three-dimensional model point cloud is as follows:
wherein: t represents the total number of diffusion steps, M (0) Representing a three-dimensional model point cloud, M (T) To follow the pure noise point cloud of the standard Gaussian distribution N (0,I), M (t) Indicating the number of diffusion steps in the noise adding process ast, middle noise point cloud; q (M) (t) |M (t-1) ) Representing a noise adding operation, adding noise points to point M (t-1) On the noise point cloud M (t) Modeling the distribution of (2);
substep S33: noise-adding operation q (M) for three-dimensional model point cloud under each diffusion step (t) |M (t-1) ) The formula of (2) is:
wherein: beta t Indicating the super-parameter, beta, of the noise adding intensity controlled according to the time step t t From beta 1 Linearly increasing to β=0.0004 T =0.02; i represents an identity matrix; n represents the number of points; m is M (t) Representing the intermediate point cloud during the noise addition.
Specifically, step S4 includes the following substeps:
substep S41: will bePoint cloud distribution as original three-dimensional model>Z is the image condition feature vector obtained by re-parameterization in the step S2;
substep S42: the process of gradually denoising the three-dimensional model point cloud is also used as a Markov chain, and the formula of gradually denoising the initial three-dimensional Gaussian noise under the guidance of the image condition feature vector is as follows;
wherein: t represents the total number of diffusion steps, M (0) Representing an original three-dimensional model point cloud, M (T) Is pure noise point cloud, M (t) Representing the intermediate point cloud, p, in the denoising process θ (M (t-1) |M (t) Z) represents de-duplicationNoise operation;
according to the image condition feature vector z, the current moment point set M (t) Denoising to obtain a point set M at the next moment (t -1) The formula of (2) is:
p θ (M (t-1) |M (t) ,z):=N(M (t-1)θ (M (t) ,t,z),β t I);
wherein: mu (mu) θ The method comprises the steps of predicting a noise mean value added in a point set in a current time step for a denoising network; m is M (t) Represents the noise point cloud, beta, at the diffusion step number t t The noise adding intensity when the diffusion step number is t is represented; n represents the number of point clouds; i represents an identity matrix;
substep S43: the guiding capability of the image on network denoising under each time step is improved through cyclic diffusion; the cyclic diffusion includes: first, a point set M at a certain time step is given (t) Taking the first input point set as a first input point set of a cyclic denoising networkObtaining an input point set of the second cyclic denoising network through denoising operation>Circulation C times to obtain->I.e. finally outputting the point set M of the last time step (t-1)
Substep S44: the formula for optimizing the network parameters by the gradient descent algorithm is as follows:
wherein: d (D) KL Represents q (M) (t-1) |M (t) ,M (0) ) AndKL divergence between two probability distributions;modeling probability distribution of noise point cloud of the next diffusion step number t-1 by inputting noise point cloud of the diffusion step number t and image condition feature vector into a denoising network; p is p θ (M (0) |M (1) Z) is a noise point cloud M with a spread number of 1 by input (1) Modeling noise point cloud M of next moment by image condition feature vector into denoising network (0) Probability distribution of (2);
q(M (t-1) |M (t) ,M (0) ) For the prior probability distribution, the calculation formula is:
wherein:
preferably, step S5 comprises the following sub-steps:
substep S51: randomly selecting a view V, V epsilon V from the atlas V of the rendering view in the step S1, and setting the total iteration step number T=100 of the diffusion model;
substep S52: extracting image characteristics of the single view through the step S2 and the step S3, and sampling to obtain an image condition characteristic vector z;
substep S53: sampling a set of points M from a standard Gaussian distribution (T) The method comprises the steps of carrying out a first treatment on the surface of the Standard gaussian distribution N (0,I) is known;
substep S54: based on Euler method, using trained denoising model, for M (T) And carrying out gradual denoising until the denoising step number reaches the total iteration step number T, and obtaining the three-dimensional model point cloud.
In the description herein, reference to the term "embodiment," "example," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The technical principle of the present invention is described above in connection with the specific embodiments. The description is made for the purpose of illustrating the general principles of the invention and should not be taken in any way as limiting the scope of the invention. Other embodiments of the invention will be apparent to those skilled in the art from consideration of this specification without undue burden.

Claims (6)

1. The single-view three-dimensional reconstruction method based on the cyclic diffusion model is characterized by comprising the following steps of:
step S1: extracting and fusing to obtain fused feature vectors based on a random inactivation fused image feature extraction algorithm, and predicting the mean value and variance of image condition feature vectors through the fused feature vectors;
step S2: sampling the mean value and the variance of the predicted image condition feature vector through a heavy parameterization skill to obtain the image condition feature vector;
step S3: performing furthest point sampling on a triangular patch of the three-dimensional model to obtain a three-dimensional model point cloud, gradually adding noise to the three-dimensional model point cloud according to a noise adding strategy, and training a noise removing network in the diffusion model through the three-dimensional model point cloud after noise adding;
step S4: in each iteration of the training diffusion model, improving the guiding capability of the image condition feature vector to the denoising network through cyclic denoising;
step S5: the denoising network is guided by the input sheet Zhang Shitu to gradually denoise the pure noise point cloud conforming to the standard Gaussian distribution, and finally the three-dimensional model point cloud consistent with the geometric structure of the single view is obtained.
2. The single view three dimensional reconstruction method based on a cyclic diffusion model according to claim 1, wherein step S1 comprises the sub-steps of:
substep S11: inputting rendering view { v } of i three-dimensional model 1 ,v 2 ,...,v i The method comprises the steps of (1) inputting the view feature vector into a two-dimensional backbone network psi, and extracting features to obtain a set { f } of i view feature vectors 1 ,f 2 ,...,f i The dimension of each view feature vector is set to 512; randomly inactivating the set of the view feature vectors by taking the probability as p to obtain a set F of the view feature vectors after random inactivation, wherein the specific formula is as follows:
wherein: f represents a set of randomly inactivated view feature vectors, VD p Represents a random deactivation operation, and p represents a probability of random deactivation;
substep S12: and fusing the set F of the view feature vectors after the random inactivation by using maximum value pooling to obtain a final fused feature vector.
3. The single view three dimensional reconstruction method based on a cyclic diffusion model according to claim 1, wherein step S2 comprises the sub-steps of:
substep S21: constructing networks for the mean and variance of the image condition feature vectors, each network comprising two full-connection layers, a batch normalization layer and a linear rectification layer; the number of network output channels is set to 256;
substep S22: sampling the mean value and the variance of the image condition feature vector predicted by the network to obtain the image condition feature vector used for guiding the denoising network, wherein the specific formula is as follows:
wherein: z represents the image condition feature vector for final use in guiding the denoising network, σ represents the variance of the network prediction, μ represents the mean of the network prediction, ε represents the noise sampled from the standard Gaussian distribution N (0,I), and I represents the identity matrix.
4. The single view three dimensional reconstruction method based on a cyclic diffusion model according to claim 1, wherein step S3 comprises the sub-steps of:
substep S31: the original three-dimensional model point cloud isWherein->N represents the number of points and,representing the value of the ith point in the point cloud on the x-axis; />Representing the value of the ith point in the point cloud on the y-axis; />Representing the value of the ith point in the point cloud on the z-axis;
substep S32: the process of gradually adding noise to the three-dimensional model point cloud is regarded as a Markov chain, and the formula for gradually adding noise to the three-dimensional model point cloud is as follows:
wherein: t represents the total number of diffusion steps, M (0) Representing a three-dimensional model point cloud, M (T) To follow the pure noise point cloud of the standard Gaussian distribution N (0,I), M (t) Indicating the number of diffusion steps in the noise adding process ast, middle noise point cloud; q (M) (t) |M (t-1) ) Representing a noise adding operation, adding noise points to point M (t-1) On the noise point cloud M (t) Modeling the distribution of (2);
substep S33: noise-adding operation q (M) for three-dimensional model point cloud under each diffusion step (t) |M (t-1) ) The formula of (2) is:
wherein: beta t Indicating the super-parameter, beta, of the noise adding intensity controlled according to the time step t t From beta 1 Linearly increasing to β=0.0004 T =0.02; i represents an identity matrix; n represents the number of points; m is M (t) Representing the intermediate point cloud during the noise addition.
5. The single view three dimensional reconstruction method based on a cyclic diffusion model according to claim 1, wherein step S4 comprises the sub-steps of:
substep S41: will bePoint cloud distribution as original three-dimensional model>Z is the image condition feature vector obtained by re-parameterization in the step S2;
substep S42: the process of gradually denoising the three-dimensional model point cloud is also used as a Markov chain, and the formula of gradually denoising the initial three-dimensional Gaussian noise under the guidance of the image condition feature vector is as follows;
wherein: t represents the total number of diffusion steps, M (0) Representing an original three-dimensional model point cloud, M (T) Is pure noise point cloud, M (t) Representing the intermediate point cloud, p, in the denoising process θ (M (t-1) |M (t) Z) represents a denoising operation;
according to the image condition feature vector z, the current moment point set M (t) Denoising to obtain a point set M at the next moment (t-1) The formula of (2) is:
p θ (M (t-1) |M (t) ,z):=N(M (t-1)θ (M (t) ,t,z),β t I);
wherein: mu (mu) θ The method comprises the steps of predicting a noise mean value added in a point set in a current time step for a denoising network; m is M (t) Represents the noise point cloud, beta, at the diffusion step number t t The noise adding intensity when the diffusion step number is t is represented; n represents the number of point clouds; i represents an identity matrix;
substep S43: the guiding capability of the image on network denoising under each time step is improved through cyclic diffusion; the cyclic diffusion includes: first, a point set M at a certain time step is given (t) Taking the first input point set as a first input point set of a cyclic denoising networkObtaining an input point set of the second cyclic denoising network through denoising operation>Circulation C times to obtain->I.e. finally outputting the point set M of the last time step (t-1)
Substep S44: the formula for optimizing the network parameters by the gradient descent algorithm is as follows:
wherein: d (D) KL Represents q (M) (t-1) |M (t) ,M (0) ) AndKL divergence between two probability distributions;modeling probability distribution of noise point cloud of the next diffusion step number t-1 by inputting noise point cloud of the diffusion step number t and image condition feature vector into a denoising network;
p θ (M (0) |M (1) z) is a noise point cloud M with a spread number of 1 by input (1) Modeling noise point cloud M of next moment by image condition feature vector into denoising network (0) Probability distribution of (2);
q(M (t-1) |M (t) ,M (0) ) For the prior probability distribution, the calculation formula is:
wherein: alpha t =1-β t
6. The single view three dimensional reconstruction method based on a cyclic diffusion model according to claim 1, wherein step S5 comprises the sub-steps of:
substep S51: randomly selecting a view V, V epsilon V from the atlas V of the rendering view in the step S1, and setting the total iteration step number T=100 of the diffusion model;
substep S52: extracting image characteristics of the single view through the step S2 and the step S3, and sampling to obtain an image condition characteristic vector z;
substep S53: sampling a from a standard Gaussian distributionPoint set M (T)
Substep S54: based on Euler method, using trained denoising model, for M (T) Step denoising gradually until the denoising step number reaches the total iteration step number T to obtain a three-dimensional model point cloud M (0)
CN202311650864.3A 2023-12-04 2023-12-04 Single-view three-dimensional reconstruction method based on cyclic diffusion model Pending CN117671146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311650864.3A CN117671146A (en) 2023-12-04 2023-12-04 Single-view three-dimensional reconstruction method based on cyclic diffusion model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311650864.3A CN117671146A (en) 2023-12-04 2023-12-04 Single-view three-dimensional reconstruction method based on cyclic diffusion model

Publications (1)

Publication Number Publication Date
CN117671146A true CN117671146A (en) 2024-03-08

Family

ID=90076446

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311650864.3A Pending CN117671146A (en) 2023-12-04 2023-12-04 Single-view three-dimensional reconstruction method based on cyclic diffusion model

Country Status (1)

Country Link
CN (1) CN117671146A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117953544A (en) * 2024-03-26 2024-04-30 安徽农业大学 Target behavior monitoring method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117953544A (en) * 2024-03-26 2024-04-30 安徽农业大学 Target behavior monitoring method and system

Similar Documents

Publication Publication Date Title
Labach et al. Survey of dropout methods for deep neural networks
CN110378844B (en) Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network
WO2021027759A1 (en) Facial image processing
WO2022267641A1 (en) Image defogging method and system based on cyclic generative adversarial network
CN109165735B (en) Method for generating sample picture based on generation of confrontation network and adaptive proportion
CN110992351B (en) sMRI image classification method and device based on multi-input convolution neural network
Li et al. Exploring compositional high order pattern potentials for structured output learning
CN110516724B (en) High-performance multi-layer dictionary learning characteristic image processing method for visual battle scene
CN113112534B (en) Three-dimensional biomedical image registration method based on iterative self-supervision
CN112215339B (en) Medical data expansion method based on generation countermeasure network
CN114663685B (en) Pedestrian re-recognition model training method, device and equipment
CN112883756A (en) Generation method of age-transformed face image and generation countermeasure network model
JP2019197311A (en) Learning method, learning program, and learning device
Sun et al. Deep Evolutionary 3D Diffusion Heat Maps for Large-pose Face Alignment.
CN112509154B (en) Training method of image generation model, image generation method and device
CN117671146A (en) Single-view three-dimensional reconstruction method based on cyclic diffusion model
CN116258632A (en) Text image super-resolution reconstruction method based on text assistance
CN116091885A (en) RAU-GAN-based lung nodule data enhancement method
WO2022236647A1 (en) Methods, devices, and computer readable media for training a keypoint estimation network using cgan-based data augmentation
CN104134091B (en) Neural network training method
CN113393582A (en) Three-dimensional object reconstruction algorithm based on deep learning
CN112784800A (en) Face key point detection method based on neural network and shape constraint
CN112581513A (en) Cone beam computed tomography image feature extraction and corresponding method
US20220343162A1 (en) Method for structure learning and model compression for deep neural network
KR20240005426A (en) Method for extracting the center line of a heart wall

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination