WO2023133675A1 - Method and apparatus for reconstructing 3d image on the basis of 2d image, device, and storage medium - Google Patents

Method and apparatus for reconstructing 3d image on the basis of 2d image, device, and storage medium Download PDF

Info

Publication number
WO2023133675A1
WO2023133675A1 PCT/CN2022/071269 CN2022071269W WO2023133675A1 WO 2023133675 A1 WO2023133675 A1 WO 2023133675A1 CN 2022071269 W CN2022071269 W CN 2022071269W WO 2023133675 A1 WO2023133675 A1 WO 2023133675A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
network
image
loss function
progressive
Prior art date
Application number
PCT/CN2022/071269
Other languages
French (fr)
Chinese (zh)
Inventor
王书强
胡博闻
申妍燕
Original Assignee
深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳先进技术研究院 filed Critical 深圳先进技术研究院
Priority to PCT/CN2022/071269 priority Critical patent/WO2023133675A1/en
Publication of WO2023133675A1 publication Critical patent/WO2023133675A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Definitions

  • the present application relates to the technical field of image processing, and in particular to a method, device, device and storage medium for reconstructing a 3D image based on a 2D image.
  • Brain-computer interface refers to the direct connection created between the human or animal brain and external devices, and completes the information exchange between the brain and the device to achieve the purpose of affecting the operation of the device.
  • the research of brain-computer interface has great social and practical significance, which will help to improve the working efficiency of machines, and realize scenarios such as man controlling machines and living things, thus further liberating social productivity.
  • the most important factor to ensure that the brain-computer interface can accurately perform complex tasks is to install a large number of brain electrodes to the designated area of the operator's brain through minimally invasive surgery to capture brain signals.
  • the number of electrodes required to detect brain signals is increasing exponentially, which brings new challenges to brain-computer interface surgery for installing electrodes on the brain.
  • the accurate installation of large-scale electrodes not only depends on the technical level of doctors, but also has a lot to do with the precise modeling and digital positioning of the surgical brain during surgery.
  • the current mainstream intraoperative modeling and positioning method is the 3D reconstruction of the target, that is, to reconstruct part or the complete 3D structure of the brain through the target object (such as 2D images, real-time images) and imaging algorithms.
  • point cloud is the mainstream form of information carrier.
  • the current work often uses 2048 as the number of points in the point cloud. With the rapid increase in the number of electrodes installed in brain-computer interface surgery, point clouds of this order of magnitude cannot meet the actual needs of surgery. Therefore, how to build and generate higher density The point cloud is imminent.
  • the present application provides a method, device, device, and storage medium for reconstructing a 3D image based on a 2D image, so as to solve the problem of high error in the reconstructed 3D image due to the low density of the point cloud constructed by the existing solution.
  • a technical solution adopted by the present application is to provide a method for reconstructing a 3D image based on a 2D image, including: extracting a coded feature vector of a 2D image using a distributed feature coding network of a pre-trained point cloud reconstruction model,
  • the point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator for shape processing, and a second progressive map generator for structural detail processing; the encoded feature vectors are processed according to the first progressive map generator , generate a sparse point cloud; process the sparse point cloud according to the second progressive mapping generator to generate a high-density point cloud; construct a 3D image according to the high-density point cloud.
  • the first progressive map generator includes a tree-structured cross-domain multi-structure graph convolutional network and a high-scale shape discriminant network; according to the first progressive map generator, the encoded feature vector is processed to generate a sparse point cloud , including: inputting the coded feature vector into a tree-structured cross-domain multi-structure graph convolutional network to obtain the first initial point cloud; using a high-scale shape discriminant network to determine the shape perfection of the first initial point cloud on the first preset scale Degree, and guide the tree-structured cross-domain multi-structure graph convolutional network to optimize the first initial point cloud to obtain a sparse point cloud.
  • the second progressive map generator includes a stacked cross-domain multi-structure graph convolutional network and a low-scale detail discrimination network; according to the second progressive map generator, the sparse point cloud is processed to generate high-density points cloud, including: using a stacked cross-domain multi-structure graph convolutional network to perform an upsampling operation on a sparse point cloud to generate a second initial point cloud; The reliability of the structure above, and guide the cross-domain multi-structure graph convolutional network of the stacked structure to optimize the second initial point cloud to obtain a high-density point cloud, and the accuracy of the second preset scale is higher than that of the first preset scale.
  • the step of pre-training the point cloud reconstruction model includes: obtaining the 2D sample image and the real point cloud corresponding to the 2D sample image, and constructing the point cloud reconstruction model to be trained; The image is extracted to obtain the sample encoding feature vector, and the KL divergence of the sample encoding feature vector is calculated; the sample encoding feature vector is input to the first progressive mapping generator to generate a sample sparse point cloud, and the sparse point cloud is calculated using preset rules The first chamfer distance and the first bulldozer distance from the real point cloud; input the sample sparse point cloud to the second progressive mapping generator to generate a sample high-density point cloud, and use the preset rules to calculate the high-density point cloud the second chamfer distance and the second bulldozer distance to the real point cloud; the first progressive map generator is inversely updated with the first loss function constructed based on the KL divergence, the first chamfer distance, and the first bulldozer distance, And inverse
  • the first loss function includes a tree-structured cross-domain multi-structure graph convolutional network loss function and a high-scale shape discriminant network loss function;
  • the loss function of the tree-structured cross-domain multi-structure graph convolutional network is:
  • L G1 is the cross-domain multi-structure graph convolution network loss function of the tree structure
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are adjustable parameters preset in the experiment
  • L KL is the KL divergence
  • L CD1 is the first chamfering distance
  • L EMD1 is the first bulldozer distance
  • the high-scale shape discriminant network loss function is:
  • LD1 is the high-scale shape discriminant network loss function
  • ⁇ gp is the preset adjustable parameter in the experiment
  • ⁇ gp is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud
  • the mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  • the second loss function includes a stacked cross-domain multi-structure graph convolutional network loss function and a low-scale detail discriminant network loss function;
  • the loss function of the stacked cross-domain multi-structure graph convolutional network is:
  • L G2 is the loss function of cross-domain multi-structure graph convolutional network with stacked structure
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are adjustable parameters preset in the experiment
  • L KL is the KL divergence
  • L CD2 is the The second chamfering distance
  • L EMD2 is the second bulldozer distance, Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
  • L D2 is the low-scale detail discriminant network loss function
  • ⁇ gp is the preset adjustable parameter in the experiment
  • ⁇ gp is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud
  • the mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  • the distributed feature encoding network is based on an efficient residual convolutional neural network and a self-attention network.
  • a device for reconstructing 3D images based on 2D images including: a feature extraction module, which is used to extract the distributed feature encoding network using the pre-trained point cloud reconstruction model
  • the encoded feature vector of the 2D image, the point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator for shape processing, and a second progressive map generator for structural detail processing;
  • the first generation module uses Process the encoded feature vector according to the first progressive map generator to generate a sparse point cloud;
  • the second generation module is used to process the sparse point cloud according to the second progressive map generator to generate a high-density point cloud;
  • the building block Used to construct 3D images from high-density point clouds.
  • the computer device includes a processor, a memory coupled to the processor, and program instructions are stored in the memory, so When the program instructions are executed by the processor, the processor is made to execute the steps of the above-mentioned method for reconstructing a 3D image based on a 2D image.
  • another technical solution adopted by the present application is to provide a storage medium storing program instructions capable of implementing the above-mentioned method for reconstructing a 3D image based on a 2D image.
  • the beneficial effects of the present application are: the method for reconstructing a 3D image based on a 2D image of the present application extracts the coded feature vector of the 2D image, and then inputs the coded feature vector to the first progressive map generator that constructs the overall shape, so as to Generate a sparse point cloud, and then input the sparse point cloud to the second progressive mapping generator constructed in terms of structural details, so as to obtain a high-density point cloud, and then reconstruct a 3D image based on the high-density point cloud, and generate based on this multi-stage progressive mapping mechanism, which first reconstructs the overall shape to obtain a sparse point cloud, and then upsamples the sparse point cloud on the structural details of the sparse point cloud to reconstruct a high-density point cloud, so that the construction based on the high-density point cloud 3D images are more accurate.
  • FIG. 1 is a schematic flow diagram of a method for reconstructing a 3D image based on a 2D image according to an embodiment of the present invention
  • Fig. 2 is the schematic flow chart of the point cloud reconstruction model training process of the embodiment of the present invention.
  • FIG. 3 is a schematic diagram of functional modules of an apparatus for reconstructing a 3D image based on a 2D image according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.
  • first”, “second”, and “third” in this application are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, features defined as “first”, “second”, and “third” may explicitly or implicitly include at least one of these features.
  • “plurality” means at least two, such as two, three, etc., unless otherwise specifically defined. All directional indications (such as up, down, left, right, front, back%) in the embodiments of the present application are only used to explain the relative positional relationship between the various components in a certain posture (as shown in the drawings) , sports conditions, etc., if the specific posture changes, the directional indication also changes accordingly.
  • FIG. 1 is a schematic flowchart of a method for reconstructing a 3D image based on a 2D image according to a first embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in FIG. 1 if substantially the same result is obtained. As shown in Figure 1, the method includes steps:
  • Step S101 using the distributed feature encoding network of the pre-trained point cloud reconstruction model to extract the encoding feature vector of the 2D image.
  • the point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator for shape processing, and a second progressive map generator for structural detail processing.
  • the purpose of constructing the first progressive mapping generator is to reconstruct the object in the 2D image from the overall shape, and construct a sparse point cloud with similar structure of the object on the overall shape; the purpose of constructing the second progressive mapping generator is It is to reconstruct the details in the sparse point cloud from the structural details, so that the constructed point cloud has a higher density and a higher structural similarity with the object in the 2D image.
  • the 2D image is input into the pre-trained point cloud reconstruction model, and the encoded feature vector of the 2D image is extracted by the distribution feature encoding network of the point cloud reconstruction model, so that Subsequently, a point cloud of the 2D image is generated according to the encoded feature vector.
  • the distributed feature encoding network is based on an efficient residual convolutional neural network and a self-attention network.
  • the distributed feature encoding network consists of an efficient residual convolutional neural network (ResNet) and a self-attention network.
  • ResNet efficient residual convolutional neural network
  • the efficient residual convolutional neural network consists of 18 residual blocks
  • the self-attention network can calculate an input The importance of each element in itself, and strengthen the weight and accuracy of high-importance elements, merge unimportant elements, which obey the following formula to calculate the attention score of each element in the input:
  • the distributed feature encoding network can collect and extract accurate and detailed structural features from the input single 2D image, its calculation speed is fast, and its robustness is strong, and it uses KL divergence to mine images in real image distribution
  • the distribution features of the distribution feature rather than a single feature, through the importance of the features mined by perception, strengthen the weight and accuracy of high-importance features, merge unimportant features, and provide a good feature environment for subsequent reconstruction work.
  • Step S102 Process the coded feature vector according to the first progressive map generator to generate a sparse point cloud.
  • the encoded feature vector is input into the first progressive mapping generator, and the first progressive mapping generator is used to generate a sparse point cloud similar to the object in the 2D image on the overall shape.
  • the first progressive map generator includes a tree-structured cross-domain multi-structure graph convolutional network and a high-scale shape discriminant network. Therefore, according to the first The progressive map generator processes the encoded feature vector, and the steps of generating a sparse point cloud include:
  • the tree-structured cross-domain multi-structure graph convolutional network uses the graph convolutional network as the skeleton, and is designed to connect various topological structures to complete different cross-domain generation tasks.
  • the calculation of the graph convolutional network as the skeleton obeys formula:
  • b l is the bias term and ⁇ ( ⁇ ) is the activation function.
  • the tree-structured cross-domain multi-structure graph convolutional network is connected with branch modules between each layer of the network to form a tree-shaped topology structure.
  • the purpose is to construct a point cloud reconstruction from a single feature vector.
  • Logical structure, thereby reducing generation errors, branch modules obey the formula:
  • the high-scale shape discriminant network also uses the above-mentioned graph convolutional network as the skeleton, and constructs a fully connected layer network at the end of the graph convolutional network to normalize the output structure of the graph convolutional network.
  • the input of the high-scale shape discrimination network is a 3D point cloud
  • the output is a 1*1 vector with a value range of 0-1. The closer the output is to 1, the more it recognizes the input point. The degree of authenticity of the cloud, and vice versa, it believes that the input point cloud is not credible.
  • the tree-structured cross-domain multi-structure graph convolutional network has a strong point cloud generation capability, and its skeleton graph convolutional network can make full use of the graph connectivity between each point in the point cloud.
  • the algebraic meaning, and the reconstruction of the point cloud in the geometric sense the tree structure constructs the logical meaning of reconstruction from the vector with the column number of 1 to the point cloud with the column number of 2048, ensuring its topological validity.
  • a Nash equilibrium training strategy that conforms to the principles of game theory is constructed.
  • the shape similarity of the value helps the first progressive map generator to better complete the task of outlining the target shape and contour.
  • Step S103 Process the sparse point cloud according to the second progressive map generator to generate a high-density point cloud.
  • the sparse point cloud is input into the second progressive mapping generator, and the second progressive mapping generator is used to process the structural details to generate a structure that can reflect the structural details of the object in the 2D image High density point cloud.
  • the second progressive map generator includes a stacked cross-domain multi-structure graph convolutional network and a low-scale detail discrimination network. Therefore, the The sparse point cloud is processed according to the second progressive map generator, and the steps of generating a high-density point cloud specifically include:
  • the stacked cross-domain multi-structure graph convolutional network first connects the convolutional neural network to the feature aggregation process; then connects the fully connected layer network to complete the feature upsampling process; finally connects the convolutional neural network to complete the feature aggregation process.
  • the entire network forms a stacked topology.
  • the feature aggregation process, feature upsampling process and coordinate reconstruction process respectively aggregate sparse point cloud features, and upsample the first aggregated sparse point cloud features into high-density point cloud features.
  • the scale of the sparse point cloud is 2048*3
  • the scale of the aggregated sparse point cloud feature is 2048*128, and the upsampled high-density point
  • the cloud feature is 4096*128, and the scale of the high-density point cloud after coordinate reconstruction is 4096*3.
  • the low-scale detail discrimination network is also based on the above-mentioned graph convolutional network.
  • a fully connected layer network is constructed to normalize the output structure of the graph convolutional network.
  • the input of the low-scale detail discriminant network It is a high-density point cloud, and the output is also a 1*1 vector with a value range of 0-1. The closer the output is to 1, the more it recognizes the authenticity of the input point cloud. Otherwise, it thinks the input point cloud is not credible. .
  • a cross-domain multi-structure graph convolutional network with a stacked structure of feature aggregation process, feature upsampling process and coordinate reconstruction process is designed, which combines sparse point cloud features to aggregate sparse point cloud features.
  • Sampling is a high-density point cloud feature, and then reconstructs the high-density point cloud feature into a high-density point cloud with three-dimensional coordinates.
  • the stacked cross-domain multi-structure graph convolutional network reconstructs the sparse point cloud into While the high-density point cloud is used, the local microstructure of the point cloud is corrected to ensure its reliability.
  • it designed a low-scale detail discrimination network It also built a Nash equilibrium training strategy with normalization processing capabilities. At the same time, it also built a self-attention network to test the fine structure of high-density point clouds, fine-tuning its coordinates of a point.
  • Step S104 Construct a 3D image according to the high-density point cloud.
  • the point cloud reconstruction model is pre-trained, please refer to Figure 2, the steps of the pre-training point cloud reconstruction model include:
  • Step S201 Acquire 2D sample images and real point clouds corresponding to the 2D sample images, and construct a point cloud reconstruction model to be trained.
  • the point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator and a second progressive map generator.
  • Step S202 using the distributed feature encoding network to extract the sample encoding feature vector from the 2D sample image, and calculating the KL divergence of the sample encoding feature vector.
  • L KL refers to the KL divergence
  • P represents the probability distribution of x on the generated distribution
  • Q represents the probability distribution of x on the real space
  • X represents the input feature
  • the 2D sample images need to be preprocessed, including but not limited to operations such as cleaning and denoising.
  • operations such as cleaning and denoising.
  • the 2D sample image is a brain MRI image
  • operations such as skull removal, neck bone removal, and slicing need to be performed, and a 2D slice I H ⁇ W near the optimal plane is selected, where H and W are the length and width of the 2D image .
  • Step S203 Input the sample coded feature vector into the first progressive mapping generator to generate a sample sparse point cloud, and calculate the first chamfer distance and the first bulldozer distance between the sparse point cloud and the real point cloud by using preset rules .
  • the preset rules include a chamfering distance calculation formula and a bulldozer distance calculation formula, wherein the chamfering distance calculation formula is:
  • L CD represents the chamfering distance
  • y' represents the vector in Y'
  • y represents the vector in Y.
  • the formula for calculating the bulldozer distance is:
  • L EMD represents the bulldozer distance
  • Y represents the real point cloud
  • Y′ represents the generated point cloud
  • x represents the vector in Y.
  • Step S204 Input the sample sparse point cloud into the second progressive mapping generator to generate the sample high-density point cloud, and calculate the second chamfering distance and the second Bulldozer distance.
  • Step S205 Use the first loss function constructed based on the KL divergence, the first chamfering distance and the first bulldozer distance to update the first progressive map generator inversely, and use the KL divergence, the second chamfering distance and the second The second loss function constructed by bulldozer distance inversely updates the second progressive map generator.
  • the first loss function includes a tree-structured cross-domain multi-structure graph convolutional network loss function and a high-scale shape discriminant network loss function;
  • the loss function of the tree-structured cross-domain multi-structure graph convolutional network is:
  • L G1 is the tree-structured cross-domain multi-structure graph convolutional network loss function
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are the preset adjustable parameters in the experiment
  • L KL is the KL divergence
  • L CD1 is the first inverted Angular distance
  • L EMD1 is the distance of the first bulldozer, Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
  • the high-scale shape discriminant network loss function is:
  • LD1 is the high-scale shape discriminant network loss function
  • ⁇ gp is the preset adjustable parameter in the experiment
  • ⁇ gp is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud
  • the mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  • the second loss function includes a stacked cross-domain multi-structure graph convolutional network loss function and a low-scale detail discriminant network loss function;
  • the loss function of the stacked cross-domain multi-structure graph convolutional network is:
  • L G2 is the loss function of cross-domain multi-structure graph convolutional network with stacked structure
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are the preset adjustable parameters in the experiment
  • L KL is the KL divergence
  • L CD2 is the second inverse Angular distance
  • L EMD2 is the second bulldozer distance
  • L D2 is the low-scale detail discriminant network loss function
  • ⁇ gp is the preset adjustable parameter in the experiment
  • ⁇ gp is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud
  • the mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  • the trained point cloud reconstruction network can be obtained by cyclically executing the above training process.
  • training the point cloud reconstruction network it also includes: testing the trained point cloud reconstruction network.
  • the 2D sample image and the real point cloud corresponding to the 2D sample image into a training set, a verification set and a test set according to a ratio of 80%:10%:10%, 80% of the training set is optimized for model training according to the above steps , during each training iteration, 10% of the validation set is used for validation. After the iteration, the optimal model is selected through the verification results, and the trained point cloud reconstruction model is obtained through the above steps.
  • the method for reconstructing 3D images based on 2D images in this embodiment is suitable for 3D modeling in brain-computer interface surgery, and a multi-stage progressive mapping generation mechanism is proposed, which gradually reconstructs point clouds at different stages, so that each The reconstruction stage can focus on depicting the general outline of the brain, or fine-tune the detailed features of the brain, thereby reducing reconstruction errors.
  • the coded feature vector is input to the first progressive mapping generator that constructs the overall shape to generate a sparse point cloud, Then input the sparse point cloud to the second progressive mapping generator that constructs structural details to obtain a high-density point cloud, and then reconstruct a 3D image based on the high-density point cloud.
  • the multi-stage progressive mapping generation mechanism Based on the multi-stage progressive mapping generation mechanism, it is first in The overall shape is reconstructed to obtain a sparse point cloud, and then the sparse point cloud is up-sampled on the structural details to reconstruct a high-density point cloud, so that the 3D image constructed based on the high-density point cloud is more accurate. precise.
  • FIG. 3 is a schematic diagram of functional modules of an apparatus for reconstructing a 3D image based on a 2D image according to an embodiment of the present invention.
  • the device 40 includes
  • the feature extraction module 41 is used to extract the encoded feature vector of the 2D image using the distributed feature encoding network of the pre-trained point cloud reconstruction model.
  • the point cloud reconstruction model includes the distributed feature encoding network and the first progressive map generation for shape processing. generator and a second progressive map generator for structural detail processing;
  • the first generation module 42 is used to process the encoded feature vector according to the first progressive map generator to generate a sparse point cloud
  • the second generation module 43 is used to process the sparse point cloud according to the second progressive map generator to generate a high-density point cloud;
  • the construction module 44 is used for constructing a 3D image according to the high-density point cloud.
  • the first progressive map generator includes a tree-structured cross-domain multi-structure graph convolutional network and a high-scale shape discriminant network; the first generation module 42 processes the encoded feature vector according to the first progressive map generator to generate
  • the operation of the sparse point cloud can also be: input the encoded feature vector into the tree-structured cross-domain multi-structure graph convolutional network to obtain the first initial point cloud; use the high-scale shape discrimination network to distinguish the first initial point cloud in the first Preset the degree of shape perfection on the scale, and guide the tree-structured cross-domain multi-structure graph convolutional network to optimize the first initial point cloud to obtain a sparse point cloud.
  • the second progressive map generator includes a stacked cross-domain multi-structure graph convolutional network and a low-scale detail discrimination network; the second generation module 43 executes processing the sparse point cloud according to the second progressive map generator to generate
  • the operation of high-density point cloud can also be: use the stacked cross-domain multi-structure graph convolutional network to perform upsampling operation on the sparse point cloud to generate the second initial point cloud; use the low-scale detail discrimination network to distinguish the second initial point.
  • the accuracy of the second preset scale is higher than that of the first a preset scale.
  • it also includes a training module for pre-training the point cloud reconstruction model, and the operation of training the point cloud reconstruction model specifically includes: obtaining a 2D sample image and a real point cloud corresponding to the 2D sample image, and constructing a point cloud to be trained Reconstruct the model; use the distributed feature encoding network to extract the sample encoding feature vector from the 2D sample image, and calculate the KL divergence of the sample encoding feature vector; input the sample encoding feature vector to the first progressive map generator to generate a sample sparse point cloud , and use the preset rules to calculate the first chamfer distance and the first bulldozer distance between the sparse point cloud and the real point cloud; input the sample sparse point cloud to the second progressive mapping generator to generate a sample high-density point cloud, And use the preset rules to calculate the second chamfer distance and the second bulldozer distance between the high-density point cloud and the real point cloud; use the first loss based on the KL divergence, the first
  • the first loss function includes a tree-structured cross-domain multi-structure graph convolutional network loss function and a high-scale shape discriminant network loss function;
  • the loss function of the tree-structured cross-domain multi-structure graph convolutional network is:
  • L G1 is the tree-structured cross-domain multi-structure graph convolutional network loss function
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are the preset adjustable parameters in the experiment
  • L KL is the KL divergence
  • L CD1 is the first inverted Angular distance
  • L EMD1 is the distance of the first bulldozer, Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
  • the high-scale shape discriminant network loss function is:
  • LD1 is the high-scale shape discriminant network loss function
  • ⁇ gp is the preset adjustable parameter in the experiment
  • ⁇ gp is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud
  • the mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  • the second loss function includes a stacked cross-domain multi-structure graph convolutional network loss function and a low-scale detail discriminant network loss function;
  • the loss function of the stacked cross-domain multi-structure graph convolutional network is:
  • L G2 is the loss function of cross-domain multi-structure graph convolutional network with stacked structure
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are the preset adjustable parameters in the experiment
  • L KL is the KL divergence
  • L CD2 is the second inverse Angular distance
  • L EMD2 is the second bulldozer distance
  • L D2 is the low-scale detail discriminant network loss function
  • ⁇ gp is the preset adjustable parameter in the experiment
  • ⁇ gp is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud
  • the mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  • the distributed feature encoding network is based on an efficient residual convolutional neural network and a self-attention network.
  • each embodiment in this specification is described in a progressive manner, and each embodiment focuses on the differences from other embodiments.
  • the same and similar parts in each embodiment refer to each other, that is, Can.
  • the description is relatively simple, and for related parts, please refer to part of the description of the method embodiments.
  • FIG. 4 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
  • the computer device 60 includes a processor 61 and a memory 62 coupled to the processor 61.
  • Program instructions are stored in the memory 62.
  • the processor 61 performs any of the above-mentioned operations. The steps of the method for reconstructing a 3D image based on a 2D image described in the embodiment.
  • the processor 61 may also be called a CPU (Central Processing Unit, central processing unit).
  • the processor 61 may be an integrated circuit chip with signal processing capabilities.
  • the processor 61 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components .
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • FIG. 5 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.
  • the storage medium in the embodiment of the present invention stores program instructions 71 capable of realizing all the above-mentioned methods, wherein the program instructions 71 can be stored in the above-mentioned storage medium in the form of software products, including several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods described in the various embodiments of the present application.
  • a computer device which can It is a personal computer, a server, or a network device, etc.
  • processor processor
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. , or computer equipment such as computers, servers, mobile phones, and tablets.
  • the disclosed computer equipment, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units. The above is only the implementation mode of this application, and does not limit the scope of patents of this application. Any equivalent structure or equivalent process transformation made by using the contents of this application specification and drawings, or directly or indirectly used in other related technical fields, All are included in the scope of patent protection of the present application in the same way.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are a method and an apparatus for reconstructing a 3D image on the basis of a 2D image, a device, and a storage medium. The method comprises: extracting coding feature vectors of a 2D image by using a distribution feature coding network of a pre-trained point cloud reconstruction model (S101), the point cloud reconstruction model comprising the distribution feature coding network, a first progressive mapping generator used for shape processing, and a second progressive mapping generator used for structure detail processing; and by means of the first progressive mapping generator, processing the coding feature vectors to generate a sparse point cloud (S102); by means of the second progressive mapping generator, processing the sparse point cloud to generate a high-density point cloud (S103); and according to the high-density point cloud, constructing a 3D image (S104). In the method, a high-density point cloud of a 2D image is reconstructed by means of a multi-stage progressive mapping generation mechanism, so that the error of reconstructing a 3D image is reduced.

Description

基于2D图像重建3D图像方法、装置、设备及存储介质Method, device, equipment and storage medium for reconstructing 3D image based on 2D image 技术领域technical field
本申请涉及图像处理技术领域,特别是涉及一种基于2D图像重建3D图像方法、装置、设备及存储介质。The present application relates to the technical field of image processing, and in particular to a method, device, device and storage medium for reconstructing a 3D image based on a 2D image.
背景技术Background technique
脑机接口是指在人或动物大脑与外部设备之间创建的直接连接,并完成脑与设备的信息交换以达到影响设备操作的目的。脑机接口的研究极具社会意义和现实意义,有助于提高机械的工作效率,并实现人操控机械、人操控活物等场景,从而进一步解放社会生产力。确保脑机接口可以精确胜任复杂的工作的最主要因素就是通过微创手术向操控者的大脑指定区域安装大量数目的脑电极以捕捉大脑信号。目前,随着脑机接口技术的发展,其所需要的探测脑部信号的电极的数量呈指数级增长,这给为大脑安装电极的脑机接口手术带来了新的挑战。在脑机接口手术中,大规模电极的准确安装除了有赖于医生的技术水平,也和手术大脑在术中的精确建模、定位数字化有很大关系。Brain-computer interface refers to the direct connection created between the human or animal brain and external devices, and completes the information exchange between the brain and the device to achieve the purpose of affecting the operation of the device. The research of brain-computer interface has great social and practical significance, which will help to improve the working efficiency of machines, and realize scenarios such as man controlling machines and living things, thus further liberating social productivity. The most important factor to ensure that the brain-computer interface can accurately perform complex tasks is to install a large number of brain electrodes to the designated area of the operator's brain through minimally invasive surgery to capture brain signals. At present, with the development of brain-computer interface technology, the number of electrodes required to detect brain signals is increasing exponentially, which brings new challenges to brain-computer interface surgery for installing electrodes on the brain. In brain-computer interface surgery, the accurate installation of large-scale electrodes not only depends on the technical level of doctors, but also has a lot to do with the precise modeling and digital positioning of the surgical brain during surgery.
目前的主流术中建模、定位方法是目标的3D重建,即通过标的物(如2D图像、实时影像)和影像算法重建出部分或完整的大脑3D结构。在当前的3D重建工作中,点云是信息载体的主流形式。但是目前的工作往往使用2048作为点云的点数量,随着脑机接口手术中安装电极数目的快速增长,该数量级的点云已经无法满足手术的实际需要,因此,如何构建生成更高密度的点云迫在眉睫。The current mainstream intraoperative modeling and positioning method is the 3D reconstruction of the target, that is, to reconstruct part or the complete 3D structure of the brain through the target object (such as 2D images, real-time images) and imaging algorithms. In the current 3D reconstruction work, point cloud is the mainstream form of information carrier. However, the current work often uses 2048 as the number of points in the point cloud. With the rapid increase in the number of electrodes installed in brain-computer interface surgery, point clouds of this order of magnitude cannot meet the actual needs of surgery. Therefore, how to build and generate higher density The point cloud is imminent.
发明内容Contents of the invention
本申请提供一种基于2D图像重建3D图像方法、装置、设备及存储介质,以解决现有方案构建的点云密度较低导致重建的3D图像误差较高的问题。The present application provides a method, device, device, and storage medium for reconstructing a 3D image based on a 2D image, so as to solve the problem of high error in the reconstructed 3D image due to the low density of the point cloud constructed by the existing solution.
为解决上述技术问题,本申请采用的一个技术方案是:提供一种基于2D图像重建3D图像方法,包括:利用预先训练好的点云重建模型的分布特征编码网络提取2D图像的编码特征向量,点云重建模型包括分布特征编码网络、用于进行形状处理的第一渐进映射生成器和用于进行结构细节处理的第二渐进映射生成器;根据第一渐进映射生成器对编码特征向量进行处理,生成稀疏 点云;根据第二渐进映射生成器对稀疏点云进行处理,生成高密度点云;根据高密度点云构建3D图像。In order to solve the above technical problems, a technical solution adopted by the present application is to provide a method for reconstructing a 3D image based on a 2D image, including: extracting a coded feature vector of a 2D image using a distributed feature coding network of a pre-trained point cloud reconstruction model, The point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator for shape processing, and a second progressive map generator for structural detail processing; the encoded feature vectors are processed according to the first progressive map generator , generate a sparse point cloud; process the sparse point cloud according to the second progressive mapping generator to generate a high-density point cloud; construct a 3D image according to the high-density point cloud.
作为本申请的进一步改进,第一渐进映射生成器包括树结构的跨域多结构图卷积网络和高尺度形状判别网络;根据第一渐进映射生成器对编码特征向量进行处理,生成稀疏点云,包括:将编码特征向量输入至树结构的跨域多结构图卷积网络,得到第一初始点云;利用高尺度形状判别网络判别第一初始点云在第一预设尺度上的形状完善程度,并引导树结构的跨域多结构图卷积网络优化第一初始点云,得到稀疏点云。As a further improvement of this application, the first progressive map generator includes a tree-structured cross-domain multi-structure graph convolutional network and a high-scale shape discriminant network; according to the first progressive map generator, the encoded feature vector is processed to generate a sparse point cloud , including: inputting the coded feature vector into a tree-structured cross-domain multi-structure graph convolutional network to obtain the first initial point cloud; using a high-scale shape discriminant network to determine the shape perfection of the first initial point cloud on the first preset scale Degree, and guide the tree-structured cross-domain multi-structure graph convolutional network to optimize the first initial point cloud to obtain a sparse point cloud.
作为本申请的进一步改进,第二渐进映射生成器包括堆叠结构的跨域多结构图卷积网络和低尺度细节判别网络;根据第二渐进映射生成器对稀疏点云进行处理,生成高密度点云,包括:利用堆叠结构的跨域多结构图卷积网络对稀疏点云进行上采样操作,生成第二初始点云;利用低尺度细节判别网络判别第二初始点云在第二预设尺度上的结构可信程度,并引导堆叠结构的跨域多结构图卷积网络优化第二初始点云,得到高密度点云,第二预设尺度的精度高于第一预设尺度。As a further improvement of this application, the second progressive map generator includes a stacked cross-domain multi-structure graph convolutional network and a low-scale detail discrimination network; according to the second progressive map generator, the sparse point cloud is processed to generate high-density points cloud, including: using a stacked cross-domain multi-structure graph convolutional network to perform an upsampling operation on a sparse point cloud to generate a second initial point cloud; The reliability of the structure above, and guide the cross-domain multi-structure graph convolutional network of the stacked structure to optimize the second initial point cloud to obtain a high-density point cloud, and the accuracy of the second preset scale is higher than that of the first preset scale.
作为本申请的进一步改进,预先训练点云重建模型的步骤,包括:获取2D样本图像和2D样本图像对应的真实点云,并构建待训练的点云重建模型;利用分布特征编码网络从2D样本图像提取得到样本编码特征向量,并计算得到样本编码特征向量的KL散度;将样本编码特征向量输入至第一渐进映射生成器,生成样本稀疏点云,并利用预设规则计算得到稀疏点云与真实点云之间的第一倒角距离和第一推土机距离;将样本稀疏点云输入至第二渐进映射生成器,生成样本高密度点云,并利用预设规则计算得到高密度点云与真实点云之间的第二倒角距离和第二推土机距离;利用基于KL散度、第一倒角距离和第一推土机距离构建的第一损失函数反向更新第一渐进映射生成器,并利用基于KL散度、第二倒角距离和第二推土机距离构建的第二损失函数反向更新第二渐进映射生成器。As a further improvement of the present application, the step of pre-training the point cloud reconstruction model includes: obtaining the 2D sample image and the real point cloud corresponding to the 2D sample image, and constructing the point cloud reconstruction model to be trained; The image is extracted to obtain the sample encoding feature vector, and the KL divergence of the sample encoding feature vector is calculated; the sample encoding feature vector is input to the first progressive mapping generator to generate a sample sparse point cloud, and the sparse point cloud is calculated using preset rules The first chamfer distance and the first bulldozer distance from the real point cloud; input the sample sparse point cloud to the second progressive mapping generator to generate a sample high-density point cloud, and use the preset rules to calculate the high-density point cloud the second chamfer distance and the second bulldozer distance to the real point cloud; the first progressive map generator is inversely updated with the first loss function constructed based on the KL divergence, the first chamfer distance, and the first bulldozer distance, And inversely update the second progressive map generator with a second loss function constructed based on the KL divergence, the second chamfer distance and the second bulldozer distance.
作为本申请的进一步改进,第一损失函数包括树结构的跨域多结构图卷积网络损失函数和高尺度形状判别网络损失函数;As a further improvement of the present application, the first loss function includes a tree-structured cross-domain multi-structure graph convolutional network loss function and a high-scale shape discriminant network loss function;
树结构的跨域多结构图卷积网络损失函数为:The loss function of the tree-structured cross-domain multi-structure graph convolutional network is:
Figure PCTCN2022071269-appb-000001
Figure PCTCN2022071269-appb-000001
其中,L G1为所述树结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为所述KL散度,L CD1为所述第一倒角距离,L EMD1为所述第一推土机距离,
Figure PCTCN2022071269-appb-000002
为判别器判别生成器所生成的点云的真实程度的数学期望;
Among them, L G1 is the cross-domain multi-structure graph convolution network loss function of the tree structure, λ 1 , λ 2 , λ 3 are adjustable parameters preset in the experiment, L KL is the KL divergence, L CD1 is the first chamfering distance, L EMD1 is the first bulldozer distance,
Figure PCTCN2022071269-appb-000002
Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
高尺度形状判别网络损失函数为:The high-scale shape discriminant network loss function is:
Figure PCTCN2022071269-appb-000003
Figure PCTCN2022071269-appb-000003
其中,L D1为高尺度形状判别网络损失函数,λ gp为实验中预设的可调参数,
Figure PCTCN2022071269-appb-000004
为判别器判别真实样本点云的真实程度的数学期望,
Figure PCTCN2022071269-appb-000005
为判别器判别生成点云的每个点的真实程度的数学期望。
Among them, LD1 is the high-scale shape discriminant network loss function, λgp is the preset adjustable parameter in the experiment,
Figure PCTCN2022071269-appb-000004
is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
Figure PCTCN2022071269-appb-000005
The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
作为本申请的进一步改进,第二损失函数包括堆叠结构的跨域多结构图卷积网络损失函数和低尺度细节判别网络损失函数;As a further improvement of the present application, the second loss function includes a stacked cross-domain multi-structure graph convolutional network loss function and a low-scale detail discriminant network loss function;
堆叠结构的跨域多结构图卷积网络损失函数为:The loss function of the stacked cross-domain multi-structure graph convolutional network is:
Figure PCTCN2022071269-appb-000006
Figure PCTCN2022071269-appb-000006
其中,L G2为堆叠结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为所述KL散度,L CD2为所述第二倒角距离,L EMD2为所述第二推土机距离,
Figure PCTCN2022071269-appb-000007
为判别器判别生成器所生成的点云的真实程度的数学期望;
Among them, L G2 is the loss function of cross-domain multi-structure graph convolutional network with stacked structure, λ 1 , λ 2 , λ 3 are adjustable parameters preset in the experiment, L KL is the KL divergence, and L CD2 is the The second chamfering distance, L EMD2 is the second bulldozer distance,
Figure PCTCN2022071269-appb-000007
Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
低尺度细节判别网络损失函数为:The loss function of the low-scale detail discriminant network is:
Figure PCTCN2022071269-appb-000008
Figure PCTCN2022071269-appb-000008
其中,L D2为低尺度细节判别网络损失函数,λ gp为实验中预设的可调参数,
Figure PCTCN2022071269-appb-000009
为判别器判别真实样本点云的真实程度的数学期望,
Figure PCTCN2022071269-appb-000010
为判别器判别生成点云的每个点的真实程度的数学期望。
Among them, L D2 is the low-scale detail discriminant network loss function, λ gp is the preset adjustable parameter in the experiment,
Figure PCTCN2022071269-appb-000009
is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
Figure PCTCN2022071269-appb-000010
The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
作为本申请的进一步改进,分布特征编码网络基于高效残差卷积神经网络和自注意力网络构成。As a further improvement of this application, the distributed feature encoding network is based on an efficient residual convolutional neural network and a self-attention network.
为解决上述技术问题,本申请采用的另一个技术方案是:提供一种基于2D图像重建3D图像装置,包括:特征提取模块,用于利用预先训练好的点云重建模型的分布特征编码网络提取2D图像的编码特征向量,点云重建模型包括分布特征编码网络、用于进行形状处理的第一渐进映射生成器和用于进行结构细节处理的第二渐进映射生成器;第一生成模块,用于根据第一渐进映射生成器对编码特征向量进行处理,生成稀疏点云;第二生成模块,用于根据第 二渐进映射生成器对稀疏点云进行处理,生成高密度点云;构建模块,用于根据高密度点云构建3D图像。In order to solve the above-mentioned technical problems, another technical solution adopted by the present application is to provide a device for reconstructing 3D images based on 2D images, including: a feature extraction module, which is used to extract the distributed feature encoding network using the pre-trained point cloud reconstruction model The encoded feature vector of the 2D image, the point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator for shape processing, and a second progressive map generator for structural detail processing; the first generation module uses Process the encoded feature vector according to the first progressive map generator to generate a sparse point cloud; the second generation module is used to process the sparse point cloud according to the second progressive map generator to generate a high-density point cloud; the building block, Used to construct 3D images from high-density point clouds.
为解决上述技术问题,本申请采用的再一个技术方案是:提供一种计算机设备,所述计算机设备包括处理器、与所述处理器耦接的存储器,所述存储器中存储有程序指令,所述程序指令被所述处理器执行时,使得所述处理器执行上述基于2D图像重建3D图像方法的步骤。In order to solve the above technical problems, another technical solution adopted by the present application is to provide a computer device, the computer device includes a processor, a memory coupled to the processor, and program instructions are stored in the memory, so When the program instructions are executed by the processor, the processor is made to execute the steps of the above-mentioned method for reconstructing a 3D image based on a 2D image.
为解决上述技术问题,本申请采用的再一个技术方案是:提供一种存储介质,存储有能够实现上述基于2D图像重建3D图像方法的程序指令。In order to solve the above-mentioned technical problems, another technical solution adopted by the present application is to provide a storage medium storing program instructions capable of implementing the above-mentioned method for reconstructing a 3D image based on a 2D image.
本申请的有益效果是:本申请的基于2D图像重建3D图像方法通过提取得到2D图像的编码特征向量后,将该编码特征向量输入至在整体形状方面进行构建的第一渐进映射生成器,以生成稀疏点云,再将稀疏点云输入至在结构细节方面进行构建的第二渐进映射生成器,从而得到高密度点云,再根据高密度点云重建3D图像,基于该多阶段渐进映射生成机制,其先在整体形状上进行重建,得到稀疏点云,再对稀疏点云在结构细节上对稀疏点云进行上采样,从而重建出高密度点云,从而使得根据该高密度点云构建的3D图像更为准确。The beneficial effects of the present application are: the method for reconstructing a 3D image based on a 2D image of the present application extracts the coded feature vector of the 2D image, and then inputs the coded feature vector to the first progressive map generator that constructs the overall shape, so as to Generate a sparse point cloud, and then input the sparse point cloud to the second progressive mapping generator constructed in terms of structural details, so as to obtain a high-density point cloud, and then reconstruct a 3D image based on the high-density point cloud, and generate based on this multi-stage progressive mapping mechanism, which first reconstructs the overall shape to obtain a sparse point cloud, and then upsamples the sparse point cloud on the structural details of the sparse point cloud to reconstruct a high-density point cloud, so that the construction based on the high-density point cloud 3D images are more accurate.
附图说明Description of drawings
图1是本发明实施例的基于2D图像重建3D图像方法的流程示意图;1 is a schematic flow diagram of a method for reconstructing a 3D image based on a 2D image according to an embodiment of the present invention;
图2是本发明实施例的点云重建模型训练过程的流程示意图;Fig. 2 is the schematic flow chart of the point cloud reconstruction model training process of the embodiment of the present invention;
图3是本发明实施例的基于2D图像重建3D图像装置的功能模块示意图;3 is a schematic diagram of functional modules of an apparatus for reconstructing a 3D image based on a 2D image according to an embodiment of the present invention;
图4是本发明实施例的计算机设备的结构示意图;FIG. 4 is a schematic structural diagram of a computer device according to an embodiment of the present invention;
图5是本发明实施例的存储介质的结构示意图。FIG. 5 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
本申请中的术语“第一”、“第二”、“第三”仅用于描述目的,而不能理解为 指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”、“第三”的特征可以明示或者隐含地包括至少一个该特征。本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。本申请实施例中所有方向性指示(诸如上、下、左、右、前、后……)仅用于解释在某一特定姿态(如附图所示)下各部件之间的相对位置关系、运动情况等,如果该特定姿态发生改变时,则该方向性指示也相应地随之改变。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、***、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", and "third" in this application are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, features defined as "first", "second", and "third" may explicitly or implicitly include at least one of these features. In the description of the present application, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined. All directional indications (such as up, down, left, right, front, back...) in the embodiments of the present application are only used to explain the relative positional relationship between the various components in a certain posture (as shown in the drawings) , sports conditions, etc., if the specific posture changes, the directional indication also changes accordingly. Furthermore, the terms "include" and "have", as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes For other steps or units inherent in these processes, methods, products or apparatuses.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The occurrences of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those skilled in the art that the embodiments described herein can be combined with other embodiments.
图1是本发明第一实施例的基于2D图像重建3D图像方法的流程示意图。需注意的是,若有实质上相同的结果,本发明的方法并不以图1所示的流程顺序为限。如图1所示,该方法包括步骤:FIG. 1 is a schematic flowchart of a method for reconstructing a 3D image based on a 2D image according to a first embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in FIG. 1 if substantially the same result is obtained. As shown in Figure 1, the method includes steps:
步骤S101:利用预先训练好的点云重建模型的分布特征编码网络提取2D图像的编码特征向量。Step S101: using the distributed feature encoding network of the pre-trained point cloud reconstruction model to extract the encoding feature vector of the 2D image.
其中,点云重建模型包括分布特征编码网络、用于进行形状处理的第一渐进映射生成器和用于进行结构细节处理的第二渐进映射生成器。Among them, the point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator for shape processing, and a second progressive map generator for structural detail processing.
构建第一渐进映射生成器的目的是要从整体形状上对2D图像中的对象进行重构,构建出一个整体形状上对象的结构相似的稀疏点云;构建第二渐进映射生成器的目的则是要从结构细节上对稀疏点云中的细节部分进行重建,从而使得构建出的点云密度更高,与2D图像中的对象在结构上的相似度也更高。The purpose of constructing the first progressive mapping generator is to reconstruct the object in the 2D image from the overall shape, and construct a sparse point cloud with similar structure of the object on the overall shape; the purpose of constructing the second progressive mapping generator is It is to reconstruct the details in the sparse point cloud from the structural details, so that the constructed point cloud has a higher density and a higher structural similarity with the object in the 2D image.
具体地,在接收到用户输入的2D图像后,将该2D图像输入至预先训练好的点云重建模型中,由点云重建模型的分布特征编码网络提取得到该2D图像的编码特征向量,使得后续根据该编码特征向量生成该2D图像的点云。Specifically, after receiving the 2D image input by the user, the 2D image is input into the pre-trained point cloud reconstruction model, and the encoded feature vector of the 2D image is extracted by the distribution feature encoding network of the point cloud reconstruction model, so that Subsequently, a point cloud of the 2D image is generated according to the encoded feature vector.
进一步的,本实施例中,该分布特征编码网络基于高效残差卷积神经网络 和自注意力网络构成。Further, in this embodiment, the distributed feature encoding network is based on an efficient residual convolutional neural network and a self-attention network.
具体地,分布特征编码网络由高效残差卷积神经网络(ResNet)和自注意力网络构成。拥有计算速度快、鲁棒性强、可以挖掘图像的分布性特征(而不是单张特征)的优点,高效残差卷积神经网络由18个残差块构成,自注意力网络可以计算一个输入中每个元素对自身的重要性程度,并加强高重要度元素的权重和精确度,合并不重要的元素,其服从以下公式计算输入中的每个元素的注意力得分:Specifically, the distributed feature encoding network consists of an efficient residual convolutional neural network (ResNet) and a self-attention network. With the advantages of fast calculation speed, strong robustness, and the ability to mine distributed features of images (rather than single-sheet features), the efficient residual convolutional neural network consists of 18 residual blocks, and the self-attention network can calculate an input The importance of each element in itself, and strengthen the weight and accuracy of high-importance elements, merge unimportant elements, which obey the following formula to calculate the attention score of each element in the input:
Figure PCTCN2022071269-appb-000011
Figure PCTCN2022071269-appb-000011
其中,
Figure PCTCN2022071269-appb-000012
表示元素的注意力得分,
Figure PCTCN2022071269-appb-000013
表示第i个全连接层,T表示矩阵的转置,P是输入,公式计算的是输入P中的任意两个向量p i、p j的注意力得分。p i、p j是要计算的向量,p k代表求和符号定义式,在公式中的含义是:依次从P中取出所有向量p,p k指代当前取出的那个向量。本实施例中,用另外两个全连接层网络
Figure PCTCN2022071269-appb-000014
来基于注意力分数更新
Figure PCTCN2022071269-appb-000015
使其在几何上更可靠,更新函数定义为:
in,
Figure PCTCN2022071269-appb-000012
represents the attention score of an element,
Figure PCTCN2022071269-appb-000013
Represents the i-th fully connected layer, T represents the transpose of the matrix, P is the input, and the formula calculates the attention score of any two vectors p i and p j in the input P. p i and p j are the vectors to be calculated, p k represents the definition of the summation symbol, and the meaning in the formula is: take out all vectors p from P in turn, and p k refers to the currently taken out vector. In this embodiment, another two fully connected layer networks are used
Figure PCTCN2022071269-appb-000014
to update based on the attention score
Figure PCTCN2022071269-appb-000015
To make it more geometrically robust, the update function is defined as:
Figure PCTCN2022071269-appb-000016
Figure PCTCN2022071269-appb-000016
本实施例中,该分布特征编码网络能够从输入的单张2D图像中采集、提取准确详尽的结构特征,其计算速度快、鲁棒性强,且使用KL散度挖掘图像在真实图像分布中的分布特征,而不是其某一单张特征通过感知挖掘的特征的重要程度,加强高重要度特征的权重和精确度,合并不重要的特征,为后续的重建工作提供良好的特征环境。In this embodiment, the distributed feature encoding network can collect and extract accurate and detailed structural features from the input single 2D image, its calculation speed is fast, and its robustness is strong, and it uses KL divergence to mine images in real image distribution The distribution features of the distribution feature, rather than a single feature, through the importance of the features mined by perception, strengthen the weight and accuracy of high-importance features, merge unimportant features, and provide a good feature environment for subsequent reconstruction work.
步骤S102:根据第一渐进映射生成器对编码特征向量进行处理,生成稀疏点云。Step S102: Process the coded feature vector according to the first progressive map generator to generate a sparse point cloud.
具体地,在得到编码特征向量后,将该编码特征向量输入至第一渐进映射生成器中,利用第一渐进映射生成器在整体形状上,生成与2D图像中的对象相似的稀疏点云。Specifically, after the encoded feature vector is obtained, the encoded feature vector is input into the first progressive mapping generator, and the first progressive mapping generator is used to generate a sparse point cloud similar to the object in the 2D image on the overall shape.
进一步的,为了重建精确度高的点云,在一些实施例中,该第一渐进映射 生成器包括树结构的跨域多结构图卷积网络和高尺度形状判别网络,因此,该根据第一渐进映射生成器对编码特征向量进行处理,生成稀疏点云的步骤具体包括:Further, in order to reconstruct a point cloud with high accuracy, in some embodiments, the first progressive map generator includes a tree-structured cross-domain multi-structure graph convolutional network and a high-scale shape discriminant network. Therefore, according to the first The progressive map generator processes the encoded feature vector, and the steps of generating a sparse point cloud include:
1.1、将编码特征向量输入至树结构的跨域多结构图卷积网络,得到第一初始点云。1.1. Input the encoded feature vector into the tree-structured cross-domain multi-structure graph convolutional network to obtain the first initial point cloud.
具体地,树结构的跨域多结构图卷积网络以图卷积网络为骨架,被设计成可以外接各种拓扑结构以完成不同的跨域生成任务,作为骨架的图卷积网络的计算服从公式:Specifically, the tree-structured cross-domain multi-structure graph convolutional network uses the graph convolutional network as the skeleton, and is designed to connect various topological structures to complete different cross-domain generation tasks. The calculation of the graph convolutional network as the skeleton obeys formula:
Figure PCTCN2022071269-appb-000017
Figure PCTCN2022071269-appb-000017
其中
Figure PCTCN2022071269-appb-000018
是一个感知层,用来计算本节点上层到本层的关系。
Figure PCTCN2022071269-appb-000019
是祖先项,计算本节点的各祖先节点到本节点的特征分布。b l是偏置项,σ(·)是激活函数。
in
Figure PCTCN2022071269-appb-000018
It is a perception layer, which is used to calculate the relationship from the upper layer of this node to this layer.
Figure PCTCN2022071269-appb-000019
is the ancestor item, and calculates the feature distribution from each ancestor node of this node to this node. b l is the bias term and σ(·) is the activation function.
在第一渐进映射生成器中,树结构的跨域多结构图卷积网络在每层网络之间外接了分支模块以构成树形拓扑结构,目的是构建从单一的特征向量重建出点云的逻辑结构,从而减少生成误差,分支模块服从公式:In the first progressive map generator, the tree-structured cross-domain multi-structure graph convolutional network is connected with branch modules between each layer of the network to form a tree-shaped topology structure. The purpose is to construct a point cloud reconstruction from a single feature vector. Logical structure, thereby reducing generation errors, branch modules obey the formula:
Figure PCTCN2022071269-appb-000020
Figure PCTCN2022071269-appb-000020
1.2、利用高尺度形状判别网络判别第一初始点云在第一预设尺度上的形状完善程度,并引导树结构的跨域多结构图卷积网络优化第一初始点云,得到稀疏点云。1.2. Use the high-scale shape discrimination network to judge the shape perfection of the first initial point cloud on the first preset scale, and guide the tree-structured cross-domain multi-structure graph convolutional network to optimize the first initial point cloud to obtain a sparse point cloud .
具体地,高尺度形状判别网络同样以上述图卷积网络为骨架,在图卷积网络的最后构建了一个全连接层网络,以对图卷积网络的输出结构做归一化处理,与树结构的跨域多结构图卷积网络相反,高尺度形状判别网络输入为3D点云,输出为一个值域为0-1的1*1向量,输出越靠近1,则代表其越认可输入点云的真实程度,反之则是其认为输入点云并不可信。Specifically, the high-scale shape discriminant network also uses the above-mentioned graph convolutional network as the skeleton, and constructs a fully connected layer network at the end of the graph convolutional network to normalize the output structure of the graph convolutional network. Contrary to the cross-domain multi-structure graph convolutional network of the structure, the input of the high-scale shape discrimination network is a 3D point cloud, and the output is a 1*1 vector with a value range of 0-1. The closer the output is to 1, the more it recognizes the input point. The degree of authenticity of the cloud, and vice versa, it believes that the input point cloud is not credible.
需要说明的是,树结构的跨域多结构图卷积网络拥有强大的点云生成能力,其骨架图卷积网络可以充分利用点云在空间中每个点之间的图连通特性,不仅在代数意义,而且在几何意义上重建点云,树结构则构造了从列数为1的向量到列数为2048的点云的重建逻辑意义,保证了其拓扑有效性。并且,通过设计高尺度形状判别网络,利用对抗学习算法,构建了符合博弈论原理的纳什均衡训练策略,同时利用图卷积网络有效地在较大的尺度上判别生成的稀疏点云 与目标真实值的形状相似度,帮助第一渐进映射生成器更好地完成勾勒目标形状与轮廓的任务。It should be noted that the tree-structured cross-domain multi-structure graph convolutional network has a strong point cloud generation capability, and its skeleton graph convolutional network can make full use of the graph connectivity between each point in the point cloud. The algebraic meaning, and the reconstruction of the point cloud in the geometric sense, the tree structure constructs the logical meaning of reconstruction from the vector with the column number of 1 to the point cloud with the column number of 2048, ensuring its topological validity. Moreover, by designing a high-scale shape discriminant network and using an adversarial learning algorithm, a Nash equilibrium training strategy that conforms to the principles of game theory is constructed. The shape similarity of the value helps the first progressive map generator to better complete the task of outlining the target shape and contour.
步骤S103:根据第二渐进映射生成器对稀疏点云进行处理,生成高密度点云。Step S103: Process the sparse point cloud according to the second progressive map generator to generate a high-density point cloud.
具体地,在得到稀疏点云后,将该稀疏点云输入至第二渐进映射生成器中,利用第二渐进映射生成器在结构细节上进行处理,生成能反映2D图像中对象的结构细节的高密度点云。Specifically, after the sparse point cloud is obtained, the sparse point cloud is input into the second progressive mapping generator, and the second progressive mapping generator is used to process the structural details to generate a structure that can reflect the structural details of the object in the 2D image High density point cloud.
进一步的,为了进一步提高重建的点云的精确度高,在一些实施例中,该第二渐进映射生成器包括堆叠结构的跨域多结构图卷积网络和低尺度细节判别网络,因此,该根据第二渐进映射生成器对稀疏点云进行处理,生成高密度点云的步骤具体包括:Further, in order to further improve the accuracy of the reconstructed point cloud, in some embodiments, the second progressive map generator includes a stacked cross-domain multi-structure graph convolutional network and a low-scale detail discrimination network. Therefore, the The sparse point cloud is processed according to the second progressive map generator, and the steps of generating a high-density point cloud specifically include:
2.1、利用堆叠结构的跨域多结构图卷积网络对稀疏点云进行上采样操作,生成第二初始点云。2.1. Use the stacked cross-domain multi-structure graph convolutional network to perform upsampling operations on sparse point clouds to generate the second initial point cloud.
具体地,堆叠结构的跨域多结构图卷积网络首先外接了卷积神经网络以特征聚合过程;然后外接了全连接层网络以完成将特征上采样过程;最后又外接了卷积神经网络以完成坐标重建过程,整个网络构成堆叠形拓扑结构,特征聚合过程,特征上采样过程和坐标重建过程分别起到聚合稀疏点云特征,将第聚合的稀疏点云特征上采样为高密度点云特征,和将高密度点云特征重建为具有三维坐标的高密度点云的作用,例如,稀疏点云规模为2048*3,聚合的稀疏点云特征规模为2048*128,上采样的高密度点云特征为4096*128,坐标重建完成的高密度点云规模为4096*3。Specifically, the stacked cross-domain multi-structure graph convolutional network first connects the convolutional neural network to the feature aggregation process; then connects the fully connected layer network to complete the feature upsampling process; finally connects the convolutional neural network to complete the feature aggregation process. After completing the coordinate reconstruction process, the entire network forms a stacked topology. The feature aggregation process, feature upsampling process and coordinate reconstruction process respectively aggregate sparse point cloud features, and upsample the first aggregated sparse point cloud features into high-density point cloud features. , and the function of reconstructing the high-density point cloud feature into a high-density point cloud with three-dimensional coordinates, for example, the scale of the sparse point cloud is 2048*3, the scale of the aggregated sparse point cloud feature is 2048*128, and the upsampled high-density point The cloud feature is 4096*128, and the scale of the high-density point cloud after coordinate reconstruction is 4096*3.
2.2、利用低尺度细节判别网络判别第二初始点云在第二预设尺度上的结构可信程度,并引导堆叠结构的跨域多结构图卷积网络优化第二初始点云,得到高密度点云,第二预设尺度的精度高于第一预设尺度。2.2. Use the low-scale detail discrimination network to judge the structural credibility of the second initial point cloud on the second preset scale, and guide the stacked cross-domain multi-structure graph convolutional network to optimize the second initial point cloud to obtain high density For the point cloud, the accuracy of the second preset scale is higher than that of the first preset scale.
具体地,低尺度细节判别网络也以上述图卷积网络为骨架,在图卷积网络的最后除了被构建了一个全连接层网络以对图卷积网络的输出结构做归一化处理,还构建了一个自注意力网络以检验高密度点云的精细结构,微调其每个点的坐标以保证最终输出的合理性,与跨域多结构图卷积网络相反,低尺度细节判别网络的输入为高密度点云,输出也为一个值域为0-1的1*1向量,输出越靠近1,则代表其越认可输入点云的真实程度,反之则是其认为输入点云并 不可信。Specifically, the low-scale detail discrimination network is also based on the above-mentioned graph convolutional network. At the end of the graph convolutional network, a fully connected layer network is constructed to normalize the output structure of the graph convolutional network. Constructed a self-attention network to examine the fine structure of high-density point clouds, and fine-tuned the coordinates of each point to ensure the rationality of the final output. Contrary to the cross-domain multi-structure graph convolutional network, the input of the low-scale detail discriminant network It is a high-density point cloud, and the output is also a 1*1 vector with a value range of 0-1. The closer the output is to 1, the more it recognizes the authenticity of the input point cloud. Otherwise, it thinks the input point cloud is not credible. .
本实施例中,通过设计了具有特征聚合过程、将特征上采样过程和坐标重建过程的堆叠结构的跨域多结构图卷积网络,其结合稀疏点云特征,将聚合的稀疏点云特征上采样为高密度点云特征,然后将高密度点云特征重建为具有三维坐标的高密度点云,通过这样的过程,该堆叠结构的跨域多结构图卷积网络在将稀疏点云重建成高密度点云的同时,修正了点云的局部微结构,保证了其可靠性。并且,其设计了低尺度细节判别网络,其同样构建了纳什均衡训练策略,拥有归一化处理能力,同时,还构建了一个自注意力网络以检验高密度点云的精细结构,微调其每个点的坐标。In this embodiment, a cross-domain multi-structure graph convolutional network with a stacked structure of feature aggregation process, feature upsampling process and coordinate reconstruction process is designed, which combines sparse point cloud features to aggregate sparse point cloud features. Sampling is a high-density point cloud feature, and then reconstructs the high-density point cloud feature into a high-density point cloud with three-dimensional coordinates. Through this process, the stacked cross-domain multi-structure graph convolutional network reconstructs the sparse point cloud into While the high-density point cloud is used, the local microstructure of the point cloud is corrected to ensure its reliability. Moreover, it designed a low-scale detail discrimination network. It also built a Nash equilibrium training strategy with normalization processing capabilities. At the same time, it also built a self-attention network to test the fine structure of high-density point clouds, fine-tuning its coordinates of a point.
步骤S104:根据高密度点云构建3D图像。Step S104: Construct a 3D image according to the high-density point cloud.
需要说明的是,该点云重建模型预先训练得到,请参阅图2,该预先训练点云重建模型的步骤,包括:It should be noted that the point cloud reconstruction model is pre-trained, please refer to Figure 2, the steps of the pre-training point cloud reconstruction model include:
步骤S201:获取2D样本图像和2D样本图像对应的真实点云,并构建待训练的点云重建模型。Step S201: Acquire 2D sample images and real point clouds corresponding to the 2D sample images, and construct a point cloud reconstruction model to be trained.
具体地,该点云重建模型包括了分布特征编码网络、第一渐进映射生成器和第二渐进映射生成器。Specifically, the point cloud reconstruction model includes a distributed feature encoding network, a first progressive map generator and a second progressive map generator.
步骤S202:利用分布特征编码网络从2D样本图像提取得到样本编码特征向量,并计算得到样本编码特征向量的KL散度。Step S202: using the distributed feature encoding network to extract the sample encoding feature vector from the 2D sample image, and calculating the KL divergence of the sample encoding feature vector.
具体地,KL散度的计算公式为:Specifically, the calculation formula of KL divergence is:
Figure PCTCN2022071269-appb-000021
Figure PCTCN2022071269-appb-000021
其中,L KL指KL散度,P表示x在生成分布上的概率分布,Q表示x在真实空间上的概率分布,X表示输入特征。 Among them, L KL refers to the KL divergence, P represents the probability distribution of x on the generated distribution, Q represents the probability distribution of x on the real space, and X represents the input feature.
需要理解的是,在训练之前,还需要对2D样本图像进行预处理,包括但不限于清洗去噪等操作。例如,当2D样本图像为大脑MRI图像时,则需要进行去颅骨、去脖骨、切片等操作,选取最佳平面附近的2D切片I H×W,其中H和W是2D图像的长度和宽度。 It should be understood that before training, the 2D sample images need to be preprocessed, including but not limited to operations such as cleaning and denoising. For example, when the 2D sample image is a brain MRI image, operations such as skull removal, neck bone removal, and slicing need to be performed, and a 2D slice I H×W near the optimal plane is selected, where H and W are the length and width of the 2D image .
步骤S203:将样本编码特征向量输入至第一渐进映射生成器,生成样本稀疏点云,并利用预设规则计算得到稀疏点云与真实点云之间的第一倒角距离和第一推土机距离。Step S203: Input the sample coded feature vector into the first progressive mapping generator to generate a sample sparse point cloud, and calculate the first chamfer distance and the first bulldozer distance between the sparse point cloud and the real point cloud by using preset rules .
具体地,预设规则包括倒角距离计算公式和推土机距离计算公式,其中, 倒角距离计算公式为:Specifically, the preset rules include a chamfering distance calculation formula and a bulldozer distance calculation formula, wherein the chamfering distance calculation formula is:
Figure PCTCN2022071269-appb-000022
Figure PCTCN2022071269-appb-000022
其中,L CD表示倒角距离,y′表示Y’中的向量,y表示Y中的向量。 Among them, L CD represents the chamfering distance, y' represents the vector in Y', and y represents the vector in Y.
推土机距离计算公式为:The formula for calculating the bulldozer distance is:
Figure PCTCN2022071269-appb-000023
Figure PCTCN2022071269-appb-000023
其中,L EMD表示推土机距离,Y表示真实点云,Y′表示生成点云,x表示Y中的向量。 where L EMD represents the bulldozer distance, Y represents the real point cloud, Y′ represents the generated point cloud, and x represents the vector in Y.
步骤S204:将样本稀疏点云输入至第二渐进映射生成器,生成样本高密度点云,并利用预设规则计算得到高密度点云与真实点云之间的第二倒角距离和第二推土机距离。Step S204: Input the sample sparse point cloud into the second progressive mapping generator to generate the sample high-density point cloud, and calculate the second chamfering distance and the second Bulldozer distance.
步骤S205:利用基于KL散度、第一倒角距离和第一推土机距离构建的第一损失函数反向更新第一渐进映射生成器,并利用基于KL散度、第二倒角距离和第二推土机距离构建的第二损失函数反向更新第二渐进映射生成器。Step S205: Use the first loss function constructed based on the KL divergence, the first chamfering distance and the first bulldozer distance to update the first progressive map generator inversely, and use the KL divergence, the second chamfering distance and the second The second loss function constructed by bulldozer distance inversely updates the second progressive map generator.
其中,第一损失函数包括树结构的跨域多结构图卷积网络损失函数和高尺度形状判别网络损失函数;Among them, the first loss function includes a tree-structured cross-domain multi-structure graph convolutional network loss function and a high-scale shape discriminant network loss function;
树结构的跨域多结构图卷积网络损失函数为:The loss function of the tree-structured cross-domain multi-structure graph convolutional network is:
Figure PCTCN2022071269-appb-000024
Figure PCTCN2022071269-appb-000024
其中,L G1为树结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为KL散度,L CD1为第一倒角距离,L EMD1为第一推土机距离,
Figure PCTCN2022071269-appb-000025
为判别器判别生成器所生成的点云的真实程度的数学期望;
Among them, L G1 is the tree-structured cross-domain multi-structure graph convolutional network loss function, λ 1 , λ 2 , λ 3 are the preset adjustable parameters in the experiment, L KL is the KL divergence, and L CD1 is the first inverted Angular distance, L EMD1 is the distance of the first bulldozer,
Figure PCTCN2022071269-appb-000025
Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
高尺度形状判别网络损失函数为:The high-scale shape discriminant network loss function is:
Figure PCTCN2022071269-appb-000026
Figure PCTCN2022071269-appb-000026
其中,L D1为高尺度形状判别网络损失函数,λ gp为实验中预设的可调参数,
Figure PCTCN2022071269-appb-000027
为判别器判别真实样本点云的真实程度的数学期望,
Figure PCTCN2022071269-appb-000028
为判别器判别生成点云的每个点的真实程度的数学期望。
Among them, LD1 is the high-scale shape discriminant network loss function, λgp is the preset adjustable parameter in the experiment,
Figure PCTCN2022071269-appb-000027
is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
Figure PCTCN2022071269-appb-000028
The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
第二损失函数包括堆叠结构的跨域多结构图卷积网络损失函数和低尺度细节判别网络损失函数;The second loss function includes a stacked cross-domain multi-structure graph convolutional network loss function and a low-scale detail discriminant network loss function;
堆叠结构的跨域多结构图卷积网络损失函数为:The loss function of the stacked cross-domain multi-structure graph convolutional network is:
Figure PCTCN2022071269-appb-000029
Figure PCTCN2022071269-appb-000029
其中,L G2为堆叠结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为KL散度,L CD2为第二倒角距离,L EMD2为第二推土机距离,
Figure PCTCN2022071269-appb-000030
为判别器判别生成器所生成的点云的真实程度的数学期望;
Among them, L G2 is the loss function of cross-domain multi-structure graph convolutional network with stacked structure, λ 1 , λ 2 , λ 3 are the preset adjustable parameters in the experiment, L KL is the KL divergence, and L CD2 is the second inverse Angular distance, L EMD2 is the second bulldozer distance,
Figure PCTCN2022071269-appb-000030
Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
低尺度细节判别网络损失函数为:The loss function of the low-scale detail discriminant network is:
Figure PCTCN2022071269-appb-000031
Figure PCTCN2022071269-appb-000031
其中,L D2为低尺度细节判别网络损失函数,λ gp为实验中预设的可调参数,
Figure PCTCN2022071269-appb-000032
为判别器判别真实样本点云的真实程度的数学期望,
Figure PCTCN2022071269-appb-000033
为判别器判别生成点云的每个点的真实程度的数学期望。
Among them, L D2 is the low-scale detail discriminant network loss function, λ gp is the preset adjustable parameter in the experiment,
Figure PCTCN2022071269-appb-000032
is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
Figure PCTCN2022071269-appb-000033
The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
本实施例中,通过循环执行上述训练过程,即可得到训练好的点云重建网络。In this embodiment, the trained point cloud reconstruction network can be obtained by cyclically executing the above training process.
进一步的,在训练完该点云重建网络后,还包括:对训练好的点云重建网络进行测试。Further, after training the point cloud reconstruction network, it also includes: testing the trained point cloud reconstruction network.
具体地,通过将2D样本图像以及2D样本图像对应的真实点云按照80%:10%:10%的比例划分为训练集,验证集和测试集,80%训练集按照上述步骤做模型训练优化,在每次训练迭代的过程中,10%的验证集用于做验证。在迭代结束后,通过验证结果选择最优模型,通过上述步骤,得到训练好的点云重建模型。Specifically, by dividing the 2D sample image and the real point cloud corresponding to the 2D sample image into a training set, a verification set and a test set according to a ratio of 80%:10%:10%, 80% of the training set is optimized for model training according to the above steps , during each training iteration, 10% of the validation set is used for validation. After the iteration, the optimal model is selected through the verification results, and the trained point cloud reconstruction model is obtained through the above steps.
优选地,本实施例中的基于2D图像重建3D图像方法适用于脑机接口手术中的3D建模中,提出了多阶段渐进映射生成机制,该机制在不同阶段渐进重建点云,使每个重建阶段可以专注于描绘大脑总体轮廓,或微调大脑细节特征,从而降低重建误差。Preferably, the method for reconstructing 3D images based on 2D images in this embodiment is suitable for 3D modeling in brain-computer interface surgery, and a multi-stage progressive mapping generation mechanism is proposed, which gradually reconstructs point clouds at different stages, so that each The reconstruction stage can focus on depicting the general outline of the brain, or fine-tune the detailed features of the brain, thereby reducing reconstruction errors.
本发明实施例的基于2D图像重建3D图像方法通过提取得到2D图像的编码特征向量后,将该编码特征向量输入至在整体形状方面进行构建的第一渐进映射生成器,以生成稀疏点云,再将稀疏点云输入至在结构细节方面进行构建的第二渐进映射生成器,从而得到高密度点云,再根据高密度点云重建3D图像,基于该多阶段渐进映射生成机制,其先在整体形状上进行重建,得到稀疏点云,再对稀疏点云在结构细节上对稀疏点云进行上采样,从而重建出高密度点云,从而使得根据该高密度点云构建的3D图像更为准确。In the method for reconstructing a 3D image based on a 2D image in the embodiment of the present invention, after extracting the coded feature vector of the 2D image, the coded feature vector is input to the first progressive mapping generator that constructs the overall shape to generate a sparse point cloud, Then input the sparse point cloud to the second progressive mapping generator that constructs structural details to obtain a high-density point cloud, and then reconstruct a 3D image based on the high-density point cloud. Based on the multi-stage progressive mapping generation mechanism, it is first in The overall shape is reconstructed to obtain a sparse point cloud, and then the sparse point cloud is up-sampled on the structural details to reconstruct a high-density point cloud, so that the 3D image constructed based on the high-density point cloud is more accurate. precise.
图3是本发明实施例的基于2D图像重建3D图像装置的功能模块示意图。 如图3所示,该装置40包括FIG. 3 is a schematic diagram of functional modules of an apparatus for reconstructing a 3D image based on a 2D image according to an embodiment of the present invention. As shown in Figure 3, the device 40 includes
特征提取模块41,用于利用预先训练好的点云重建模型的分布特征编码网络提取2D图像的编码特征向量,点云重建模型包括分布特征编码网络、用于进行形状处理的第一渐进映射生成器和用于进行结构细节处理的第二渐进映射生成器;The feature extraction module 41 is used to extract the encoded feature vector of the 2D image using the distributed feature encoding network of the pre-trained point cloud reconstruction model. The point cloud reconstruction model includes the distributed feature encoding network and the first progressive map generation for shape processing. generator and a second progressive map generator for structural detail processing;
第一生成模块42,用于根据第一渐进映射生成器对编码特征向量进行处理,生成稀疏点云;The first generation module 42 is used to process the encoded feature vector according to the first progressive map generator to generate a sparse point cloud;
第二生成模块43,用于根据第二渐进映射生成器对稀疏点云进行处理,生成高密度点云;The second generation module 43 is used to process the sparse point cloud according to the second progressive map generator to generate a high-density point cloud;
构建模块44,用于根据高密度点云构建3D图像。The construction module 44 is used for constructing a 3D image according to the high-density point cloud.
可选地,第一渐进映射生成器包括树结构的跨域多结构图卷积网络和高尺度形状判别网络;第一生成模块42执行根据第一渐进映射生成器对编码特征向量进行处理,生成稀疏点云的操作,还可以为:将编码特征向量输入至树结构的跨域多结构图卷积网络,得到第一初始点云;利用高尺度形状判别网络判别第一初始点云在第一预设尺度上的形状完善程度,并引导树结构的跨域多结构图卷积网络优化第一初始点云,得到稀疏点云。Optionally, the first progressive map generator includes a tree-structured cross-domain multi-structure graph convolutional network and a high-scale shape discriminant network; the first generation module 42 processes the encoded feature vector according to the first progressive map generator to generate The operation of the sparse point cloud can also be: input the encoded feature vector into the tree-structured cross-domain multi-structure graph convolutional network to obtain the first initial point cloud; use the high-scale shape discrimination network to distinguish the first initial point cloud in the first Preset the degree of shape perfection on the scale, and guide the tree-structured cross-domain multi-structure graph convolutional network to optimize the first initial point cloud to obtain a sparse point cloud.
可选地,第二渐进映射生成器包括堆叠结构的跨域多结构图卷积网络和低尺度细节判别网络;第二生成模块43执行根据第二渐进映射生成器对稀疏点云进行处理,生成高密度点云的操作,还可以为:利用堆叠结构的跨域多结构图卷积网络对稀疏点云进行上采样操作,生成第二初始点云;利用低尺度细节判别网络判别第二初始点云在第二预设尺度上的结构可信程度,并引导堆叠结构的跨域多结构图卷积网络优化第二初始点云,得到高密度点云,第二预设尺度的精度高于第一预设尺度。Optionally, the second progressive map generator includes a stacked cross-domain multi-structure graph convolutional network and a low-scale detail discrimination network; the second generation module 43 executes processing the sparse point cloud according to the second progressive map generator to generate The operation of high-density point cloud can also be: use the stacked cross-domain multi-structure graph convolutional network to perform upsampling operation on the sparse point cloud to generate the second initial point cloud; use the low-scale detail discrimination network to distinguish the second initial point The reliability of the structure of the cloud on the second preset scale, and guide the cross-domain multi-structure graph convolutional network of the stacked structure to optimize the second initial point cloud to obtain a high-density point cloud. The accuracy of the second preset scale is higher than that of the first a preset scale.
可选地,其还包括训练模块,用于预先训练点云重建模型,训练点云重建模型的操作具体包括:获取2D样本图像和2D样本图像对应的真实点云,并构建待训练的点云重建模型;利用分布特征编码网络从2D样本图像提取得到样本编码特征向量,并计算得到样本编码特征向量的KL散度;将样本编码特征向量输入至第一渐进映射生成器,生成样本稀疏点云,并利用预设规则计算得到稀疏点云与真实点云之间的第一倒角距离和第一推土机距离;将样本稀疏点云输入至第二渐进映射生成器,生成样本高密度点云,并利用预设规则计算 得到高密度点云与真实点云之间的第二倒角距离和第二推土机距离;利用基于KL散度、第一倒角距离和第一推土机距离构建的第一损失函数反向更新第一渐进映射生成器,并利用基于KL散度、第二倒角距离和第二推土机距离构建的第二损失函数反向更新第二渐进映射生成器。Optionally, it also includes a training module for pre-training the point cloud reconstruction model, and the operation of training the point cloud reconstruction model specifically includes: obtaining a 2D sample image and a real point cloud corresponding to the 2D sample image, and constructing a point cloud to be trained Reconstruct the model; use the distributed feature encoding network to extract the sample encoding feature vector from the 2D sample image, and calculate the KL divergence of the sample encoding feature vector; input the sample encoding feature vector to the first progressive map generator to generate a sample sparse point cloud , and use the preset rules to calculate the first chamfer distance and the first bulldozer distance between the sparse point cloud and the real point cloud; input the sample sparse point cloud to the second progressive mapping generator to generate a sample high-density point cloud, And use the preset rules to calculate the second chamfer distance and the second bulldozer distance between the high-density point cloud and the real point cloud; use the first loss based on the KL divergence, the first chamfer distance and the first bulldozer distance The function inversely updates the first progressive map generator, and inversely updates the second progressive map generator with a second loss function constructed based on the KL divergence, the second chamfer distance, and the second bulldozer distance.
可选地,第一损失函数包括树结构的跨域多结构图卷积网络损失函数和高尺度形状判别网络损失函数;Optionally, the first loss function includes a tree-structured cross-domain multi-structure graph convolutional network loss function and a high-scale shape discriminant network loss function;
树结构的跨域多结构图卷积网络损失函数为:The loss function of the tree-structured cross-domain multi-structure graph convolutional network is:
Figure PCTCN2022071269-appb-000034
Figure PCTCN2022071269-appb-000034
其中,L G1为树结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为KL散度,L CD1为第一倒角距离,L EMD1为第一推土机距离,
Figure PCTCN2022071269-appb-000035
为判别器判别生成器所生成的点云的真实程度的数学期望;
Among them, L G1 is the tree-structured cross-domain multi-structure graph convolutional network loss function, λ 1 , λ 2 , λ 3 are the preset adjustable parameters in the experiment, L KL is the KL divergence, and L CD1 is the first inverted Angular distance, L EMD1 is the distance of the first bulldozer,
Figure PCTCN2022071269-appb-000035
Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
高尺度形状判别网络损失函数为:The high-scale shape discriminant network loss function is:
Figure PCTCN2022071269-appb-000036
Figure PCTCN2022071269-appb-000036
其中,L D1为高尺度形状判别网络损失函数,λ gp为实验中预设的可调参数,
Figure PCTCN2022071269-appb-000037
为判别器判别真实样本点云的真实程度的数学期望,
Figure PCTCN2022071269-appb-000038
为判别器判别生成点云的每个点的真实程度的数学期望。
Among them, LD1 is the high-scale shape discriminant network loss function, λgp is the preset adjustable parameter in the experiment,
Figure PCTCN2022071269-appb-000037
is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
Figure PCTCN2022071269-appb-000038
The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
可选地,第二损失函数包括堆叠结构的跨域多结构图卷积网络损失函数和低尺度细节判别网络损失函数;Optionally, the second loss function includes a stacked cross-domain multi-structure graph convolutional network loss function and a low-scale detail discriminant network loss function;
堆叠结构的跨域多结构图卷积网络损失函数为:The loss function of the stacked cross-domain multi-structure graph convolutional network is:
Figure PCTCN2022071269-appb-000039
Figure PCTCN2022071269-appb-000039
其中,L G2为堆叠结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为KL散度,L CD2为第二倒角距离,L EMD2为第二推土机距离,
Figure PCTCN2022071269-appb-000040
为判别器判别生成器所生成的点云的真实程度的数学期望;
Among them, L G2 is the loss function of cross-domain multi-structure graph convolutional network with stacked structure, λ 1 , λ 2 , λ 3 are the preset adjustable parameters in the experiment, L KL is the KL divergence, and L CD2 is the second inverse Angular distance, L EMD2 is the second bulldozer distance,
Figure PCTCN2022071269-appb-000040
Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
低尺度细节判别网络损失函数为:The loss function of the low-scale detail discriminant network is:
Figure PCTCN2022071269-appb-000041
Figure PCTCN2022071269-appb-000041
其中,L D2为低尺度细节判别网络损失函数,λ gp为实验中预设的可调参数,
Figure PCTCN2022071269-appb-000042
为判别器判别真实样本点云的真实程度的数学期望,
Figure PCTCN2022071269-appb-000043
为判别器判别生成点云的每个点的真实程度的数学期望。
Among them, L D2 is the low-scale detail discriminant network loss function, λ gp is the preset adjustable parameter in the experiment,
Figure PCTCN2022071269-appb-000042
is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
Figure PCTCN2022071269-appb-000043
The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
可选地,分布特征编码网络基于高效残差卷积神经网络和自注意力网络构成。Optionally, the distributed feature encoding network is based on an efficient residual convolutional neural network and a self-attention network.
关于上述实施例基于2D图像重建3D图像装置中各模块实现技术方案的其他细节,可参见上述实施例中的基于2D图像重建3D图像方法中的描述,此处不再赘述。For other details of implementing the technical solution of each module in the device for reconstructing a 3D image based on a 2D image in the above embodiment, please refer to the description in the method for reconstructing a 3D image based on a 2D image in the above embodiment, which will not be repeated here.
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。对于装置类实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。It should be noted that each embodiment in this specification is described in a progressive manner, and each embodiment focuses on the differences from other embodiments. For the same and similar parts in each embodiment, refer to each other, that is, Can. As for the device-type embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and for related parts, please refer to part of the description of the method embodiments.
请参阅图4,图4为本发明实施例的计算机设备的结构示意图。如图5所示,该计算机设备60包括处理器61及和处理器61耦接的存储器62,存储器62中存储有程序指令,程序指令被处理器61执行时,使得处理器61执行上述任一实施例所述的基于2D图像重建3D图像方法的步骤。Please refer to FIG. 4 , which is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in FIG. 5 , the computer device 60 includes a processor 61 and a memory 62 coupled to the processor 61. Program instructions are stored in the memory 62. When the program instructions are executed by the processor 61, the processor 61 performs any of the above-mentioned operations. The steps of the method for reconstructing a 3D image based on a 2D image described in the embodiment.
其中,处理器61还可以称为CPU(Central Processing Unit,中央处理单元)。处理器61可能是一种集成电路芯片,具有信号的处理能力。处理器61还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。Wherein, the processor 61 may also be called a CPU (Central Processing Unit, central processing unit). The processor 61 may be an integrated circuit chip with signal processing capabilities. The processor 61 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components . A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
参阅图5,图5为本发明实施例的存储介质的结构示意图。本发明实施例的存储介质存储有能够实现上述所有方法的程序指令71,其中,该程序指令71可以以软件产品的形式存储在上述存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,或者是计算机、服务器、手机、平板等计算机设备。Referring to FIG. 5 , FIG. 5 is a schematic structural diagram of a storage medium according to an embodiment of the present invention. The storage medium in the embodiment of the present invention stores program instructions 71 capable of realizing all the above-mentioned methods, wherein the program instructions 71 can be stored in the above-mentioned storage medium in the form of software products, including several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. , or computer equipment such as computers, servers, mobile phones, and tablets.
在本申请所提供的几个实施例中,应该理解到,所揭露的计算机设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有 另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed computer equipment, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。以上仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units. The above is only the implementation mode of this application, and does not limit the scope of patents of this application. Any equivalent structure or equivalent process transformation made by using the contents of this application specification and drawings, or directly or indirectly used in other related technical fields, All are included in the scope of patent protection of the present application in the same way.

Claims (10)

  1. 一种基于2D图像重建3D图像方法,其特征在于,包括:A method for reconstructing a 3D image based on a 2D image, comprising:
    利用预先训练好的点云重建模型的分布特征编码网络提取2D图像的编码特征向量,所述点云重建模型包括所述分布特征编码网络、用于进行形状处理的第一渐进映射生成器和用于进行结构细节处理的第二渐进映射生成器;The coded feature vector of the 2D image is extracted using the distributed feature encoding network of the pre-trained point cloud reconstruction model, which includes the distributed feature encoding network, the first progressive map generator for shape processing, and the A second progressive map generator for structural detail processing;
    根据所述第一渐进映射生成器对所述编码特征向量进行处理,生成稀疏点云;Processing the encoded feature vector according to the first progressive map generator to generate a sparse point cloud;
    根据所述第二渐进映射生成器对所述稀疏点云进行处理,生成高密度点云;Processing the sparse point cloud according to the second progressive mapping generator to generate a high-density point cloud;
    根据所述高密度点云构建3D图像。A 3D image is constructed from the high-density point cloud.
  2. 根据权利要求1所述的基于2D图像重建3D图像方法,其特征在于,所述第一渐进映射生成器包括树结构的跨域多结构图卷积网络和高尺度形状判别网络;所述根据所述第一渐进映射生成器对所述编码特征向量进行处理,生成稀疏点云,包括:The method for reconstructing a 3D image based on a 2D image according to claim 1, wherein the first progressive mapping generator includes a tree-structured cross-domain multi-structure graph convolutional network and a high-scale shape discriminant network; The first progressive mapping generator processes the encoded feature vector to generate a sparse point cloud, including:
    将所述编码特征向量输入至所述树结构的跨域多结构图卷积网络,得到第一初始点云;The coded feature vector is input to the cross-domain multi-structure graph convolutional network of the tree structure to obtain the first initial point cloud;
    利用所述高尺度形状判别网络判别所述第一初始点云在第一预设尺度上的形状完善程度,并引导所述树结构的跨域多结构图卷积网络优化所述第一初始点云,得到所述稀疏点云。Use the high-scale shape discrimination network to judge the shape perfection of the first initial point cloud on the first preset scale, and guide the tree-structured cross-domain multi-structure graph convolutional network to optimize the first initial point. cloud to get the sparse point cloud.
  3. 根据权利要求2所述的基于2D图像重建3D图像方法,其特征在于,所述第二渐进映射生成器包括堆叠结构的跨域多结构图卷积网络和低尺度细节判别网络;所述根据所述第二渐进映射生成器对所述稀疏点云进行处理,生成高密度点云,包括:The method for reconstructing a 3D image based on a 2D image according to claim 2, wherein the second progressive mapping generator includes a stacked cross-domain multi-structure graph convolution network and a low-scale detail discrimination network; The second progressive mapping generator processes the sparse point cloud to generate a high-density point cloud, including:
    利用所述堆叠结构的跨域多结构图卷积网络对所述稀疏点云进行上采样操作,生成第二初始点云;Using the cross-domain multi-structure graph convolutional network of the stacked structure to perform an upsampling operation on the sparse point cloud to generate a second initial point cloud;
    利用所述低尺度细节判别网络判别所述第二初始点云在第二预设尺度上的结构可信程度,并引导所述堆叠结构的跨域多结构图卷积网络优化所述第二初始点云,得到所述高密度点云,所述第二预设尺度的精度高于所述第一预设尺度。Use the low-scale detail discrimination network to judge the structural credibility of the second initial point cloud on the second preset scale, and guide the cross-domain multi-structure graph convolutional network of the stacked structure to optimize the second initial point cloud A point cloud is obtained by obtaining the high-density point cloud, and the accuracy of the second preset scale is higher than that of the first preset scale.
  4. 根据权利要求1所述的基于2D图像重建3D图像方法,其特征在于, 预先训练所述点云重建模型的步骤,包括:The method for reconstructing a 3D image based on a 2D image according to claim 1, wherein the step of pre-training the point cloud reconstruction model includes:
    获取2D样本图像和所述2D样本图像对应的真实点云,并构建待训练的点云重建模型;Obtaining a 2D sample image and a real point cloud corresponding to the 2D sample image, and constructing a point cloud reconstruction model to be trained;
    利用所述分布特征编码网络从所述2D样本图像提取得到样本编码特征向量,并计算得到所述样本编码特征向量的KL散度;Using the distributed feature encoding network to extract a sample encoding feature vector from the 2D sample image, and calculate the KL divergence of the sample encoding feature vector;
    将所述样本编码特征向量输入至所述第一渐进映射生成器,生成样本稀疏点云,并利用预设规则计算得到所述稀疏点云与所述真实点云之间的第一倒角距离和第一推土机距离;Inputting the sample encoded feature vector into the first progressive mapping generator to generate a sample sparse point cloud, and calculating a first chamfering distance between the sparse point cloud and the real point cloud by using preset rules Distance from the first bulldozer;
    将所述样本稀疏点云输入至所述第二渐进映射生成器,生成样本高密度点云,并利用所述预设规则计算得到所述高密度点云与所述真实点云之间的第二倒角距离和第二推土机距离;Inputting the sample sparse point cloud to the second progressive mapping generator to generate a sample high-density point cloud, and using the preset rules to calculate the first point between the high-density point cloud and the real point cloud Second chamfer distance and second bulldozer distance;
    利用基于所述KL散度、所述第一倒角距离和所述第一推土机距离构建的第一损失函数反向更新所述第一渐进映射生成器,并利用基于所述KL散度、所述第二倒角距离和所述第二推土机距离构建的第二损失函数反向更新所述第二渐进映射生成器。Utilize the first loss function constructed based on the KL divergence, the first chamfer distance and the first bulldozer distance to inversely update the first progressive map generator, and utilize the KL divergence, the Inversely updating the second progressive map generator with a second loss function constructed by the second chamfering distance and the second bulldozer distance.
  5. 根据权利要求4所述的基于2D图像重建3D图像方法,其特征在于,所述第一损失函数包括树结构的跨域多结构图卷积网络损失函数和高尺度形状判别网络损失函数;The method for reconstructing a 3D image based on a 2D image according to claim 4, wherein the first loss function includes a tree-structured cross-domain multi-structure graph convolutional network loss function and a high-scale shape discriminant network loss function;
    所述树结构的跨域多结构图卷积网络损失函数为:The cross-domain multi-structure graph convolutional network loss function of the tree structure is:
    Figure PCTCN2022071269-appb-100001
    Figure PCTCN2022071269-appb-100001
    其中,L G1为所述树结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为所述KL散度,L CD1为所述第一倒角距离,L EMD1为所述第一推土机距离,
    Figure PCTCN2022071269-appb-100002
    为判别器判别生成器所生成的点云的真实程度的数学期望;
    Among them, L G1 is the cross-domain multi-structure graph convolution network loss function of the tree structure, λ 1 , λ 2 , λ 3 are adjustable parameters preset in the experiment, L KL is the KL divergence, L CD1 is the first chamfering distance, L EMD1 is the first bulldozer distance,
    Figure PCTCN2022071269-appb-100002
    Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
    所述高尺度形状判别网络损失函数为:The loss function of the high-scale shape discriminant network is:
    Figure PCTCN2022071269-appb-100003
    Figure PCTCN2022071269-appb-100003
    其中,L D1为所述高尺度形状判别网络损失函数,λ gp为实验中预设的可调参数,
    Figure PCTCN2022071269-appb-100004
    为判别器判别真实样本点云的真实程度的数学期望,
    Figure PCTCN2022071269-appb-100005
    为判别器判别生成点云的每个点的真实程度的数学期望。
    Among them, L D1 is the loss function of the high-scale shape discriminant network, λ gp is an adjustable parameter preset in the experiment,
    Figure PCTCN2022071269-appb-100004
    is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
    Figure PCTCN2022071269-appb-100005
    The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  6. 根据权利要求4所述的基于2D图像重建3D图像方法,其特征在于, 所述第二损失函数包括堆叠结构的跨域多结构图卷积网络损失函数和低尺度细节判别网络损失函数;The method for reconstructing a 3D image based on a 2D image according to claim 4, wherein the second loss function includes a stacked cross-domain multi-structure graph convolutional network loss function and a low-scale detail discrimination network loss function;
    所述堆叠结构的跨域多结构图卷积网络损失函数为:The cross-domain multi-structure graph convolutional network loss function of the stacked structure is:
    Figure PCTCN2022071269-appb-100006
    Figure PCTCN2022071269-appb-100006
    其中,L G2为所述堆叠结构的跨域多结构图卷积网络损失函数,λ 1、λ 2、λ 3为实验中预设的可调参数,L KL为所述KL散度,L CD2为所述第二倒角距离,L EMD2为所述第二推土机距离,
    Figure PCTCN2022071269-appb-100007
    为判别器判别生成器所生成的点云的真实程度的数学期望;
    Among them, L G2 is the cross-domain multi-structure graph convolutional network loss function of the stacked structure, λ 1 , λ 2 , λ 3 are adjustable parameters preset in the experiment, L KL is the KL divergence, L CD2 is the second chamfering distance, L EMD2 is the second bulldozer distance,
    Figure PCTCN2022071269-appb-100007
    Discriminate the mathematical expectation of the degree of realism of the point cloud generated by the generator for the discriminator;
    所述低尺度细节判别网络损失函数为:The loss function of the low-scale detail discrimination network is:
    Figure PCTCN2022071269-appb-100008
    Figure PCTCN2022071269-appb-100008
    其中,L D2为所述低尺度细节判别网络损失函数,λ gp为实验中预设的可调参数,
    Figure PCTCN2022071269-appb-100009
    为判别器判别真实样本点云的真实程度的数学期望,
    Figure PCTCN2022071269-appb-100010
    为判别器判别生成点云的每个点的真实程度的数学期望。
    Among them, L D2 is the loss function of the low-scale detail discriminant network, λ gp is an adjustable parameter preset in the experiment,
    Figure PCTCN2022071269-appb-100009
    is the mathematical expectation of the authenticity of the discriminator to discriminate the real sample point cloud,
    Figure PCTCN2022071269-appb-100010
    The mathematical expectation for the discriminator to discriminate how realistic each point of the generated point cloud is.
  7. 根据权利要求1所述的基于2D图像重建3D图像方法,其特征在于,所述分布特征编码网络基于高效残差卷积神经网络和自注意力网络构成。The method for reconstructing a 3D image based on a 2D image according to claim 1, wherein the distributed feature encoding network is formed based on a high-efficiency residual convolutional neural network and a self-attention network.
  8. 一种基于2D图像重建3D图像装置,其特征在于,包括:A device for reconstructing a 3D image based on a 2D image, comprising:
    特征提取模块,用于利用预先训练好的点云重建模型的分布特征编码网络提取2D图像的编码特征向量,所述点云重建模型包括所述分布特征编码网络、用于进行形状处理的第一渐进映射生成器和用于进行结构细节处理的第二渐进映射生成器;The feature extraction module is used to extract the coded feature vector of the 2D image using the distributed feature encoding network of the pre-trained point cloud reconstruction model. The point cloud reconstruction model includes the distributed feature encoding network and the first shape processing. a progressive map generator and a second progressive map generator for structural detail processing;
    第一生成模块,用于根据所述第一渐进映射生成器对所述编码特征向量进行处理,生成稀疏点云;A first generating module, configured to process the encoded feature vector according to the first progressive mapping generator to generate a sparse point cloud;
    第二生成模块,用于根据所述第二渐进映射生成器对所述稀疏点云进行处理,生成高密度点云;A second generating module, configured to process the sparse point cloud according to the second progressive mapping generator to generate a high-density point cloud;
    构建模块,用于根据所述高密度点云构建3D图像。A building block for constructing a 3D image from the high-density point cloud.
  9. 一种计算机设备,其特征在于,所述计算机设备包括处理器、与所述处理器耦接的存储器,所述存储器中存储有程序指令,所述程序指令被所述处理器执行时,使得所述处理器执行如权利要求1-7中任一项权利要求所述的基于2D图像重建3D图像方法的步骤。A computer device, characterized in that the computer device includes a processor and a memory coupled to the processor, and program instructions are stored in the memory, and when the program instructions are executed by the processor, the The processor executes the steps of the method for reconstructing a 3D image based on a 2D image according to any one of claims 1-7.
  10. 一种存储介质,其特征在于,存储有能够实现如权利要求1-7中任一项所述的基于2D图像重建3D图像方法的程序指令。A storage medium, characterized by storing program instructions capable of implementing the method for reconstructing a 3D image based on a 2D image according to any one of claims 1-7.
PCT/CN2022/071269 2022-01-11 2022-01-11 Method and apparatus for reconstructing 3d image on the basis of 2d image, device, and storage medium WO2023133675A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/071269 WO2023133675A1 (en) 2022-01-11 2022-01-11 Method and apparatus for reconstructing 3d image on the basis of 2d image, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/071269 WO2023133675A1 (en) 2022-01-11 2022-01-11 Method and apparatus for reconstructing 3d image on the basis of 2d image, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2023133675A1 true WO2023133675A1 (en) 2023-07-20

Family

ID=87279868

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071269 WO2023133675A1 (en) 2022-01-11 2022-01-11 Method and apparatus for reconstructing 3d image on the basis of 2d image, device, and storage medium

Country Status (1)

Country Link
WO (1) WO2023133675A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389671A (en) * 2018-09-25 2019-02-26 南京大学 A kind of single image three-dimensional rebuilding method based on multistage neural network
CN111598998A (en) * 2020-05-13 2020-08-28 腾讯科技(深圳)有限公司 Three-dimensional virtual model reconstruction method and device, computer equipment and storage medium
CN112258625A (en) * 2020-09-18 2021-01-22 山东师范大学 Single image to three-dimensional point cloud model reconstruction method and system based on attention mechanism
CN112598790A (en) * 2021-01-08 2021-04-02 中国科学院深圳先进技术研究院 Brain structure three-dimensional reconstruction method and device and terminal equipment
WO2021232687A1 (en) * 2020-05-19 2021-11-25 华南理工大学 Deep learning-based point cloud upsampling method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389671A (en) * 2018-09-25 2019-02-26 南京大学 A kind of single image three-dimensional rebuilding method based on multistage neural network
CN111598998A (en) * 2020-05-13 2020-08-28 腾讯科技(深圳)有限公司 Three-dimensional virtual model reconstruction method and device, computer equipment and storage medium
WO2021232687A1 (en) * 2020-05-19 2021-11-25 华南理工大学 Deep learning-based point cloud upsampling method
CN112258625A (en) * 2020-09-18 2021-01-22 山东师范大学 Single image to three-dimensional point cloud model reconstruction method and system based on attention mechanism
CN112598790A (en) * 2021-01-08 2021-04-02 中国科学院深圳先进技术研究院 Brain structure three-dimensional reconstruction method and device and terminal equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BOWEN HU; BAIYING LEI; SHUQIANG WANG; YONG LIU; BINGCHUAN WANG; MIN GAN; YANYAN SHEN: "3D Brain Reconstruction by Hierarchical Shape-Perception Network from a Single Incomplete Image", ARXIV.ORG, 12 October 2021 (2021-10-12), XP091068180 *

Similar Documents

Publication Publication Date Title
Xie et al. Point clouds learning with attention-based graph convolution networks
JP7182021B2 (en) KEYPOINT DETECTION METHOD, KEYPOINT DETECTION DEVICE, ELECTRONIC DEVICE, AND STORAGE MEDIUM
WO2015139574A1 (en) Static object reconstruction method and system
WO2023134063A1 (en) Comparative learning-based method, apparatus, and device for predicting properties of drug molecule
JP2019076699A (en) Nodule detection with false positive reduction
WO2016183464A1 (en) Deepstereo: learning to predict new views from real world imagery
US20220222925A1 (en) Artificial intelligence-based image processing method and apparatus, device, and storage medium
WO2023151237A1 (en) Face pose estimation method and apparatus, electronic device, and storage medium
JP7443647B2 (en) Keypoint detection and model training method, apparatus, device, storage medium, and computer program
KR102188732B1 (en) System and Method for Data Processing using Sphere Generative Adversarial Network Based on Geometric Moment Matching
CN114298997B (en) Fake picture detection method, fake picture detection device and storage medium
US20210365718A1 (en) Object functionality predication methods, computer device, and storage medium
CN109948575A (en) Eyeball dividing method in ultrasound image
Kaul et al. FatNet: A feature-attentive network for 3D point cloud processing
CN111091010A (en) Similarity determination method, similarity determination device, network training device, network searching device and storage medium
CN115830375A (en) Point cloud classification method and device
CN113127697B (en) Method and system for optimizing graph layout, electronic device and readable storage medium
CN111192320A (en) Position information determining method, device, equipment and storage medium
CN114782503A (en) Point cloud registration method and system based on multi-scale feature similarity constraint
CN114092653A (en) Method, device and equipment for reconstructing 3D image based on 2D image and storage medium
WO2023133675A1 (en) Method and apparatus for reconstructing 3d image on the basis of 2d image, device, and storage medium
WO2024021641A1 (en) Blood vessel segmentation method and apparatus, device, and storage medium
Wu et al. Active 3-D shape cosegmentation with graph convolutional networks
CN115170599A (en) Method and device for vessel segmentation through link prediction of graph neural network
Bergamasco et al. A bipartite graph approach to retrieve similar 3D models with different resolution and types of cardiomyopathies

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22919341

Country of ref document: EP

Kind code of ref document: A1