CN113920243A

CN113920243A - Three-dimensional reconstruction method and device of brain structure in extreme environment and readable storage medium

Info

Publication number: CN113920243A
Application number: CN202111108509.4A
Authority: CN
Inventors: 王书强; 胡博闻; 申妍燕
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2021-09-22
Filing date: 2021-09-22
Publication date: 2022-01-11

Abstract

The application provides a three-dimensional reconstruction method and device of a brain structure in an extreme environment and a readable storage medium, relates to the field of image processing, and aims to reconstruct a complete three-dimensional structure based on incomplete images. The method comprises the following steps: acquiring a two-dimensional image of a target object, wherein the two-dimensional image presents a partial region of the target object; converting the two-dimensional image into a first point cloud, wherein the first point cloud is used for describing a three-dimensional structure of a partial area; processing the first point cloud through the trained point cloud completion model to obtain a second point cloud, wherein the second point cloud is used for describing the overall three-dimensional structure of the target object; the point cloud complementing model comprises N levels of point cloud compression modules and N levels of point cloud expansion modules, the N levels of point cloud compression modules are used for compressing a first point cloud to obtain N pieces of compressed information with different resolutions, the N levels of point cloud expansion modules are used for reconstructing the N pieces of compressed information with different resolutions to obtain a second point cloud, N is not less than 2, and N is an integer.

Description

Three-dimensional reconstruction method and device of brain structure in extreme environment and readable storage medium

Technical Field

The application belongs to the field of image processing, and particularly relates to a three-dimensional reconstruction method and device of a brain structure in an extreme environment and a readable storage medium.

Background

With the continuous development of medical technical means, minimally invasive surgery and robot-guided interventional techniques have been gradually applied to brain surgery, and better treatment experience is brought to patients due to the characteristics of smaller surgical wounds and shorter recovery time. However, since the lesion site cannot be directly observed during the operation, the surgeon cannot directly manipulate the operation target.

At present, an image of a brain can be acquired through an optical sensing device, and the image is reconstructed to acquire a three-dimensional structure of the brain, so that a doctor is assisted in observing a lesion part, and an operation is completed. However, the light environment of the optical sensing device and the visual pollution condition (for example, local bleeding) that may occur during the operation may cause the image collected by the optical sensing device to be a defective image (i.e. a two-dimensional image showing a partial region of the brain instead of the complete brain), which may result in the three-dimensional structure obtained by reconstruction being incomplete, thereby affecting the observation and judgment of the doctor.

Disclosure of Invention

The method, the device, the equipment and the readable storage medium for three-dimensional reconstruction of the brain structure in the extreme environment can solve the problem of how to reconstruct a complete three-dimensional structure based on incomplete images.

In a first aspect, an embodiment of the present application provides a three-dimensional reconstruction method, including: acquiring a two-dimensional image of a target object, wherein the two-dimensional image presents a partial region of the target object; converting the two-dimensional image into a first point cloud, wherein the first point cloud is used for describing the three-dimensional structure of the partial area; processing the first point cloud through the trained point cloud completion model to obtain a second point cloud, wherein the second point cloud is used for describing the overall three-dimensional structure of the target object; the point cloud complementing model comprises N levels of point cloud compression modules and N levels of point cloud expansion modules, the N levels of point cloud compression modules are used for compressing a first point cloud to obtain N pieces of compressed information with different resolutions, the N levels of point cloud expansion modules are used for reconstructing the N pieces of compressed information with different resolutions to obtain a second point cloud, N is not less than 2, and N is an integer.

Based on the three-dimensional reconstruction method provided by the application, after the incomplete two-dimensional image is converted into the first point cloud, the first point cloud is complemented by using the point cloud complementing model, and the second point cloud is obtained. The point cloud complementing model can extract compressed information of different resolutions in the first point cloud, perform point cloud expansion based on the compressed information of different resolutions, and reconstruct to obtain a second point cloud capable of describing the overall three-dimensional structure of the target object. Therefore, the method provided by the application can be used for reconstructing the complete three-dimensional structure of the target object based on the incomplete image of the target object.

In a possible implementation manner, the input information of the point cloud compression module at the 1 st layer is the first point cloud, the output information of the previous layer in the two adjacent layers of point cloud compression modules is the input information of the next time, the input information of the point cloud expansion module at the 1 st layer is the output information of the point cloud compression modules at the N-1 st layer and the N-1 st layer, the input information of the point cloud expansion module at the m th layer is the output information of the point cloud compression module at the N-m th layer and the output information of the point cloud expansion module at the m-1 st layer, the input information of the point cloud expansion module at the N th layer is the output information of the point cloud expansion module at the N-1 st layer, the output information of the point cloud expansion module at the N th layer is the second point cloud, m is more than or equal to 2 and less than N, and m is an integer.

In one possible implementation manner, the point cloud expansion module includes a plurality of dynamic information gate modules, a plurality of full connection layers are arranged between the plurality of dynamic information gate modules, and the dynamic information gate modules are used for performing attention mechanism calculation on input information of the dynamic information gates.

In one possible implementation, the point cloud compression module is a PointNet + + network structure.

In one possible implementation, converting the two-dimensional image into the first point cloud includes: and converting the two-dimensional image into a first point cloud by using the trained point cloud reconstruction model.

In one possible implementation manner, the point cloud reconstruction model includes a ResNet encoder and a graph convolution neural network, which are sequentially connected, the graph convolution neural network includes a plurality of sets of graph convolution modules and branch modules, which are alternately arranged, the graph convolution modules are used for adjusting position coordinates of the point cloud, and the branch modules are used for expanding the number of the point clouds.

In one possible implementation manner, the training manner of the point cloud completion model and the point cloud reconstruction model is as follows: constructing a completion integration initial model which comprises a point cloud reconstruction initial model, a discriminator and a point cloud completion initial model; performing countermeasure training on the completion integrated initial model according to a preset loss function and a training set so as to train the point cloud reconstruction initial model into a point cloud reconstruction model and train the point cloud completion initial model into a point cloud completion model; the training set comprises a plurality of two-dimensional image samples presenting part areas of the object sample, and a part area point cloud sample and an integral point cloud sample corresponding to each two-dimensional image sample; the loss function includes a loss calculation based on the relative entropy, the chamfer distance, and the dozer distance.

In a second aspect, an embodiment of the present application provides a three-dimensional reconstruction apparatus, including:

an acquisition unit configured to acquire a two-dimensional image of a target object, the two-dimensional image representing a partial region of the target object;

the conversion unit is used for converting the two-dimensional image into a first point cloud, and the first point cloud is used for describing the three-dimensional structure of the partial area;

the completion unit is used for processing the first point cloud through the trained point cloud completion model to obtain a second point cloud, and the second point cloud is used for describing the overall three-dimensional structure of the target object; the point cloud complementing model comprises N levels of point cloud compression modules and N levels of point cloud expansion modules, the N levels of point cloud compression modules are used for compressing a first point cloud to obtain N pieces of compressed information with different resolutions, the N levels of point cloud expansion modules are used for reconstructing the N pieces of compressed information with different resolutions to obtain a second point cloud, N is not less than 2, and N is an integer.

In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method of the first aspect or any optional manner of the first aspect is implemented.

In a fourth aspect, this application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of the first aspect or any alternative form of the first aspect.

In a fifth aspect, an embodiment of the present application provides a computer program product, which, when run on a terminal device, causes the terminal device to execute the method of the first aspect or any alternative manner of the first aspect.

It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of an embodiment of a three-dimensional reconstruction method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a network structure of a completion integration model provided in the present application;

FIG. 3 is a schematic diagram of a network structure of a point cloud expansion module provided in the present application;

FIG. 4 is a schematic diagram of a network structure of a completion integration initial model provided in the present application;

fig. 5 is a schematic structural diagram of a three-dimensional reconstruction apparatus provided in the present application;

fig. 6 is a schematic structural diagram of a terminal device provided in the present application.

Detailed Description

A point cloud is a data structure that describes the structure of a particular object shape in three-dimensional space, represented as a set of several scattered points. For example, given a three-dimensional structure of the brain, its point cloud can be written as a matrix Y _ (Y × 3), where Y represents the number of points and 3 represents the three-dimensional coordinates of the points. In particular, when any two rows of the switching matrix are considered to switch the storage locations of two points in the point cloud, all properties of the point cloud remain unchanged according to the disorder of the set. The point cloud has the advantages of small space complexity, simple storage form, high calculation performance and the like. Compared with a flat space of a two-dimensional image, the point cloud contains more space structure information, and can provide more visual information for a doctor, so that the doctor is assisted to make a better diagnosis and treatment. Therefore, the two-dimensional image is reconstructed into an accurate and clear point cloud, additional visual and diagnosis and treatment information can be provided for a doctor, and the doctor is assisted in making a real-time decision.

For example, in the observation and analysis of the brain, a two-dimensional image of the brain (e.g., a Magnetic Resonance Imaging (MRI) image, a Computed Tomography (CT) image, etc.) may be acquired by an optical sensing apparatus, and then the two-dimensional image is reconstructed to obtain a point cloud of the brain to assist a doctor in observing a three-dimensional structure of the brain, thereby analyzing a location of a lesion. However, in some extreme environments, such as abnormal light environment, or local bleeding of the brain during operation, the two-dimensional image collected by the optical sensing device may be a defective image that presents two-dimensional information of a partial region of the brain instead of complete two-dimensional information. This results in an incomplete reconstruction of the three-dimensional structure of the brain, which can affect the physician's observation and judgment.

Aiming at the problem that the three-dimensional structure obtained by reconstructing the incomplete image is incomplete, the application provides a three-dimensional reconstruction method, after the incomplete two-dimensional image is converted into first point cloud, the first point cloud is complemented by using a point cloud complementing model, and second point cloud is obtained. The point cloud complementing model can extract compressed information of different resolutions in the first point cloud, perform point cloud expansion based on the compressed information of different resolutions, and reconstruct to obtain a second point cloud capable of describing the overall three-dimensional structure of the target object. Therefore, the complete three-dimensional structure of the target object is reconstructed based on the incomplete image of the target object.

The three-dimensional reconstruction method provided by the present application is exemplarily described below with reference to specific embodiments.

Referring to fig. 1, a flowchart of an embodiment of a three-dimensional reconstruction method provided in an embodiment of the present application is an implementation subject of the method, which may be an image data acquisition device, such as a Positron Emission Tomography (PET) device, a CT device or an MRI device, a Diffusion Tensor Imaging (DTI), a Functional Magnetic Resonance Imaging (FMRI) device, a camera, or other terminal devices. The system can also be a control device, a computer, a robot, a mobile terminal and other terminal devices for acquiring the two-dimensional graph from the image data acquisition device. As shown in fig. 1, the method includes:

s101, acquiring a two-dimensional image of the target object, wherein the two-dimensional image presents a partial region of the target object.

The target object may be a human or animal body, or an organ of the human or animal body, such as a brain, a heart, a lung, or the like. Other living or non-living entities are also possible. The two-dimensional image may be a PET image, an MRI image, a CT image, a DTI image, an FMRI image, or an image taken by a camera.

In the present example, a partial region of the two-dimensional image in which the target object is present, that is, due to an extreme environment or a special reason, only the two-dimensional structure information of the partial region is included in the captured two-dimensional image.

And S102, converting the two-dimensional image into a first point cloud, wherein the first point cloud is used for describing the three-dimensional structure of the partial area.

For example, the conversion of the two-dimensional image into the point cloud can be performed by a neural network model, and the conversion of the two-dimensional image into the point cloud can also be realized based on a traditional algorithm of a depth camera.

In one example, the neural network model may be a point cloud reconstruction model as shown in fig. 2, including a ResNet encoder and a graph convolution neural network.

The ResNet encoder is used for quantizing the two-dimensional image into feature vectors which have a certain mean value mu and a certain variance sigma and obey Gaussian distribution, randomly extracting 96-dimensional encoding feature vectors z from the feature vectors, and transmitting the encoding feature vectors z to the graph convolution neural network. The encoded feature vector is used as an initial point cloud of the input graph convolutional neural network, the number of the point cloud is 1, and the coordinate dimension is 96.

The graph convolution neural network comprises a plurality of alternately arranged branch modules and a graph convolution module, wherein the branch modules can map one point cloud into a plurality of point clouds, and then 1 initial point cloud can be gradually expanded into a target number of point clouds through the plurality of branch modules. The graph convolution module is used for adjusting the position coordinates of each point cloud, and the coordinate dimension of each point cloud input is subjected to dimension increasing or dimension reducing through the multiple graph convolution modules so as to gradually reduce the coordinate dimension of the point cloud from 96 dimensions to 3 dimensions. Therefore, through the plurality of graph convolution modules and the plurality of branch modules which are alternately arranged, the graph convolution neural network finally generates a first point cloud with a specific point cloud number, and each point cloud has 3-dimensional position coordinates.

Wherein the branching module obeys formula (1):

in the formula (1), the first and second groups,

representing the ith point cloud in the l-layer network of the graph convolution neural network;

representing the ith point cloud in the l +1 layer network of the graph convolution neural network;

representing the (i + 1) th point cloud in the (l + 1) th layer network of the graph convolution neural network;

the (i + n) th point cloud in the (l + 1) th layer network of the graph convolution neural network is represented.

That is, in the present embodiment, the branching module may copy the coordinates of each point cloud in the upper layer into n, respectively. If the upper layer has a (i belongs to a) point clouds, the coordinates of each point cloud are copied into n points, the branch module of the layer can expand the number of the point clouds into a multiplied by n points, and the a multiplied by n point cloud coordinates are transmitted to the next layer. If the graph convolution neural network comprises b (i belongs to b, b is more than or equal to 1, and b is a positive integer) branch modules, the expansion multiple of each branch module is the same and is n, after the ResNet encoder inputs an initial point cloud into the graph convolution neural network, each branch module in the graph convolution neural network replicates the coordinates of each point cloud into n, and the predicted first point cloud finally generated by the graph convolution neural network contains n^bAnd (4) point cloud.

Of course, the expansion factor of each branching module may also be different. For example, the expansion factor of the first layer branching module is 5, and the ResNet encoder can expand an initial point cloud input into 5 point clouds. The expansion multiple of the second layer of branch modules is 10, and the second layer can expand 5 point clouds into 50 point clouds after receiving 5 point clouds.

The graph convolution module obeys equation (2):

in the formula (2), the first and second groups,

representing K perceptrons in the l layer;

the node is a full connection layer and represents the mapping relation between the node of the l layer and the node of the l +1 layer;

represents the ith node in the ith layer

The collection of all nodes (i.e., ancestor nodes) of the corresponding levels 1 to l-1;

is a sparse matrix;

representing the distribution of features from ancestor nodes of level l to level l +1 nodes, q_jRepresents the jth ancestor node; b_lIs a bias parameter; σ (-) denotes the activation function.

In this example, the ResNet encoder can effectively extract the encoding feature information of the two-dimensional image, and the encoding feature information can guide the graph convolution neural network to accurately construct the first point cloud, so that the two-dimensional image containing limited information can be reconstructed into the first point cloud with richer and more accurate information.

And S103, processing the first point cloud through the trained point cloud completion model to obtain a second point cloud, wherein the second point cloud is used for describing the overall three-dimensional structure of the target object.

The point cloud completion model comprises point cloud compression modules of N levels and point cloud expansion modules of N levels. The point cloud expansion modules of the N levels are used for reconstructing the compressed information of the N different resolutions to obtain a second point cloud capable of describing the whole three-dimensional structure of the target object, wherein the point cloud compression modules of the N levels are used for compressing the first point cloud to obtain compressed information of the N different resolutions, the point cloud expansion modules of the N levels are used for reconstructing the compressed information of the N different resolutions, N is not less than 2, and N is an integer.

In one example, the point cloud compression module may be a PointNet + + network structure. PointNet + + is a network structure of encoder-decoder (encoder-decoder) with multi-level feature extraction. The encoder obtains global features of different scales through multi-level down-sampling. The encoder samples the point-level features (i.e., compressed information) at the corresponding resolution by upsampling.

By adopting the PointNet + + network structure, local features can be fully captured, the generalization capability is realized on complex scenes, and the detail information can be fully reserved, so that more structural information is reserved in the obtained compressed information. Thereby being beneficial to the subsequent point cloud expansion module to carry out point cloud reconstruction.

In one example, the point cloud compression module at the N-m layer is connected with the point cloud expansion module at the m layer through a hierarchy dynamic information pipeline, the point cloud compression module at the N-1 layer is connected with the point cloud expansion module at the 1 layer through a hierarchy dynamic information pipeline, wherein m is more than or equal to 2 and less than N, and m is an integer. Output information of the cloud compression modules from the layer 1 to the layer N-1 can be transmitted to the point cloud expansion modules of the corresponding levels in a jumping mode, and the point cloud expansion modules of the corresponding levels have more prior information and characteristic information in the point cloud expansion process.

That is to say, in the point cloud complete model, the input information of the point cloud compression module at the 1 st layer is the first point cloud, the output information of the previous layer in the two adjacent layers of point cloud compression modules is the input information of the next time, the input information of the point cloud expansion module at the 1 st layer is the output information of the point cloud compression modules at the N-1 st layer and the N-1 st layer, the input information of the point cloud expansion module at the m th layer is the output information of the point cloud compression module at the N-m th layer and the output information of the point cloud expansion module at the m-1 st layer, the input information of the point cloud expansion module at the N th layer is the output information of the point cloud expansion module at the N-1 st layer, and the output information of the point cloud expansion module at the N th layer is the second point cloud.

For example, taking N as 3 as an example, m may take a value, that is, m may take a value of 2, and a network structure of a corresponding point cloud completion model may be as shown in fig. 2. Wherein, the 3 layers of point cloud compression modules are sequentially marked as a point cloud compression module 1, a point cloud compression module 2 and a point cloud compression module 3, and the 3 layers of point cloud expansion modules are sequentially marked as a point cloud expansion module 1, a point cloud expansion module 2 and a point cloud expansion module 3.

The input information of the point cloud compression module 1 is the first point cloud, the input information of the point cloud compression module 2 is the output information of the point cloud compression module 1, and the input information of the point cloud compression module 3 is the output information of the point cloud compression module 2. The point cloud compression module 1 is connected with the point cloud expansion module 2 through a hierarchy dynamic information pipeline, and the point cloud compression module 2 is connected with the point cloud expansion module 1 through a hierarchy dynamic information pipeline. The input information of the point cloud expansion module 1 is the output information of the point cloud compression module 3 and the point cloud expansion module 2, the input information of the point cloud expansion module 2 is the output information of the point cloud compression module 1 and the point cloud expansion module 1, and the input information of the point cloud expansion module 3 is the output information of the point cloud expansion module 2.

In one example, the point cloud expansion module may include a plurality of dynamic information gate modules, wherein a plurality of fully connected layers are disposed between the plurality of dynamic information gate modules, and the dynamic information gate modules are used for performing attention mechanism calculation on input information of the dynamic information gates.

For example, taking 3 dynamic information gate modules as an example, the network structure of the point cloud expansion module may be as shown in fig. 3 (the dynamic information gate module is denoted by AGB in fig. 3). The input information of the dynamic information gate module comprises two attention sets K and R. K and R may be output information of two point cloud compression modules, or may be output information of one point cloud compression module and one point cloud expansion module, or may be output information of the same point cloud expansion module or the same full link layer (i.e., K and R are the same in this case).

For example, K and R of the 1 st dynamic information gate module in the point cloud expansion module 1 are output information of the point cloud compression module 2 and the point cloud compression module 3, K and R of the 1 st dynamic information gate module in the point cloud expansion module 2 are output information of the point cloud compression module 1 and the point cloud expansion module 1, K and R of the 1 st dynamic information gate module in the point cloud expansion module 3 are output information of the point cloud expansion module 2, and K and R of the 2 nd and the 3 rd dynamic information gate modules in each point cloud expansion module are output information of a full connection layer connected with the dynamic information gate module.

As shown in fig. 3, the dynamic information gate module includes 4 fully connected layers, attention gate and softMax modules. Wherein, 4 full connection layers are respectively F₁、F₂、F₃And F₄. Then, the attention score between any of the elements in the attention sets K and R can be calculated by equation (3):

wherein k is_iDenotes the i-th element in K, r_jDenotes the jth element in R, T denotes the full time of the matrix, a_i,jRepresents k_iAnd r_jAttention points in between.

Then, based on the attention score, each element in K is updated by the following formula (4):

wherein, K may specifically be output information from a network layer above the dynamic information gate module, and R may specifically be output information from a network layer above the dynamic information gate module or compressed information of different resolutions output from a hierarchical dynamic information pipeline. And the updated K is the output information of the dynamic information gate module.

Illustratively, taking the point cloud completion model of the three-layer compression module and the expansion module shown in fig. 2 and 3 as an example, it is assumed that the point cloud is represented as a matrix 2048 × 3 according to the first point cloud for describing the three-dimensional structure of a partial region of the target object (e.g., the brain shown in fig. 2). After the matrix 2048 × 3 is input to the point cloud compensation model, the point cloud compression module 1 compresses the matrix into a 512 × 128 matrix and outputs the matrix, the point cloud compression module 2 compresses the output of the point cloud compression module 1 into a 256 × 256 matrix and outputs the matrix, and the point cloud compression module 3 compresses the output of the point cloud compression module 2 into a 1 × 512 feature vector and outputs the feature vector.

Next, the point cloud expansion module 1 reduces the feature vector 1 × 512 output by the point cloud compression module 3 and the matrix 256 × 256 output by the point cloud compression module 2 into a matrix 256 × 256 and outputs the matrix 256 × 256, the point cloud expansion module 2 reduces the matrix 256 × 256 output by the point cloud expansion module 1 and the matrix 512 × 128 output by the point cloud compression module 1 into a matrix 512 × 128 and outputs the matrix 512 × 128, and the point cloud expansion module 3 reduces the matrix 512 × 128 output by the point cloud expansion module 2 into a point cloud 2048 × 3 capable of describing the overall three-dimensional structure of the brain.

In the point cloud expansion module, the size of the matrix and the value of the matrix are adjusted through the full connection layer, and the value of the matrix is adjusted through the AGB, namely the sizes of the input and the output of the AGB are the same. Thus, if an AGB receives a feature matrix R for hierarchical dynamic information pipe transmission, the fully-connected layer can ensure that the AGB receives inputs K from the fully-connected layer that are of equal size to R (equal row-column width).

In the embodiment of the application, the attention scores in the point cloud expansion module are calculated through the dynamic information gate module, so that useless features are combined for compressed information with different resolutions, and the weights of the useful features are adjusted, so that the first point cloud is expanded more accurately.

In summary, by using the three-dimensional reconstruction method provided by the application, after the incomplete two-dimensional image is converted into the first point cloud, the first point cloud is complemented by using the point cloud complementing model, so as to obtain the second point cloud. The point cloud complementing model can extract compressed information of different resolutions in the first point cloud, perform point cloud expansion based on the compressed information of different resolutions, and reconstruct to obtain a second point cloud capable of describing the overall three-dimensional structure of the target object. The problem of incomplete three-dimensional structure of the target object reconstructed based on the incomplete image of the target object at present is solved.

The above-mentioned training method of the point cloud completion model and the point cloud reconstruction model is exemplarily described below.

The point cloud completion model and the point cloud reconstruction model can be trained independently. And a complementary integrated initial model can be constructed for comprehensive training.

For example, as shown in fig. 4, completing the unified initial model may include a point cloud reconstruction initial model, a discriminator, and a point cloud completion initial model. And performing countermeasure training on the completion integrated initial model according to a preset loss function and a training set so as to train the point cloud reconstruction initial model into a point cloud reconstruction model and train the point cloud completion initial model into a point cloud completion model.

The training set comprises a plurality of two-dimensional image samples presenting part areas of the object sample, and a part area point cloud sample and an integral point cloud sample corresponding to each two-dimensional image sample.

For example, if a model capable of reconstructing the three-dimensional structure of the brain needs to be trained, the object samples are the brains of a plurality of different individuals. After the three-dimensional MRI image of each brain is acquired through the MRI equipment, the preprocessing of removing noise, removing skull and neck bone is carried out through cleaning, the preprocessed three-dimensional MRI image of the brain is sliced from different angles, and a two-dimensional slice image near the optimal plane is selected. And simulating an extreme environment to perform visual pollution on the two-dimensional slice image, so that the two-dimensional slice image only presents a partial region of the brain, and a two-dimensional image sample is obtained. The two-dimensional image sample may be denoted as I_H×WWhere H and W represent the length and width, respectively, of a two-dimensional image sample.

Correspondingly, a point cloud sample of the whole brain is obtained according to the three-dimensional MRI image.

During training, firstly, a two-dimensional image sample is input into a point cloud reconstruction initial model to obtain a predicted third point cloud. And the third point cloud is the point cloud of a partial area presented in the two-dimensional image sample predicted by the point cloud reconstruction initial model.

Exemplary I_H×WInputting the point cloud into a ResNet encoder for reconstructing an initial model of the point cloud, and enabling the ResNet encoder to convert I into_H×WIs converted into a gaussian distribution vector with a specific mean mu and variance sigma, and a 96-dimensional coded feature vector z-N (mu, sigma) is randomly extracted from the vector²) And transmitting the coding feature vector z to a graph convolution neural network, and reconstructing the coding feature vector into a third point cloud by the graph convolution neural network.

Wherein the KL divergence of the ResNet encoder can be calculated by the following formula (5):

wherein L is_KLIs KL divergence; x is the total number of Q values or P values; q (x) is the x-th probability distribution obtained by the encoder according to the coding feature vector z; p (x) is a preset xth probability distribution.

Then, inputting the third point cloud and the point cloud sample into a discriminator for discrimination; and inputting the third point cloud into the point cloud completion initial model to obtain a reconstructed fourth point cloud. The fourth point cloud is the whole point cloud of the object sample predicted by the point cloud completion initial model.

The loss function used in the training process includes a loss calculation based on the relative entropy, the chamfer distance, and the bulldozer distance. Illustratively, the method comprises a first loss function for training the point cloud to reconstruct the initial model, a second loss function for training the discriminator and a third loss function for training the point cloud to complement the initial model. And (4) independently training the point cloud reconstruction initial model, the discriminator and the point cloud completion initial model.

Wherein the first loss function L_E,GAs shown in the following equation (6):

L_E,G＝λ₁L_KL+λ₂L_CD+E_z～Z[D(G(z))] (6)

λ₁and λ₂Is a constant; l is_KLIs the KL divergence of the ResNet encoder; z is the distribution of the coding eigenvector generated by the ResNet coder; z represents a coded feature vector; g (z) is a third point cloud; d (g (z)) represents a value obtained after the third point cloud is input to the discriminator; e (-) represents expectation; l is_CDThe Chamfer Distance (CD) between the third point cloud and the partial area point cloud sample can be expressed as formula (7):

in formula (7), Y is a partial area point cloud sample, and represents a coordinate matrix of all point clouds in the partial area point cloud sample, and Y is a point cloud coordinate vector in the matrix Y; y ' is a coordinate matrix of all point clouds in the third point cloud, and Y ' is a point cloud coordinate vector in the matrix Y '. For example, if Y is a v × 3 matrix composed of v point cloud coordinates, Y is a coordinate vector with a size of 1 × 3 corresponding to one point cloud in Y.

DiscriminationSecond loss function L of the device_DLoss function L by bulldozer Distance (EMD)_EMDDerived, and can be expressed as formula (6):

in the formula (8), the first and second groups,

the samples representing the linear division between the third point cloud and the partial region point cloud samples may be represented as

E (-) is desired; d (g (z)) represents a value obtained by inputting the third point cloud g (z) to the discriminator; d (Y) represents a value obtained after the point cloud sample Y of the partial area is input into the discriminator; r is point cloud sample distribution of a partial region; lambda [ alpha ]_gpIs a constant;

is a gradient operator.

The third loss function of the point cloud completing the initial model can be shown as the following formula (9):

L_P＝λ₃L_EMD+λ₄L_CD (9)

wherein λ is₃And λ₄Is a constant number, L_EMDCan be expressed as shown in the following equation (10):

in the formula (10), the first and second groups,

is a bijective function.

It can be understood that the point cloud reconstruction model and the point cloud compensation model obtained by the compensation integration initial model training can be used as a complete compensation integration model (i.e., the model shown in fig. 2) to reconstruct the incomplete two-dimensional image into the whole point cloud. Or may be used separately.

The completion integration initial model carries out loss calculation by fusing the relative entropy, the chamfer distance and the bulldozer distance in the training process, ensures global Nash balance in the space, improves the global perception visual field of the completion integration model obtained by training, and enhances the generalization, the calculation and the accuracy of the model.

The number of the point clouds is expanded and the position coordinates of the point clouds are adjusted by alternately using the graph convolution module and the branch module, so that the first point cloud obtained by the graph convolution neural network is more accurate. Aiming at the first point cloud, extracting compressed information with different resolutions through a point cloud compression module designed by a hierarchy, and designing a hierarchy dynamic information pipeline to jump and transmit the compressed information to a point cloud expansion module of a corresponding hierarchy, so that the point cloud expansion module carries out combination of useless features and weight adjustment of the useful features by calculating internal attention based on the compressed information with different resolutions, thereby realizing point cloud expansion and reconstructing to obtain the point cloud capable of describing the whole three-dimensional structure of a target object.

In the following, a part of known efficient networks are used to replace part of models in the completion integration model provided by the application, and training is performed. The obtained model is compared with the experimental data of the completion integration model provided by the application, so as to further explain the effect of the completion integration model provided by the application.

For the replacement of the point cloud reconstruction model part, the Chamfer Distance (CD) is used as an evaluation index, and the experimental data pair ratio is shown in table 1:

TABLE 1

Model (model)	Complete integrated model	No-D	PointOutNet
				CD(×10^-1)	4.461	5.309	5.492

Wherein, No-D represents the model obtained by training after the discriminant is cancelled in the model training process; the PointOutNet represents a model obtained by replacing the point cloud reconstruction model in the completion integration model with a PointOutNet structure.

For the replacement of the point cloud completion model part, when CD is used as an evaluation index, the experimental data pair ratio is shown in table 2:

TABLE 2

Model (model)	Complete integrated model	FC	FN	TN
					CD(×10^-1)	4.461	10.572	9.863	6.225

The FC represents a model obtained by replacing the point cloud completion model in the completion integration model with a full-connection layer network, the FN represents a model obtained by replacing the point cloud completion model in the completion integration model with a foldingNet network, and the TN represents a model obtained by replacing the point cloud completion model in the completion integration model with a TopNet network.

And randomly sampling a plurality of regions (for example, region 1, region 2, and region 3) from the same sample, and reconstructing the regions by using the complement integration model and the other three models referred to in table 2, and then calculating a reconstruction Error between the reconstruction result of each region and the entire sample, thereby obtaining experimental data of a Point-to-Point Error. The point-to-point error of the complementing integration model is about 2, the point-to-point error of the model represented by FC is about 5, the point-to-point error of the model represented by FN is about 4.5, and the point-to-point error of the model represented by TN is about 3.

Through the comparison of the experimental data, it can be seen that the completion integration model provided by the application is superior to the existing method in chamfer distance and point-to-point error from the result of quantitative analysis.

It can be understood that, the method for performing three-dimensional reconstruction by using the trained point cloud reconstruction model and the trained point cloud compensation model and the execution subjects of the trained point cloud reconstruction model and the trained point cloud compensation model may be the same terminal device or different terminal devices.

In addition, the sequence numbers of the steps in the foregoing embodiments do not mean the execution sequence, and the execution sequence of each process should be determined by the function and the internal logic of the process, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Fig. 5 is a structural diagram of an embodiment of a three-dimensional reconstruction apparatus according to an embodiment of the present application, which corresponds to the three-dimensional reconstruction method described in the foregoing embodiment. Referring to fig. 5, the three-dimensional reconstruction apparatus may include:

an acquiring unit 501 is configured to acquire a two-dimensional image of a target object, where the two-dimensional image represents a partial region of the target object.

A converting unit 502, configured to convert the two-dimensional image into a first point cloud, where the first point cloud is used to describe a three-dimensional structure of the partial region.

And a completion unit 503, configured to process the first point cloud through the trained point cloud completion model to obtain a second point cloud, where the second point cloud is used to describe a three-dimensional structure of the whole target object.

The point cloud complementing model comprises N levels of point cloud compression modules and N levels of point cloud expansion modules, the N levels of point cloud compression modules are used for compressing a first point cloud to obtain N pieces of compressed information with different resolutions, the N levels of point cloud expansion modules are used for reconstructing the N pieces of compressed information with different resolutions to obtain the second point cloud, N is not less than 2, and N is an integer.

Optionally, the input information of the point cloud compression module at the 1 st layer is the first point cloud, the output information of the previous layer in the two adjacent layers of point cloud compression modules is the input information of the next time, the input information of the point cloud expansion module at the 1 st layer is the output information of the point cloud compression module at the N-1 st layer and the N-1 th layer, the input information of the point cloud expansion module at the m-th layer is the output information of the point cloud compression module at the N-m th layer and the output information of the point cloud expansion module at the m-1 st layer, the input information of the point cloud expansion module at the N-1 st layer is the output information of the point cloud expansion module at the N-th layer, the output information of the point cloud expansion module at the N-th layer is the second point cloud, m is more than or equal to 2 and less than N, and m is an integer.

Optionally, the point cloud expansion module includes a plurality of dynamic information gate modules, a plurality of full connection layers are disposed between the plurality of dynamic information gate modules, and the dynamic information gate modules are configured to perform attention mechanism calculation on input information of the dynamic information gates.

Optionally, the point cloud compression module is a PointNet + + network structure.

Optionally, the converting unit 502 converts the two-dimensional image into the first point cloud, including:

converting the two-dimensional image into the first point cloud using a trained point cloud reconstruction model.

Optionally, the point cloud reconstruction model includes a ResNet encoder and a graph convolution neural network, which are sequentially connected, where the graph convolution neural network includes multiple sets of graph convolution modules and branch modules, which are alternately arranged, the graph convolution modules are used to adjust position coordinates of the point cloud, and the branch modules are used to expand the number of the point clouds.

Optionally, the point cloud completion model and the point cloud reconstruction model are trained in the following manners:

and constructing a completion integration initial model which comprises a point cloud reconstruction initial model, a discriminator and a point cloud completion initial model.

And performing countermeasure training on the completion integration initial model according to a preset loss function and a training set so as to train the point cloud reconstruction initial model into the point cloud reconstruction model and train the point cloud completion initial model into the point cloud completion model.

The training set comprises a plurality of two-dimensional image samples presenting part areas of object samples, and a part area point cloud sample and an integral point cloud sample corresponding to each two-dimensional image sample; the loss function includes a loss calculation based on relative entropy, chamfer distance, and bulldozer distance.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, modules and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Fig. 6 shows a schematic block diagram of a terminal device provided in an embodiment of the present application, and only shows a part related to the embodiment of the present application for convenience of description.

As shown in fig. 6, the terminal device 6 of this embodiment includes: a processor 60, a memory 61 and a computer program 62 stored in said memory 61 and executable on said processor 60. The processor 60, when executing the computer program 62, implements the steps in the above-described respective three-dimensional reconstruction method embodiments, such as the steps S101 to S103 shown in fig. 1. Alternatively, the processor 60, when executing the computer program 62, implements the functions of each module/unit in each device embodiment, for example, the functions of the obtaining module 501, the converting module 502, and the complementing module 503 shown in fig. 5.

Illustratively, the computer program 62 may be partitioned into one or more modules/units that are stored in the memory 61 and executed by the processor 60 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 62 in the terminal device 6.

It will be understood by those skilled in the art that fig. 6 is only an example of the terminal device 6, and does not constitute a limitation to the terminal device 6, and may include more or less components than those shown, or combine some components, or different components, for example, the terminal device 6 may further include an input-output device, a network access device, a bus, etc.

The Processor 60 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 61 may be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6. The memory 61 may also be an external storage device of the terminal device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 6. Further, the memory 61 may also include both an internal storage unit and an external storage device of the terminal device 6. The memory 61 is used for storing the computer programs and other programs and data required by the terminal device 6. The memory 61 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.

The embodiments of the present application provide a computer program product, which when running on a terminal device, enables the terminal device to implement the steps in the above method embodiments when executed.

In the description above, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other ways. For example, the above-described apparatus/network device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same. Although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: modifications of the technical solutions described in the embodiments or equivalent replacements of some technical features may still be made. Such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A method of three-dimensional reconstruction, comprising:

acquiring a two-dimensional image of a target object, wherein the two-dimensional image presents a partial region of the target object;

converting the two-dimensional image into a first point cloud, wherein the first point cloud is used for describing a three-dimensional structure of the partial area;

processing the first point cloud through the trained point cloud completion model to obtain a second point cloud, wherein the second point cloud is used for describing the overall three-dimensional structure of the target object;

2. The method according to claim 1, wherein the input information of the point cloud compression module at the 1 st layer is the first point cloud, the output information of the previous layer of the two adjacent layers of the point cloud compression modules is the input information of the next time, the input information of the point cloud expansion module at the 1 st layer is the output information of the point cloud compression modules at the N-1 st layer and the N-1 st layer, the input information of the point cloud expansion module at the m-th layer is the output information of the point cloud compression module at the N-m th layer and the output information of the point cloud expansion module at the m-1 st layer, the input information of the point cloud expansion module at the N-1 st layer is the output information of the point cloud expansion module at the N-th layer, the output information of the point cloud expansion module at the N-th layer is the second point cloud, m is more than or less than 2 and less than N, and m is an integer.

3. The method of claim 2, wherein the point cloud expansion module comprises a plurality of dynamic information gate modules, and a plurality of fully connected layers are arranged between the plurality of dynamic information gate modules, and the dynamic information gate modules are used for performing attention mechanism calculation on input information of the dynamic information gates.

4. The method of claim 1, wherein the point cloud compression module is a PointNet + + network structure.

5. The method of claim 1, wherein transforming the two-dimensional image into a first point cloud comprises:

6. The method of claim 5, wherein the point cloud reconstruction model comprises a ResNet encoder and a graph convolution neural network which are connected in sequence, the graph convolution neural network comprises a plurality of groups of graph convolution modules and branch modules which are alternately arranged, the graph convolution modules are used for adjusting the position coordinates of the point cloud, and the branch modules are used for expanding the number of the point clouds.

7. The method of claim 5, wherein the point cloud completion model and the point cloud reconstruction model are trained by:

constructing a completion integration initial model which comprises a point cloud reconstruction initial model, a discriminator and a point cloud completion initial model;

performing countermeasure training on the completion integration initial model according to a preset loss function and a training set so as to train the point cloud reconstruction initial model into the point cloud reconstruction model and train the point cloud completion initial model into the point cloud completion model;

8. A three-dimensional reconstruction apparatus, comprising:

a conversion unit, configured to convert the two-dimensional image into a first point cloud, where the first point cloud is used to describe a three-dimensional structure of the partial region;

the completion unit is used for processing the first point cloud through the trained point cloud completion model to obtain a second point cloud, and the second point cloud is used for describing the overall three-dimensional structure of the target object;

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.