CN114359310B - 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning - Google Patents

3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning Download PDF

Info

Publication number
CN114359310B
CN114359310B CN202210035567.7A CN202210035567A CN114359310B CN 114359310 B CN114359310 B CN 114359310B CN 202210035567 A CN202210035567 A CN 202210035567A CN 114359310 B CN114359310 B CN 114359310B
Authority
CN
China
Prior art keywords
ventricular
layer
feature
convolution
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210035567.7A
Other languages
Chinese (zh)
Other versions
CN114359310A (en
Inventor
董舜杰
潘子宣
卓成
付钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202210035567.7A priority Critical patent/CN114359310B/en
Publication of CN114359310A publication Critical patent/CN114359310A/en
Application granted granted Critical
Publication of CN114359310B publication Critical patent/CN114359310B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a 3D ventricular nuclear magnetic resonance video segmentation optimization system based on depth learning, which acquires high-dimensional image features in an MRI video segment through a depth space-time deformable convolution fusion module TDAM; the feature images with different scales are fused by utilizing space-time information in the high-dimensional image features obtained by the enhanced deformable convolution attention network TDAM, and then feature images with multi-scale information are output; obtaining the distribution of the high-dimensional image features through a probability noise correction module PNCM, and outputting an embedded vector containing distribution mean and variance information; and performing splicing convolution on the feature map output by EDAN and the embedded vector output by PNCM after expanding to obtain a prediction result. The training network model is utilized to directly segment the newly input 3D ventricular MRI video, and the accuracy and efficiency of ventricular segmentation can be effectively improved by introducing multi-frame image compensation, deformable convolution and a multi-scale attention mechanism, and the method has higher robustness.

Description

3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning
Technical Field
The invention relates to the technical field of medical image engineering, in particular to a 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning.
Background
With the development of medical imaging technology and artificial intelligence, automated and semi-automated systems in computer-aided diagnosis are gradually replacing traditional manual diagnostic systems for accurate diagnosis and treatment. Magnetic Resonance Imaging (MRI) is currently widely used in ventricular diagnostics by virtue of its lack of radioactive damage and high resolution. In order to have a better understanding of the ventricular condition of a patient, accurate segmentation systems are required to correctly segment the position of various parts of the ventricle, however, conventional clinical procedures by visual assessment of three-dimensional MRI images are time consuming and dependent on the clinical experience of the physician. Therefore, it is important to find a system that improves the accuracy and efficiency of diagnosis of various parts of the ventricle.
The challenges faced by the prior art are mainly: 1. complex motion and blood flow within the heart can cause the imaging data to contain significant amounts of motion artifacts, non-uniformities in intensity, and noise, however, the detection system can ignore subtle changes, resulting in reduced detection sensitivity. 2. Most of the existing algorithms are only suitable for processing two-dimensional natural images, while MRI images are three-dimensional structures composed of parallel scanned image frames, so that the two-dimensional positioning algorithm ignores important interframe information. 3. The shape of the heart may vary greatly when it is in different states. This deformation is particularly pronounced for patients suffering from heart disease. The heart chambers of a patient can undergo severe deformation with changes in breathing, resulting in regions of the same nature being greatly deformed, especially the myocardial and right ventricular portions surrounding the left ventricle, which can create significant interference and challenges to the segmentation system. 4. Because the medical image data volume is small, high-quality labeling data and training samples are lacking, the trained model is easy to generalize and weak in capability, and the problem of fitting is solved.
In summary, a 3D ventricular nmr video segmentation optimization system based on deep learning is provided, and the accuracy and efficiency of ventricular segmentation are improved by using continuity information in and between MRI video image frames, which becomes an important technical problem to be solved urgently.
Disclosure of Invention
Aiming at the defects of the prior art of ventricular segmentation of the current medical image, the invention provides a 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning, which is used for automatically segmenting the positions of all parts of a ventricle, and has high accuracy of a positioning result and higher robustness of a model.
The aim of the invention is realized by the following technical scheme: A3D ventricular MRI video segmentation optimization system based on deep learning comprises a 3D ventricular MRI video data preprocessing module, a deformable convolution depth attention network DeU-Net and an image detection module;
The 3D ventricular MRI video data preprocessing module performs normalization processing on each frame of MRI image in the existing 3D ventricular MRI video data set; input to a deformable convolved deep attention network DeU-Net;
The deformable convolved depth attention network DeU-Net includes a depth spatiotemporal deformable convolved fusion module TDAM, an enhanced deformable convolved attention network EDAN and a probabilistic noise correction module PNCM:
the depth spatiotemporal deformable convolution fusion module TDAM: the module comprises a U-Net network and a deformable convolution layer; the U-Net network takes the 3D ventricular MRI video data set as input, outputs the 3D ventricular MRI video data set as a high-dimensional feature compensation region of an image in an MRI video segment, and the deformable convolution layer takes the 3D ventricular MRI video data set and the high-dimensional feature compensation region as input to obtain the high-dimensional feature of the compensated image;
the enhanced deformable convolution attention network EDAN: the module comprises a downsampling channel and an upsampling channel. EDAN takes the high-dimensional feature of the TDAM output image as input, and the feature is obtained through L-layer downsampling Convolution is carried out to obtain the original characteristic/>, of the L layer in the up-sampling channelAnd/>After splicing, obtain/>And/>Commonly input to a DeConv (a) module, which obtains an offset delta L and a fused L-th layer feature/>, through deformable convolutionThe specific calculation is as follows:
original features of the i-th layer of the upsampling channel Features of corresponding layers in the downsampling channel/>Splicing to obtain Δi+1,/>A common input DeConv (b) module, wherein i E [1, L-1]; finally, the fused characteristic/>, of the last layer of the up-sampling channel is obtained
The DeConv (b) module includes a multi-scale attention module MSAM and a deformable convolution layer; After being spliced with delta i+1, the vector is convolved to obtain delta i and MSAM, wherein the input is/> Features x (H W) and/>, after deformable convolution with Delta i Convolved features/>Wherein H and W represent the height and width of the input feature, respectively; MSAM has a calculation formula:
where k and j are the indices of the input x, y and output z, k e1, H, Sigma (·) is a scalar function used for normalization, phi (·) is a function that calculates the pairing correlation between x and y; θ (·) is the feature transfer function;
Will be Features after bilinear interpolation are spliced with z and x (H×W) and then convolved to obtain/>
The probabilistic noise correction module PNCM: the module comprises a twin neural network with shared weights, which is formed by convolution layers, two independent feature extraction layers, which are formed by full connection layers, and a heavy parameter operation. The twin neural network takes the high-dimensional features of the image output by TDAM as input, the output of the twin neural network is respectively input into two feature extraction layers for feature extraction, one feature extraction layer is output as the mean value of the high-dimensional features of the input image, the other feature extraction layer is output as the variance of the high-dimensional features of the input image, and the variance is added with the mean value through heavy parameter operation to generate an embedded vector as final output;
the image detection module is used for 3D ventricular region segmentation, and a probability heat map of the 3D ventricular MRI image of the test set is calculated by using a trained deformable convolved depth attention network DeU-Net. Specifically, EDAN is used to obtain a feature map And splicing the embedded vector obtained by PNCM, and obtaining a probability heat map through a convolution layer. And dividing the probability heat map corresponding to each ventricular MRI image by the dividing probability to obtain a division result, namely a left ventricular region, a myocardial region and a right ventricular region.
Further, the data enhancement module expands the data set by rotating, adjusting contrast and scaling during the image processing process, and divides the 3D ventricular MRI video into four directions of x, y, z and t, wherein x, y and z represent a spatial coordinate system, t represents a time axis, and a video frame of an x-y plane is selected.
Further, before each frame of MRI image is input into the forming U-Net network, the front and back r frames of the target frame along the direction of the time t axis are selected as the forming U-Net network input, namely 2r+1 frames of MRI images.
Further, in the depth space-time deformable convolution fusion module TDAM, the calculation process of the convolution layer is as follows:
Wherein conv out is the output image size of the convolution layer, conv in is the input image size, padding represents filling pixels around the image, kennel _size is the convolution kernel size, stride is the step size of the convolution kernel;
The 3 x3 convolution kernel R is defined as: r= { (-1, -1), (-1, 0), …, (0, 1), (1, 1) }, the characteristic y of the convolution layer (p 0) is:
Wherein p n is the position in R, w (·) is the weight, x (·) is the feature of the input image, and p 0 is the initial position;
in the deformable convolution layer, the convolution kernel R is enhanced by offset { Δp n |n=1, …, N } where n= |r|; thus, the feature y' (p 0) of the deformable convolution is:
Where Δp n is the high-dimensional feature compensation region obtained over the U-Net network.
Further, the enhanced deformable convolution attention network EDAN includes an up-sampling channel and a down-sampling channel, each of which includes an L-layer structure. Each layer in the downsampling channel comprises two 3 x 3 convolutional layers and a 2x 2 max-pooling layer of step size 2, each convolutional layer followed by a batch normalization operation and a Relu function. Each layer in the upsampling path uses a3 x 3 upsampling operation twice the upsampling feature followed by a batch normalization operation and a Relu function.
Further, in the multiscale attention module MSAM, the specific calculation process of the function of the pairing correlation between x and y and the feature transfer function is as follows:
Wherein f (x k)=Wfxk,g(yj)=Wgyj;Wf,Wg is a convolutional layer;
θ (y j) generates a new representation of y j by a convolution layer:
θ(yj)=Wθyj
Wherein W θ is a convolutional layer.
Further, the probabilistic noise correction module PNCM includes a twin neural network with shared weights formed by convolution layers, two independent feature extraction layers formed by full connection layers, and a heavy parameter operation; the module regards TDAM fused feature maps as distributions and calculates the mean and variance by the following equation:
Where μ and Σ are the mean and variance, respectively, of the features, F fused is the fused feature of TDAM output, g φ (·) is the shared-weight twin neural network consisting of convolutional layers, And/>Feature extraction layers with weighting parameters θ μ and θ , respectively.
The random noise epsilon is sampled from the standard gaussian distribution N (0,I) by a heavy parameter operation, and the embedded vectors s-N (mu, epsilon sigma) are obtained through s=mu+epsilon sigma and serve as the output of the probability noise correction module.
Further, the DeU-Net adopts a total loss function in the training processAs a loss function of the network, the total loss function/>The calculation formula of (2) is as follows:
wherein y is a split tag, p DeU-Net is a predicted result of DeU-Net, and α is a balance uncertainty loss And cross entropy function/>Is a super ginseng of (a) and (b).
Uncertainty lossThe calculation formula of (2) is as follows:
Wherein n is a batch parameter, and the calculation formula of q i is:
Where diag (·) is the diagonal vector of the input tensor, m is the total feature dimension, Σ i is the variance of the ith slice.
Cross entropyThe calculation formula of (2) is as follows:
Wherein M represents the class number, y c is a one-hot vector, and p c is DeU-Net predicted probability of belonging to sample c;
the weight parameter theta is updated by adopting the gradient descent of a standard Adam optimizer, and the formula is as follows:
where η is the learning rate and θ k is the weight parameter of the kth time.
The beneficial effects of the invention are as follows:
1) Depth features in 3D ventricular MRI image video data can be automatically learned. Traditional visual assessment requires a doctor to observe and judge frame by frame, is extremely dependent on the experience and skill level of the doctor, and consumes a lot of time. DeU-Net can automatically learn the high-dimensional features in 3D ventricular MRI image video data to discover the intrinsic links between MRI images and portions of the ventricles. Compared with the traditional ventricular segmentation system, the system provided by the invention can learn the high-order features which are difficult to identify by human eyes.
2) The accurate segmentation of the parts of the ventricle can be realized. Compared with the existing segmentation algorithm based on the depth network, the system provided by the invention can accurately segment the ventricular image of the patient, and the left ventricle, the cardiac muscle and the right ventricle areas segmented by the system are more consistent with the visual evaluation of doctors, and the higher accuracy and the higher efficiency are maintained. Therefore, it is of great value in helping the physician locate the ventricular area of the patient and in subsequent surgical treatment.
3) The method can be suitable for different equipment and different formats of organ segmentation detection, such as CT images, ultrasonic images and X-ray images. The system provided by the invention is effective for each part of the ventricle and the whole time period.
4) Network training with small data size can be realized. The invention increases the sample size by utilizing the image enhancement mode, and carries out training model and test data on the basis, thereby avoiding the over fitting of network training and improving the robustness of the network training. In addition, in order to improve segmentation of the tiny parts during ventricular contraction, the invention adopts a multi-frame quality enhancement mode to acquire the space-time information between frames in the 3D ventricular MRI video to compensate the target image, and simultaneously adopts a deformable convolution method to better fuse the compensated information into the target image so as to enhance the segmentation precision.
Drawings
FIG. 1 is a block diagram of a depth learning based 3D ventricular NMR video segmentation optimization system according to an embodiment of the invention;
FIG. 2 is a flow chart of an implementation of a depth learning based 3D ventricular NMR video segmentation optimization system according to one embodiment of the invention;
FIG. 3 is a TDAM schematic diagram of one embodiment of the present invention;
FIG. 4 is a EDAN schematic diagram of one embodiment of the invention;
FIG. 5 is a DeConv schematic diagram of the construction of one embodiment of the invention;
FIG. 6 is a MSAM schematic diagram of one embodiment of the invention;
FIG. 7 is a PNCM schematic diagram of one embodiment of the invention;
FIG. 8 is a graph of ventricular segmentation results for DeU-Net in accordance with one embodiment of the present invention.
Detailed Description
The invention will be described in further detail with reference to the drawings and the specific examples.
As shown in fig. 1 and fig. 2, the 3D ventricular MRI video segmentation optimization system provided by the present invention includes a 3D ventricular nuclear Magnetic Resonance (MRI) video data preprocessing module, a Deformable convolved depth attention network Deformable U-Net (DeU-Net) and an image detection module;
The 3D ventricular nuclear Magnetic Resonance (MRI) video data preprocessing module comprises a data enhancement module and a data division module:
The data enhancement module: the existing 3D ventricular MRI video data set is split into MRI images of each frame, the data set is expanded by means of rotation, contrast adjustment and scaling, and normalization processing is carried out on the image size. The 3D ventricular MRI video is divided into four directions of x, y, z and t, wherein x, y and z represent a space coordinate system, t represents a time axis, and video frames of an x-y plane are selected.
The data dividing module: dividing the enhanced image data into a training set and a testing set; the training set and the test set both contain complete 3D ventricular MRI images. Before the image is input into the network, the front and back r frames of the target frame along the direction of the time t axis are selected as input, namely 2r+1 frames of MRI images.
The Deformable convolved depth attention network de-formable U-Net (DeU-Net) includes a depth spatiotemporal Deformable convolution fusion module Temporal Deformable Aggregation Module (TDAM), an enhanced Deformable convolved attention network Enhanced Deformable Attention Network (EDAN) and a probabilistic noise correction module Probabilistic Noise Correction Module (PNCM):
As shown in fig. 3, the depth space-time deformable convolution fusion module TDAM includes a 9-layer structure, the first layer includes a convolution layer and a Relu function, and converts the number of input channels (2r+1) in_c into nf, where in_c is the number of input image channels, and nf is the number of output channels of the custom convolution layer. Layers 2 through 4 are downsampled structures comprising two convolutional layers and two Relu functions. Layers 5 through 6 are upsampled structures comprising a convolution layer, a deconvolution layer and two Relu functions. The 7 th layer is a jump transfer structure, processes the characteristics obtained by downsampling and fuses the processed characteristics with the upsampling result, and comprises two convolution layers, a deconvolution layer and three Relu functions. Layer 8 is an offset output structure comprising two convolutional layers and a Relu function, the second convolutional layer outputs the number of channels (2r+1) x 2x (deform _ks) 2, where deform _ks is the deformable convolutional kernel size. The ninth layer structure is a deformable convolution and Relu functions, taking the input image and offset as the layer inputs, resulting in the high-dimensional features fused feature maps of the image. Step size is 2 in all convolutions and deconvolutions in up-sampling and down-sampling, padding is 1, and the number of channels is the same. The remaining convolution steps are 1 and padding is 0 for preserving feature size. The calculation process of the convolution layer is as follows:
Wherein conv out is the output image size of the convolution layer, conv in is the input image size, padding represents filling pixels around the image, kennel _size is the convolution kernel size, stride is the step size of the convolution kernel;
The 3 x3 convolution kernel R is defined as: r= { (-1, -1), (-1, 0), …, (0, 1), (1, 1) }, the characteristic y of the convolution layer (p 0) is:
Where p n is the position in R, w (·) is the weight, x (·) is the feature of the input image, and p 0 is the initial position.
In the deformable convolution layer, the convolution kernel R is enhanced by offset { Δp n |n=1, …, N } where n= |r|. Thus, the feature y' (p 0) of the deformable convolution is:
Where Δp n is the high-dimensional feature compensation region obtained over the U-Net network.
MRI images of each frame of 3D ventricle transmitted into Deformable U-Net network areThen TDAM outputs a fused feature calculation formula:
Where F (k) is the resulting feature, S is the convolution kernel size, For the core of the h channel,/>For the image of the h-th channel, h 0 is the current channel, k is any spatial position, and k s is the deformable convolution sampling offset. Additional learnable/>, is provided in TDAMSo that
ks←ks(h,k),s
Where k denotes any spatial position, k s denotes a standard deformable convolution sampling offset,For the h-th channel at position k, the learned offset field, δ (h,k),s is the sample offset of the learned offset, and the whole offset prediction network gets the offset field:
Wherein the method comprises the steps of Is a U-Net network.
The activation function used in TDAM is linear rectification unit except that the last layer is linear activation. The linear rectifying unit g (z)' is calculated by:
The linear activation function g (z) is calculated in the following way:
g(z)=z
TDAM the input data is B× (2r+1) ×3×H×W, where B is the batch size, 2r+1 is the number of input MRI images, 3 is the number of image channels, H is the image height, and W is the image width. In this embodiment, the input MRI image size is 12×3×3×256×256, and the size is changed to 12×32×256×256 after the first layer structure. After three downsampling at layers 2 to 4, the resulting data sizes are 12 x 32 x 128 in turn, 12 x 32 x 64, 12X 32X 32 x 32. After the 5 th to 6 th layers of up-sampling, the obtained image feature sizes are 12×32×64×64, 12×32×128×128 in order. The image feature size obtained through the 7 th layer jump transfer structure is 12×32×128×128, and is combined with the up-sampled input before being transferred into the up-sampling structure. Layer 8 is an offset output structure, resulting in image feature sizes of 12×54×256×256. The ninth layer of spatiotemporal deformable convolution structure yields an image feature size of 12 x 64 x 256 for the fusion feature fused feature maps.
As shown in fig. 4, the enhanced deformable convolution attention network EDAN: the module contains one downsampling channel and one upsampling channel, each channel containing an L (l=4) layer structure. Each layer in the downsampling channel comprises two 3 x 3 convolutional layers and a 2 x 2 max-pooling layer of step size 2, each convolutional layer followed by a batch normalization operation and a Relu function. EDAN takes the high-dimensional feature of the TDAM output image as input, and the feature is obtained through L-layer downsamplingConvolution is carried out to obtain the original characteristic/>, of the L layer in the up-sampling channelAnd/>After splicing, obtain/>And/>Common input to DeConv (a) module results in offset Δ L and fused layer L features/>Original features of the i-th layer of the upsampling channelAre all fused i+1st layer features/>Calculated by a 3 x 3 up-convolution double up-sampling operation, where each up-convolution operation follows a batch normalization operation and a Relu function, i e1, l-1. Original features of the i-th layer/>Features of corresponding layers in the downsampling channel/>Splicing to obtain/>Δi+1,/>And (3) jointly inputting the characteristics into a DeConv (b) module to finally obtain the fused characteristics/>, of the last layer of the up-sampling channel
As shown in fig. 5, the DeConv (a) module contains one deformable convolution layer and three convolution layers, to beThe offset Δ L is calculated by two convolution layers and the offset Δ L is compared with/>Inputting into a deformable convolution layer, and calculating to obtain the fused L-layer characteristic/>, through one convolution layerThe specific calculation is as follows:
As shown in fig. 5, the DeConv (b) module includes a multi-scale attention module MSAM, a deformable convolution layer, two quadratic linear interpolation operations, and four convolution layers. Will be Splicing the characteristic of a convolution layer and the offset of the (i+1) th layer offset delta i+1 through a quadratic linear interpolation operation, and calculating the (i) th layer offset delta ii and/>, by using the convolution layerThe common input is calculated to obtain the characteristic x (H multiplied by W) by the deformable convolution layer.
As shown in FIG. 6, the multi-scale attention module MSAM sums x (H W)Convolved features/>The attention feature z is derived as an input to MSAM, where H and W represent the height and width of the input feature, respectively. MSAM has a calculation formula:
where k and j are the indices of the input x, y and output z, k e1, H, Sigma (·) is a scalar function used for normalization, phi (·) is a function that calculates the pairing correlation between x and y, and the specific calculation process is:
Wherein f (x k)=Wfxk,g(yj)=Wgyj;Wf,Wg is a convolutional layer;
θ (y j) generates a new representation of y j by a convolution layer:
θ(yj)=Wθyj
Wherein W θ is a convolutional layer.
Will beThe features after bilinear interpolation are spliced with the attention features z and x (H×W) and then convolved to obtain/>The calculation formula of DeConv (b) is:
As shown in fig. 7, the probabilistic noise correction module PNCM: comprising a twin neural network of shared weights formed by convolution layers, two independent feature extraction layers formed by fully connected layers, and a heavy parameter operation. The module regards TDAM fused feature maps as distributions and calculates the mean and variance by the following equation:
where μ and Σ are the mean and variance of the features, respectively, F fused is the fusion feature of the target frame output by TDAM, g φ (-) is a dual-stream neural network of shared weights made up of convolutional layers, And/>Feature extraction layers with parameters θ μ and θ , respectively.
The re-parameterization is used to sample random noise epsilon from a standard gaussian distribution N (0,I), and the embedded vectors s-N (μ, epsilon sigma) are obtained as a module output by s=μ+epsilon sigma.
Using total loss function in model trainingOptimizing DeU-Net:
wherein y is a split tag, p DeU-Net is a predicted result of DeU-Net, and α is a balance uncertainty loss And cross entropy function/>Is a super ginseng of (a) and (b).
Uncertainty lossThe calculation formula of (2) is as follows:
Wherein n is a batch parameter, and the calculation formula of q i is:
Where diag (·) is the diagonal vector of the input tensor, m is the total feature dimension, Σ i is the variance of the ith slice.
Cross entropyThe calculation formula of (2) is as follows:
Wherein M represents the number of categories, y c is a one-hot vector, and p c is the probability that the network model prediction belongs to sample c;
the weight parameter theta is updated by adopting the gradient descent of a standard Adam optimizer, and the formula is as follows:
where η is the learning rate and θ k is the weight parameter of the kth time.
The image detection module is used for segmenting the 3D ventricular region, and calculating a probability heat map of the 3D ventricular MRI image of the test set by using the trained network. Specifically, after the feature map obtained by EDAN is unfolded and spliced with the embedded vector obtained by PNCM, a probability heat map is obtained through a convolution layer. And dividing the probability heat map corresponding to each ventricular MRI image by the dividing probability to obtain a division result, namely a left ventricular region, a myocardial region and a right ventricular region.
In a specific case of applying the system of the embodiment, as shown in fig. 2, the collected 3D ventricular MRI dataset is first divided into a training set and a test set, a U-Net network is used to construct a depth space-time deformable convolution fusion module TDAM, an offset of a target image in a 3D ventricular MRI video segment is obtained, the obtained offset is fused into the target image through deformable convolution, then the target image is input into a deformable convolution attention network EDAM and a probabilistic noise correction module PNCM, an embedded vector which is fused with space-time information and represents uncertainty is obtained respectively, the feature image and the expanded embedded vector are spliced, and finally a segmentation result image is obtained, as shown in fig. 8, so that accurate segmentation of a patient ventricle in the 3D ventricular MRI video image is realized, and finally the Dice of the whole video segmentation result is 92.9%, and compared with the existing segmentation algorithm based on the depth neural network, the left ventricle, the myocardial and right ventricle region and the doctor vision assessment of the system are more consistent, and higher accuracy and efficiency are maintained.
The present invention is not limited to the above-described preferred embodiments. Any person who can obtain other various types of deep learning-based epilepsy-induction-range positioning systems under the teaching of the present invention shall fall within the scope of the present invention.

Claims (8)

1. The 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning is characterized by comprising a 3D ventricular nuclear magnetic resonance MRI video data preprocessing module, a deformable convolution depth attention network DeU-Net and an image detection module;
The 3D ventricular MRI video data preprocessing module performs normalization processing on each frame of MRI image in the existing 3D ventricular MRI video data set; input to a deformable convolved deep attention network DeU-Net;
The deformable convolved depth attention network DeU-Net includes a depth spatiotemporal deformable convolved fusion module TDAM, an enhanced deformable convolved attention network EDAN and a probabilistic noise correction module PNCM:
the depth spatiotemporal deformable convolution fusion module TDAM: the module comprises a U-Net network and a deformable convolution layer; the U-Net network takes the 3D ventricular MRI video data set as input, outputs the 3D ventricular MRI video data set as a high-dimensional feature compensation region of an image in an MRI video segment, and the deformable convolution layer takes the 3D ventricular MRI video data set and the high-dimensional feature compensation region as input to obtain the high-dimensional feature of the compensated image;
The enhanced deformable convolution attention network EDAN: the module comprises a downsampling channel and an upsampling channel; EDAN takes the high-dimensional feature of the TDAM output image as input, and the feature is obtained through L-layer downsampling Convolution is carried out to obtain the original characteristic/>, of the L layer in the up-sampling channelAnd/>After splicing, obtain/>And/>Commonly input to a DeConv (a) module, which obtains an offset delta L and a fused L-th layer feature/>, through deformable convolutionThe specific calculation is as follows:
original features of the i-th layer of the upsampling channel Features of corresponding layers in the downsampling channel/>Splicing to obtain/> Δi+1,/>A common input DeConv (b) module, wherein i E [1, L-1]; finally, the fused characteristic/>, of the last layer of the up-sampling channel is obtained
The DeConv (b) module includes a multi-scale attention module MSAM and a deformable convolution layer; After being spliced with delta i+1, the vector is convolved to obtain delta i and MSAM, wherein the input is/> Feature x (H W) and after deformable convolution with Delta i Convolved features/>Wherein H and W represent the height and width of the input feature, respectively; MSAM has a calculation formula:
Where k and j are the indices of the input x, y and output z, k e1, H, Sigma (·) is a scalar function used for normalization, phi (·) is a function that calculates the pairing correlation between x and y; θ (·) is the feature transfer function;
Will be Features after bilinear interpolation are spliced with z and x (H×W) and then convolved to obtain/>
The probabilistic noise correction module PNCM: the module comprises a twin neural network which is composed of convolution layers and shares weight, two independent feature extraction layers which are composed of full connection layers and heavy parameter operation; the twin neural network takes the high-dimensional features of the image output by TDAM as input, the output of the twin neural network is respectively input into two feature extraction layers for feature extraction, one feature extraction layer is output as the mean value of the high-dimensional features of the input image, the other feature extraction layer is output as the variance of the high-dimensional features of the input image, and the variance is added with the mean value through heavy parameter operation to generate an embedded vector as final output;
the image detection module is used for 3D ventricular region segmentation, and a probability heat map of the 3D ventricular MRI image of the test set is calculated by using a trained deformable convolved depth attention network DeU-Net; specifically, EDAN is used to obtain a feature map Splicing the embedded vector obtained by PNCM, and obtaining a probability heat map through a convolution layer; and dividing the probability heat map corresponding to each ventricular MRI image by the dividing probability to obtain a division result, namely a left ventricular region, a myocardial region and a right ventricular region.
2. The depth learning-based 3D ventricular MRI video segmentation optimization system of claim 1, wherein the 3D ventricular MRI video data preprocessing module comprises a data enhancement module, wherein the data enhancement module expands the data set by rotating, adjusting contrast and scaling during image processing, so as to divide the 3D ventricular MRI video into four directions of x, y, z and t, wherein x, y and z represent a spatial coordinate system, t represents a time axis, and a video frame of an x-y plane is selected.
3. The 3D ventricular nmr video segmentation optimization system based on deep learning as claimed in claim 1, wherein before each frame of MRI image is input into a formable U-Net network, r frames before and after the target frame along the direction of the time t axis are selected as the input of the formable U-Net network, namely 2r+1 frames of MRI image.
4. The 3D ventricular nmr video segmentation optimization system based on deep learning of claim 1, wherein in the depth spatiotemporal deformable convolution fusion module TDAM, the calculation process of the convolution layer is as follows:
Wherein conv out is the output image size of the convolution layer, conv in is the input image size, padding represents filling pixels around the image, kennel _size is the convolution kernel size, stride is the step size of the convolution kernel;
The 3 x 3 convolution kernel R is defined as: r= { (-1, -1), (-1, 0), (0, 1), (1, 1) }, the characteristic y (p 0) of the convolutional layer is:
Wherein p n is the position in R, w (·) is the weight, x (·) is the feature of the input image, and p 0 is the initial position;
in the deformable convolution layer, the convolution kernel R is enhanced by offset { Δp n |n=1,..n } where n= |r|; thus, the feature y' (p 0) of the deformable convolution is:
Where Δp n is the high-dimensional feature compensation region obtained over the U-Net network.
5. The depth learning based 3D ventricular nmr video segmentation optimization system of claim 1, wherein the enhanced deformable convolution attention network EDAN comprises an up-sampling channel and a down-sampling channel, each channel comprising an L-layer structure; each layer in the downsampling channel comprises two 3 x 3 convolutional layers and a2 x 2 max pooling layer with a step size of 2, each convolutional layer being followed by a batch normalization operation and a Relu function; each layer in the upsampling path uses a 3 x 3 upsampling operation twice the upsampling feature followed by a batch normalization operation and a Relu function.
6. The 3D ventricular nmr video segmentation optimization system according to claim 1, wherein in the multi-scale attention module MSAM, the specific calculation process of the function of the pairing correlation between x and y and the feature transfer function is as follows:
Wherein f (x k)=Wfxk,g(yj)=Wgyj;Wf,Wg is a convolutional layer;
θ (y j) generates a new representation of y j by a convolution layer:
θ(yj)=Wθyj
Wherein W θ is a convolutional layer.
7. The 3D ventricular nmr video segmentation optimization system based on deep learning of claim 1, wherein the probabilistic noise correction module PNCM comprises a weight-sharing twin neural network formed by convolution layers, two independent feature extraction layers formed by full connection layers, and a heavy parameter operation; the module regards TDAM fused feature maps as distributions and calculates the mean and variance by the following equation:
Where μ and Σ are the mean and variance, respectively, of the features, F fused is the fused feature of TDAM output, g φ (·) is the shared-weight twin neural network consisting of convolutional layers, And/>Feature extraction layers with weight parameters theta μ and theta respectively;
The random noise epsilon is sampled from the standard gaussian distribution N (0,I) by a heavy parameter operation, and the embedded vectors s-N (mu, epsilon sigma) are obtained through s=mu+epsilon sigma and serve as the output of the probability noise correction module.
8. The 3D ventricular nmr video segmentation optimization system based on deep learning of claim 1, wherein said DeU-Net uses a total loss function during trainingAs a loss function of the network, the total loss function/>The calculation formula of (2) is as follows:
wherein y is a split tag, p DeU-Net is a predicted result of DeU-Net, and α is a balance uncertainty loss And cross entropy function/>Is prepared from radix Ginseng Rubra;
Uncertainty loss The calculation formula of (2) is as follows:
Wherein n is a batch parameter, and the calculation formula of q i is:
Wherein diag (·) is the diagonal vector of the input tensor, m is the total feature dimension, Σ i is the variance of the ith slice; cross entropy The calculation formula of (2) is as follows:
Wherein M represents the class number, y c is a one-hot vector, and p c is DeU-Net predicted probability of belonging to sample c; the weight parameter theta is updated by adopting the gradient descent of a standard Adam optimizer, and the formula is as follows:
where η is the learning rate and θ k is the weight parameter of the kth time.
CN202210035567.7A 2022-01-13 2022-01-13 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning Active CN114359310B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210035567.7A CN114359310B (en) 2022-01-13 2022-01-13 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210035567.7A CN114359310B (en) 2022-01-13 2022-01-13 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning

Publications (2)

Publication Number Publication Date
CN114359310A CN114359310A (en) 2022-04-15
CN114359310B true CN114359310B (en) 2024-06-04

Family

ID=81110097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210035567.7A Active CN114359310B (en) 2022-01-13 2022-01-13 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning

Country Status (1)

Country Link
CN (1) CN114359310B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115239765B (en) * 2022-08-02 2024-03-29 合肥工业大学 Infrared image target tracking system and method based on multi-scale deformable attention
CN116152285B (en) * 2023-02-15 2023-08-18 哈尔滨工业大学 Image segmentation system based on deep learning and gray information
CN117789153B (en) * 2024-02-26 2024-05-03 浙江驿公里智能科技有限公司 Automobile oil tank outer cover positioning system and method based on computer vision
CN118096785B (en) * 2024-04-28 2024-06-25 北明成功软件(山东)有限公司 Image segmentation method and system based on cascade attention and multi-scale feature fusion

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191476A (en) * 2018-09-10 2019-01-11 重庆邮电大学 The automatic segmentation of Biomedical Image based on U-net network structure
CN111311592A (en) * 2020-03-13 2020-06-19 中南大学 Three-dimensional medical image automatic segmentation method based on deep learning
CN111768432A (en) * 2020-06-30 2020-10-13 中国科学院自动化研究所 Moving target segmentation method and system based on twin deep neural network
CN111932550A (en) * 2020-07-01 2020-11-13 浙江大学 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning
CN112017198A (en) * 2020-10-16 2020-12-01 湖南师范大学 Right ventricle segmentation method and device based on self-attention mechanism multi-scale features
WO2020246996A1 (en) * 2019-06-06 2020-12-10 Elekta, Inc. Sct image generation using cyclegan with deformable layers
CN113516659A (en) * 2021-09-15 2021-10-19 浙江大学 Medical image automatic segmentation method based on deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191476A (en) * 2018-09-10 2019-01-11 重庆邮电大学 The automatic segmentation of Biomedical Image based on U-net network structure
WO2020246996A1 (en) * 2019-06-06 2020-12-10 Elekta, Inc. Sct image generation using cyclegan with deformable layers
CN111311592A (en) * 2020-03-13 2020-06-19 中南大学 Three-dimensional medical image automatic segmentation method based on deep learning
CN111768432A (en) * 2020-06-30 2020-10-13 中国科学院自动化研究所 Moving target segmentation method and system based on twin deep neural network
CN111932550A (en) * 2020-07-01 2020-11-13 浙江大学 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning
CN112017198A (en) * 2020-10-16 2020-12-01 湖南师范大学 Right ventricle segmentation method and device based on self-attention mechanism multi-scale features
CN113516659A (en) * 2021-09-15 2021-10-19 浙江大学 Medical image automatic segmentation method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DeU-Net 2.0: Enhanced deformable U-Net for 3D cardiac cine MRI segmentation;Shunjie Dong等;《Medical Image Analysis》;20220501;全文 *
DeU-Net: Deformable U-Net for 3D Cardiac MRI Video Segmentation;Shunjie Dong等;《Medical Image Computing and Computer Assisted Intervention – MICCAI 2020》;20200929;全文 *
基于全卷积神经网络的左心室图像分割方法;谢文鑫;苑金辉;胡晓飞;;软件导刊;20200515(05);全文 *

Also Published As

Publication number Publication date
CN114359310A (en) 2022-04-15

Similar Documents

Publication Publication Date Title
CN111932550B (en) 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning
CN114359310B (en) 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning
US11645748B2 (en) Three-dimensional automatic location system for epileptogenic focus based on deep learning
CN110930416B (en) MRI image prostate segmentation method based on U-shaped network
US11430140B2 (en) Medical image generation, localizaton, registration system
US20220028085A1 (en) Method and system for providing an at least 3-dimensional medical image segmentation of a structure of an internal organ
WO2022121100A1 (en) Darts network-based multi-modal medical image fusion method
CN115496771A (en) Brain tumor segmentation method based on brain three-dimensional MRI image design
CN112785632B (en) Cross-modal automatic registration method for DR and DRR images in image-guided radiotherapy based on EPID
CN111584066B (en) Brain medical image diagnosis method based on convolutional neural network and symmetric information
JP7270319B2 (en) Left ventricle automatic segmentation method of SPECT three-dimensional reconstructed image
CN115830016B (en) Medical image registration model training method and equipment
Liu et al. 3-D prostate MR and TRUS images detection and segmentation for puncture biopsy
Jeevakala et al. Artificial intelligence in detection and segmentation of internal auditory canal and its nerves using deep learning techniques
CN117422788B (en) Method for generating DWI image based on CT brain stem image
CN113269774B (en) Parkinson disease classification and lesion region labeling method of MRI (magnetic resonance imaging) image
CN113361689A (en) Training method of super-resolution reconstruction network model and scanning image processing method
CN113344940A (en) Liver blood vessel image segmentation method based on deep learning
CN112750131A (en) Pelvis nuclear magnetic resonance image musculoskeletal segmentation method based on scale and sequence relation
CN115424319A (en) Strabismus recognition system based on deep learning
CN114359309A (en) Medical image segmentation method based on index point detection and shape gray scale model matching
CN115272386A (en) Multi-branch segmentation system for cerebral hemorrhage and peripheral edema based on automatic generation label
Han Cerebellum parcellation from magnetic resonance imaging using deep learning
CN115272385A (en) Automatic label generation based cooperative cross segmentation system for cerebral hemorrhage and peripheral edema
Rohini et al. ConvNet based detection and segmentation of brain tumor from MR images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant