CN111932550B - 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning - Google Patents
3D ventricle nuclear magnetic resonance video segmentation system based on deep learning Download PDFInfo
- Publication number
- CN111932550B CN111932550B CN202010622947.1A CN202010622947A CN111932550B CN 111932550 B CN111932550 B CN 111932550B CN 202010622947 A CN202010622947 A CN 202010622947A CN 111932550 B CN111932550 B CN 111932550B
- Authority
- CN
- China
- Prior art keywords
- convolution
- deformable
- image
- module
- mri
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30048—Heart; Cardiac
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Magnetic Resonance Imaging Apparatus (AREA)
Abstract
The invention discloses a 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning, which comprises: an MRI data preprocessing module; depth attention network based on deformable convolution: the depth space-time deformable convolution fusion module TDAM inputs continuous 3D ventricle MRI video slice images on a time axis into a network so as to obtain a compensation area of a high-dimensional image in an MRI video band, and a high-dimensional image characteristic is obtained by utilizing a deformable convolution layer; and constructing a deformable convolution attention module to obtain an attention feature map, and suppressing an irrelevant background by using an addition attention module to finally obtain a network model. The newly input 3D ventricle MRI video is directly segmented by utilizing the trained network model, and the accuracy and efficiency of ventricle segmentation can be effectively improved by introducing a multi-frame image compensation, a deformable convolution and an attention adding mechanism, and the system has higher robustness.
Description
Technical Field
The invention relates to the technical field of medical image engineering, in particular to a 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning.
Background
With the development of medical imaging technology and artificial intelligence, automatic and semi-automatic systems in computer-aided diagnosis are gradually replacing the traditional artificial diagnosis systems to perform accurate diagnosis and treatment. Magnetic Resonance Imaging (MRI) is currently widely used in ventricular diagnostics by virtue of its lack of radioactive damage and high resolution. In order to better understand the condition of the patient's ventricle, it is necessary to correctly segment the position of each part of the ventricle with an accurate segmentation system, however, the conventional clinical procedure through visual evaluation of three-dimensional MRI images is time-consuming and depends on the clinical experience of the doctor. Therefore, it is important to find a system that improves the accuracy and efficiency of diagnosis of portions of the heart chamber.
The challenges faced by the prior art are mainly: 1. magnetic resonance imaging is very sensitive to body movements of the patient and prone to artifacts, however, subtle changes are ignored by the detection system, resulting in reduced detection sensitivity. 2. Most of the existing algorithms are only suitable for processing two-dimensional natural images, and MRI images are three-dimensional structures formed by parallel scanning image frames, so that important interframe information can be ignored by the two-dimensional positioning algorithm. 3. The patient's heart chambers can be severely deformed with breathing changes, resulting in a very large deformation of areas of the same nature, especially the myocardial and right ventricle portions surrounding the left ventricle, which can be a significant distraction and challenge to the segmentation system. 4. Due to the fact that the quantity of medical image data is small, high-quality labeling data and training samples are lacked, the trained model may be over-fit or the generalization capability of the model is not high.
In summary, providing a 3D ventricular nuclear magnetic resonance video segmentation system based on deep learning, which utilizes continuity information between MRI video image frames and between frames to improve accuracy and efficiency of ventricular segmentation, becomes an important technical problem to be solved urgently at present.
Disclosure of Invention
The invention aims to provide a 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning, aiming at the defects of the prior art of the current medical image ventricle segmentation, which is used for automatically segmenting the positions of all parts of the ventricle, and has high accuracy of positioning results and higher robustness of a model.
The purpose of the invention is realized by the following technical scheme: A3D ventricle magnetic resonance video segmentation system based on deep learning is characterized by comprising a 3D ventricle magnetic resonance MRI video data preprocessing module, a Deformable convolution depth attention network Deformable U-Net (DeU-Net) and an image detection module;
the 3D ventricular magnetic resonance MRI video data preprocessing module comprises a data enhancement module and a data division module:
the data enhancement module: splitting the existing 3D ventricular MRI video data set into MRI images of each frame, expanding the data set and carrying out normalization processing on the image size;
the data dividing module: dividing the enhanced image data into a training set and a testing set; the training set and the test set both comprise complete 3D ventricular MRI images, and the training set is used for training a Deformable convolution depth attention network Deformable U-Net;
the Deformable convolution depth attention network Deformable U-Net comprises a depth space-time Deformable convolution fusion module TDAM and a depth Deformable convolution global attention module DGPA:
the depth space-time deformable convolution fusion module TDAM: the module comprises a U-Net network and a deformable convolution layer; the TDAM inputs each frame image of the continuous 3D ventricular MRI video on a time axis into a U-Net network, outputs the image as a high-dimensional characteristic compensation area offset field of the image in the MRI video segment, transmits the high-dimensional characteristic compensation area offset field and the input image into a deformable convolution layer, and calculates to obtain high-dimensional characteristic fused feature maps of the compensated image, namely the fused feature containing the information of the front frame and the rear frame;
the deep deformable convolution global attention module DGPA: the module comprises a U-Net network, a deformable convolution attention module, three summation attention modules and an output layer, wherein the deformable convolution attention module passes through a deformable convolution layer on the basis of a spatial attention module, and adds the output of the deformable convolution layer obtained through calculation with the output of the spatial attention module to finally obtain the output of the deformable convolution attention module; adding a deformable convolution attention module in a first layer skip connection of the U-Net network, and adding a sum attention module in the other three layers of skip connections; inputting the compensated high-dimensional features of the image into a U-Net network, transmitting the high-dimensional features calculated by the U-Net network into an output layer to obtain an attention feature map, and then obtaining a segmentation probability by adopting a softmax regression function, namely the probability that a certain region in the MRI image belongs to the left ventricle, the myocardium or the right ventricle;
the image detection module is used for segmenting a 3D ventricle area, the probability heat map of the 3D ventricle MRI image of the test set is calculated by using a trained network, the probability heat map corresponding to each ventricle MRI image is segmented according to the segmentation probability obtained by DGPA, and segmentation results, namely the left ventricle area, the myocardial area and the right ventricle area, are obtained.
Further, in the image processing process, the data enhancement module expands the data set through rotating, adjusting contrast and zooming, and divides the 3D ventricular MRI video into four directions of x, y, z and t, wherein x, y and z represent a space coordinate system, t represents a time axis, and a video frame of an x-y plane is selected.
Furthermore, before each frame of MRI image is input into the Deformable U-Net network, r frames before and after the target frame along the time t axis direction are selected as the Deformable U-Net network input, namely 2r +1 frame of MRI images.
Furthermore, the depth space-time deformable convolution fusion module TDAM comprises 9 layers of structures, the first layer comprises a convolution layer and a Relu function, and converts the input channel number (2r +1) × in _ c into nf, wherein in _ c is the input image channel number, and nf is the output channel number of the custom convolution layer; layers 2 to 4 are downsampling structures comprising two convolutional layers and two Relu functions; the 5 th to 6 th layers are up-sampling structures and comprise a convolution layer, an anti-convolution layer and two Relu functions; the 7 th layer is a skip transfer structure, and the features obtained by down sampling are processed and then fused with an up sampling result, wherein the skip transfer structure comprises two convolution layers, an anti-convolution layer and three Relu functions; layer 8 is offset output structureIncluding two convolution layers and a Relu function, the second convolution layer outputting channel number (2r +1) × 2 × (defem _ ks)2Where deform _ ks is the deformable convolution kernel size; the ninth layer structure is a deformable convolution and a Relu function, and the input image and the offset are used as the layer input to obtain the high-dimensional characteristic fused feature maps of the image.
Further, in the depth space-time deformable convolution fusion module TDAM, the computation process of the convolution layer is as follows:
wherein convoutOutputting the image size, conv, for the convolution layerinIs the input image size, padding indicates that pixels are filled around the image, kennel _ size is the convolution kernel size, stride is the step size of the convolution kernel;
the 3 × 3 convolution kernel R is defined as: r { (-1, -1), (-1, 0), (0, 1), (1, 1) }, characteristic y (p) of the convolutional layer0) Comprises the following steps:
wherein p isnIs the position in R, w (-) is the weight, x (-) is the feature of the input image, p0Is an initial position;
in the deformable convolution layer, the convolution kernel R is offset { Δ p }n1, ·, N } enhanced, where N ═ R |; thus, the feature y' (p) of the deformable convolution0) Comprises the following steps:
in the formula,. DELTA.pnAnd compensating the region for the high-dimensional characteristics obtained by the U-Net network.
Furthermore, in the depth space-time Deformable convolution fusion module TDAM, each frame of 3D ventricle MRI image transmitted into Deformable U-Net networkLike asThe calculation formula of the fusion feature output by the TDAM is as follows:
where F (k) is the result characteristic, S is the convolution kernel size,is the core of the h-th channel,is an image of the h-th channel, h0For the current channel, k is the arbitrary spatial position, ksSampling an offset for a deformable convolution; providing additional learnable in TDAMSo that
ks←ks+δ(h,k),s
Wherein, delta(h,k)Learned offset, delta, for the h-th channel of spatial position k(h,k),sThe sample offset for the learned offset.
Further, the DGPA can extract the relevant features of the global pixel points; the high-dimensional features I of the compensated MRI image are input to a 3 x 3 deformable convolution kernel in a deformable convolution attention moduleObtaining an output O:
inputting the compensated high-dimensional features I of the MRI image into three 1 x 1 convolution kernels in a spatial attention module to generateNovel feature mapN is the number of channels of the feature map, M is H × W, i.e. the number of pixels of the feature map, and H and W are the height and width of the feature map, respectively; performing matrix multiplication on C and B after the conversion, and obtaining a space attention diagram according to a softmax formula by using a resultWherein, the calculation method of each element of S is as follows:
in the formula, sjiIs an element of the ith row and the jth column in S, BiFor the ith row of the feature map B,for the transposed j-th column of the feature map C, the spatial attention map S is matrix-multiplied with the feature map D, and the result of the computation and the result O of the previous deformable convolution are added to obtain the final result
Wherein α is a weight coefficient, DiIs line i of the feature map D, OjIs the jth column of output O.
Further, in the deep deformable convolution global attention module DGPA, a sum attention module is used to suppress irrelevant backgrounds; g is a characteristic diagram with the dimension of the down-sampling stage being NxWxH, X is a characteristic diagram with the dimension of the up-sampling stage being NxWxH, and the two characteristic diagrams respectively change the dimension to F after passing through a convolution kernel of 1X 1intxW x H, wherein FintThe dimension parameters in the preset U-Net network are obtained; to the obtainedThe two matrixes are added point by point and then pass through a Relu activation layer, the dimension of the result is changed to 1 multiplied by W multiplied by H after passing through a 1 multiplied by 1 convolution kernel, and then a weight coefficient alpha is obtained through a sigmoid function; the input X is multiplied by a weight coefficient alpha to obtain an attention feature map.
Further, the Deformable U-Net adopts a cross entropy function as a Loss function of the network in the training process, and the calculation formula of the cross entropy Loss _ seg is as follows:
wherein M represents the number of categories, ycIs a one-hot vector, pcIs the probability that the network model predicts belongs to sample c;
updating a weight parameter theta of Deformable U-Net by adopting a standard Adam optimizer gradient descent, wherein the formula is as follows:
where η is the learning rate, θkIs the weight parameter for the k-th time.
The invention has the following beneficial effects:
1) depth features in 3D ventricular MRI image video data can be automatically learned. Conventional visual assessment requires observation and judgment of a doctor frame by frame, is extremely dependent on the experience and skill level of the doctor, and consumes a lot of time. DeU-Net is capable of automatically learning high-dimensional features in 3D ventricular MRI image video data to discover intrinsic associations between MRI images and portions of the ventricle. Compared with the traditional ventricular segmentation system, the system provided by the invention can learn high-order features which are difficult to recognize by human eyes.
2) Accurate segmentation of each part of the ventricle can be realized. The system provided by the invention can accurately segment the ventricle image of the patient, and compared with the existing segmentation algorithm based on the depth network, the left ventricle, the cardiac muscle and the right ventricle area segmented by the system are more consistent with the visual evaluation of a doctor, and higher accuracy and efficiency are kept. Therefore, the method has high value in helping a doctor to locate the ventricular area of the patient and the subsequent surgical treatment.
3) The device can be suitable for organ segmentation detection of different formats of different devices, such as CT images, ultrasonic images and X-ray images. The system proposed by the present invention is effective for each part of the ventricle as well as for the full time period.
4) Network training with small data volume can be realized. The invention increases the sample size by using an image enhancement mode, and trains the model and test data on the basis of the sample size, thereby avoiding overfitting of network training and improving the robustness of network training. In addition, in order to improve the segmentation of minute parts during ventricular contraction, the invention adopts a multi-frame quality enhancement mode to acquire space-time information between frames in the 3D ventricular MRI video to compensate the target image, and simultaneously uses a deformable convolution method to better fuse the compensated information into the target image, thereby enhancing the segmentation precision.
Drawings
FIG. 1 is a block diagram of a deep learning based 3D ventricular MRI video segmentation system according to an embodiment of the present invention;
FIG. 2 is a flow chart of an implementation of a deep learning based 3D ventricular nuclear magnetic resonance video segmentation system according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the construction of DeU-Net according to one embodiment of the present invention;
FIG. 4 is a schematic diagram of DeU-Net configuration according to one embodiment of the present invention;
FIG. 5 is a graph of DeU-Net ventricular segmentation results, in accordance with one embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples.
As shown in fig. 1 and fig. 2, the 3D ventricular MRI video segmentation system provided by the present invention includes a 3D ventricular nuclear Magnetic Resonance (MRI) video data preprocessing module, a Deformable convolution depth attention network Deformable U-Net (DeU-Net) and an image detection module;
the 3D ventricular nuclear Magnetic Resonance (MRI) video data preprocessing module comprises a data enhancement module and a data partitioning module:
the data enhancement module: the method comprises the steps of splitting an existing 3D ventricular MRI video data set into MRI images of frames, expanding the data set in a rotating, contrast adjusting and scaling mode, and normalizing the size of the images. The 3D ventricular MRI video is divided into four directions of x, y, z and t, wherein the x, y and z represent a space coordinate system, the t represents a time axis, and video frames of an x-y plane are selected.
The data dividing module: dividing the enhanced image data into a training set and a testing set; both the training set and the test set contain complete 3D ventricular MRI images. Before an image is input into a network, front and back r frames of a target frame along the time t axis direction are selected as Deformable U-Net network input, namely 2r +1 frame MRI images.
As shown in fig. 3 and 4, the Deformable convolved deep attention network Deformable U-Net (DeU-Net) includes a deep spatiotemporal Deformable convolution fusion module TDAM and a deep Deformable convolved global attention module DGPA:
the depth space-time deformable convolution fusion module TDAM comprises 9 layers of structures, the first layer comprises a convolution layer and a Relu function, and the input channel number (2r +1) × in _ c is converted into nf, wherein in _ c is the input image channel number, and nf is the output channel number of the custom convolution layer. Layers 2 to 4 are downsampled structures comprising two convolutional layers and two Relu functions. Layers 5 to 6 are upsampled structures comprising a convolutional layer, an anti-convolutional layer and two Relu functions. The 7 th layer is a jump transfer structure, and the features obtained by down sampling are processed and then fused with an up sampling result, and the jump transfer structure comprises two convolution layers, an anti-convolution layer and three Relu functions. The 8 th layer is an offset output structure and comprises two convolution layers and a Relu function, and the second convolution layer outputs the channel number (2r +1) × 2 × (defem _ ks)2Where deform _ ks is the deformable convolution kernel size. The ninth layer structure is a deformable convolution and a Relu function, and the input image and the offset are used as the layer input to obtain the high-dimensional characteristic fused feature maps of the image. In all convolution and deconvolution in upsampling and downsampling, the step size is 2, padding is 1, and the number of channels is the same. The rest(s)Convolution step size is 1 and padding is 0 to preserve feature size. The calculation process of the convolutional layer is as follows:
wherein convoutOutputting the image size, conv, for the convolution layerinIs the input image size, padding indicates that pixels are filled around the image, kennel _ size is the convolution kernel size, stride is the step size of the convolution kernel;
the 3 × 3 convolution kernel R is defined as: r { (-1, -1), (-1, 0), (0, 1), (1, 1) }, characteristic y (p) of the convolutional layer0) Comprises the following steps:
wherein p isnIs the position in R, w (-) is the weight, x (-) is the feature of the input image, p0Is the initial position.
In the deformable convolution layer, the convolution kernel R is offset { Δ p }n1, ·, N } enhanced, where N ═ R |; thus, the feature y' (p) of the deformable convolution0) Comprises the following steps:
in the formula,. DELTA.pnAnd compensating the region for the high-dimensional characteristics obtained by the U-Net network.
3D ventricular frame-by-frame MRI images into Deformable U-Net networkThe calculation formula of the fusion feature output by the TDAM is as follows:
wherein F (k) is a characteristic of the result,s is the size of the convolution kernel,is the core of the h-th channel,is an image of the h-th channel, h0For the current channel, k is the arbitrary spatial position, ksSampling an offset for a deformable convolution; providing additional learnable in TDAMSo that
ks←ks+δ(h,k),s
Wherein, delta(h,k)Is the learnable offset field, δ, of the h channel at position k(h,k),sFor the learned sample offset of offset, the overall offset prediction network gets an offset field of:
The activation functions used in TDAM are linear rectifying units except for the last layer, which is linear activation. The linear rectifying unit g (z)' is calculated by:
the linear activation function g (z) is calculated as:
g(z)=z
the input data in TDAM is bx (2r +1) × 3 × H × W, where B is the batch size, 2r +1 is the input MRI image frame number, 3 is the image channel number, H is the image height, and W is the image width. In this embodiment, the input MRI image size is 12 × 3 × 3 × 256 × 256, and the size is changed to 12 × 32 × 256 × 256 after passing through the first layer structure. After the third down-sampling of the 2 nd to 4 th layers, the obtained data sizes are 12 × 32 × 128 × 128, 12 × 32 × 64 × 64, and 12 × 32 × 32 × 32 in this order. After 5 th to 6 th layers are subjected to up-sampling twice, the obtained image feature sizes are 12 multiplied by 32 multiplied by 64 and 12 multiplied by 32 multiplied by 128 in sequence. The image feature size obtained by the layer 7 skip pass structure is 12 × 32 × 128 × 128, and is merged with the upsampled input before being transmitted to the upsampled structure. The 8 th layer is an offset output structure, and the obtained image feature size is 12 × 54 × 256 × 256. The image feature size of the fused feature maps obtained by the ninth layer of space-time deformable convolution structure is 12 × 64 × 256 × 256.
The depth deformable convolution global attention module DGPA: the system comprises a U-Net network, a deformable convolution Attention module, three Attention adding modules (Attention adding modules) and an output layer, wherein the deformable convolution Attention module passes the input of a space Attention module through a deformable convolution layer based on the space Attention module, and adds the output of the calculated deformable convolution layer with the output of the space Attention module to finally obtain the output of the deformable convolution Attention module; adding a deformable convolution Attention module in the first layer skip connection of the U-Net network, and adding an Attention module (Attention Gates module) in the other three layers of skip connection; and inputting the compensated high-dimensional feature fused feature maps of the images into a U-Net network, wherein the number of input channels is changed into 64, and the sizes of the images are not changed.
And 4 times of downsampling is carried out, each downsampling operation comprises convolution, a 3 multiplied by 3 convolution kernel is adopted, the number of channels is converted into one time, and the number of the channels of the image is changed into 128, 256, 512 and 1024 in sequence. After each convolution, nonlinear features are obtained through Relu activation functions. The maximum pooling operation with a pooling kernel of 2 × 2 changes the picture size to half of the original size, i.e., 128 × 128, 64 × 64, 32 × 32, and 16 × 16 in this order.
And then 4 times of upsampling are carried out, wherein each time of upsampling operation comprises convolution, a 3 multiplied by 3 convolution kernel is adopted, and the number of channels is converted into half of the original number, so that the number of channels of the image is changed into 1024, 512, 256 and 128 in sequence. After each convolution, nonlinear features are obtained through Relu activation functions. The picture size is changed to one time by linear interpolation, namely, the picture size is changed to 32 × 32, 64 × 64, 128 × 128 and 256 × 256 in sequence.
And meanwhile, the input also enters a DGPA module to extract the relevant characteristics of the global pixel points and is spliced with the output result of the DGPA network. And the result of the first three times of down sampling enters an addition attention network to suppress irrelevant backgrounds and then is spliced with the result of the first three times of up sampling.
In the DGPA of the depth deformable convolution global attention module, a global attention network of deformable convolution is adopted to extract global pixel point relevant characteristics, and the compensated high-dimensional characteristics I of the MRI image are input into a 3 multiplied by 3 deformable convolution kernel in the DGPA moduleObtaining an output O:
inputting the compensated high-dimensional features I of the MRI image into three 1 x 1 convolution kernels in a spatial attention module to generate a new feature mapN is the number of channels of the feature map, M is H × W, i.e. the number of pixels of the feature map, and H and W are the height and width of the feature map, respectively; performing matrix multiplication on C and B after the conversion, and obtaining a space attention diagram according to a softmax formula by using a resultWherein, the calculation method of each element of S is as follows:
in the formula, sjiIs an element of the ith row and the jth column in S, BiFor the ith row of the feature map B,for the transposed j-th column of the feature map C, the spatial attention map S is matrix-multiplied with the feature map D, and the result of the computation and the result O of the previous deformable convolution are added to obtain the final result
Wherein α is a weight coefficient, DiIs line i of the feature map D, OjIs the jth column of output O.
In the deep deformable convolution global attention module DGPA, a summation attention module is used for suppressing irrelevant background; g is a characteristic diagram with the dimension of the down-sampling stage being NxWxH, X is a characteristic diagram with the dimension of the up-sampling stage being NxWxH, and the two characteristic diagrams respectively change the dimension to F after passing through a convolution kernel of 1X 1intxW x H, wherein FintThe dimension parameters in the preset U-Net network are obtained; adding the two obtained matrixes point by point, passing through a Relu activation layer, changing the dimension of the result to be 1 multiplied by W multiplied by H after passing through a 1 multiplied by 1 convolution kernel, and then obtaining a weight coefficient alpha through a sigmoid function; the input X is multiplied by a weight coefficient alpha to obtain an attention feature map. The output result can be normalized to between 0 and 1, and the magnitude of the value represents the magnitude of the correlation between this point and the recognition result. The larger the description, the more likely the recognition object is contained therein. Therefore, irrelevant areas are suppressed, and the accuracy of image recognition is improved.
In model training, a cross entropy function is adopted as a Loss function of a network, and a calculation formula of the cross entropy Loss _ seg is as follows:
wherein M represents the number of categories, ycIs a one-hot vector, pcIs the probability that the network model predicts belongs to sample c;
updating the weight parameter theta by adopting a standard Adam optimizer gradient descent, wherein the formula is as follows:
where η is the learning rate, θkIs the weight parameter for the k-th time.
The image detection module is used for segmenting a 3D ventricle area, the probability heat map of the 3D ventricle MRI image of the test set is calculated by using a trained network, the probability heat map corresponding to each ventricle MRI image is segmented according to the segmentation probability obtained by DGPA, and segmentation results, namely the left ventricle area, the myocardial area and the right ventricle area, are obtained.
In a specific case of applying the system of this embodiment, as shown in fig. 5, firstly, the acquired 3D ventricular MRI dataset is divided into a training set and a testing set, a deep space-time deformable convolution fusion module TDAM is constructed by using a U-Net network to obtain an offset of a target map in a 3D ventricular MRI video band, the obtained offset is fused into the target map by deformable convolution, and then the target map is input into a deep deformable convolution global attention module DGPA to extract global pixel point related features, and irrelevant portions are suppressed by using an attention adding network, so as to obtain a segmentation result map, thereby realizing accurate segmentation of the patient ventricle in the 3D ventricular MRI video image, and finally, the Dice of the whole video segmentation result is 90.1%, and compared with the existing segmentation algorithm based on a deep neural network, left ventricle, myocardium and right ventricle regions segmented by the system are more consistent with visual evaluation, the accuracy and efficiency are kept high.
The present invention is not limited to the above-described preferred embodiments. Any person can derive various other types of epileptogenic focus positioning systems based on deep learning according to the teaching of the present invention, and all equivalent changes and modifications made according to the application scope of the present invention shall fall within the scope of the present invention.
Claims (9)
1. A3D ventricle magnetic resonance video segmentation system based on deep learning is characterized by comprising a 3D ventricle magnetic resonance MRI video data preprocessing module, a Deformable convolution depth attention network Deformable U-Net and an image detection module;
the 3D ventricular magnetic resonance MRI video data preprocessing module comprises a data enhancement module and a data division module:
the data enhancement module: splitting the existing 3D ventricular MRI video data set into MRI images of each frame, expanding the data set and carrying out normalization processing on the image size;
the data dividing module: dividing the enhanced image data into a training set and a testing set; the training set and the test set both comprise complete 3D ventricular MRI images, and the training set is used for training a Deformable convolution depth attention network Deformable U-Net;
the Deformable convolution depth attention network Deformable U-Net comprises a depth space-time Deformable convolution fusion module TDAM and a depth Deformable convolution global attention module DGPA:
the depth space-time deformable convolution fusion module TDAM: the module comprises a U-Net network and a deformable convolution layer; the TDAM inputs each frame image of the continuous 3D ventricular MRI video on a time axis into a U-Net network, outputs the image as a high-dimensional characteristic compensation area offset field of the image in the MRI video segment, transmits the high-dimensional characteristic compensation area offset field and the input image into a deformable convolution layer, and calculates to obtain high-dimensional characteristic fused feature maps of the compensated image, namely the fused feature containing the information of the front frame and the rear frame;
the deep deformable convolution global attention module DGPA: the module comprises a U-Net network, a deformable convolution attention module, three summation attention modules and an output layer, wherein the deformable convolution attention module passes through a deformable convolution layer on the basis of a spatial attention module, and adds the output of the deformable convolution layer obtained through calculation with the output of the spatial attention module to finally obtain the output of the deformable convolution attention module; adding a deformable convolution attention module in a first layer skip connection of the U-Net network, and adding a sum attention module in the other three layers of skip connections; inputting the compensated high-dimensional features of the image into a U-Net network, transmitting the high-dimensional features calculated by the U-Net network into an output layer to obtain an attention feature map, and then obtaining a segmentation probability by adopting a softmax regression function, namely the probability that a certain region in the MRI image belongs to the left ventricle, the myocardium or the right ventricle;
the image detection module is used for segmenting a 3D ventricle area, the probability heat map of the 3D ventricle MRI image of the test set is calculated by using a trained network, the probability heat map corresponding to each ventricle MRI image is segmented according to the segmentation probability obtained by DGPA, and segmentation results, namely the left ventricle area, the myocardial area and the right ventricle area, are obtained.
2. The deep learning-based 3D ventricular MRI video segmentation system according to claim 1, wherein the data enhancement module expands the data set by rotation, contrast adjustment and scaling during image processing to divide the 3D ventricular MRI video into four directions of x, y, z and t, wherein x, y and z represent a spatial coordinate system, t represents a time axis, and video frames of an x-y plane are selected.
3. The 3D ventricular MRI video segmentation system based on deep learning of claim 1, characterized in that before each frame of MRI image is inputted into the Deformable U-Net network, the previous and subsequent r frames of the target frame along the time t axis direction are selected as the Deformable U-Net network input, i.e. 2r +1 frame of MRI image.
4. The deep learning based 3D ventricular MRI video segmentation system of claim 1, wherein the depth is spatiotemporally variableThe shape convolution fusion module TDAM comprises 9 layers of structures, the first layer comprises a convolution layer and a Relu function, and the input channel number (2r +1) × in _ c is converted into nf, wherein in _ c is the input image channel number, and nf is the user-defined convolution layer output channel number; layers 2 to 4 are downsampling structures comprising two convolutional layers and two Relu functions; the 5 th to 6 th layers are up-sampling structures and comprise a convolution layer, an anti-convolution layer and two Relu functions; the 7 th layer is a skip transfer structure, and the features obtained by down sampling are processed and then fused with an up sampling result, wherein the skip transfer structure comprises two convolution layers, an anti-convolution layer and three Relu functions; the 8 th layer is an offset output structure and comprises two convolution layers and a Relu function, and the second convolution layer outputs the channel number (2r +1) × 2 × (defem _ ks)2Where deform _ ks is the deformable convolution kernel size; the ninth layer structure is a deformable convolution and a Relu function, and the input image and the offset are used as the layer input to obtain the high-dimensional characteristic fused feature maps of the image.
5. The deep learning-based 3D ventricular nuclear magnetic resonance video segmentation system according to claim 1, wherein in the depth spatiotemporal deformable convolution fusion module TDAM, the computation process of the convolution layer is as follows:
wherein convoutOutputting the image size, conv, for the convolution layerinIs the input image size, padding indicates that pixels are filled around the image, kennel _ size is the convolution kernel size, stride is the step size of the convolution kernel;
the 3 × 3 convolution kernel R is defined as: r { (-1, -1), (-1, 0), (0, 1), (1, 1) }, characteristic y (p) of the convolutional layer0) Comprises the following steps:
wherein p isnIs the position in R, w (-) is the weight, x (-) is the feature of the input image, p0Is an initial position;
in the deformable convolution layer, the convolution kernel R is offset { Δ p }n1, ·, N } enhanced, where N ═ R |; thus, the feature y' (p) of the deformable convolution0) Comprises the following steps:
in the formula,. DELTA.pnAnd compensating the region for the high-dimensional characteristics obtained by the U-Net network.
6. The deep learning-based 3D ventricular MRI video segmentation system according to claim 1, wherein in the deep spatiotemporal Deformable convolution fusion module TDAM, each frame of 3D ventricular MRI image transmitted into Deformable U-Net network isThe calculation formula of the fusion feature output by the TDAM is as follows:
where F (k) is the result characteristic, S is the convolution kernel size,is the core of the h-th channel,is an image of the h-th channel, h0For the current channel, k is the arbitrary spatial position, ksSampling an offset for a deformable convolution; providing additional learnable in TDAMSo that
ks←ks+δ(h,k),s
Wherein, delta(h,k)Learned offset, delta, for the h-th channel of spatial position k(h,k),sThe sample offset for the learned offset.
7. The deep learning based 3D ventricular MRI video segmentation system according to claim 1, wherein the DGPA can extract global pixel point related features; the high-dimensional features I of the compensated MRI image are input to a 3 x 3 deformable convolution kernel in a deformable convolution attention moduleObtaining an output O:
inputting the compensated high-dimensional features I of the MRI image into three 1 x 1 convolution kernels in a spatial attention module to generate a new feature mapN is the number of channels of the feature map, M is H × W, i.e. the number of pixels of the feature map, and H and W are the height and width of the feature map, respectively; performing matrix multiplication on C and B after the conversion, and obtaining a space attention diagram according to a softmax formula by using a resultWherein, the calculation method of each element of S is as follows:
in the formula, sjiIs an element of the ith row and the jth column in S, BiFor the ith row of the feature map B,for the transposed j-th column of the feature map C, the spatial attention map S is matrix-multiplied with the feature map D, and the result of the computation and the result O of the previous deformable convolution are added to obtain the final result
Wherein α is a weight coefficient, DiIs line i of the feature map D, OjIs the jth column of output O.
8. The deep learning based 3D ventricular nuclear magnetic resonance video segmentation system of claim 7, characterized in that in the deep deformable convolution global attention module DGPA, a sum attention module is used to suppress irrelevant background; g is a characteristic diagram with the dimension of the down-sampling stage being NxWxH, X is a characteristic diagram with the dimension of the up-sampling stage being NxWxH, and the two characteristic diagrams respectively change the dimension to F after passing through a convolution kernel of 1X 1intxW x H, wherein FintThe dimension parameters in the preset U-Net network are obtained; adding the two obtained matrixes point by point, passing through a Relu activation layer, changing the dimension of the result to be 1 multiplied by W multiplied by H after passing through a 1 multiplied by 1 convolution kernel, and then obtaining a weight coefficient alpha through a sigmoid function; the input X is multiplied by a weight coefficient alpha to obtain an attention feature map.
9. The deep learning-based 3D ventricular nuclear magnetic resonance video segmentation system according to claim 1, wherein the Deformable U-Net adopts a cross entropy function as a Loss function of the network during the training process, and the cross entropy Loss _ seg is calculated by the following formula:
wherein M represents the number of categories, ycIs a one-hot vector, pcIs the probability that the network model predicts belongs to sample c;
updating a weight parameter theta of Deformable U-Net by adopting a standard Adam optimizer gradient descent, wherein the formula is as follows:
where η is the learning rate, θkIs the weight parameter for the k-th time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010622947.1A CN111932550B (en) | 2020-07-01 | 2020-07-01 | 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010622947.1A CN111932550B (en) | 2020-07-01 | 2020-07-01 | 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111932550A CN111932550A (en) | 2020-11-13 |
CN111932550B true CN111932550B (en) | 2021-04-30 |
Family
ID=73316977
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010622947.1A Active CN111932550B (en) | 2020-07-01 | 2020-07-01 | 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111932550B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112330683B (en) * | 2020-11-16 | 2022-07-29 | 的卢技术有限公司 | Lineation parking space segmentation method based on multi-scale convolution feature fusion |
CN112733672B (en) * | 2020-12-31 | 2024-06-18 | 深圳一清创新科技有限公司 | Three-dimensional target detection method and device based on monocular camera and computer equipment |
CN112766195B (en) * | 2021-01-26 | 2022-03-29 | 西南交通大学 | Electrified railway bow net arcing visual detection method |
CN113436139A (en) * | 2021-05-10 | 2021-09-24 | 上海大学 | Small intestine nuclear magnetic resonance image identification and physiological information extraction system and method based on deep learning |
CN113283529B (en) * | 2021-06-08 | 2022-09-06 | 南通大学 | Neural network construction method for multi-modal image visibility detection |
CN114004847B (en) * | 2021-11-01 | 2023-06-16 | 中国科学技术大学 | Medical image segmentation method based on graph reversible neural network |
CN114155208B (en) * | 2021-11-15 | 2022-07-08 | 中国科学院深圳先进技术研究院 | Atrial fibrillation assessment method and device based on deep learning |
CN114359310B (en) * | 2022-01-13 | 2024-06-04 | 浙江大学 | 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning |
CN116612131B (en) * | 2023-05-22 | 2024-02-13 | 山东省人工智能研究院 | Cardiac MRI structure segmentation method based on ADC-UNet model |
CN116630628B (en) * | 2023-07-17 | 2023-10-03 | 四川大学 | Aortic valve calcification segmentation method, system, equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109447008A (en) * | 2018-11-02 | 2019-03-08 | 中山大学 | Population analysis method based on attention mechanism and deformable convolutional neural networks |
CN109685813A (en) * | 2018-12-27 | 2019-04-26 | 江西理工大学 | A kind of U-shaped Segmentation Method of Retinal Blood Vessels of adaptive scale information |
CN110120033A (en) * | 2019-04-12 | 2019-08-13 | 天津大学 | Based on improved U-Net neural network three-dimensional brain tumor image partition method |
CN110163876A (en) * | 2019-05-24 | 2019-08-23 | 山东师范大学 | Left ventricle dividing method, system, equipment and medium based on multi-feature fusion |
CN111161273A (en) * | 2019-12-31 | 2020-05-15 | 电子科技大学 | Medical ultrasonic image segmentation method based on deep learning |
CN111275755A (en) * | 2020-04-28 | 2020-06-12 | 中国人民解放军总医院 | Mitral valve orifice area detection method, system and equipment based on artificial intelligence |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111260651B (en) * | 2018-11-30 | 2023-11-10 | 西安电子科技大学 | Stomach low-quality MRI image segmentation method based on deep migration learning |
CN111192245B (en) * | 2019-12-26 | 2023-04-07 | 河南工业大学 | Brain tumor segmentation network and method based on U-Net network |
-
2020
- 2020-07-01 CN CN202010622947.1A patent/CN111932550B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109447008A (en) * | 2018-11-02 | 2019-03-08 | 中山大学 | Population analysis method based on attention mechanism and deformable convolutional neural networks |
CN109685813A (en) * | 2018-12-27 | 2019-04-26 | 江西理工大学 | A kind of U-shaped Segmentation Method of Retinal Blood Vessels of adaptive scale information |
CN110120033A (en) * | 2019-04-12 | 2019-08-13 | 天津大学 | Based on improved U-Net neural network three-dimensional brain tumor image partition method |
CN110163876A (en) * | 2019-05-24 | 2019-08-23 | 山东师范大学 | Left ventricle dividing method, system, equipment and medium based on multi-feature fusion |
CN111161273A (en) * | 2019-12-31 | 2020-05-15 | 电子科技大学 | Medical ultrasonic image segmentation method based on deep learning |
CN111275755A (en) * | 2020-04-28 | 2020-06-12 | 中国人民解放军总医院 | Mitral valve orifice area detection method, system and equipment based on artificial intelligence |
Non-Patent Citations (4)
Title |
---|
Attention U-Net: Learning Where to Look for the Pancreas;Ozan Oktay等;《arXiv:1804.03999v3 [cs.CV]》;20180520;全文 * |
Deformable Convolutional Networks;Jifeng Dai等;《arXiv:1703.06211v3 [cs.CV]》;20170605;全文 * |
DUNet: A deformable network for retinal vessel segmentation;Qiangguo Jin等;《JOURNAL OF LATEX CLASS FILES》;20150831;第14卷(第8期);全文 * |
基于MRI图像的左心室分割方法研究现状与发展;周钦等;《计算机工程与应用》;20191231;第55卷(第2期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111932550A (en) | 2020-11-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111932550B (en) | 3D ventricle nuclear magnetic resonance video segmentation system based on deep learning | |
CN111145170B (en) | Medical image segmentation method based on deep learning | |
CN110390351B (en) | Epileptic focus three-dimensional automatic positioning system based on deep learning | |
CN107492071B (en) | Medical image processing method and equipment | |
US11430140B2 (en) | Medical image generation, localizaton, registration system | |
US20200167929A1 (en) | Image processing method, image processing apparatus, and computer-program product | |
CN114359310B (en) | 3D ventricular nuclear magnetic resonance video segmentation optimization system based on deep learning | |
CN110930416A (en) | MRI image prostate segmentation method based on U-shaped network | |
CN110599528A (en) | Unsupervised three-dimensional medical image registration method and system based on neural network | |
CN111444896A (en) | Method for positioning human meridian key points through far infrared thermal imaging | |
CN111951288A (en) | Skin cancer lesion segmentation method based on deep learning | |
CN111161271A (en) | Ultrasonic image segmentation method | |
CN110648331A (en) | Detection method for medical image segmentation, medical image segmentation method and device | |
CN116258933A (en) | Medical image segmentation device based on global information perception | |
CN116258732A (en) | Esophageal cancer tumor target region segmentation method based on cross-modal feature fusion of PET/CT images | |
CN117392312A (en) | New view image generation method of monocular endoscope based on deformable nerve radiation field | |
CN114119635B (en) | Fatty liver CT image segmentation method based on cavity convolution | |
CN117523204A (en) | Liver tumor image segmentation method and device oriented to medical scene and readable storage medium | |
CN113269774A (en) | Parkinson disease classification and lesion region labeling method of MRI (magnetic resonance imaging) image | |
CN116758087A (en) | Lumbar vertebra CT bone window side recess gap detection method and device | |
CN116757982A (en) | Multi-mode medical image fusion method based on multi-scale codec | |
CN116309754A (en) | Brain medical image registration method and system based on local-global information collaboration | |
CN115424319A (en) | Strabismus recognition system based on deep learning | |
CN117274282B (en) | Medical image segmentation method, system and equipment based on knowledge distillation | |
Sarkera et al. | MobileGAN: Skin Lesion Segmentation Using a Lightweight Generative Adversarial Network [J] |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |