CN112215850A - Method for segmenting brain tumor by using cascade void convolution network with attention mechanism - Google Patents

Method for segmenting brain tumor by using cascade void convolution network with attention mechanism Download PDF

Info

Publication number
CN112215850A
CN112215850A CN202010848879.0A CN202010848879A CN112215850A CN 112215850 A CN112215850 A CN 112215850A CN 202010848879 A CN202010848879 A CN 202010848879A CN 112215850 A CN112215850 A CN 112215850A
Authority
CN
China
Prior art keywords
layer
convolution
decoder
segmentation
cascade
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010848879.0A
Other languages
Chinese (zh)
Inventor
褚晶辉
黄凯隆
吕卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202010848879.0A priority Critical patent/CN112215850A/en
Publication of CN112215850A publication Critical patent/CN112215850A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a method for segmenting a brain tumor by a cascade cavity convolution network with an attention mechanism, which comprises the following steps: preprocessing data; the method for building the network structure comprises the following steps: establishing a cascade cavity convolution network with an attention mechanism, and adopting a three-level cascade framework to simplify a multi-class segmentation task into three two-class segmentation tasks, wherein the three-level segmentation networks are respectively W-Net, T-Net and E-Net and are respectively used for segmenting a brain tumor Whole (WT) area, a tumor nucleus (TC) area and an enhanced tumor nucleus (ET) area; segmenting in each stage from the axial direction, the sagittal direction and the coronal direction respectively, and then averaging in segmentation results in the three directions to obtain a more accurate segmentation result; the network structure of each level in the three-level cascade frame of the cascade cavity convolution with the attention mechanism is a full convolution network structure of encoding and decoding and is divided into four parts, namely an encoder, a decoder, a layer jump structure and a multi-layer feature map fusion.

Description

Method for segmenting brain tumor by using cascade void convolution network with attention mechanism
Technical Field
The invention relates to the field of image processing, in particular to a method for segmenting a three-dimensional medical brain tumor image.
Background
The brain tumor is an intracranial tumor with high lethality, and can be divided into high glial tumor (HGG) and low glial tumor (LGG) according to histological heterogeneity and tumor aggressiveness, and further divided into edema region, tumor nucleus region, enhanced tumor nucleus region, non-enhanced tumor nucleus region and necrosis region. Four modality images of brain tumor nuclear Magnetic Resonance (MR): the T1, T1ce, T2, and FLAIR are distinct in tumor regions of interest and may provide supplementary information to each other. Brain tumor segmentation is to segment different tumor regions in a brain image, and is important for the assessment of patient diseases, the formulation of treatment schemes and the subsequent observation and study. However, manual segmentation by doctors is time-consuming and labor-consuming, errors are easy to occur after long-time manual labeling, and segmentation results of doctors with different experiences are different, so that an automatic and high-accuracy brain tumor segmentation method is needed.
With the development of deep learning, brain tumor segmentation based on deep learning becomes a method with the highest accuracy, and the brain tumor segmentation based on deep learning is more common to be FCN[1]、U-Net[2]And V-Net[3]The FCN deletes the full connection layer of the convolutional neural network to obtain a segmented image with the same size as the input image, the U-Net improves the FCN, a coding-decoding symmetrical structure is adopted, the V-Net changes the convolutional layer, the pooling layer and the upsampling layer of the U-Net into 3D variables, and residual connection is added to solve the problem of network degradation, DeepLab[4]The system adds cavity convolution in the full convolution network to increase the receptive field of the convolution kernel. Oktayetal[5]It is proposed to add spatial attention to the 3DU-Net structure for image segmentation and to use the decoder's profile attention on the encoder profile path.
Reference to the literature
[1]Shen H,Zhang J,Zheng W.Efficient symmetry-driven fully convolutional network for multimodal brain tumor segmentation[C]//2017 IEEE International Conference on Image Processing(ICIP).IEEE,2017:3864-3868.
[2]Ronneberger O,Fischer P,Brox T.U-Net:Convolutional Networks for Biomedi-cal Image Segmentation[J],2015,9351:234–241.
[3]Milletari,F.,Navab,N.,Ahmadi,S.A.,2016.V-net:Fully convolutional neural networks for volumetric medical image segmentation,in:2016 Fourth International Conference on 3D Vision(3DV),IEEE.pp.565–571.
[4]Chen,L.C.,Papandreou,G.,Kokkinos,I.,Murphy,K.,Yuille,A.L.,2017a.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs.IEEE transactions on pattern analysis and machine intelligence 40,834–848.
[5]Oktay O,Schlemper J,Folgoc L L,et al.Attention U-Net:Learning Where to Look for the Pancreas[J].2018.
[6]http://braintumorsegmentation.org/
[7]Isensee,F.,Kickingereder,P.,Wick,W.,Bendszus,M.,Maier-Hein,K.H.,2018.No new-net,in:International MICCAI Brain lesion Workshop,Springer.pp.234–244.
[8]Ioffe S,Szegedy C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].arXiv preprint arXiv:1502.03167,2015.
[9]Xu B,Wang N,Chen T,et al.Empirical evaluation of rectified activations in convolutional network[J].arXiv preprint arXiv:1505.00853,2015.
[10]Noh H,Hong S,Han B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE international conference on computer vision.2015:1520-1528.
Disclosure of Invention
The invention aims to provide a brain tumor segmentation method capable of improving segmentation precision. The brain tumor nuclear magnetic resonance image is divided by adopting a three-level cascaded full convolution network, the brain tumor nuclear magnetic resonance image is divided into a tumor Whole (WT) area, a tumor kernel (TC) area and an enhanced tumor kernel (ET) area, each level of network is similar, modification is carried out on the basis of the full convolution network, a coding-decoding network structure is adopted, a 3D convolution kernel is divided into intra-frame convolution and inter-frame convolution, hole convolution with different expansion rates is added, an attention mechanism is added, and multi-layer feature map fusion is added on a decoder, so that the dividing precision is improved. The technical scheme is as follows:
a method for segmenting brain tumors by a cascade cavity convolution network with an attention mechanism comprises the following steps:
(1) data preprocessing:
and selecting 3DMR images, constructing a training set and a verification set containing images of different brain tumor types, and preprocessing.
(2) The method for building the network structure comprises the following steps:
establishing a cascade cavity convolution network with an attention mechanism, adopting a three-level cascade framework, simplifying a plurality of segmentation tasks into three two-level segmentation tasks, reducing the segmentation difficulty and reducing network parameters, wherein the three-level segmentation networks are respectively W-Net, T-Net and E-Net and are respectively used for segmenting a brain tumor Whole (WT) area, a tumor nucleus (TC) area and an enhanced tumor nucleus (ET) area; segmenting in each stage from the axial direction, the sagittal direction and the coronal direction respectively, and then averaging in segmentation results in the three directions to obtain a more accurate segmentation result;
the network structure of each level in the three-level cascade frame of the cascade cavity convolution with the attention mechanism is a full convolution network structure of encoding and decoding and is divided into four parts of an encoder, a decoder, a layer jump structure and a multi-layer characteristic diagram fusion:
the encoder is of a four-layer structure, the first layer comprises four convolution kernels which are intra convolution layers with the size of 3 x 1, a batch normalization layer (BN) and a PReLU layer for non-linearization are arranged behind each intra convolution layer to form intra convolution blocks, and every two intra convolution blocks are connected through a residual to form a residual block; the second layer contains two residual blocks, a down-sampling layer and a convolution kernel which are interframe convolution layers with the size of 1 multiplied by 3, and a BN layer and a PReLU layer which are arranged behind the interframe convolution layers form interframe convolution blocks; the third layer has three residual blocks, an inter-frame convolution block and a down-sampling layer; respectively adding the second residual block and the third residual block into a cavity convolution with expansion rates of 2 and 3; the fourth layer comprises a down-sampling layer, three residual blocks with the expansion rate of 3 and an anti-convolution layer;
the decoding path has a three-layer structure, wherein the first layer has three residual blocks, an interframe convolution block and an up-sampling layer, and the first and second residual blocks are respectively added with a hole convolution with expansion rates of 3 and 2; the second layer has two residual blocks, an inter-frame convolution block and an deconvolution layer; the third layer contains two residual blocks;
the jump layer is connected with three: the first one adds the feature diagram output by the second layer of the decoder into the attention of the feature diagram output by the first layer of the encoder through four times of intra-frame convolution block operation, and is connected with the feature diagram output by the second layer of the decoder to be used as the input of the third layer of the decoder; the second one adds the feature diagram output by the second layer of the coder into the attention of the feature diagram output by the first layer of the decoder through two times of intra-frame convolution block operation, and is connected with the feature diagram output by the first layer of the decoder to be used as the input of the second layer of the decoder; the third section adds the output characteristic diagram of the third layer of the encoder to the attention of the characteristic diagram output by the fourth layer of the encoder, and is connected with the output characteristic diagram of the fourth layer of the encoder to be used as the input of the first layer of the decoder;
and the multilayer characteristic diagram fusion carries out two-time interframe convolution and two-time deconvolution operations on the fourth layer output characteristic diagram of the encoder, the first layer output characteristic diagram of the decoder is connected with the second layer output characteristic diagram of the decoder and the third layer output characteristic diagram of the decoder through one-time interframe convolution and one-time deconvolution operation, and the final output is obtained through one-class two-segmentation convolution.
(3) And (5) training and optimizing the model.
The invention has the following beneficial effects:
1: the three-level cascade framework of the cascade cavity convolution network with the attention mechanism reduces a plurality of classes of segmentation tasks into three two classes of segmentation tasks to divide the subareas of the brain tumors, thereby reducing the segmentation difficulty and the complexity of each class of network, limiting the segmentation range of the next class of network on the segmentation result output by the previous class of network by the cascade structure, reducing the problems of misjudgment and unbalanced number among classes, and improving the segmentation precision; the segmentation is respectively carried out on three dimensions and the average is calculated, so that the segmentation result is more reliable;
2: the use of the intra-frame convolution and the inter-frame convolution in the cascade void convolution network with the attention mechanism fully utilizes the space information of three dimensions of the slice, and reduces the consumption of network parameters and video memory; the cavity convolution in the intra-frame convolution can increase the receptive field of a convolution kernel on the basis of not damaging the image resolution, can reduce the number of down-sampling and is beneficial to protecting the image resolution;
3: the cascade void convolutional network with the attention mechanism is added with a spatial attention module in a layer jump structure, so that in the process that the network fuses a feature map containing context information of an encoder to a feature map containing detail information of a decoder, the weight of the feature map containing the detail information of the decoder is multiplied to the feature map containing the context information of the encoder, the part containing the detail information is endowed with a larger weight, the attention is concentrated on the fine part of the feature map, and the precision of a segmentation result is improved;
4: the cascade hole convolution network with attention mechanism adds multilayer characteristic diagram fusion in the decoder to further integrate the context information and the detail information, wherein the characteristic diagram of a lower layer in the decoder contains more context information, the characteristic diagram of a higher layer contains more detail information, the multilayer characteristic diagram fusion is connected with the characteristic diagrams of different layers in the decoder, and the context characteristic and the detail characteristic are integrated together, so that the accuracy of the segmentation result is further improved.
Drawings
FIG. 1 is a schematic diagram of a cascaded void convolution network architecture with attention mechanism
FIG. 2 is a space attention module structure
Detailed Description
Firstly, the technical scheme of the brain tumor segmentation of the cascade cavity convolution network with the attention mechanism is introduced, and the steps are as follows:
(1) data preprocessing:
the invention uses the published BraTS 2018[6]The data set comprises 285 training sets and 66 verification sets, wherein the training setsThe method comprises the steps of including HGG 210 cases and LGG 75 cases, wherein each case has 3D MR images of T1, T1ce, T2 and FLAIR four modalities, and each size is 240 x 155; there were 66 validation sets, with no differentiation between tumor types. The slices are divided into 144 × 144 × 19 slices in the axial, sagittal, and coronal directions, respectively, as raw inputs.
(2) The method for building the network structure comprises the following steps:
the cascade cavity convolution network with the attention mechanism adopts a three-level cascade framework, a plurality of segmentation tasks are simplified into three two-level segmentation tasks, the segmentation difficulty is reduced, and network parameters are reduced, wherein the three-level segmentation networks are respectively W-Net, T-Net and E-Net and are respectively used for segmenting a brain tumor Whole (WT) area, a tumor nucleus (TC) area and an enhanced tumor nucleus (ET) area; segmenting in each stage from the axial direction, the sagittal direction and the coronal direction respectively, and then averaging in segmentation results in the three directions to obtain a more accurate segmentation result;
the network structure of each level in the three-level cascade frame of the cascade cavity convolution with the attention mechanism is similar, and the network structure is a full convolution network structure of encoding and decoding and is divided into four parts of an encoder, a decoder, a layer jump structure and a multi-layer feature map fusion:
the encoder has a four-layer structure, the first layer comprises four convolution kernels, each convolution kernel is a 3 × 3 × 1 intra convolution layer, and each intra convolution layer is followed by a batch normalization layer (BN)[8]And a PReLU for non-linearization[9]The layer, form the intra-frame and roll up the block, every two intra-frame roll up the block to connect and form the residual block through the residual; the second layer contains two residual blocks (four intra-frame convolution blocks), a down-sampling layer and a convolution kernel which are inter-frame convolution layers with the size of 1 multiplied by 3, and a BN layer and a PReLU layer which are arranged behind the inter-frame convolution layers form the inter-frame convolution blocks; the third layer has three residual blocks (six intra-frame convolution blocks), an inter-frame convolution block and a down-sampling layer; respectively adding the second residual block and the third residual block into a cavity convolution with expansion rates of 2 and 3; the fourth layer comprises a down-sampling layer, three residual blocks with expansion rate of 3 (six intra-frame convolution blocks) and a deconvolution layer[10]
The decoding path has a three-layer structure, the first layer has three residual blocks (six intra-frame rolling blocks), an inter-frame rolling block and an up-sampling layer, wherein the first and second residual blocks are respectively added with a hole convolution with expansion rates of 3 and 2; the second layer has two residual blocks (four intra-frame convolution blocks), an inter-frame convolution block, and an deconvolution layer; the third layer contains two residual blocks (four intra-frame convolution blocks);
the jump layer is connected with three: the first one adds the feature diagram of the first layer output of the coder into the attention (multiplied by the weight) of the feature diagram of the second layer output of the decoder through four times of intra-frame convolution block operation, and connects the feature diagram of the second layer output of the decoder as the input of the third layer of the decoder; the second one adds the feature diagram of the second layer output of the coder to the attention (multiplied by the weight) of the feature diagram of the first layer output of the decoder through two times of intra-frame convolution block operation, and is connected with the feature diagram of the first layer output of the decoder to be used as the input of the second layer of the decoder; the third section adds the output characteristic diagram of the third layer of the coder into the attention (multiplied by the weight) of the characteristic diagram output by the fourth layer of the coder, and is connected with the output characteristic diagram of the fourth layer of the coder to be used as the input of the first layer of the decoder;
and the multilayer characteristic diagram fusion carries out two-time interframe convolution and two-time deconvolution operations on the fourth layer output characteristic diagram of the encoder, the first layer output characteristic diagram of the decoder is connected with the second layer output characteristic diagram of the decoder and the third layer output characteristic diagram of the decoder through one-time interframe convolution and one-time deconvolution operation, and the final output is obtained through one-class two-segmentation convolution.
(3) Model training and optimization:
285 cases (each case containing four modalities) of 3D NMR images of the BraTS 2018 training set are cut into blocks with the size of 144 x 19 and sent to a network, and the initial learning rate is 10-4And using ADAM optimization, continuously and reversely propagating and updating the weight, training the model and storing.
Selection of learning rate: too small a learning rate can lead to long-time non-convergence and waste of resources; too large a learning rate may result in a local minimum being trapped. Therefore, the learning rate of 10 was selected for this experiment-4
Optimizing the model: ADAM optimization, according to a loss function, is continuously propagated backwards, and the weight is updated.
The embodiments will be described in further detail below with reference to the accompanying drawings:
first, a data set is prepared:
the invention uses the publication Brain Tumor Segmentation 2018(BraTS 2018)[6]) The data set is divided into a training set and a verification set, wherein the training set comprises 285 cases, including 210 cases of high glial tumors (HGG) and 75 cases of low glial tumors (LGG), the verification set comprises 66 cases, and tumor types are not distinguished; each 3D nuclear magnetic resonance image containing four modes of T1, T1ce, T2 and FLAIR has the size of 240 x 155, the 285 images of the training set are cut, the images are cut into pieces with the size of 144 x 19, the pieces are sent to a network for training a model, and the 66 images of the verification set are used as a test set test model;
and secondly, constructing a cascade cavity convolution network with an attention mechanism by using a deep learning frame Tensorflow, wherein the whole network frame is in three-stage cascade, the network is trained in three directions of an axial direction, a sagittal direction and a coronal direction in each stage, and the network structure of each stage is similar. As shown in FIG. 1, FIG. 1 shows a network (W-Net) structure for dividing the whole brain tumor;
(1) sending the axial blocks into a network, and changing the axial blocks into a feature map with the channel number of 32 and the size of 144 multiplied by 19 through two residual blocks (four intra-frame convolution blocks, wherein each intra-frame convolution block comprises an intra-frame convolution layer with the convolution kernel size of 3 multiplied by 1, a BN layer and a PReLU layer) of a first layer of an encoder; after entering the second layer of the encoder, the second layer is converted into a feature map with channel number of 32 and size of 72 × 72 × 17 through two residual blocks (four intra-frame convolution blocks), an inter-frame convolution block (including an inter-frame convolution layer with convolution kernel size of 1 × 1 × 3, a BN layer and a prilu layer) and a down-sampling layer; entering a third layer of the encoder, and obtaining a characteristic diagram with the channel number of 36 multiplied by 15 through three residual blocks (six intra-frame rolling blocks, the expansion rates of the three residual blocks are 1, 2 and 3 respectively), an inter-frame rolling block and a down-sampling layer; entering a fourth layer of the encoder, and obtaining a feature map with the channel number of 32 and the size of 36 multiplied by 15 through a down-sampling layer, three residual blocks (six intra-frame rolling blocks) with the expansion rate of 3 and an up-sampling layer; the layer-skipping structure adds the output of the third layer of the encoder into the attention (multiplied by weight) of the output characteristic diagram of the fourth layer of the encoder, is connected with the output characteristic diagram of the fourth layer of the encoder and enters the first layer of the decoder; the first layer of the decoder obtains a characteristic diagram with the channel number of 32 and the size of 72 multiplied by 13 through three residual blocks (six intra-frame convolution blocks, the expansion rates of the three residual blocks are respectively 3, 2 and 1), an inter-frame convolution block and an deconvolution layer; the layer jump connection adds the attention (multiplied by weight) of the output characteristic diagram of the first layer of the decoder after the output of the second layer of the encoder is subjected to two inter-frame convolution block operations, is connected with the output characteristic diagram of the first layer of the decoder and enters the second layer of the decoder; the second layer of the decoder obtains a feature map with the channel number of 32 and the size of 144 multiplied by 11 through two residual error blocks (four intra convolution blocks), an inter convolution block and a deconvolution layer; the skip layer connection adds the attention (multiplied by weight) of the output characteristic diagram of the second layer of the decoder after the output of the first layer of the encoder is subjected to four inter-frame convolution block operations, is connected with the output characteristic diagram of the second layer of the decoder and enters the third layer of the decoder; the third layer of the encoder has two residual blocks (four intra-frame convolution blocks) to obtain a feature map with the channel number of 32 and the size of 144 multiplied by 11; performing interframe convolution operation and deconvolution operation on the output characteristic diagram of the fourth layer of the encoder twice, performing interframe convolution operation and deconvolution operation on the output characteristic diagram of the first layer of the decoder once, connecting the output characteristic diagram of the first layer of the decoder with the output characteristic diagrams of the second layer and the third layer of the decoder, and performing convolution of two types of segmentation to obtain 2-channel output with the size of 144 multiplied by 11; obtaining the integral segmentation result of the axial tumor;
the attention module is shown in fig. 2, and adds the feature map a and the feature map B after convolution with convolution kernel size of 1 × 1 × 1 and channel number of C, performs convolution with ReLU nonlinear operation, convolution kernel size of 1 × 1 × 1 and channel number of 1, multiplies the result by the feature map B after Sigmoid function, and gives weight (attention) to the feature map B;
(2) cutting the output obtained in the step (1), and cutting the whole tumor segmented in the step (1) to be used as the input of the second level of the cascade network, wherein the structure of the second level network is similar to that of the cascade network in the step (1); the process is the same as (1); reducing the down-sampling operation once on the second layer of the encoder compared with the down-sampling operation once on the second layer of the encoder in the case of the multi-layer feature map fusion, reducing one deconvolution layer on the second layer of the decoder, and reducing one deconvolution layer on each of the fourth layer of the encoder and the first layer of the decoder in the case of the multi-layer feature map fusion; obtaining the segmentation result of the axial tumor nucleus;
(3) cutting the output obtained in the step (2), and cutting the tumor kernel segmented in the step (2) to be used as the input of the third level of the cascade network, wherein the structure of the third level network is similar to that of the network in the figure 1; the process is the same as (1); compared with the method in the figure 1, the method has the advantages that the down-sampling operation is reduced once on the first layer and the second layer of the encoder, the deconvolution layer is reduced on the second layer and the third layer of the decoder, the deconvolution operation is reduced twice on the fourth layer of the encoder and the deconvolution operation is reduced on the first layer of the decoder when the multi-layer feature maps are fused; obtaining the segmentation result of the axially enhanced tumor nuclei;
(4) sending the sagittal blocks into a network, and obtaining the segmentation results of the sagittal tumor whole body, the tumor nucleus and the enhanced tumor nucleus through three-level network segmentation, wherein the processes are the same as (1) - (3);
(5) sending the coronal blocks into a network, and obtaining segmentation results of coronal tumor whole bodies, tumor nuclei and enhanced tumor nuclei through three-level network segmentation, wherein the processes are the same as (1) - (3);
(6) and averaging the segmentation results of the axial direction, the sagittal direction and the coronal direction to obtain a final segmentation result.
Thirdly, training the network, namely cutting 285 cases (each case comprises four modes) of 3D nuclear magnetic resonance images of the BraTS 2018 training set into blocks with the size of 144 x 19, and sending the blocks into the network, wherein the initial learning rate is 10-4Using ADAM optimization, continuously and reversely propagating and updating the weight, training a model and storing; the loss functions selected in this embodiment are the Dice loss function and the Cross entropy loss function adopted in document 7.
And fourthly, testing the model, namely taking 66 cases (each case contains four modes) of the BraTS 2018 verification set as a test set test model, and segmenting the test set by using the trained model to obtain a Dice score and a Hausdorff distance. Obtaining a Dice score: WT 0.90462, TC 0.81727, ET 0.80091; hausdorff distance: WT is 4.81871, TC is 8.75708, and ET is 2.98508.
The invention has the following substantive characteristics and beneficial effects:
(1) three sub-areas of the brain tumor image are respectively segmented by adopting a three-level cascade framework, so that the difficulty of each stage of segmentation task is reduced; the encoder extracts context characteristic information through convolution, and increases the receptive field of a convolution kernel through a down-sampling layer and a cavity convolution, and the decoder extracts detail information through convolution and restores the image resolution through deconvolution operation; the layer jump structure connects the feature graphs of the same level in the encoder and the decoder, and fuses feature information.
(2) The 3D convolution kernel is divided into an intra-frame convolution kernel and an inter-frame convolution kernel, the characteristic information inside the slices and the characteristic information between the slices are respectively extracted, and meanwhile, the consumption of parameters and video memory of a network model can be reduced; adding cavity convolution with different expansion rates into the intra-frame convolution to obtain convolution kernels with different receptive fields;
(3) and adding a spatial attention module in the layer jump structure, so that when the encoder and the decoder are connected at the same stage of feature maps, the weights of the feature maps of the decoder are multiplied by the feature map of the encoder, and the attention of the convolutional network is focused on the detail features.
(4) And adding multi-stage feature map fusion into a decoder, recovering the original resolution of the feature map of each layer of the decoder through deconvolution, connecting, fusing global features and detail features, and improving the accuracy.

Claims (1)

1. A method for segmenting brain tumors by a cascade cavity convolution network with an attention mechanism comprises the following steps:
(1) data preprocessing:
and selecting a 3D MR image, constructing a training set and a verification set containing images of different brain tumor types, and preprocessing.
(2) The method for building the network structure comprises the following steps:
establishing a cascade cavity convolution network with an attention mechanism, adopting a three-level cascade framework, simplifying a plurality of segmentation tasks into three two-level segmentation tasks, reducing the segmentation difficulty and reducing network parameters, wherein the three-level segmentation networks are respectively W-Net, T-Net and E-Net and are respectively used for segmenting a brain tumor Whole (WT) area, a tumor nucleus (TC) area and an enhanced tumor nucleus (ET) area; segmenting in each stage from the axial direction, the sagittal direction and the coronal direction respectively, and then averaging in segmentation results in the three directions to obtain a more accurate segmentation result;
the network structure of each level in the three-level cascade frame of the cascade cavity convolution with the attention mechanism is a full convolution network structure of encoding and decoding and is divided into four parts of an encoder, a decoder, a layer jump structure and a multi-layer characteristic diagram fusion:
the encoder is of a four-layer structure, the first layer comprises four convolution kernels which are intra convolution layers with the size of 3 x 1, a batch normalization layer (BN) and a PReLU layer for non-linearization are arranged behind each intra convolution layer to form intra convolution blocks, and every two intra convolution blocks are connected through a residual to form a residual block; the second layer contains two residual blocks, a down-sampling layer and a convolution kernel which are interframe convolution layers with the size of 1 multiplied by 3, and a BN layer and a PReLU layer which are arranged behind the interframe convolution layers form interframe convolution blocks; the third layer has three residual blocks, an inter-frame convolution block and a down-sampling layer; respectively adding the second residual block and the third residual block into a cavity convolution with expansion rates of 2 and 3; the fourth layer comprises a down-sampling layer, three residual blocks with the expansion rate of 3 and an anti-convolution layer;
the decoding path has a three-layer structure, wherein the first layer has three residual blocks, an interframe convolution block and an up-sampling layer, and the first and second residual blocks are respectively added with a hole convolution with expansion rates of 3 and 2; the second layer has two residual blocks, an inter-frame convolution block and an deconvolution layer; the third layer contains two residual blocks;
the jump layer is connected with three: the first one adds the feature diagram output by the second layer of the decoder into the attention of the feature diagram output by the first layer of the encoder through four times of intra-frame convolution block operation, and is connected with the feature diagram output by the second layer of the decoder to be used as the input of the third layer of the decoder; the second one adds the feature diagram output by the second layer of the coder into the attention of the feature diagram output by the first layer of the decoder through two times of intra-frame convolution block operation, and is connected with the feature diagram output by the first layer of the decoder to be used as the input of the second layer of the decoder; the third section adds the output characteristic diagram of the third layer of the encoder to the attention of the characteristic diagram output by the fourth layer of the encoder, and is connected with the output characteristic diagram of the fourth layer of the encoder to be used as the input of the first layer of the decoder;
the multilayer characteristic diagram fusion carries out two-time interframe convolution and two-time deconvolution operations on the fourth layer output characteristic diagram of the encoder, the first layer output characteristic diagram of the decoder is connected with the second layer output characteristic diagram of the decoder and the third layer output characteristic diagram of the decoder through one-time interframe convolution and one-time deconvolution operation, and then the final output is obtained through one-type two-segmentation convolution;
(3) and (5) training and optimizing the model.
CN202010848879.0A 2020-08-21 2020-08-21 Method for segmenting brain tumor by using cascade void convolution network with attention mechanism Pending CN112215850A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010848879.0A CN112215850A (en) 2020-08-21 2020-08-21 Method for segmenting brain tumor by using cascade void convolution network with attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010848879.0A CN112215850A (en) 2020-08-21 2020-08-21 Method for segmenting brain tumor by using cascade void convolution network with attention mechanism

Publications (1)

Publication Number Publication Date
CN112215850A true CN112215850A (en) 2021-01-12

Family

ID=74058700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010848879.0A Pending CN112215850A (en) 2020-08-21 2020-08-21 Method for segmenting brain tumor by using cascade void convolution network with attention mechanism

Country Status (1)

Country Link
CN (1) CN112215850A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767417A (en) * 2021-01-20 2021-05-07 合肥工业大学 Multi-modal image segmentation method based on cascaded U-Net network
CN113658142A (en) * 2021-08-19 2021-11-16 江苏金马扬名信息技术股份有限公司 Hip joint femur near-end segmentation method based on improved U-Net neural network
CN113888555A (en) * 2021-09-02 2022-01-04 山东师范大学 Multi-modal brain tumor image segmentation system based on attention mechanism
CN114170244A (en) * 2021-11-24 2022-03-11 北京航空航天大学 Brain glioma segmentation method based on cascade neural network structure
CN114187296A (en) * 2021-11-09 2022-03-15 元化智能科技(深圳)有限公司 Capsule endoscope image focus segmentation method, server and system
CN114565628A (en) * 2022-03-23 2022-05-31 中南大学 Image segmentation method and system based on boundary perception attention

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754404A (en) * 2019-01-02 2019-05-14 清华大学深圳研究生院 A kind of lesion segmentation approach end to end based on more attention mechanism
CN109872306A (en) * 2019-01-28 2019-06-11 腾讯科技(深圳)有限公司 Medical image cutting method, device and storage medium
CN110059717A (en) * 2019-03-13 2019-07-26 山东大学 Convolutional neural networks automatic division method and system for breast molybdenum target data set
CN110189342A (en) * 2019-06-27 2019-08-30 中国科学技术大学 Glioma region automatic division method
CN110675379A (en) * 2019-09-23 2020-01-10 河南工业大学 U-shaped brain tumor segmentation network fusing cavity convolution
CN110992414A (en) * 2019-11-05 2020-04-10 天津大学 Indoor monocular scene depth estimation method based on convolutional neural network
CN111028242A (en) * 2019-11-27 2020-04-17 中国科学院深圳先进技术研究院 Automatic tumor segmentation system and method and electronic equipment
CN111046921A (en) * 2019-11-25 2020-04-21 天津大学 Brain tumor segmentation method based on U-Net network and multi-view fusion
CN111259906A (en) * 2020-01-17 2020-06-09 陕西师范大学 Method for generating and resisting remote sensing image target segmentation under condition containing multilevel channel attention
CN111340828A (en) * 2020-01-10 2020-06-26 南京航空航天大学 Brain glioma segmentation based on cascaded convolutional neural networks
CN111401480A (en) * 2020-04-27 2020-07-10 上海市同济医院 Novel breast MRI (magnetic resonance imaging) automatic auxiliary diagnosis method based on fusion attention mechanism

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754404A (en) * 2019-01-02 2019-05-14 清华大学深圳研究生院 A kind of lesion segmentation approach end to end based on more attention mechanism
CN109872306A (en) * 2019-01-28 2019-06-11 腾讯科技(深圳)有限公司 Medical image cutting method, device and storage medium
CN110059717A (en) * 2019-03-13 2019-07-26 山东大学 Convolutional neural networks automatic division method and system for breast molybdenum target data set
CN110189342A (en) * 2019-06-27 2019-08-30 中国科学技术大学 Glioma region automatic division method
CN110675379A (en) * 2019-09-23 2020-01-10 河南工业大学 U-shaped brain tumor segmentation network fusing cavity convolution
CN110992414A (en) * 2019-11-05 2020-04-10 天津大学 Indoor monocular scene depth estimation method based on convolutional neural network
CN111046921A (en) * 2019-11-25 2020-04-21 天津大学 Brain tumor segmentation method based on U-Net network and multi-view fusion
CN111028242A (en) * 2019-11-27 2020-04-17 中国科学院深圳先进技术研究院 Automatic tumor segmentation system and method and electronic equipment
CN111340828A (en) * 2020-01-10 2020-06-26 南京航空航天大学 Brain glioma segmentation based on cascaded convolutional neural networks
CN111259906A (en) * 2020-01-17 2020-06-09 陕西师范大学 Method for generating and resisting remote sensing image target segmentation under condition containing multilevel channel attention
CN111401480A (en) * 2020-04-27 2020-07-10 上海市同济医院 Novel breast MRI (magnetic resonance imaging) automatic auxiliary diagnosis method based on fusion attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李大湘等: "基于改进U-Net视网膜血管图像分割算法", 《光学学报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767417A (en) * 2021-01-20 2021-05-07 合肥工业大学 Multi-modal image segmentation method based on cascaded U-Net network
CN112767417B (en) * 2021-01-20 2022-09-13 合肥工业大学 Multi-modal image segmentation method based on cascaded U-Net network
CN113658142A (en) * 2021-08-19 2021-11-16 江苏金马扬名信息技术股份有限公司 Hip joint femur near-end segmentation method based on improved U-Net neural network
CN113658142B (en) * 2021-08-19 2024-03-12 江苏金马扬名信息技术股份有限公司 Hip joint femur near-end segmentation method based on improved U-Net neural network
CN113888555A (en) * 2021-09-02 2022-01-04 山东师范大学 Multi-modal brain tumor image segmentation system based on attention mechanism
CN114187296A (en) * 2021-11-09 2022-03-15 元化智能科技(深圳)有限公司 Capsule endoscope image focus segmentation method, server and system
CN114187296B (en) * 2021-11-09 2022-12-13 元化智能科技(深圳)有限公司 Capsule endoscope image focus segmentation method, server and system
CN114170244A (en) * 2021-11-24 2022-03-11 北京航空航天大学 Brain glioma segmentation method based on cascade neural network structure
CN114170244B (en) * 2021-11-24 2024-05-28 北京航空航天大学 Brain glioma segmentation method based on cascade neural network structure
CN114565628A (en) * 2022-03-23 2022-05-31 中南大学 Image segmentation method and system based on boundary perception attention
CN114565628B (en) * 2022-03-23 2022-09-13 中南大学 Image segmentation method and system based on boundary perception attention

Similar Documents

Publication Publication Date Title
CN112215850A (en) Method for segmenting brain tumor by using cascade void convolution network with attention mechanism
CN110782462B (en) Semantic segmentation method based on double-flow feature fusion
CN111046921B (en) Brain tumor segmentation method based on U-Net network and multi-view fusion
He et al. H2Former: An efficient hierarchical hybrid transformer for medical image segmentation
CN108846473B (en) Light field depth estimation method based on direction and scale self-adaptive convolutional neural network
CN109584161A (en) The Remote sensed image super-resolution reconstruction method of convolutional neural networks based on channel attention
Zhang et al. Progressive hard-mining network for monocular depth estimation
CN112258526A (en) CT (computed tomography) kidney region cascade segmentation method based on dual attention mechanism
CN112785593B (en) Brain image segmentation method based on deep learning
CN112329780B (en) Depth image semantic segmentation method based on deep learning
CN113592026A (en) Binocular vision stereo matching method based on void volume and cascade cost volume
CN114266939B (en) Brain extraction method based on ResTLU-Net model
CN110472634A (en) Change detecting method based on multiple dimensioned depth characteristic difference converged network
CN112215291A (en) Method for extracting and classifying medical image features under cascade neural network
CN114387161B (en) Video super-resolution reconstruction method
KR20220139541A (en) A method and apparatus for image segmentation using global attention
CN117036380A (en) Brain tumor segmentation method based on cascade transducer
CN114821050A (en) Named image segmentation method based on transformer
CN113744284B (en) Brain tumor image region segmentation method and device, neural network and electronic equipment
CN113284079B (en) Multi-modal medical image fusion method
CN114332047A (en) Construction method and application of surface defect detection model
CN115995002B (en) Network construction method and urban scene real-time semantic segmentation method
CN111210416A (en) Anatomical structure prior-guided brain region-of-interest rapid segmentation method and system
CN116403212B (en) Method for identifying small particles in pixels of metallographic image based on improved U-net network
CN116579988A (en) Cerebral apoplexy focus segmentation method based on progressive fusion network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210112

RJ01 Rejection of invention patent application after publication