CN114926477A

CN114926477A - Brain tumor multi-modal MRI (magnetic resonance imaging) image segmentation method based on deep learning

Info

Publication number: CN114926477A
Application number: CN202210526936.2A
Authority: CN
Inventors: 叶晨; 唐鹏; 陈嘉雷; 付冲
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2022-05-16
Filing date: 2022-05-16
Publication date: 2022-08-19

Abstract

The invention discloses a brain tumor multi-modal MRI (magnetic resonance imaging) image segmentation method based on deep learning, which comprises the following steps of: step 1: preprocessing a brain tumor multi-modal MRI image data set, and dividing the brain tumor multi-modal MRI image data set into a training set and a testing set; and 2, step: constructing a cavity multi-scale phantom module; and 3, step 3: constructing a channel attention module; and 4, step 4: establishing a brain tumor MRI image segmentation depth convolution neural network on the basis of the cavity multi-scale phantom module and the channel attention module; and 5: training a deep convolutional neural network by using a training set; and 6: verifying the usability of the saved model using the test set; and 7: and (3) carrying out segmentation processing on the lesion region of the MRI image of the plurality of modes of the patient by using the trained deep convolutional neural network. The method can accurately and quickly segment the whole brain tumor region in the image, is used for a brain tumor focus segmentation system, has high network parameter utilization rate and low calculation complexity, and has practical significance.

Description

Brain tumor multi-mode MRI image segmentation method based on deep learning

Technical Field

The invention belongs to the technical field of medical image post-processing, and relates to a brain tumor multi-modal MRI image segmentation method based on deep learning.

Background

The brain is the highest part of the human nervous system, and can ensure the normal operation of the human body, but the brain diseases usually have high disability rate, and the diseases and tumors have various types and high recurrence rate. Among them, brain glioma is a relatively common primary and secondary brain malignant tumor with high harmfulness, which is invasive, invasive and high in recurrence risk, and people of different ages have disease risks, which account for about 45% -50% of intracranial tumors.

Studies have shown that the central necrotic lesion of gliomas is surrounded by highly dense tumor cells, and there are also a number of more active gliomas in the edematous region. As cell density continues to expand, the tumor will undergo invasive growth. Therefore, it is an essential step in the diagnosis process to define the overall scope of brain tumor focus and segment it. Clinical application shows that heredity and the acquired disease possibly cause brain tumor, and people with family medical history should perform brain medical image examination regularly. Magnetic Resonance Imaging (MRI) and its contrast agent enhanced modality are sensitive to soft tissue Imaging, have excellent image contrast, can clearly reflect heterogeneity of tumors and anatomical structures such as proportion and position thereof, and are the first medical Imaging modality for diagnosis of brain diseases.

With the development and application of computer vision technology, the imaging analysis by computer aided doctors has become a mainstream research direction, wherein the segmentation of the focus area in the medical image by computer is one of the very important and promising task areas. At present, most of traditional segmentation algorithms are based on region growing, edge detection, SVM and other machine learning methods. However, due to the lack of rich characteristic information, the methods have the problems of fuzzy boundaries, poor effect, sensitivity to noise, poor robustness and the like in the segmentation result. In addition, the segmentation mode of the traditional method is too dependent on a manual feature extraction and design mode, and when the method is used for multi-modal MRI brain tumors, the segmentation precision is low, the design process is complicated, and automation is difficult to realize.

In recent years, deep learning has been rapidly developed, and hardware devices have been advanced. The convolutional neural network can help a computer to better capture image features, automatically explore internal relations among image data and enable semantic representation of images to be more accurate. The method saves the complicated step of manually designing the feature extractor, and provides a new idea for the segmentation of the brain tumor. From this, researchers began exploring brain tumor segmentation models based on convolutional neural networks, aiming at determining lesion regions using advanced algorithms. Although the brain tumor segmentation algorithm based on the convolutional neural network has made a great progress compared to the conventional approach, the following disadvantages still exist:

(1) at present, most of brain tumor segmentation frames based on convolutional neural networks are transplanted from natural image tasks, and compared with the medical images, semantic information is relatively simple, so that the networks have the problems of structural redundancy, huge parameter quantity and the like, and the practicability is lacked.

(2) The existing medical segmentation networks are all general networks, emphasis is placed on generalization aiming at various medical images, and adjustment and optimization are not carried out aiming at the practical image characteristics such as heterogeneity of brain tumors, so that the specificity is poor.

(3) Medical images have serious imbalance-like problems, and brain tumor MRI is no exception. However, the existing brain tumor segmentation network is not ideal enough in design for optimizing the problem, so that the fitting degree of the existing brain tumor segmentation network to a smaller target is poor, the segmentation precision is low, and the existing brain tumor segmentation network is not practical for clinical application.

Disclosure of Invention

In order to solve the technical problems, the invention aims to provide a brain tumor multi-modality MRI image segmentation method based on deep learning, which greatly reduces the parameter quantity required by a network and ensures higher brain tumor segmentation precision.

The invention provides a brain tumor multi-mode MRI (magnetic resonance imaging) image segmentation method based on deep learning, which comprises the following steps of:

step 1: acquiring an existing brain tumor multi-modal MRI image data set, preprocessing the brain tumor multi-modal MRI image data set, and dividing the brain tumor multi-modal MRI image data set into a training set and a testing set according to a proportion;

and 2, step: constructing a cavity multi-scale phantom module, generating phantom characteristics by using a linear convolution kernel to reduce the number of network parameters, setting different cavity rates to enable the module to adapt to targets with different scales, improving the generalization of the module and enhancing the utilization rate of the characteristics;

and 3, step 3: constructing a channel attention module, and adjusting the utilization rate of different characteristics by adaptively calculating the attention weight of each channel characteristic;

and 4, step 4: establishing a brain tumor MRI image segmentation depth convolution neural network on the basis of a cavity multi-scale phantom module and a channel attention module, fusing upper and lower sampling path information by designing a U-shaped structure through cross-layer connection, and improving the network performance by using the channel attention module on a high-level semantic information layer;

and 5: training the deep convolutional neural network by using a training set, and screening and storing the training models with good effect;

and 6: verifying the usability of the saved model using the test set;

and 7: and (4) using the deep convolutional neural network trained in the step 6 to perform segmentation processing on the lesion region in the multi-modality MRI images of the patient.

In the method for segmenting the brain tumor multi-modal MRI image based on the deep learning, the step 1 comprises the following steps:

step 1.1: the brain tumor multi-modality MRI image data set is from a data set of the BraTS challenge 2020, MRI images of four modalities of T1, T2, T1ce and FLAIR of 293-bit high-grade glioblastoma patients and a sample label file are provided, and specific data information of the data set is stored in an NIFTI format; the NIFTI files of MRI images of the four modalities were normalized using the z-score method, which is specifically defined as formula (1):

wherein, x is a pixel matrix of the MRI image, mu and sigma are the mean and variance of all pixels, and z is the result of z-score;

step 1.2: making a universal picture format data set; reading the standardized NIFTI file by using ITK-SNAP software, slicing the MRI images, screening out brain MRI images containing brain tumor lesions, performing center cutting on the slices and a label to obtain a data result with the size of 160 multiplied by 160, combining two modal slice images of T1 and FLAIR to obtain an enhanced slice image T1-FLAIR, storing the enhanced slice image in a general image format, wherein the specific definition is as shown in formula (2):

wherein, t1f _ij The value T1 representing the ith row and jth column pixel point on the enhanced slice image T1-FLAIR _ij Represents the value of the ith row and jth column of pixel point on T1, f _ij The values of the ith row and the jth column of pixel points on the FLAIR are represented;

splicing slice images of T1-FLAIR, T2, T1ce and FLAIR of corresponding sequences according to channels to obtain a slice combination of four channels as a sample;

step 1.3: making a label picture, processing a label into a label image only containing 1 and 0 through binarization operation, wherein 1 represents an integral focus area, 0 represents a background and other brain tissues, the original image and the sequence of the label are ensured to be in one-to-one correspondence, and the original image and the sequence of the label are stored in a general image format, so that the pretreatment of a brain tumor multi-modal MRI image data set is completed;

step 1.4: the brain tumor multi-modal MRI image data set is divided into a training set and a testing set according to the ratio of 9: 1.

In the deep learning-based multi-modality brain tumor MRI image segmentation method of the present invention, the step 2 includes:

step 2.1: generating a characteristic diagram of an O channel by using 3 multiplied by 3 standard convolution for an output characteristic diagram of an I channel at the upper layer of a convolution network, and then carrying out batch normalization and ReLU activation function operation once to obtain an internal characteristic diagram;

step 2.2: performing 3 × 3 grouping convolution with the void rate of 1 on the internal feature map to obtain a feature map of an O/2 channel, and then performing batch normalization and ReLU activation function operation once to obtain a first group of phantom feature maps;

step 2.3: performing 3 × 3 grouping convolution with the void rate of 2 on the internal feature map to obtain a feature map of an O/2 channel, and then performing batch normalization and ReLU activation function operation once to obtain a second group of phantom feature maps;

step 2.4: and splicing the internal feature map, the first group of phantom feature maps and the second group of phantom feature maps according to channels to obtain an output feature map with the channel number of 2O as the output of the cavity multi-scale phantom module.

In the deep learning-based multi-modality brain tumor MRI image segmentation method of the present invention, the step 3 includes:

step 3.1: for an output characteristic diagram X with the size of I multiplied by W multiplied by H from a layer of the convolution network, I convolution kernels with the size of 1 multiplied by W multiplied by H are used for calculating layer by layer to obtain a one-dimensional space context description information vector T with the length of I;

step 3.2: for the one-dimensional spatial context description information vector T generated in the step 3.1, further aggregating channel characteristics by using one-dimensional convolution to generate a characteristic aggregation vector P with the element number of I/16;

step 3.3: recovering the feature aggregation vector P generated in the step 3.2 into a channel attention mapping Q with the length I by using one-dimensional convolution;

step 3.4: multiplying the channel attention map Q generated in step 3.3 by X of the module input profile yields an attention weighted output profile of size I X W X H.

In the deep learning-based multi-modality brain tumor MRI image segmentation method of the present invention, the step 4 includes:

step 4.1: constructing a downsampling network layer, namely performing convolution operation with convolution kernel size of 3 multiplied by 3 and step length and filling number of 1 twice, wherein the number of output channels of the convolution operation is 64 twice, the size of a characteristic diagram is unchanged, and a BN layer and a ReLU layer are connected after each convolution operation; then, the images alternately enter a 2 x 2 maximum pooling layer and a cavity multi-scale phantom module for four times, the number of channels of the feature map is unchanged after passing through the maximum pooling layer every time, the width and the height are halved, and the number of the channels of the corresponding feature map is 128,256,512,512 after passing through the operation of the cavity multi-scale phantom module every time; performing the operation of the feature attention module on the output feature map of the fourth cavity multi-scale phantom module, calculating a one-dimensional channel attention mapping vector, and multiplying the one-dimensional channel attention mapping vector by the input feature map to obtain an enhanced feature map, wherein the size of the enhanced feature map is unchanged;

step 4.2: constructing an up-sampling network layer, namely performing alternate operation of up-sampling and a cavity multi-scale phantom module for three times, wherein the up-sampling operation uses bilinear interpolation without training parameters, the width and height of a feature map are doubled by each up-sampling, the number of channels is unchanged, the feature maps of the layers corresponding to the down-sampling network are spliced according to the channels and then input into the cavity multi-scale phantom module, and the number of output channels of the three cavity multi-scale phantom modules is respectively 256, 128 and 64; after the up-sampling is carried out again, the feature graph is restored to the original graph size, the number of channels is 64, the feature graph corresponding to the down-sampling network layer is spliced according to the channels, then the convolution is carried out for two times by 3 times, the operation of the BN layer and the ReLU activation function is carried out after each convolution, and the output channels of the two times of convolution are 64; obtaining an output result after passing through a convolution layer with the convolution kernel size of 1 multiplied by 1;

step 4.3: adding the layer connection structure of striding in the upper and lower sampling layer structure, the output characteristic map that will obtain on upper and lower sampling layer carries out according to the passageway concatenation with the level according to corresponding size, fuses the characteristic information that depth layer network extracted, avoids the detail information that downsamples leads to lose, so far, and brain tumour MRI image cuts apart degree of depth convolution neural network overall framework and builds the completion.

In the deep learning-based multi-modality brain tumor MRI image segmentation method of the present invention, the step 5 includes:

step 5.1: setting a joint loss function, specifically a joint loss function of a Focal loss function for pixel imbalance mitigation and a region-based Dice loss function, which is specifically defined as formula (3):

Loss _WT ＝L _Dice +0.5×L _focal (3)

wherein, the specific definition of the Focal loss function is as follows (4):

the specific definition of the Dice loss function is given by formula (5):

in equation (4), N is the total number of samples, log is the logarithm based on e, y _nk Is the predicted output of the network on the nth sample, t _nk Alpha and gamma are hyper-parameters for corresponding real label values, and are used for coordinating and controlling the weight of different classes; in formula (5), X and Y represent the set of predicted values and the set of true values (labels) of the network;

step 5.2: adjusting the hyper-parameters, the learning rate, the training rounds, the attenuation factors and the network optimizer in the training process;

step 5.3: sending the divided training set into a network, randomly extracting a data set with a certain proportion as a verification set for use, and starting to train the model;

step 5.4: and in the training process, recording the performance of each round model on the verification set, and screening and storing the optimal model.

In the deep learning-based multi-modality brain tumor MRI image segmentation method of the present invention, the step 6 includes:

step 6.1: five indexes of a Dase index, a cross-over ratio, accuracy, sensitivity and a Hosdov distance are selected as evaluation indexes of a test set, and the evaluation indexes are compared and analyzed with the existing classical network;

step 6.2: recording the network parameter quantity, and analyzing and comparing with the classical network;

step 6.3: and performing a segmentation test on the test set to verify the actual segmentation effect of the model.

Aiming at three defects in the research of the traditional brain tumor segmentation algorithm based on the convolutional neural network set forth in the background technology, the brain tumor multi-mode MRI image segmentation method based on the deep learning has the following beneficial effects:

1. the lightweight module is adopted to simplify the segmentation network of the brain tumor, the idea of convolution of the phantom module and the cavity is used for reference, a large number of grouping convolutions are used for reducing the parameter number, the parameter field of the network is expanded while the parameter number is not additionally increased, the network parameters are reduced from the aspect of reducing characteristic redundancy, and the practicability of the model is improved.

2. Aiming at the heterogeneous characteristics of brain tumors with different shapes and sizes and unfixed positions, the cavity multi-scale phantom module uses a linear convolution kernel to generate phantom characteristics to reduce the number of network parameters, and sets different cavity rates to increase the receptive fields of different scales of the network so as to enable the module to adapt to targets with different scales. The module ensures that the network has stable and ideal performance from the perspective of rich feature extraction rate, achieves the effect of refining edges, and improves generalization. In addition, slice images T1 of two modes and the FLAIR are used for calculating to obtain enhanced images T1-FLAIR, compared with the existing modes, the T1-FLAIR is richer in details and higher in contrast, and meanwhile, a feature attention module is used for enabling the network to effectively utilize features of different channels, so that the performance of the network is improved.

3. By using the joint loss function, different types of loss functions are simultaneously acted on the network in the training process, the problem of unbalanced front and back backgrounds in brain tumor MRI is relieved by setting partial hyperparameters alpha and gamma of the Focal loss function, and in addition, preprocessing modes such as screening out slices without focuses, cutting useless background information at edges and the like also achieve the effect of relieving the unbalanced classes.

Drawings

Fig. 1 is a flowchart of a brain tumor multi-modal MRI image segmentation method based on deep learning according to the present invention;

FIG. 2 is a schematic drawing of a multi-modality MRI image data set for brain tumor preparation;

FIG. 3 is a schematic structural diagram of a cavity multi-scale phantom module;

FIG. 4 is a schematic diagram of a feature attention module;

FIG. 5 is a schematic diagram of the overall structure of the brain tumor MRI image segmentation depth convolution neural network of the present invention;

FIG. 6 is a diagram illustrating the relationship between predicted values and actual values;

FIG. 7 is a schematic diagram of a Hosdov distance calculation;

FIG. 8 is a comparison of test set segmentation prototypes;

FIG. 9 is a graph comparing the algorithm of the present invention with the base network partitioning on a test set.

Detailed Description

As shown in fig. 1, the method for multi-modality MRI image segmentation of brain tumor based on deep learning of the present invention specifically includes:

step 1: the method comprises the following steps of obtaining an existing brain tumor multi-modal MRI image data set, preprocessing the brain tumor multi-modal MRI image data set, and dividing the brain tumor multi-modal MRI image data set into a test set and a training set according to a proportion, wherein the method comprises the following specific steps:

step 1.1: the brain tumor multi-modal MRI image dataset is from a dataset of 2020 BraTS challenge, provides MRI images of four modalities of T1, T2, T1ce and FLAIR of 293-bit high-grade glioblastoma patients and a sample label file, and specific data information of the dataset adopts sequence images in NIFTI (nii.gz) format; NIFTI files of MRI images of four modalities were normalized using the z-score method, which is specifically defined as formula (1):

step 1.2: making a universal picture format (png format) data set; reading the standardized NIFTI file by using ITK-SNAP software, slicing the MRI images, screening out brain MRI images containing brain tumor focuses, performing center cutting on the slice images and labels to obtain data results with the size of 160 multiplied by 160, combining the slice images of two modes of T1 and FLAIR to obtain enhanced slice images T1-FLAIR, and storing the enhanced slice images in a general image format (png format), wherein the specific definition of the enhanced slice images is as shown in formula (2):

step 1.3: and (3) making a label picture (png format), processing the label into a label image only containing 1 and 0 through binarization operation, wherein 1 represents an integral focus area, 0 represents a background and other brain tissues, the sequence one-to-one correspondence of the original image and the label is ensured, the original image and the sequence of the label are stored into a general image format (png format), and the preprocessing of the brain tumor multi-mode MRI image data set is completed. The network will use the slice combination of T1-FALIR, T2, T1ce and FLAIR as input for one sample, the sample graph and corresponding label is shown in fig. 2, label is a label picture.

Step 1.4: the brain tumor multi-modal MRI image data set is divided into a training set and a testing set according to the ratio of 9: 1. The training set contained multi-modality MRI slice images of about 263 patients, and the test set contained multi-modality MRI slice images of about 30 other patients.

And 2, step: the method comprises the following steps of constructing a cavity multi-scale phantom module, generating phantom characteristics by using a linear convolution kernel to reduce the number of network parameters, setting different cavity rates to enable the module to adapt to targets with different scales, improving the generalization of the module, and enhancing the utilization rate of the characteristics, wherein the method comprises the following specific steps:

step 2.3: performing 3 × 3 grouping convolution with the void rate of 2 on the internal feature map to obtain a feature map of an O/2 channel, and then performing batch standardization and ReLU activation function operation once to obtain a second group of phantom feature maps;

step 2.4: and splicing the internal feature map, the first group of phantom feature maps and the second group of phantom feature maps according to channels to obtain an output feature map with the channel number of 2O as the output of the cavity multi-scale phantom module. The structure of the hollow multi-scale phantom module is shown in FIG. 3.

And step 3: constructing a channel attention module, calculating the attention weight of each channel feature in a self-adaptive manner, and adjusting the utilization rate of different features, wherein the method specifically comprises the following steps:

step 3.3: restoring the feature aggregation vector P generated in the step 3.2 into a channel attention map Q with the length I by using one-dimensional convolution;

step 3.4: multiplying the channel attention map Q generated in step 3.3 by X of the module input profile yields an attention weighted output profile of size I X W X H. The structure of the channel attention module is shown in fig. 4.

And 4, step 4: the brain tumor MRI image segmentation depth convolution neural network is established on the basis of the cavity multi-scale phantom module and the channel attention module and comprises a down-sampling network layer and an up-sampling network layer. Through cross-layer connection design U type structure with sampling path information fusion from top to bottom to use the channel attention module to promote network performance at high-level semantic information layer, concrete step includes:

step 4.1: constructing a downsampling network layer, namely performing convolution operation with convolution kernel size of 3 multiplied by 3 and step length and filling number of 1 twice, wherein the number of output channels of the convolution operation is 64 twice, the size of a characteristic diagram is unchanged, and a BN layer and a ReLU layer are connected after each convolution operation; then, the images alternately enter a 2 x 2 maximum pooling layer and a cavity multi-scale phantom module for four times, the number of channels of the feature map is unchanged after the images pass through the maximum pooling layer every time, the width and the height are halved, and the number of the channels of the corresponding feature map is 128,256,512,512 after the images pass through the operation of the cavity multi-scale phantom module every time; the feature attention module is operated on the output feature map of the fourth cavity multi-scale phantom module, a one-dimensional channel attention mapping vector is calculated, and then the one-dimensional channel attention mapping vector is multiplied with the input feature map of the one-dimensional channel attention mapping vector to obtain an enhanced feature map, wherein the size of the enhanced feature map is unchanged;

step 4.2: constructing an up-sampling network layer, namely performing alternate operation of up-sampling and a cavity multi-scale phantom module for three times, wherein the up-sampling operation uses bilinear interpolation without training parameters, the width and height of a feature map are doubled by each up-sampling, the number of channels is unchanged, the feature maps of the layers corresponding to the down-sampling network are spliced according to the channels and then input into the cavity multi-scale phantom module, and the number of output channels of the three cavity multi-scale phantom modules is respectively 256, 128 and 64; after the up-sampling is carried out again, the feature map is restored to the original image size, the number of channels is 64, the feature map corresponding to the down-sampling network layer is spliced according to the channels, then the convolution is carried out twice by 3 times, the calculation of the BN layer and the ReLU activation function is carried out after each convolution, and the output channels of the two convolutions are 64; obtaining an output result after passing through a convolution layer with the convolution kernel size of 1 multiplied by 1;

step 4.3: adding the layer connection structure of striding in the upper and lower sampling layer structure, the output characteristic map that will obtain on upper and lower sampling layer carries out according to the passageway concatenation with the level according to corresponding size, fuses the characteristic information that depth layer network extracted, avoids the detail information that downsamples leads to lose, so far, and brain tumour MRI image cuts apart degree of depth convolution neural network overall framework and builds the completion. The overall structure of the network is shown in fig. 5, and the details of the parameters are shown in table 1. Wherein () represents parallel, { } represents cascade, and > represents where the kernel output profiles are concatenated by channel; s represents a step length, p represents a filling number, d represents a void rate, and g represents a grouping number, and if the value is 1, labeling is not performed; biliner represents Bilinear interpolation upsampling; the BN layer and the activation function ReLU layer after each convolution operation are not listed separately, and the parameter calculation at the rightmost side of the table does not contain the parameters of the BN layer.

Table 1:

wherein Conv is a convolutional layer, MaxPool is a max pooling layer, DMG Module is a void multi-scale phantom Module, CAM is a channel attention Module, UpSample is an upsampling layer, and OutConv output layer.

And 5: the deep convolutional neural network is trained by using a training set, the training models with good effect are screened and stored, and the method specifically comprises the following steps:

Loss _WT ＝L _Dice +0.5×L _focal (3)

wherein, the specific definition of the Focal loss function is as formula (4):

the specific definition of the Dice loss function is given by formula (5):

in equation (4), N is the total number of samples, log is the logarithm based on e, y _nk Is the predicted output of the network on the nth sample, t _nk Alpha and gamma are hyper-parameters for corresponding real label values, and are used for coordinating and controlling the weight of different classes; in formula (5), X and Y represent a set of predicted values and a set of true values (labels) of the network;

And 6: verifying the usability of the saved model by using a test set, and the method comprises the following specific steps:

step 6.1: five indexes of a Dice (Dice) index, a cross-over ratio (IoU), Precision (Precision), sensitivity (Sensitive) and a Hostaff distance (HD-95) are selected as evaluation indexes of a test set, and are compared and analyzed with a base network U-Net and an existing classical network;

step 6.2: recording the network parameter number (Params), and performing analysis comparison with the classical network;

And 7: and (6) segmenting the lesion area in the MRI images of the plurality of modes of the patient by using the deep convolutional neural network trained in the step 6.

The method comprises the steps of preprocessing four modality MRI images respectively, then slicing the images, calculating corresponding T1-FLAIR, splicing T1-FLAIR, T2 and T1ce at the same position with the FLAIR slices according to channels to obtain a four-channel slice combination, inputting the slice combination into a trained brain tumor MRI image segmentation depth convolution neural network in sequence, outputting a focus segmentation result after binarization on the slice images after calculation by the network, splicing the segmentation results in sequence to obtain a three-dimensional segmentation result, and completing automatic marking and segmentation of a focus region in the brain tumor MRI image.

Results and analysis of the experiments

1. Segmentation evaluation index, experimental result and analysis

(1) Index for evaluation of segmentation

The performance evaluation method adopts five indexes of a Dis (Dice) index, a cross-over ratio (IoU), Precision (Precision), sensitivity (Sensitive) and a Hosdov distance (HD-95) to evaluate the performance of the segmentation method. Semantic segmentation is actually a classification task at the pixel level, and as shown in fig. 6, the following four relationships generally exist for a prediction value set and a true value (label) set:

1) the True Positive (TP), representing the number of positive samples that the model judges as positive samples, i.e. the number of positive samples that are correctly judged.

2) False Positives (FP), which represent the number of negative samples that the model calls positive samples, i.e. the number of negative samples that are misjudged.

3) And a True Negative (TN) representing the number of negative samples judged by the model as negative samples, namely the number of negative samples judged correctly.

4) False Negatives (FN), representing the number of positive samples that the model judges as negative, i.e. the number of false positive samples.

The performance evaluation index obtains an objective numerical value to evaluate the model through the analysis and calculation of the four relations, so that a researcher can take the objective numerical value as a reference to improve the model in a targeted manner.

Precision (Precision), also called Precision, indicates the proportion of correctly judged (i.e. TP) pixels among all pixels judged as positive samples (i.e. TP and FP) in the model, and reflects the false detection degree of the model, and a higher value of this number indicates that the more accurate the positive samples judged by the model are, the lower the false detection rate is, which is specifically defined as formula (6):

the Sensitivity (Sensitivity), also called Recall ratio (Recall), represents the proportion of positive samples (TP) successfully predicted by the model among all pixels with true positive samples (TP and FN), and reflects the degree of missing detection of the model, and the higher this number value is, the more comprehensive the judgment of the model on the positive samples is, the lower the missing detection rate is, which is defined as formula (7):

precision and Sensitivity reflect whether the model is sufficient to find all positive sample pixels and whether it is accurate enough. However, these two indexes have a trade-off relationship, if one index can reach the maximum, the other index may become poor in performance, and if the model reaches the ideal result, the balance point of the two indexes needs to be found, thereby leading to the concept of F1 score (F1 score), which is specifically defined as formula (8):

the F1 score is an index for measuring the performance of the binary classification model, and can be regarded as a harmonic mean value of the accuracy and sensitivity in consideration of the accuracy and sensitivity of the classification model. The Dice index is a set similarity measure, and is generally used for calculating the similarity between two sample sets, and is the same as the F1 score to some extent, and the specific relationship is as follows:

an intersection over (IoU) is one of performance evaluation indexes commonly used in semantic segmentation and target detection, and simply means an overlapping repetition rate of a prediction value set and a real value set output by a model, and a higher numerical value indicates that a prediction result is closer to a real situation, which is specifically defined as formula (10):

the performance evaluation indexes are all performance evaluation indexes based on regions and are sensitive to filling inside the divided regions, and the Hosdov distance is an evaluation index based on boundaries and is sensitive to the boundaries of the divided regions. The Hosdov distance describes the maximum minimum distance between two sets of points, which measures the maximum degree of mismatch between the two sets of points. As shown in fig. 7, the concept of the hoursoff distance is intuitively shown, several typical points a to G are respectively selected from the set of real values and the set of predicted values, wherein A, B, F, G is a point in the set of predicted values, C, D, E is a point in the set of real values, the hoursoff distance firstly respectively calculates the shortest euclidean distance from each point in the set of points to another set of points, and then selects the maximum value from the shortest distances. The shortest distance between points in the overlapping area of the two point sets, such as points E and F, is 0; non-coincident portions, the shortest distance must be greater than 0, such as A, B, C, D, G; the smaller the value of the shortest distance, the higher the coincidence ratio between the two point sets, i.e. the smaller the maximum value of the shortest distance, the better the segmentation effect of the model. However, a very small number of outlier subsets may have a large influence on the value of the hounsfield distance (e.g., G point), so that the evaluation result has a large deviation on the determination of the actual effect, and the final result of the hounsfield distance is multiplied by 95% to obtain HD-95, so as to eliminate the influence of the outlier subsets.

(2) Results of segmentation experiment

The experiment was performed on a test data set containing approximately 30 patient brain slices, with evaluation indices of dess (Dice) index, cross-over ratio (IoU), Precision (Precision), sensitivity (Sensitive), and hostdoff distance (HD-95), respectively. Through multiple tests, the segmentation model in the invention respectively achieves 87.9%, 80.1%, 86.5%, 91.3% and 3.145 pixels on five indexes, the performance is superior to other classical segmentation network models, and the parameter quantity only needs 3.70M.

(3) Analysis of ablation Experimental results

TABLE 2 ablation experiment

Table 2 details the network metrics for different module or Loss use cases, where the column T1-F labeled "√" indicates that T1 in the four input modes was replaced by T1-FLAIR, the column CAM labeled "√" indicates that the channel attention module was added at the highest semantic information layer, the column DMG labeled "√" indicates that two convolution operations in other levels of the network except the lowest semantic information layer were replaced by hole multi-scale phantom modules, and the column joint Loss labeled "√" indicates that the joint Loss function Dice-Focal local was used, otherwise the BCE Loss function. Depending on the purpose and setting of the experiment, the 13 experiments in table 2 can be divided into five groups:

1) the first group of experiments are only experiments of the 1 st time, do not comprise any designed module, and the loss function uses a binary cross entropy loss function, namely a baseline network U-Net;

2) the second group of experiments comprise 2-5 experiments, the improvement in the four aspects is respectively acted on the basic-line network U-Net and the consistency in other aspects is ensured, and the improvement in each aspect can be observed to improve the performance of the model from the data in the table;

3) the third group of experiments comprise 6-8 experiments, the mutual gain effect between every two modules is explored, the experimental result is compared with the first group, and the results show that the combined use of every two modules is higher than the numerical values of Dice, IoU and Sensitivity when the modules are used independently, and HD-95 is shorter, but Precision is reduced to a certain extent when the Sensitivity is increased obviously, the omission factor of the model is reduced, and the false detection rate is increased, because the Sensitivity of the model to the positive sample is improved, the detection rate of the positive sample is improved, the omission factor is reduced, and meanwhile, more false positive predicted values are generated;

4) the experiment of the IV group comprises 9-12 experiments, four aspects of improvement are removed in sequence on the basis of the brain tumor MRI image segmentation depth convolution neural network, or any three aspects of improvement are combined and used on the basis of the basic line network U-Net, compared with the experiment of the III group, the experiment result is generally improved, and similarly, Precision is reduced to a certain degree;

5) the group V only has the 13 th experiment, and the improvement of the four aspects acts on the base line network U-Net at the same time, namely the brain tumor MRI image segmentation depth convolution neural network has better overall improvement compared with the base line network U-Net performance of the group I experiment, namely, the Dice is improved by about 4.2 percentage points, IoU is improved by about 5.4 percentage points, the sensitve is improved by about 5.9 percentage points, meanwhile, Precision is only reduced by about 0.6 percentage points, and HD-95 is shortened by about 0.149 pixel distance.

The setting of the experiments of the II, III and IV groups proves the effect of each module and the joint loss function, and the model performance is improved under the complementary effect of the modules and the joint loss function. The relationship between Sensitivity and Precision always exists, the brain tumor MRI image segmentation depth convolution nerve net provided by the application has a 0.6 percent reduction on Precision, but the Sensitivity has a 5.9 percent improvement. Considering the practical problems of medical images, the increase in Sensitivity (i.e., the decrease in the missed detection rate) is more significant than the increase in Precision, and is more desirable if the missed detection is less likely to occur than if a healthy part is misdiagnosed as a tumor, and if the result of the medical auxiliary diagnosis detection is determined again by a doctor and the uncertain region of the suspected tumor is examined by other means, the improvement can be achieved at the same time.

(4) Comparative analysis of different feature extraction modules

TABLE 3 influence of different feature extraction modules on the experimental results

The brain tumor MRI image segmentation depth convolution neural network designed in the invention replaces part of the twice 3 multiplied by 3 convolution process of the base line network U-Net with a cavity multi-scale phantom module, the module is improved on the basis of the phantom, the simple linear convolution enables the network to extract more abundant and various features to reduce feature redundancy, and the cavity convolution is utilized to increase the receptive fields of different scales without increasing parameters, thereby improving the network performance. Table 3 proves that the cavity multi-scale phantom module designed in the invention has optimal performance on five used indexes, improves the network performance and has strong practicability.

(5) Comparative analysis of different loss functions

TABLE 4 Effect of different loss functions on the results of the experiment

Table 4 shows the network performance of the segmentation method of the present invention using different loss functions, where the parameters α and γ with respect to the Focal loss function are set to 0.65 and 2, respectively, the coefficient ratio of the BCE loss function to the die loss function in the BCE-die combined loss function is 1:0.5, and the coefficient ratio of the die loss function to the Focal loss function in the die-Focal combined loss function is 1: 0.5. The segmentation method has better effect than any one loss function used alone under the action of the Focal-Dice combined loss function, the numerical value of the loss function is optimal on the Dse index Dice and cross-over ratio IoU, the numerical value of the loss function is second best on the sensory and HD-95, and comprehensive analysis is the optimal choice.

(6) Analysis by contrast with other deep learning algorithms

Table 5 shows experimental comparison results of the segmentation method of the present invention with the classical segmentation method. From the overall experimental data, the algorithm of the invention has the highest Dice index, IoU index, Sensitive index and the shortest HD-95 among the compared algorithms, and has excellent segmentation performance.

Table 5 compares the results with the classical segmentation algorithm

Finally, in order to visually demonstrate the superiority of the method design of the present invention, fig. 8 shows a visual example diagram of the algorithm segmentation result of the present invention. The first column is a slice image of a T2 mode, the second column is a slice image of a FLAIR mode, the third column is a Label image, and the fourth column is an output result of the brain tumor MRI image segmentation depth convolution neural network.

FIG. 9 shows the superiority of the algorithm of the present invention compared with the baseline network U-Net, which is more sensitive to boundaries and small targets, better in segmentation effect, and stronger in practicability.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the scope of the present invention, which is defined by the appended claims.

Claims

1. A brain tumor multi-modality MRI image segmentation method based on deep learning is characterized by comprising the following steps:

step 2: constructing a cavity multi-scale phantom module, generating phantom characteristics by using a linear convolution kernel to reduce the number of network parameters, setting different cavity rates to enable the module to adapt to targets with different scales, improving the generalization of the module and enhancing the utilization rate of the characteristics;

and step 3: constructing a channel attention module, and adjusting the utilization rate of different characteristics by adaptively calculating the attention weight of each channel characteristic;

and 5: training the deep convolutional neural network by using a training set, and screening and storing training models with good effects;

step 6: verifying the usability of the saved model using the test set;

2. The deep learning-based brain tumor multi-modality MRI image segmentation method according to claim 1, wherein the step 1 comprises:

step 1.1: the brain tumor multi-modality MRI image data set is from a data set of the BraTS challenge 2020, MRI images of four modalities of T1, T2, T1ce and FLAIR of 293-bit high-grade glioblastoma patients and a sample label file are provided, and specific data information of the data set is stored in an NIFTI format; NIFTI files of MRI images of four modalities were normalized using the z-score method, which is specifically defined as formula (1):

splicing the slice images of the T1-FLAIR, T2, T1ce and the FLAIR of the corresponding sequences according to channels to obtain a slice combination of four channels as a sample;

step 1.3: making a label picture, processing a label into a label image only containing 1 and 0 through binarization operation, wherein 1 represents an integral focus area, 0 represents a background and other brain tissues, the sequence one-to-one correspondence of an original image and the label is ensured, the original image and the sequence of the label are stored in a general image format, and the preprocessing of a brain tumor multi-modal MRI image data set is completed;

3. The deep learning-based brain tumor multi-modality MRI image segmentation method according to claim 1, wherein the step 2 comprises:

step 2.2: performing 3 × 3 grouping convolution with the void rate of 1 on the internal feature map to obtain a feature map of an O/2 channel, and then performing batch normalization once and operation of a ReLU activation function to obtain a first group of phantom feature maps;

4. The deep learning-based brain tumor multi-modality MRI image segmentation method according to claim 1, wherein the step 3 comprises:

step 3.1: for an output characteristic diagram X with the size I multiplied by W multiplied by H from the previous layer of the convolution network, I1 multiplied by W multiplied by H convolution kernels are used for calculating layer by layer to obtain a one-dimensional space context description information vector T with the length I;

step 3.4: multiplying the channel attention map Q generated in step 3.3 by X of the module input feature map yields an attention weighted output feature map of size I × W × H.

5. The deep learning-based multi-modality MRI image segmentation method of brain tumors according to claim 1, wherein the step 4 comprises:

step 4.1: constructing a downsampling network layer, namely performing convolution operation with convolution kernel size of 3 multiplied by 3 and step length and filling number of 1 twice, wherein the number of output channels of the convolution operation is 64, the size of a characteristic diagram is unchanged, and a BN layer and a ReLU layer are connected after each convolution operation; then, the images alternately enter a 2 x 2 maximum pooling layer and a cavity multi-scale phantom module for four times, the number of channels of the feature map is unchanged after passing through the maximum pooling layer every time, the width and the height are halved, and the number of the channels of the corresponding feature map is 128,256,512,512 after passing through the operation of the cavity multi-scale phantom module every time; the feature attention module is operated on the output feature map of the fourth cavity multi-scale phantom module, a one-dimensional channel attention mapping vector is calculated, and then the one-dimensional channel attention mapping vector is multiplied with the input feature map of the one-dimensional channel attention mapping vector to obtain an enhanced feature map, wherein the size of the enhanced feature map is unchanged;

and 4.2: constructing an up-sampling network layer, namely performing alternate operation of up-sampling and a cavity multi-scale phantom module for three times, wherein the up-sampling operation uses bilinear interpolation without training parameters, the width and height of a feature map are doubled by each up-sampling, the number of channels is unchanged, the feature maps of the layers corresponding to the down-sampling network are spliced according to the channels and then input into the cavity multi-scale phantom module, and the number of output channels of the three cavity multi-scale phantom modules is respectively 256, 128 and 64; after the up-sampling is carried out again, the feature map is restored to the original image size, the number of channels is 64, the feature map corresponding to the down-sampling network layer is spliced according to the channels, then the convolution is carried out twice by 3 times, the calculation of the BN layer and the ReLU activation function is carried out after each convolution, and the output channels of the two convolutions are 64; obtaining an output result after passing through a convolution layer with the convolution kernel size of 1 multiplied by 1;

6. The deep learning-based multi-modality MRI image segmentation method of brain tumors according to claim 1, wherein the step 5 comprises:

Loss _WT ＝L _Dice +0.5×L _focal (3)

wherein, the specific definition of the Focal loss function is as formula (4):

the specific definition of the Dice loss function is given by formula (5):

in the formula (4), N is the total number of samples, log is the logarithm based on e, and y _nk Is the predicted output of the network on the nth sample, t _nk Alpha and gamma are hyper-parameters for corresponding real label values, and are used for coordinating and controlling the weight of different classes; in formula (5), X and Y represent a set of predicted values and a set of true values (labels) of the network;

and step 5.2: adjusting the hyper-parameters, the learning rate, the training round number, the attenuation factors and the network optimizer in the training process;

7. The deep learning-based brain tumor multi-modality MRI image segmentation method according to claim 1, wherein the step 6 comprises:

step 6.1: five indexes of a Dass index, a cross-correlation ratio, accuracy, sensitivity and a Hosdov distance are selected as evaluation indexes of a test set, and the evaluation indexes are compared and analyzed with the existing classical network;