CN114119637A - Brain white matter high signal segmentation method based on multi-scale fusion and split attention - Google Patents

Brain white matter high signal segmentation method based on multi-scale fusion and split attention Download PDF

Info

Publication number
CN114119637A
CN114119637A CN202111429055.0A CN202111429055A CN114119637A CN 114119637 A CN114119637 A CN 114119637A CN 202111429055 A CN202111429055 A CN 202111429055A CN 114119637 A CN114119637 A CN 114119637A
Authority
CN
China
Prior art keywords
attention
white matter
high signal
training
matter high
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111429055.0A
Other languages
Chinese (zh)
Other versions
CN114119637B (en
Inventor
赵欣
张银平
苗延巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University
Original Assignee
Dalian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University filed Critical Dalian University
Priority to CN202111429055.0A priority Critical patent/CN114119637B/en
Publication of CN114119637A publication Critical patent/CN114119637A/en
Application granted granted Critical
Publication of CN114119637B publication Critical patent/CN114119637B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/05Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves 
    • A61B5/055Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves  involving electronic [EMR] or nuclear [NMR] magnetic resonance, e.g. magnetic resonance imaging
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Pathology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Veterinary Medicine (AREA)
  • General Engineering & Computer Science (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Surgery (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • High Energy & Nuclear Physics (AREA)
  • Fuzzy Systems (AREA)
  • Physiology (AREA)
  • Psychiatry (AREA)
  • Signal Processing (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)
  • Image Processing (AREA)

Abstract

A white matter high signal segmentation method based on multi-scale fusion and attention splitting belongs to the technical field of brain magnetic resonance image processing. The technical points are as follows: acquiring a white matter high signal FLAIR image data set, and dividing and preprocessing the data set; constructing and training the white matter high signal segmentation model based on multi-scale fusion and split attention, and obtaining the white matter high signal segmentation model when the training meets a termination condition; and inputting each image in the test set into a trained brain white matter high signal segmentation model based on multi-scale fusion and split attention for testing. Has the advantages that: the brain white matter high-signal segmentation method based on multi-scale fusion and split attention can effectively improve WMH segmentation accuracy, and particularly has good identification capability on tiny lesions. The invention has the same reference significance for the segmentation of other medical images and has positive significance for promoting the computer-aided diagnosis of brain diseases.

Description

Brain white matter high signal segmentation method based on multi-scale fusion and split attention
Technical Field
The invention relates to the technical field of brain magnetic resonance image processing, in particular to a 3D U-Net full convolution model adopting multi-scale convolution and distraction, which is used for solving the automatic segmentation problem of White Matter high signals (WMH).
Background
White Matter high signal (WMH) refers to the localized areas of highlighting that appear on T2-weighted images and FLAIR images of nuclear magnetic resonance, also known as White lesions. WMH is commonly found in Magnetic Resonance Imaging (MRI) of the brain of patients with neurodegenerative diseases (e.g. dementia, alzheimer's disease), stroke and cerebral small vessel disease, as well as in the brain structures of healthy elderly over 70 years of age. Studies have shown that the size, location, number and shape of WMH can provide valuable information for exploring the etiology and development of brain diseases and evaluating the therapeutic effects, and thus, it is important in clinical diagnosis to accurately segment WMH regions on MRI and quantitatively and qualitatively analyze them.
However, it is time consuming to rely on the physician to manually segment out WMH regions because the manual approach is to process frame by frame on MRI images (each brain MRI image typically contains tens or even hundreds of frames of slices), which can be time consuming and labor intensive for the physician if each frame is manually segmented by the physician. Meanwhile, because manual segmentation depends on subjective judgment of doctors, manual omission is avoided, and diagnosis result differences caused by different experience of different doctors are generated. If the automatic segmentation of the part with the white brain lesion can be performed by a machine instead of a doctor, the doctor can be relieved from the heavy segmentation operation, and the objectivity and the accuracy of the segmentation and diagnosis result are ensured. Therefore, in recent years, researchers have proposed many WMH automatic segmentation methods, mainly including early segmentation methods based on conventional machine learning and current segmentation methods based on deep learning. The deep learning method is superior to the traditional method in segmentation effect because the deep learning method can autonomously learn the complex features hidden in the image. However, most of the existing deep learning methods directly adopt a classical full convolution neural network U-Net model in the field of image segmentation or simply improve the model when solving the WMH segmentation problem, such as simply increasing jump connection, or using attention without distinguishing in a bottleneck layer or all layers, and the like. The reason for this is that the characteristics of WMH are not fully considered in the design of the model, so that the feature extraction capability of the model is insufficient, and the segmentation effect is affected.
The WMH has the characteristics of variable shapes, random positions and uneven signals, and if the WMH is in a sheet shape, some WMHs are in a point shape; some occur around the ventricles of the brain and some occur deep in the brain; some have a distinct high brightness signal and some are similar in brightness to surrounding tissue, which all pose difficulties and challenges to the automatic identification of WMH. Therefore, a technology for automatically segmenting a white matter high signal (WMH) accurately and effectively aiming at the characteristics of WMH is urgently needed to be researched to solve the problems of missed identification of small lesions, inaccurate boundary segmentation and low segmentation accuracy in the current WMH segmentation.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a white matter high signal full-volume integral segmentation method based on multi-scale fusion and distraction, which improves the accuracy of automatic WMH segmentation by enhancing the multi-scale feature extraction capability of a model and the attention capability of a segmentation target, can well segment a white matter high signal region, and particularly can segment a plurality of tiny focuses.
The technical scheme is as follows:
a brain white matter high signal segmentation method based on multi-scale fusion and split attention comprises the following steps:
step one, acquiring a white matter high signal FLAIR image data set;
dividing the acquired white matter high signal FLAIR image data set into a training set, a verification set and a test set;
step three, preprocessing the acquired white matter high signal FLAIR image data set;
fourthly, constructing and training the brain white matter high signal segmentation model based on multi-scale fusion and split attention;
s4.1, constructing the brain white matter high signal segmentation model based on multi-scale fusion and split attention;
the brain white matter high signal segmentation model based on multi-scale fusion and split attention is based on a 3D U-Net framework and comprises an encoding part and a decoding part. The encoding part consists of three feature extraction submodules with the same structure and a bottleneck layer; each feature extraction submodule comprises 2 multi-scale convolution modules, 1 attention module and 1 mixed downsampling module; the bottleneck layer sequentially comprises 1 multi-scale convolution module, 1 attention module and 1 multi-scale convolution module; the decoding part consists of three decoding sub-modules with the same structure and 1 pixel classification layer, wherein each decoding sub-module comprises 1 deconvolution and 2 convolutions with residual connection. In addition, the model also comprises a three-time jump connection, and the connection sends the attention module in each feature extraction submodule to the deconvolution result in the equivalent decoding submodule for splicing. Wherein the split attention is embodied in that the attention modules in the 1 st and 2 nd coding layers use a spatial attention module; the attention module in the 2 nd coding layer and the bottleneck layer uses the channel attention module.
And S4.2, inputting the image data of the preprocessed training set and verification set into the constructed model for training. Training calculates the error based on a loss function. In the training process, inputting images of a training set into a model to perform multiple rounds of training, verifying a training result of each round based on a verification set, obtaining an optimal weight parameter of the model under the training set when the training meets a termination condition, and stopping the training to obtain a trained segmentation model.
Inputting each image in the test set into a trained brain white matter high signal segmentation model based on multi-scale fusion and split attention for testing;
s5.1, testing and displaying the trained model on a test set
S5.2, evaluating the test result
Further, in step three, the preprocessing comprises: data enhancement, unified image size, and data normalization.
Further, in step 4.2, the termination condition of the training process is: stopping training when the loss function values on the verification set are not reduced within n iteration cycles; or to an upper limit on the number of iterations. In the model training process, the learning rate, the optimization mode and the iteration times are required to be set.
Further, in step 4.1, the multi-scale convolution module includes 3 convolution branches and 1 residual branch. The 3 convolution branches respectively adopt 16 × n 1 × 1 × 1 convolutions, 16 × n 3 × 3 convolutions and 16 × n 3 × 3 × 3 convolutions twice in succession, wherein n is the layer number of the coding layer (including the bottleneck layer), and n belongs to {1,2,3,4 }. The output results of the 3 convolution branches are overlapped with the output of the residual branch after voxel-by-voxel overlapping, and are finally output through a ReLU activation function.
Further, in step 4.1, in the hybrid downsampling module:
performing maximum pooling on the input feature matrix, simultaneously performing convolution operation with convolution kernel size of 3 multiplied by 3 and step length of 2, splicing the two obtained feature matrices, and performing convolution operation with convolution kernel size of 1 multiplied by 1 and step length of 1 to fuse features to realize information compensation;
expressed by the following formula:
Figure BDA0003379547170000051
wherein, XinIndicating input features, YoutRepresenting the output of a hybrid downsampling module, C1×1×1Representing a convolution with a convolution kernel size of 1 × 1 × 1 and a step size of 1, C3×3×3Representing convolution kernel largeConvolution with a small size of 3X 3 and a step size of 2, MaxpoolThe maximum pooling is indicated by the number of pools,
Figure BDA0003379547170000052
indicating a splicing operation.
Further, in step 4.1, in the channel attention module:
firstly, compressing an input feature map by respectively using global average pooling operation and global maximum pooling operation on a spatial dimension to obtain two one-dimensional feature vectors. Then inputting the two one-dimensional characteristic vectors into a multilayer perceptron network containing a hidden layer for coding, outputting the coded result through Sigmoid activation operation after using voxel level addition operation, and outputting the output result as a channel weight expression vector;
the channel attention is calculated as follows:
Mc(F)=σ{xMLP[(xAvgPool(F)]+xMLP[xMaxPool(F)]},
Mc(F)=σ{W1[W0(Fc,avg)]+W1[W0(Fc,max)]},
in the formula: f represents an input feature diagram, and sigma represents a Sigmoid function; x is the number ofAvgPool(. cndot.) represents the average pooling function; x is the number ofMaxPool(. -) represents a maximum pooling function;
Fc,avgand Fc,maxRespectively representing the characteristic graphs after average pooling and maximum pooling; w1And W0Representing two-layer parameters in the MLP;
further, in step 4.1, in the spatial attention module:
firstly, performing global average pooling and global maximum pooling on input feature matrices in channel dimensions to obtain two spatial feature matrices; splicing the two spatial feature matrixes, outputting the spliced feature matrixes through a sigmoid activation function after convolution operation with a convolution kernel size of 7 multiplied by 7 and a step length of 1, and generating a spatial attention feature map;
the spatial attention is calculated as follows:
Ms(F')=σ{f7×7×7[xAvgPool(F');xMaxPool(F')]},
Ms(F')=σ[f7×7×7(Fs,avg;Fs,max)],
in the formula: sigma represents Sigmoid function, F' represents input feature diagram, Fs,avgAnd Fs,maxRespectively representing the characteristic graphs after average pooling and maximum pooling; f. of7×7×7Represents a convolution operation of 7 × 7 × 7.
Further, in step 4.2, the loss function:
using the Tversky loss function, it can be expressed as:
Lc=∑(1-Ic),
wherein the content of the first and second substances,
Figure BDA0003379547170000061
in the formula: c represents a lesion; gicE {0,1} and picE {0,1} respectively represents a real label and a prediction result;
Figure BDA0003379547170000062
and
Figure BDA0003379547170000063
respectively representing the real label and the background voxel in the prediction result;
Figure BDA0003379547170000064
gicindicating false negatives in the segmentation results; p is a radical ofic
Figure BDA0003379547170000065
Representing false positives in the segmentation results; n represents the total prime number in the image; ε represents a freely selectable constant; alpha and beta represent the penalty weights for controlling false negatives and false positives, respectively, where an increase in one weight increases the error associated with that weightA penalty for a type; α is set to 0.7 and β is set to 0.3.
As an improvement of the present invention, in step 4.1, the multi-scale fusion includes a multi-scale convolution module and a hybrid downsampling module, so that the model has more scales of feature extraction and fusion capabilities.
As an improvement of the invention, 3 convolution branches contained in the multi-scale convolution module respectively have the receptive fields of 1 × 1 × 1, 3 × 3 × 3 and 5 × 5 × 5, and the 3 branches are convolved and then superposed, so that the multi-scale convolution module has the advantages of three different scales of feature extraction and fusion capabilities and small calculation amount.
As an improvement of the present invention, in step 4.1, the hybrid downsampling module includes two downsampling branches, one downsampling branch is performed by using maximum pooling with a sampling step length of 2, the other downsampling branch is performed by using a step length of 2, the convolution kernel size is 3 × 3 × 3 convolution operation to filter the image, the effect is equal to the downsampling with a sampling step length of 2, but the detail feature in the sampling step length range can be retained, and the maximum pooling retains the maximum feature in the sampling step length range; after the convolution operation fusion of the output of the two branches with the step length of 1 and the convolution kernel size of 1 multiplied by 1, the network keeps information with two different scales of thickness and fineness when down-sampling, realizes the mutual compensation of the information and avoids the information loss in the down-sampling process.
As an improvement of the present invention, in step 4.1, the split attention is to split the spatial attention and the channel attention in the original attention mechanism. The original attention mechanism combines the space attention and the channel attention, so that the calculation cost is high, and the advantages of the two kinds of attention are not exerted in a targeted manner. According to the characteristics that shallow features are biased to spatial information and deep features are biased to semantic information, spatial attention is biased to spatial attention and channel attention is biased to semantic attention, the two types of attention are split and used and are respectively used in different stages of a model coding part, so that attention is more targeted, the attention of a network to a target is improved, and the calculation cost is low.
As an improvement of the present invention, in step 4.1, a continuous 2 convolution operations with residual connection are adopted in the decoding submodule of the brain white matter high signal segmentation model based on multi-scale fusion and split attention, so as to avoid gradient fading and model overfitting during training.
As a preferred improvement of the present invention, in step 4.2, the loss function adopts a Tversky loss function, which is a loss function different from most conventional methods, to optimize model training, and solve the problem of excessive false negatives caused by sample imbalance so as to improve the sensitivity of the model to the lesion.
Compared with the prior art, the invention has the following beneficial effects:
(1) the white matter high-signal segmentation method based on multi-scale fusion and split attention can effectively improve WMH segmentation accuracy, and particularly has good identification and segmentation capability on tiny lesions. The invention has the same reference significance for the segmentation of other medical images and has positive significance for promoting the computer-aided diagnosis of brain diseases.
(2) The model is added with a multi-scale convolution module in an encoding stage to increase the width of a network, so that the model has multi-scale feature extraction capability, meanwhile, a mixed down-sampling module is added in the encoding stage, so that the network retains information of two different scales, namely a coarse scale and a fine scale during down-sampling, and the loss of detail information caused by down-sampling is prevented.
(3) The model adopts a split attention mechanism in the coding stage, so that not only is the calculation cost reduced, but also the attention capacity of the network to the target is improved, and the WMH segmentation accuracy is improved.
(4) The model adopts convolution operation with residual connection in the decoding stage, so that gradient elimination and model overfitting in the training process can be avoided.
(5) The model uses the Tverseky loss function in the training process, so that the problem of excessive false negative of a segmentation result caused by unbalanced samples is effectively solved, and the sensitivity of the model to focus identification is improved.
Drawings
FIG. 1 is a schematic diagram of the overall structure of a white matter high signal segmentation model with multi-scale feature extraction and attention dispersion integrated in the invention;
FIG. 2 is a schematic structural diagram of a multi-scale convolution module according to the present invention;
FIG. 3 is a schematic view of a spatial attention module according to the present invention;
FIG. 4 is a schematic view of a channel attention module of the present invention;
FIG. 5 is a schematic diagram of a hybrid downsampling module according to the present invention;
FIG. 6 is a white matter high signal FLAIR image of the invention;
FIG. 7 is a manually labeled segmentation label (gold standard) corresponding to FIG. 6 according to the present invention;
FIG. 8 is a diagram illustrating the automatic segmentation of the model according to FIG. 6.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures 1 to 6 are described in further detail below.
The invention provides a brain white matter high signal segmentation method based on multi-scale convolution and distraction, which specifically comprises the following steps:
step 1, acquiring a white matter high signal FLAIR image data set;
specifically, the white matter high signal FLAIR image dataset is derived from a MICCAI2017WMH segmentation challenge match published dataset which comprises brain FLAIR image data of 60 white matter high signal patients and tag data of WMH regions marked by experts. The FLAIR image data was acquired from three different scanners in each hospital, providing 20 data in each hospital, for a total of 60. The acquired white matter signal FLAIR image data is preprocessed by a provider through offset field correction. The format of the FLARI image is nifti, and the format of the label image is nifti. The data sizes of the three hospitals were 132 × 256 × 83, 240 × 240 × 48, and 256 × 232 × 48, respectively.
Step 2, dividing the acquired data set into a training set, a verification set and a test set;
specifically, 60 samples were measured as 8: 1: the scale of 1 is divided into a training set (containing 48 samples), a validation set (containing 6 samples), and a test set (containing 6 samples).
Step 3, preprocessing the acquired sample data set;
firstly, the sizes of the acquired FLAIR image data and the label data are adjusted to be uniform 192 multiplied by 16; secondly, 48 samples in the training set are taken to carry out data amplification treatment, wherein the amplification mode comprises the steps of respectively carrying out conversion treatment such as horizontal overturning, 4 times of rotation at different angles, 3 times of affine and the like on each sample, and 336 new samples are generated after amplification; then, the amplified data of 384 training sample sets, test sets and verification sets are normalized.
Step 4, constructing and training the brain white matter high signal segmentation model based on multi-scale fusion and attention splitting;
s4.1, constructing a brain white matter high signal segmentation model based on multi-scale fusion and split attention as shown in figure 1;
the model includes an encoding portion and a decoding portion.
The coding part consists of three feature extraction submodules with the same structure and a bottleneck layer.
The feature extraction submodule of the 1 st coding layer comprises 2 multi-scale convolution modules as shown in fig. 2, wherein the number of convolution channels is 16, 1 spatial attention module as shown in fig. 3, the number of convolution channels is 16, and 1 hybrid down-sampling module is provided, and the number of convolution channels is 16;
the feature extraction submodule of the 2 nd coding layer comprises 2 multi-scale convolution modules, as shown in fig. 2, the number of convolution channels is 32, 1 spatial attention module, as shown in fig. 3, the number of convolution channels is 32, and 1 hybrid down-sampling module, the number of convolution channels is 32;
the feature extraction submodule of the 3 rd coding layer comprises 2 multi-scale convolution modules, as shown in fig. 2, the number of convolution channels is 64, 1 channel attention module, as shown in fig. 4, the number of convolution channels is 64, and 1 hybrid down-sampling module, the number of convolution channels is 64;
the bottleneck layer comprises 2 multi-scale convolution modules, as shown in fig. 2, the number of convolution channels is 128, and 1 channel attention module, as shown in fig. 4, the number of convolution channels is 128;
the decoding part consists of three decoding sub-modules with the same structure and 1 pixel classification layer.
The 1 st decoding layer contains 1 deconvolution of 64 channels with convolution kernel size 3 × 3 × 3 and 23 × 3 × 3 convolutions of 64 channels with residual connection. Splicing the deconvolution result with the output of the channel attention module of the 3 rd coding layer;
the 2 nd decoding layer contains 1 deconvolution of 32 channels with convolution kernel size 3 × 3 × 3 and 23 × 3 × 3 convolutions of 32 channels with residual connection. Splicing the deconvolution result with the output of the channel attention module of the 2 nd coding layer;
the 3 rd decoding layer contains 1 deconvolution with channel number 16 and convolution kernel size 3 × 3 × 3 and 23 × 3 × 3 convolutions of 16 channels with residual connection. Splicing the deconvolution result with the output of the channel attention module of the 1 st coding layer; and transmitting the output result of 2 convolutions with residual connection to 13 multiplied by 3 with the number of channels being 1, performing pixel-level classification by a softmax activation function, and outputting the result, namely the final segmentation result.
And S4.2, inputting the image data of the preprocessed training set and verification set into the constructed model for training.
Specifically, firstly, carrying out model hyper-parameter setting before training, including defining a loss function as a Tversery loss function; setting the learning rate to 0.0001; the optimization mode adopts an Adam random gradient descent optimizer; the number of iterations (training round) epoch is set to 400. The termination condition is used to stop training when the loss function values on the validation set no longer decrease over 10 iteration cycles.
Secondly, inputting the image data of the preprocessed training set and verification set into a constructed white matter high signal segmentation model based on multi-scale fusion and split attention for training. In the training process, the images of the training set are input into the model to carry out multiple rounds of training, the training result of each round is verified based on the verification set, when the training meets the termination condition, the optimal weight parameters of the model under the training set are obtained, at the moment, the training is stopped, the trained segmentation model is obtained, and the model parameters are stored.
Step 5, inputting each image in the test set into a trained brain white matter high signal segmentation model based on multi-scale fusion and split attention for testing;
s5.1, testing and displaying the trained model on a test set
And obtaining a binary image of the segmentation result of each sample in the test set. FIG. 6 is an axial slice image of a FLAIR image in a test set; FIG. 7 is its corresponding manually labeled segmentation label (gold standard); FIG. 8 shows the segmentation result of the proposed model. As can be seen from the comparison between FIG. 7 and FIG. 8, the method for segmenting high-signal brain white matter based on multi-scale fusion and split attention can segment small lesions well, and the overall segmentation effect is close to the golden standard.
S5.2, evaluating the test result
In this embodiment, three common evaluation indexes are used to evaluate the test results, which are: DSC (dice Similarity coeffient), Recall (Recall), Precision (Precision). The recall rate reflects the proportion of correctly segmented voxels that are actually white lesions and is used to measure the completeness of the segmentation of white lesions. The precision reflects the proportion of the white brain lesion in the segmented voxels, and is used for measuring the precision of the white brain lesion segmentation. DSC is an evaluation of the overall segmentation performance of a lesion image, and its value reflects the similarity between the segmentation result and the true label, and the larger the DSC value, the closer the segmentation result is to the true label. The specific formula can be expressed as:
Figure BDA0003379547170000121
Figure BDA0003379547170000122
Figure BDA0003379547170000131
in the formula: x is the number ofTPRepresenting the number of voxels of which the real label is a leukoencephalopathy class and which are segmented into the leukoencephalopathy class; x is the number ofFPRepresenting the number of voxels of which the real label is a non-white brain lesion class but is segmented into a white brain lesion class; x is the number ofFNThe number of voxels representing the true label that the white brain lesion class is divided into the non-white brain lesion class.
According to the formula, the segmentation result obtained by the model on the test set is evaluated, and recall is calculated to be 0.84; precision ═ 0.77; DSC 0.79.
S5.3 comparison with the prior art method
The white matter high signal 3D U-Net segmentation method based on multi-scale fusion and split attention provided by the invention is compared with the existing several mainstream WMH segmentation methods, and the comparison result is shown in Table 1.
TABLE 1
Figure BDA0003379547170000132
The data set used by all the methods in table 1 was derived from the public data set of the MICCAI2017WMH segmentation competition. As can be seen from the table, the WMH segmentation method provided by the invention has the best evaluation indexes compared with other methods. Compared with the optimal values of various indexes in the existing method, the precision of the model provided by the invention is improved by 3%, the DSC is improved by 1%, and the recall rate is improved by 1%. It is clear from the table that the network proposed by the present invention achieves excellent results.
Aiming at improving the segmentation precision of small focus areas and WMH segmentation precision, the invention makes the following improvements on the basis of a 3DU-Net model:
(1) expanding the width of the 3D U-Net model by using a multi-scale convolution module, so that the model has a multi-scale feature extraction receptive field, and the feature extraction capability is enhanced;
(2) the original maximum pooling is replaced by using mixed downsampling, so that the network retains information with different sizes of thickness and fineness during downsampling, and the loss of detail information caused by downsampling is prevented;
(3) in the coding stage, a split attention mechanism is adopted, so that the respective advantages of space attention and channel attention are played in a targeted manner, and the attention capacity of the network to the target is improved;
(4) convolution operation with residual connection is adopted in a decoding stage, and gradient elimination and model overfitting in a training process are avoided.
(5) And optimizing and training by using a Tverseky loss function, controlling the balance between false negative and false positive, and improving the sensitivity of the model to the focus area.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical solutions and the inventive concepts of the present invention within the technical scope of the present invention.

Claims (9)

1. A brain white matter high signal segmentation method based on multi-scale fusion and split attention is characterized by comprising the following steps:
step 1, acquiring a white matter high signal FLAIR image data set;
step 2, dividing the acquired white matter high signal FLAIR image data set into a training set, a verification set and a test set;
step 3, preprocessing the acquired white matter high signal FLAIR image data set;
step 4, constructing and training the brain white matter high signal segmentation model based on multi-scale fusion and attention splitting;
s4.1, constructing the brain white matter high signal segmentation model based on multi-scale fusion and split attention;
the brain white matter high signal segmentation model based on multi-scale fusion and split attention is based on a 3D U-Net framework and comprises an encoding part and a decoding part; the encoding part consists of three feature extraction submodules with the same structure and a bottleneck layer; each feature extraction submodule comprises 2 multi-scale convolution modules, 1 attention module and 1 mixed downsampling module; the bottleneck layer sequentially comprises 1 multi-scale convolution module, 1 attention module and 1 multi-scale convolution module; the decoding part consists of three decoding sub-modules with the same structure and 1 pixel classification layer, wherein each decoding sub-module comprises 1 deconvolution and 2 convolutions with residual error connection; the model also comprises three-time jump connection, and the connection sends the attention module in each feature extraction submodule to the deconvolution result in the equivalent decoding submodule for splicing; wherein the split attention is embodied in that the attention modules in the 1 st and 2 nd coding layers use a spatial attention module; the attention module in the 2 nd coding layer and the bottleneck layer uses a channel attention module;
s4.2, inputting the preprocessed image data of the training set and the preprocessed image data of the verification set into the constructed model for training; training a calculation error based on a loss function; in the training process, inputting images of a training set into a model to perform multiple rounds of training, verifying a training result of each round based on a verification set, obtaining an optimal weight parameter of the model under the training set when the training meets a termination condition, and stopping the training to obtain a training segmentation model;
step 5, inputting each image in the test set into a trained brain white matter high signal segmentation model based on multi-scale fusion and split attention for testing;
s5.1, testing and displaying the trained model on a test set;
and S5.2, evaluating the test result.
2. The method for multi-scale fusion and split attention based white matter high signal segmentation of brain according to claim 1, wherein in step 3, the pre-processing comprises: data enhancement, unified image size, and data normalization.
3. The method for multi-scale fusion and split attention based white matter high signal segmentation of brain according to claim 1, characterized in that in step 4.2, the termination condition of the training process is: stopping training when the loss function values on the verification set are not reduced within n iteration cycles; or stopping training when the upper limit of the iteration number is reached; and setting a learning rate, an optimization mode and iteration times in the model training process.
4. The method for multi-scale fusion and split attention based white matter high signal segmentation as claimed in claim 1, wherein in step 4.1, the multi-scale convolution module comprises 3 convolution branches and 1 residual branch; the 3 convolution branches respectively adopt 16 × n 1 × 1 × 1 convolutions, 16 × n 3 × 3 convolutions and 16 × n 3 × 3 × 3 convolutions which are continuously performed twice, wherein n is the layer number of the coding layer, and n belongs to {1,2,3,4 }; the output results of the 3 convolution branches are overlapped with the output of the residual branch after voxel-by-voxel overlapping, and are finally output through a ReLU activation function.
5. The method for multi-scale fusion and split attention based white matter high signal segmentation of brain according to claim 1, characterized in that in step 4.1, in the hybrid down-sampling module:
performing maximum pooling on the input feature matrix, simultaneously performing convolution operation with convolution kernel size of 3 multiplied by 3 and step length of 2, splicing the two obtained feature matrices, and performing convolution operation with convolution kernel size of 1 multiplied by 1 and step length of 1 to fuse features to realize information compensation;
expressed by the following formula:
Figure FDA0003379547160000031
wherein, XinIndicating input features, YoutRepresenting the output of a hybrid downsampling module, C1×1×1Representing a convolution with a convolution kernel size of 1 × 1 × 1 and a step size of 1, C3×3×3Representing a convolution with a convolution kernel size of 3 × 3 × 3 and a step size of 2, MaxpoolThe maximum pooling is indicated by the number of pools,
Figure FDA0003379547160000032
indicating a splicing operation.
6. The method for multi-scale fusion and split attention based white matter high signal segmentation of brain according to claim 1, characterized in that in step 4.1, in the channel attention module:
firstly, compressing an input feature map by respectively using global average pooling operation and global maximum pooling operation in a spatial dimension to obtain two one-dimensional feature vectors; then inputting the two one-dimensional characteristic vectors into a multilayer perceptron network containing a hidden layer for coding, outputting the coded result through Sigmoid activation operation after using voxel level addition operation, and outputting the output result as a channel weight expression vector;
the channel attention is calculated as follows:
Mc(F)=σ{xMLP[(xAvgPool(F)]+xMLP[xMaxPool(F)]},
Mc(F)=σ{W1[W0(Fc,avg)]+W1[W0(Fc,max)]},
in the formula: f represents an input feature diagram, and sigma represents a Sigmoid function; x is the number ofAvgPool(. cndot.) represents the average pooling function; x is the number ofMaxPool(. -) represents a maximum pooling function;
Fc,avgand Fc,maxRespectively representing the characteristic graphs after average pooling and maximum pooling; w1And W0Representing two-layer parameters in MLP.
7. The method for multi-scale fusion and split attention based white matter high signal segmentation of brain according to claim 1, characterized in that in step 4.1, in the spatial attention module:
firstly, performing global average pooling and global maximum pooling on input feature matrices in channel dimensions to obtain two spatial feature matrices; splicing the two spatial feature matrixes, outputting the spliced feature matrixes through a sigmoid activation function after convolution operation with a convolution kernel size of 7 multiplied by 7 and a step length of 1, and generating a spatial attention feature map;
the spatial attention is calculated as follows:
Ms(F')=σ{f7×7×7[xAvgPool(F');xMaxPool(F')]},
Ms(F')=σ[f7×7×7(Fs,avg;Fs,max)],
in the formula: sigma represents Sigmoid function, F' represents input feature diagram, Fs,avgAnd Fs,maxRespectively representing the characteristic graphs after average pooling and maximum pooling; f. of7×7×7Represents a convolution operation of 7 × 7 × 7.
8. The method for multi-scale fusion and split attention based white matter high signal segmentation of brain according to claim 1, characterized in that in step 4.2 the loss function:
using the Tversky loss function, it can be expressed as:
Lc=∑(1-Ic),
wherein the content of the first and second substances,
Figure FDA0003379547160000041
in the formula: c represents a lesion; gicE {0,1} and picE {0,1} respectively represents a real label and a prediction result;
Figure FDA0003379547160000042
and
Figure FDA0003379547160000043
respectively representing the real label and the background voxel in the prediction result;
Figure FDA0003379547160000044
indicating false negatives in the segmentation results;
Figure FDA0003379547160000045
representing false positives in the segmentation results; n represents the total prime number in the image; ε represents a freely selectable constant; α and β represent penalty force weights controlling false negatives and false positives, respectively, wherein an increase in one weight increases the penalty for the type of error associated with the weight; α is set to 0.7 and β is set to 0.3.
9. The method for multi-scale fusion and split attention based white matter high signal segmentation of brain matter according to claim 1, characterized in that in step 4.1, the multi-scale fusion comprises a multi-scale convolution module and a hybrid down-sampling module; the split attention is used by splitting space attention and channel attention in the original attention mechanism.
CN202111429055.0A 2021-11-29 2021-11-29 Brain white matter high signal segmentation method based on multiscale fusion and split attention Active CN114119637B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111429055.0A CN114119637B (en) 2021-11-29 2021-11-29 Brain white matter high signal segmentation method based on multiscale fusion and split attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111429055.0A CN114119637B (en) 2021-11-29 2021-11-29 Brain white matter high signal segmentation method based on multiscale fusion and split attention

Publications (2)

Publication Number Publication Date
CN114119637A true CN114119637A (en) 2022-03-01
CN114119637B CN114119637B (en) 2024-05-31

Family

ID=80370826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111429055.0A Active CN114119637B (en) 2021-11-29 2021-11-29 Brain white matter high signal segmentation method based on multiscale fusion and split attention

Country Status (1)

Country Link
CN (1) CN114119637B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898110A (en) * 2022-04-25 2022-08-12 四川大学 Medical image segmentation method based on full-resolution representation network
CN115115628A (en) * 2022-08-29 2022-09-27 山东第一医科大学附属省立医院(山东省立医院) Lacunar cerebral infarction identification system based on three-dimensional refined residual error network
CN115310486A (en) * 2022-08-09 2022-11-08 重庆大学 Intelligent detection method for welding quality

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111091130A (en) * 2019-12-13 2020-05-01 南京邮电大学 Real-time image semantic segmentation method and system based on lightweight convolutional neural network
CN112163449A (en) * 2020-08-21 2021-01-01 同济大学 Lightweight multi-branch feature cross-layer fusion image semantic segmentation method
CN112446890A (en) * 2020-10-14 2021-03-05 浙江工业大学 Melanoma segmentation method based on void convolution and multi-scale fusion
DE102019123756A1 (en) * 2019-09-05 2021-03-11 Connaught Electronics Ltd. Neural network for performing semantic segmentation of an input image
CN113052856A (en) * 2021-03-12 2021-06-29 北京工业大学 Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102019123756A1 (en) * 2019-09-05 2021-03-11 Connaught Electronics Ltd. Neural network for performing semantic segmentation of an input image
CN111091130A (en) * 2019-12-13 2020-05-01 南京邮电大学 Real-time image semantic segmentation method and system based on lightweight convolutional neural network
CN112163449A (en) * 2020-08-21 2021-01-01 同济大学 Lightweight multi-branch feature cross-layer fusion image semantic segmentation method
CN112446890A (en) * 2020-10-14 2021-03-05 浙江工业大学 Melanoma segmentation method based on void convolution and multi-scale fusion
CN113052856A (en) * 2021-03-12 2021-06-29 北京工业大学 Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898110A (en) * 2022-04-25 2022-08-12 四川大学 Medical image segmentation method based on full-resolution representation network
CN115310486A (en) * 2022-08-09 2022-11-08 重庆大学 Intelligent detection method for welding quality
CN115310486B (en) * 2022-08-09 2023-09-26 重庆大学 Intelligent welding quality detection method
CN115115628A (en) * 2022-08-29 2022-09-27 山东第一医科大学附属省立医院(山东省立医院) Lacunar cerebral infarction identification system based on three-dimensional refined residual error network
CN115115628B (en) * 2022-08-29 2022-11-22 山东第一医科大学附属省立医院(山东省立医院) Lacunar infarction identification system based on three-dimensional refined residual error network

Also Published As

Publication number Publication date
CN114119637B (en) 2024-05-31

Similar Documents

Publication Publication Date Title
CN109584254B (en) Heart left ventricle segmentation method based on deep full convolution neural network
CN114119637A (en) Brain white matter high signal segmentation method based on multi-scale fusion and split attention
Akram et al. Detection of neovascularization in retinal images using multivariate m-Mediods based classifier
CN112529839B (en) Method and system for extracting carotid vessel centerline in nuclear magnetic resonance image
CN115205300B (en) Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
Coupé et al. LesionBrain: an online tool for white matter lesion segmentation
CN112348785B (en) Epileptic focus positioning method and system
CN112465905A (en) Characteristic brain region positioning method of magnetic resonance imaging data based on deep learning
CN113205524B (en) Blood vessel image segmentation method, device and equipment based on U-Net
CN110444294B (en) Auxiliary analysis method and equipment for prostate cancer based on perception neural network
CN113689954A (en) Hypertension risk prediction method, device, equipment and medium
CN115147600A (en) GBM multi-mode MR image segmentation method based on classifier weight converter
CN113034507A (en) CCTA image-based coronary artery three-dimensional segmentation method
Al Jannat et al. Detection of multiple sclerosis using deep learning
Gulati et al. Comparative analysis of deep learning approaches for the diagnosis of diabetic retinopathy
CN114821176B (en) Viral encephalitis classification system for MR (magnetic resonance) images of children brain
CN116309615A (en) Multi-mode MRI brain tumor image segmentation method
Khani Medical image segmentation using machine learning
CN111951228B (en) Epileptogenic focus positioning system integrating gradient activation mapping and deep learning model
Pallawi et al. Study of Alzheimer’s disease brain impairment and methods for its early diagnosis: a comprehensive survey
KR102373992B1 (en) Method and apparatut for alzheimer's disease classification using texture features
Amin et al. Automated psoriasis detection using deep learning
UmaMaheswaran et al. Enhanced non-contrast computed tomography images for early acute stroke detection using machine learning approach
Naji et al. Skin diseases detection, classification, and segmentation
Sajjadi et al. Estimation of the Biological Age of the Human Brain Using Multitask Self-Supervised Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant