CN108961274B

CN108961274B - Automatic head and neck tumor segmentation method in MRI (magnetic resonance imaging) image

Info

Publication number: CN108961274B
Application number: CN201810730473.5A
Authority: CN
Inventors: 王艳; 何坤; 林峰; 吴锡; 周激流
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2018-07-05
Filing date: 2018-07-05
Publication date: 2021-03-02
Anticipated expiration: 2038-07-05
Also published as: CN108961274A

Abstract

The invention discloses an automatic head and neck tumor segmentation method in an MRI image, which comprises the following steps: training a U-net-based neural network model: the neural network model includes a contraction encoder for analyzing the input MRI image and an expansion decoder for generating a tag map output; combining the appearance characteristic representation of the shallow coding layer with the high-level characteristic representation of the depth decoding layer by using jump connection in a U-net architecture; and segmenting the NPC tumor area image in the MRI image to be tested by utilizing the neural network model. The method can realize the fast, steady and accurate automatic segmentation of the NPC tumor in the MRI image.

Description

Automatic head and neck tumor segmentation method in MRI (magnetic resonance imaging) image

Technical Field

The invention belongs to the technical field of medical images, and particularly relates to an automatic head and neck tumor segmentation method in an MRI image.

Background

In head and neck tumors, nasopharyngeal carcinoma (NPC) is the most common type leading to high mortality; most patients with nasopharyngeal carcinoma have missed the optimal treatment period before the nasopharyngeal carcinoma is diagnosed. Accurate tumor delineation in Magnetic Resonance Imaging (MRI) images plays a crucial role in guiding radiation therapy.

Therefore, early diagnosis of NPC is particularly important in clinical applications. NPC patients are usually diagnosed on the basis of manual segmentation and medical image analysis. Compared to other types of tumors, such as brain and lung tumors, NPC tumors have a more complex anatomy and generally have similar strength to surrounding tissues such as brainstem, cochlea, parotid gland, and lymph; in addition, tumors from different NPC patients often exhibit high shape variability. These attributes make the segmentation of NPC tumors a particularly challenging task.

Since MRI images with NPC typically have similar visual properties to the nasal cavity region, general image segmentation techniques based on visual features may no longer be suitable for distinguishing NPC tumor margins in MRI images. Perhaps partly due to segmentation challenges and local distribution of nasopharyngeal cancer cases, there are only a few literature records along the direction of this study. Whereas existing extraction algorithms typically extract a set of manual features for tumor segmentation; however, these methods may limit segmentation performance due to the large shape variation of NPC tumors and similarity to the intensity values of adjacent tissues.

Therefore, accurate segmentation of NPC tumors to determine characteristics such as spread and volume is critical to diagnosis and subsequent treatment planning. However, due to the labor intensive nature of manual segmentation and the divergence between different radiologists, the accuracy and robustness of the segmentation is reduced.

Disclosure of Invention

In order to solve the above problems, the invention provides an automatic head and neck tumor segmentation method in an MRI image, which can realize fast, stable and accurate automatic segmentation of NPC tumors in the MRI image.

In order to achieve the purpose, the invention adopts the technical scheme that: a method for automatic head and neck tumor segmentation in MRI images comprises the following steps:

s100, training a neural network model based on U-net: the neural network model includes a contraction encoder for analyzing the input MRI image and an expansion decoder for generating a tag map output; combining the appearance characteristic representation of the shallow coding layer with the high-level characteristic representation of the depth decoding layer by using jump connection in a U-net architecture;

training the U-net based neural network model, comprising the steps of:

s101, performing data preprocessing and data enhancement on an image training set; data enhancement is to generate more training data to improve network performance by applying random nonlinear transformation to deal with a limited number of labeled NPC data for training;

s102, training a neural network through the whole MRI image in the image training set, combining hierarchical features by using jump connection to generate a label graph, and jumping connection to simultaneously realize good localization and use of context;

s103, training a neural network model based on U-net by combining the label graph with the enhanced data;

s200, segmenting an NPC tumor region image in an MRI image to be tested by using the neural network model, and comprising the following steps: and carrying out data acquisition, image preprocessing and NPC tumor area image segmentation on the MRI image to be tested.

Further, the data acquisition comprises the steps of: acquiring an MRI image with T1 weighting by the scanner, i.e. a T1-MRI image; the T1-MRI images have the same size from head to neck and the same voxel size.

Further, considering that NPC tumors occupy only a small area in the obtained image and the position of the nasopharynx is relatively fixed, the image preprocessing comprises the steps of: selecting an axial view of each MRI slice in the T1-MRI image as a region of interest, wherein the size of the region of interest is the size of a nasopharyngeal region; isotropic resampling is carried out to reach a set resolution; the bias field in the MRI images was corrected and the data intensity of the T1-MRI images was normalized by subtracting the mean of the T1 sequence and dividing by its standard deviation.

Further, the T1-MRI image was acquired by a Philips Achieva 3.0T scanner; the image obtained had the same dimensions 232X 320X 103mm from head to neck³And the same voxel size 0.6061 × 0.6061 × 0.8mm³(ii) a Each MRI slice is selected to be 128X 128mm in size as the region of interest in the axial view³The nasopharyngeal area of (a); isotropic resampling to reach 1.0 × 1.0 × 1.0mm³The resolution of (2).

Furthermore, because the nasopharyngeal carcinoma tumor has no definite shape, different patients usually present larger tumor form change, and random nonlinear transformation adopts image deformation; the data enhancement comprises the steps of: acquiring NPC data of MRI image training markers of different shapes by an image deformation processing mode;

the image deformation is to segment the rows and columns of the MRI image into segments, thereby obtaining boxes having the same size in the MRI image; the vertex on the box boundary represents the range of image deformation, and all vertexes in the box boundary are used as source control points, so that the target position of the control points is obtained; applying the deformation function to each vertex in the network to obtain the NPC data of the MRI image training markers in different shapes; sufficiently diverse training data with completely different shapes are generated, enabling data enhancement of MRI images to cope with a limited number of labeled NPC data for training to improve network performance.

Further, the segmenting the NPC tumor comprises the steps of: the tumor area represented by each pixel in the label map of the neural network model is labeled 1 and the normal area is labeled 0; skip connections used in the U-net architecture combine high-level features from the extended decoding layer with appearance features from the punctured coding layer; the NPC tumor area image is obtained by segmenting the NPC tumor area in the image by a combination of hierarchical features.

Further, the U-net based neural network model comprises 28 convolutional layers;

the path of the encoder comprises 5 convolution blocks, each convolution block comprises 2 convolution layers with the size of 3 multiplied by 3 filters and the step length of each dimension being 1, and a ReLu activation function is set; a dropout layer is arranged in the last layer of the fourth block and the fifth block of the encoder path, and the dropout layer is set to be 0.5; the number of feature mappings of the encoder is increased from 1 to 1024; a 2 × 2 filter is arranged at the end of each convolution block except the last block and a 2 × 2 step lower convolution layer is arranged, so that the size of the feature map output by each convolution block is reduced from 128 × 128 to 8 × 8;

the path of the decoder contains 4 upper convolutional blocks starting from an upper convolutional layer with step size 2 in each dimension and filter size 3 × 3, which doubles the size of the feature map in the decoder, but halves the number of feature maps, increasing the size of the feature map in the decoder from 8 × 8 to 128 × 128; the upper convolution block includes 2 convolution layers, a first convolution layer in the upper convolution block reduces the number of connected feature maps; the feature map from the encoder path is copied and concatenated with the feature map of the decoder path.

Further, applying 1 x 1 zero padding in each convolutional layer of the U-net based neural network model such that an output patch size of each convolutional layer is:

I_output＝(I_input-F+2P)/S+1，

wherein, I_inputAnd I_outputIs the patch size of the input and output of the convolutional layer, F is the filter size, P represents the fill size, and S is the stride size;

the calculation of the output patch size described above is such that the same size of feature map is preserved in each block of the encoder path and the decoder path.

Further, applying a 1 × 1 convolutional layer in the U-net based neural network model to reduce the number of feature maps to the number of feature maps reflecting the tag map, and applying a sigmoid function to make the output in the range of 0 to 1; the tumor area represented by each pixel in the label map is labeled 1 and the normal area is labeled 0; skip-connections used in the U-net architecture combine the high-level features from the extended decoder convolutional layer with the appearance features from the punctured encoder convolutional layer; the NPC tumor area in the image is segmented by a combination of hierarchical features.

Further, in the training process, binary cross entropy is used as a cost function, and the network is trained by random gradient descent optimization so as to minimize the cost function related to the parameters of the network.

The beneficial effects of the technical scheme are as follows:

the method provided by the invention utilizes the deep neural structure to automatically extract the characteristic information from the training data, and better captures the relationship between the MRI intensity image and the corresponding label image;

the invention trains the neural network by using the whole MRI image as input instead of the image block; and in the invention, the appearance characteristic representation of the shallow coding layer is combined with the high-level characteristic representation of the depth decoding layer by a skip connection strategy; therefore, better segmentation performance can be realized by combining the layering characteristics; a fast, robust and accurate automatic segmentation of NPC tumors in MRI images can be achieved.

Drawings

FIG. 1 is a schematic flow chart of an automatic head and neck tumor segmentation method in an MRI image according to the present invention;

FIG. 2 is a schematic diagram illustrating an automatic head and neck tumor segmentation method in an MRI image according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating an image segmentation result according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described with reference to the accompanying drawings.

In this embodiment, referring to fig. 1 and fig. 2, the present invention provides an automatic head and neck tumor segmentation method in an MRI image, including the steps of:

training the U-net based neural network model, comprising the steps of:

As the optimization method of the above embodiment, the data acquisition includes the steps of: acquiring an MRI image with T1 weighting by the scanner, i.e. a T1-MRI image; the T1-MRI images have the same size from head to neck and the same voxel size.

Considering that NPC tumors occupy only a small area in the acquired image and the position of the nasopharynx is relatively fixed, the image pre-processing comprises the steps of: selecting an axial view of each MRI slice in the T1-MRI image as a region of interest, wherein the size of the region of interest is the size of a nasopharyngeal region; isotropic resampling is carried out to reach a set resolution; the bias field in the MRI images was corrected and the data intensity of the T1-MRI images was normalized by subtracting the mean of the T1 sequence and dividing by its standard deviation.

The T1-MRI image is obtained by a Philips Achieva 3.0T scanner; the image obtained had the same dimensions 232X 320X 103mm from head to neck³And the same voxel size 0.6061 × 0.6061 × 0.8mm³(ii) a Each MRI slice is selected to be 128X 128mm in size as the region of interest in the axial view³The nasopharyngeal area of (a); isotropic resampling to reach 1.0 × 1.0 × 1.0mm³The resolution of (2).

Because the nasopharyngeal carcinoma tumor has no definite shape, different patients usually present larger tumor form change, and random nonlinear transformation adopts image deformation; the data enhancement comprises the steps of: acquiring NPC data of MRI image training markers of different shapes by an image deformation processing mode;

The segmentation of NPC tumors comprises the steps of: the tumor area represented by each pixel in the label map of the neural network model is labeled 1 and the normal area is labeled 0; skip connections used in the U-net architecture combine high-level features from the extended decoding layer with appearance features from the punctured coding layer; the NPC tumor area image is obtained by segmenting the NPC tumor area in the image by a combination of hierarchical features.

As the optimization method of the above embodiment, the U-net based neural network model includes 28 convolutional layers;

Applying 1 x 1 zero padding in each convolutional layer of the U-net based neural network model such that the output patch size of each convolutional layer is:

I_output＝(I_input-F+2P)/s+1，

wherein, I_inputAnd I_outputPatch size of input and output of convolutional layer, F filter size, P stands forFill size, and S is stride size;

Applying a 1 x 1 convolutional layer in the U-net based neural network model to reduce the number of feature maps to the number of feature maps reflecting tag maps and applying a sigmoid function to bring the output in the range of 0 to 1; the tumor area represented by each pixel in the label map is labeled 1 and the normal area is labeled 0; skip-connections used in the U-net architecture combine the high-level features from the extended decoder convolutional layer with the appearance features from the punctured encoder convolutional layer; the NPC tumor area in the image is segmented by a combination of hierarchical features.

In the training process, binary cross entropy is used as a cost function, and the network is trained by random gradient descent optimization so as to minimize the cost function related to the parameters of the network.

The proposed method was verified by testing:

in order to visually evaluate the segmentation performance of the method of the present invention, some example segmentation results are given, as shown in fig. 3. The first row shows the MRI intensity image of the NPC subject, the second row corresponds to the segmentation results of the method of the invention, and the third row shows the manual segmentation results of the radiologist. As can be seen, even without any post-processing, our segmentation results are very close to reality, which indicates that our method can accurately segment NPC tumors in MRI images.

I evaluate our present method based on DSC, ASSD, PM, and CR parameter values. For each target voxel, extracting patches around the target voxel as target patches; then, the same position is found in each training sample, and a neighborhood taking the voxel as the center is defined; then, extracting a patch with the same size as the target patch from each voxel in the neighborhood to form a patch library; and considering that the number of patches in the patch library is large, a small dictionary is obtained through dictionary learning. After the dictionary is obtained, the label of the target voxel can be obtained by solving the corresponding sparse representation classification problem. The comparison results are shown in table 1.

TABLE 1 comparison of parameter value evaluations

From table 1 we can see that the proposed method achieves the highest DSC and lowest ASSD, indicating that it is superior to the other three methods. The DL-based method achieves the worst segmentation performance even though it reaches the highest CR, compared to the methods including the CNN-based method, the FCN-based method, and the proposed method. This is because the deep learning approach does not rely on handmade features, but rather automatically learns the hierarchy of complex features from training data. Compared to the CNN-based approach, the average DSC of our approach increased by about 1.67%, while the average ASSD decreased by about 0.12 mm. In addition, PM and CR values increased by 3.36% and 2.82%, respectively. Compared to the FCN based approach, which achieves the highest PM and better performance than the CNN approach, the average DSC of our approach increases by about 1.17%, while the average ASSD decreases by about 0.0048 mm. In addition, the CR value increased by 1.44%.

The experimental results prove the superiority of the deep neural network and the benefit of adopting the jump connection strategy in the deep neural network.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for automatic head and neck tumor segmentation in MRI images is characterized by comprising the following steps:

training the U-net based neural network model, comprising the steps of:

s101, performing data preprocessing and data enhancement on an image training set; data enhancement is by applying a random nonlinear transformation;

s102, training a neural network through the whole MRI image in the image training set, and combining hierarchical features by using jump connection to generate a label graph;

s200, segmenting an NPC tumor region image in an MRI image to be tested by using the neural network model, and comprising the following steps: carrying out data acquisition, image preprocessing and NPC tumor area image segmentation on an MRI image to be tested;

the segmentation of NPC tumors comprises the steps of: the tumor area represented by each pixel in the label map of the neural network model is labeled 1 and the normal area is labeled 0; skip connections used in the U-net architecture combine high-level features from the extended decoding layer with appearance features from the punctured coding layer; obtaining an NPC tumor area image by the NPC tumor area in the combined segmentation image with the layered characteristics;

the data enhancement comprises the steps of: acquiring NPC data of MRI image training markers of different shapes by an image deformation processing mode; the image deformation is to segment the rows and columns of the MRI image into segments, thereby obtaining boxes having the same size in the MRI image; the vertex on the box boundary represents the range of image deformation, and all vertexes in the box boundary are used as source control points, so that the target position of the control points is obtained; and applying a deformation function to each vertex in the network to obtain the NPC data of the MRI image training markers with different shapes.

2. The method for automatic head and neck tumor segmentation in MRI images according to claim 1, wherein the data acquisition comprises the steps of: acquiring an MRI image with T1 weighting by the scanner, i.e. a T1-MRI image; the T1-MRI images have the same size from head to neck and the same voxel size.

3. The method for automatic head and neck tumor segmentation in MRI images according to claim 2, wherein said image preprocessing comprises the steps of: selecting an axial view of each MRI slice in the T1-MRI image as a region of interest, wherein the size of the region of interest is the size of a nasopharyngeal region; isotropic resampling is carried out to reach a set resolution; the bias field in the MRI images was corrected and the data intensity of the T1-MRI images was normalized by subtracting the mean of the T1 sequence and dividing by its standard deviation.

4. The method for automatic head and neck tumor segmentation in MRI images as claimed in claim 3, wherein the T1-MRI images are obtained by Philips Achieva 3.0T scanner; the image obtained had the same dimensions 232X 320X 103mm from head to neck³And the same voxel size 0.6061 × 0.6061 × 0.8mm³(ii) a Each MRI slice is selected to be 128X 128mm in size as the region of interest in the axial view³The nasopharyngeal area of (a); isotropic resampling to reach 1.0 × 1.0 × 1.0mm³The resolution of (2).

5. The method for automatic head and neck tumor segmentation in MRI images as claimed in any one of claims 1-4, wherein said U-net based neural network model comprises 28 convolutional layers;

6. The method of claim 5, wherein 1 x 1 zero padding is applied in each convolutional layer of the U-net based neural network model, so that the output patch size of each convolutional layer is:

I_output＝(I_input-F+2P)/S+1，

7. The method of automatic head and neck tumor segmentation in MRI images as claimed in claim 6, characterized in that 1 x 1 convolutional layers are applied in the U-net based neural network model to reduce the number of feature maps to the number of feature maps reflecting the label mapping and sigmoid function is applied to make the output in the range of 0 to 1; the tumor area represented by each pixel in the label map is labeled 1 and the normal area is labeled 0; skip-connections used in the U-net architecture combine the high-level features from the extended decoder convolutional layer with the appearance features from the punctured encoder convolutional layer; the NPC tumor area in the image is segmented by a combination of hierarchical features.

8. The method of claim 7, wherein in the training process, binary cross entropy is used as a cost function, and the network is trained by stochastic gradient descent optimization to minimize the cost function associated with its parameters.