CN115578404B

CN115578404B - Liver tumor image enhancement and segmentation method based on deep learning

Info

Publication number: CN115578404B
Application number: CN202211417364.0A
Authority: CN
Inventors: 陈英; 郑铖; 张伟; 林洪平; 陈旺; 周宗来
Original assignee: Nanchang Hangkong University
Current assignee: Nanchang Hangkong University
Priority date: 2022-11-14
Filing date: 2022-11-14
Publication date: 2023-03-31
Anticipated expiration: 2042-11-14
Also published as: CN115578404A

Abstract

The invention discloses a method for enhancing and segmenting a liver tumor image based on deep learning, and particularly relates to a method for designing a liver tumor image preprocessing model, improving the quality of the liver tumor image and expanding data to obtain a large number of high-quality liver tumor images. A loss function which can adapt to the problem of unbalanced ratio of foreground to background of a segmented target is designed, the feature learning capacity of the target with a small proportion in the training process is improved, and meanwhile, a liver tumor segmentation 2.5D network model containing three-dimensional space information extraction capacity is designed to learn the two-dimensional features of the liver tumor and the three-dimensional space features of an image and segment an accurate tumor area. The method can improve the characteristic learning ability of the image with relatively deficient data volume and small segmentation target, and can effectively improve the segmentation performance of the liver tumor image.

Description

Liver tumor image enhancement and segmentation method based on deep learning

Technical Field

The invention relates to the field of medical image segmentation, in particular to a method for enhancing and segmenting a liver tumor image based on deep learning.

Background

Liver tumor is a disease with high incidence and high deterioration probability, and is very important for early prediction and clinical diagnosis of liver tumor. The traditional liver cancer clinical diagnosis method based on liver puncture has the defects of implementation difficulty, patient physical examination, postoperative recovery and the like although the accuracy is high. The current popular computer aided diagnosis technology based on deep learning can carry out early detection and diagnosis on liver tumor without intervention of living body and low manual participation. The method can overcome the defect that the subjective judgment of a doctor is limited and influenced by experience and knowledge level, and can help the doctor to observe some slight changes which are easy to miss in the diagnosis, so that the diagnosis becomes more accurate and scientific.

Although the computer-aided diagnosis technology has great advantages, the segmentation model based on deep learning depends on a large amount of data sets as support of model training, medical images are often difficult to obtain due to privacy, liver tumor regions in CT images are small in proportion, and the data sets have the problem of unbalanced categories. Therefore, the invention uses the image expansion technology and the image enhancement technology to improve the quality and the quantity of the data sets, then designs a proper loss function aiming at the image characteristics of the liver tumor and constructs an efficient and reliable liver tumor segmentation model.

Disclosure of Invention

The invention aims to provide a liver tumor image enhancement and segmentation method based on deep learning, which is characterized in that an image enhancement and expansion method is fused into a segmentation frame and combined with a 2.5D segmentation method to improve the characteristic learning capacity of an image with deficient data and small segmentation target and effectively improve the segmentation performance of a liver tumor image.

In order to achieve the purpose, the invention provides the following scheme:

a method of liver tumor image enhancement and segmentation based on deep learning, the method comprising:

s1: batch pre-processing a Computed Tomography (CT) dataset; preprocessing abdominal CT data sets in a given data set in batch, and adjusting HU (Hounsfield unit) values of the abdominal CT data sets to obtain CT with higher liver tumor region discrimination; the given data set includes, but is not limited to, liTS, 3DIRCADb;

s2: designing an augmented model of the CT dataset based on generating a countermeasure network (GAN); designing a CT data set expansion model based on a generated countermeasure network based on the preprocessed CT data set to obtain a high-quality liver tumor CT data set;

s3: randomly dividing a CT data set without cross; converting the high-quality liver tumor CT (computed tomography) in the S2, namely the enhanced CT into 2D slices, randomly classifying, and dividing each enhanced CT data set into a training set and a verification set without intersection;

s4: designing a loss function for a liver tumor segmentation model; designing a loss function for a liver tumor segmentation model aiming at the proportion and distribution characteristics of liver tumors in an abdominal CT data set;

s5: designing a liver tumor segmentation model; designing a liver tumor segmentation model based on the loss function designed in the S4 and combining the characteristics of the medical image;

s6: training a liver tumor segmentation model and performing liver tumor segmentation; and (5) taking the training set obtained in the step (3) as the input of the liver tumor segmentation model designed in the step (5), training and learning the liver tumor segmentation model, and segmenting the liver tumor through the trained liver tumor segmentation model. Further, in S1, the CT data set is subjected to batch preprocessing, specifically: the existing liver 3DCT dataset is subjected to a HU value adjustment operation, the HU values of the CTs in the CT dataset are all adjusted to be in the range of [ -200,250], and the 3DCT is divided into 2D slices of 512 x 512 resolution.

Further, the designing in S2 is based on generating a CT expansion model of the countermeasure network, and specifically, the method includes: the CT data set is preprocessed in S1, the preprocessed CT data set is used as a part of training set for S2 training an extended model based on generation of a countermeasure network, and the training of the extended model of the countermeasure network needs abdominal Magnetic Resonance Imaging (MRI) provided in the CT-MRI combined healthy abdominal organ segmentation data set as another part of training set for generating liver CT required by modal migration;

firstly, a liver CT pathological area image is generated by utilizing a cyclic generation countermeasure network (cycleGAN), wherein the liver CT pathological area image comprises two generators and two discriminators, the first generator G is used for generating a liver tumor pathological area Gg, the second generator F is used for generating a normal liver area Fb, the first discriminator DX is used for distinguishing true and false of the normal liver CT, and the second discriminator DY is used for distinguishing true and false of the liver tumor pathological area image; wherein the first discriminator DX and the second discriminator DY give a high score to the true CT, and give a low score to the generated CT;

the designed objective function of the liver CT pathological region image generation network model comprises a double-pair-resistance loss function and a cycle consistency loss function, for a first generator G and a second discriminator DY, the double-pair-resistance loss function utilizes two generation countermeasure networks, and the double-pair-resistance loss function represents that a formula (1) is as follows:

（1）

wherein the content of the first and second substances,

expressed as a dual-pair loss value, g represents the true normal region sub-image domain, b represents the true pathological region image domain, and λ ₁ For controlling parameters, for controlling the relative importance of similarity and diversity between images,

representing the first objective function to generate the competing network,

representing an objective function of a second generative countermeasure network;

the CT expansion model of the generated countermeasure network is the same as the objective function of the originally generated countermeasure network; it is defined by equation (2) as:

（2）

wherein the content of the first and second substances,

expressed as the CT extended model objective function value of the generation countermeasure network,

representing probability values obtained by the CT calculations generated by the generator,

which represents the distribution of the generated CT's,

which represents the CT generated by the generator, is,

representing the probability values obtained by the real CT calculation,

representing the value obtained after the real CT is input into the discriminator;

for the second generator F, the first discriminator DX, the dual-immunity loss function equation (3) is obtained as:

（3）

wherein the content of the first and second substances,

expressed as a dual-immunity loss value,

representing the first objective function to generate a competing network,

representing the objective function of the second generation of the competing network.

Further, in the S3, the data set is randomly divided without crossing, specifically, the method includes: the high-quality liver tumor CT is a 3D image, different directions of the 3D image are read and stored as 2D slices, the 2D slices are divided into a training set and a verification set according to a fixed proportion, the training set slices are repeatedly cut, and the resolution of each slice is 256 multiplied by 256 pixels; the slice of the validation set is at original size with a cross-sectional slice resolution of 512 x 512 pixels; the slice resolution of coronal and sagittal planes is 512 × n pixels, n is a non-fixed value and varies with the Z-axis length of the 3 DCT.

Further, in the step S4, a loss function for the liver tumor segmentation model is designed, specifically, the method includes: analyzing the image characteristics in the abdominal CT data set, designing a loss function for the liver tumor segmentation model according to the analysis result, determining the optimal value of the hyperparameter in the loss function by combining an experiment, wherein a formula (4) for designing the loss function by the liver tumor segmentation model is as follows:

（4）

wherein the content of the first and second substances,

expressed as a loss function of liver tumor segmentation model design, C is the category of the liver tumor region or the non-tumor region of the pixel point,

is the weight coefficient of the class to which the pixel belongs,

which is indicative of the focus coefficient, is,

in order to be a function of the cross-over ratio loss,

expression formula (5) is:

（5）

wherein N represents the number of image pixels,

for segmenting model to pixel pointsiPrediction as a classCThe probability value of (a) is determined,

is a pixel pointiTrue values belonging to the category to which they belong.

Further, the liver tumor segmentation model is designed in S5, the liver tumor segmentation model and the tumor segmentation model are obtained by jointly training CT slices in multiple directions, and the high resolution information and the low resolution information are combined in the form of an encoder-jump connection-decoder, specifically:

processing a characteristic graph by using a depth residual block in each layer of an encoder, and simultaneously performing two-step operation on input, wherein in the first step, two groups of convolution groups with different scales are used for input, and the output of the two groups of convolution groups is spliced, wherein each group of convolution groups respectively comprises convolution with a convolution kernel size of 1 multiplied by 1 and convolution with a convolution kernel size of 3 multiplied by 3; the second step of input is that the channel attention value is calculated for the output of the previous step and multiplied by two groups of convolution groups, then the final output characteristic graph is added with the input characteristic graph, wherein the calculation of the channel attention value comprises two operations of compression and excitation, the dimensionality of the characteristic graph is compressed into one dimension through global average pooling, then the importance of each channel is predicted by a full connection layer, and excitation is completed; finally, obtaining the output of each layer of the encoder by using global average pooling;

the jump connection connects a channel attention mechanism and a space attention mechanism in a parallel mode, wherein the principle of the channel attention mechanism is the same as the calculation mode of the channel attention mechanism in the encoder, the space attention mechanism carries out processing through global average pooling and global maximum pooling, dimension reduction is carried out by using convolution operation with a convolution kernel of 7 multiplied by 7, and the space attention mechanism is generated by an S-shaped growth curve; extracting a spatial attention value and a channel attention value from the output of each layer of the encoder through a channel attention mechanism and a spatial attention mechanism, and giving the spatial attention value and the channel attention value to a feature map of each layer of the decoder to be used as the input of each layer of the decoder;

and each layer of the decoder sequentially performs convolution, batch normalization processing and linear unit correction operation on the input in different scales, obtains the output of each layer of the encoder through up-sampling, and continuously and repeatedly completes feature restoration.

Furthermore, in the liver segmentation model, training is carried out by using a training set mixed with cross section, coronal plane and sagittal plane slices, extraction and learning of liver features are carried out by an encoder, feature weights are given to feature maps by jumping connection, feature restoration is carried out by a decoder, and the multi-plane liver segmentation model adapting to the cross section, coronal plane and sagittal plane slices of the liver CT is fitted after a plurality of rounds of training; in the tumor segmentation model, the segmentation result of the liver segmentation model is input into the tumor segmentation model, the training set removes the background except the liver by taking the correct marking data of the liver as the reference, the influence of the complex background environment in the training process on the tumor feature learning capability of the tumor segmentation model is eliminated, and the multi-plane tumor segmentation model adaptive to the tumor features is fitted after several rounds of training.

Further, the training of the liver tumor segmentation model and the liver tumor segmentation in S6 are specifically performed by: in the training process of the liver tumor segmentation model, the model capability is optimized by calculating the loss of the dice similarity coefficient and utilizing back propagation until fitting; and after the fitting is finished, the segmentation performance of the model is checked by using the verification set.

Further, the trained liver tumor segmentation model segments and fuses CT slices in multiple directions at the same time, and specifically includes:

respectively inputting the CT slices of the cross section, the coronal plane and the sagittal plane into a liver tumor segmentation model to obtain a segmentation probability map;

respectively multiplying the segmentation probability graphs of different planes by preset weights;

and adding the weighted segmentation probability maps pixel by pixel and converting the weighted segmentation probability maps into a final segmentation result.

The invention has the beneficial effects that:

(1) The invention designs a liver tumor image preprocessing model, improves the quality of liver tumor images, expands data and obtains a large number of high-quality liver tumor images.

(2) The invention designs a loss function which can adapt to the problem of unbalanced ratio of foreground to background of the segmented target, and improves the characteristic learning capability of the target with small ratio in the training process.

(3) The invention designs a liver tumor segmentation 2.5D network model with three-dimensional spatial information extraction capability, learns the two-dimensional characteristics of the liver tumor and the three-dimensional spatial characteristics of the image, and segments accurate tumor regions.

(4) The method can improve the characteristic learning ability of the image with relatively deficient data volume and small segmentation target, and can effectively improve the segmentation performance of the liver tumor image.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a flow chart of a method for enhancing and segmenting a liver tumor image according to the present invention;

FIG. 2 is a data comparison diagram of a CT image after being preprocessed and expanded according to the present invention;

FIG. 3 is a flow chart of image augmentation based on generation of a countermeasure network provided by the present invention;

FIG. 4 is a diagram of a liver tumor segmentation model structure according to the present invention;

FIG. 5 is a schematic diagram of a training set slice provided by the present invention;

fig. 6 is a flowchart of a liver tumor segmentation model provided by the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention relates to the technical fields of deep learning, computer vision, a GAN framework, a U-Net framework and the like, in particular to design and training of a deep learning neural network for image enhancement and image segmentation in the deep learning and the computer vision, and fusion of multi-plane segmentation results. The invention aims to provide a liver tumor image enhancement and segmentation method based on deep learning, which is used for fusing an image enhancement and expansion method into a segmentation framework and combining a 2.5D segmentation method to solve the problem of limited training of a segmentation model caused by lack of medical image data and equipment performance limitation.

In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, the present invention is described in detail with reference to the accompanying drawings and the detailed description thereof.

The invention adopts the flow chart shown in fig. 1, and the key steps of the method and the processing comprise:

s1: carrying out batch preprocessing on the CT in the data set;

s2: designing a CT augmentation model based on generating a countermeasure network;

s3: randomly dividing the data set without cross;

s4: designing a loss function for a liver tumor segmentation model;

s5: designing a liver tumor segmentation model;

s6: training a segmentation model and performing liver tumor segmentation.

Of the above 6 main steps, S1 and S3 are the basis of the implementation of the invention, and S2 and S5 are the most central. The specific implementation of the steps is as follows:

s1: carrying out batch preprocessing on the CT in the data set;

in the invention, the HU value of the CT is adjusted to be in a range of [ -200,250] so as to enable the shape outline of the liver to be clearer and be more beneficial to the subsequent training of a liver tumor segmentation network, and the comparison of the HU value before and after adjustment is shown in figure 2.

the CT dataset has been preprocessed in S1, but the existing CT dataset has a problem of few lesion samples, and needs to be expanded in order to balance the number of normal samples and the number of samples including nodules in the dataset. The preprocessed CT data set in S1 is used as a partial training set for S2 training of the augmented model based on generation of the confrontation network, and training of the model further requires abdominal MRI provided in a combined CT-MRI healthy abdominal organ segmentation (CHAOS) data set as another partial training set for modality migration to generate the required liver CT.

As shown in fig. 3, in order to generate a liver tumor image, firstly, a liver CT pathological area image is generated by using a cyclic generation countermeasure network, and the generated liver CT pathological area image includes two generators and two discriminators, wherein the first generator G is used for generating a liver tumor pathological area Gg, the second generator F is used for generating a normal liver area Fb, the first discriminator DX is used for distinguishing true and false of a normal liver CT, and the second discriminator DY is used for distinguishing true and false of the liver tumor pathological area image; wherein the first discriminators DX and DY give a high score to the true CT and a low score to the generated CT;

the designed objective function of the liver CT pathological region image generation network model comprises a dual-countermeasure loss function and a cycle consistency loss function, and for the first generator G and the second discriminator DY, the dual-countermeasure loss can be expressed as formula (1):

（1）

the dual antagonistic loss function utilizes two GANs, where,

expressed as a dual-pair loss function, g represents the true normal region sub-image domain, b represents the true pathological region image domain, and λ ₁ To control the relative importance of parameters used to control the similarity and diversity between images,

an objective function representing the first GAN is shown,

an objective function representing a second GAN;

the CT expansion model based on the GAN is the same as an original GAN target function; it defines equation (2) as:

（2）

wherein the content of the first and second substances,Lexpressed as an objective function of the CT augmented model,

representing the probability values computed by the CT generated by the generator,

which represents the distribution of the generated CT's,

which represents the CT generated by the generator, is,

representing the probability values obtained by the real CT calculation,

representing the value obtained after the true CT is input into the discriminator.

（3）

wherein the content of the first and second substances,

expressed as a dual-immunity loss function,

an objective function representing the first GAN is shown,

representing the objective function of the second GAN.

The dual-confrontation loss combines the advantages of KL divergence and reverse KL divergence to generate a more effective target function, so that the model can more easily achieve Nash equilibrium, the convergence rate is accelerated, the mode collapse is avoided, and the quality and diversity of the images of the pathological area of the liver tumor generated by the network are ensured.

The loss of the cycle consistency can ensure that one-to-one mapping is established for images in two fields of CT and MRI, and by the mode, the proposed generation model can learn the characteristics of liver tumors more comprehensively, and meanwhile, images of normal areas of the liver without tumors can be mapped into images of pathological areas of the liver tumors with tumors one by one. Therefore, the loss of cycle consistency enables the designed extended model based on generation of the countermeasure network to generate the liver tumor pathological region image with the tumor by using the tumor-free liver normal image sub-region, thereby extending the liver CT data set and solving the class imbalance problem in the data set.

S3: randomly dividing the data set without cross;

the data sets used by the present invention include MICCAI 2017 LiTS data set, 3DIRCADb data set and driver 07 data set, where the number of 3 DCTs containing correctly labeled data (ground route) in the three data sets is 131, 20, respectively, and are all preprocessed in S1. The specific amounts are as follows.

101 of the LiTS data sets are divided into training sets, and 30 are divided into verification sets; a total of 40 of the 3DIRCADb datasets and the Sliver07 dataset were divided into validation sets. Meanwhile, the liver CT generated in S2 is labeled manually and randomly inserted into a training set, and one generated slice is added into each 5 real slices. To speed up the training speed while learning as limited features as possible, the cross, coronal, and sagittal slices of the training set are repeatedly sliced into 256 × 256 resolution sub-images. The slices of the validation set remain in the original size. The detailed information of the training set and the verification set is as follows.

S4: designing a loss function for a liver tumor segmentation model;

due to the inherent properties of medical images, such as small liver tumor area occupancy in abdominal CT, irregular shape, blurred edges, etc., the data set classes used for liver tumor segmentation are unbalanced. Segmentation models based on traditional loss functions, when trained using class-imbalanced data sets, bias the trained segmentation models towards pointing the prediction results to most classes. And designing a proper loss function for the segmentation model can well alleviate the problem of unbalanced data set, and the loss function formula (4) is as follows:

（4）

wherein the content of the first and second substances,

expressed as a loss function designed for a liver tumor segmentation model, and C is a pixel point of a liver tumor region or a non-tumor regionThe category to which the region belongs is,

is the weight coefficient of the class to which the pixel belongs,

which is indicative of the focus coefficient, is,

in order to be a function of the cross-over ratio loss,

expression formula (5) is:

（5）

wherein the content of the first and second substances,

representing the cross-over ratio loss function, N representing the number of image pixels,

is a pixel pointiThe ratio of the intersection and union of the true values belonging to the category can be used to measure the effect of the segmentation model. Meanwhile, in order to balance the imbalance between the foreground and the background in the data set and the imbalance between the easily-segmented area and the edge fuzzy area, a weight coefficient is added on the basis of the Dice loss function

And focusing coefficient

. When the true category of a certain pixelWhen belonging to the foreground (i.e. a smaller number of tumor regions)

Higher weight makes the model more be absorbed in to study to few class, in addition, through adding the exponential term for the Dice loss function, can make the model distribute littleer gradient value to the better pixel of prediction effect, and provide bigger gradient value to the pixel that the prediction effect is not good to force the model to focus on more in the marginal area of difficult classification.

After the basic form of the loss function is determined, the loss function is further optimized, and the hyperparameters in the loss function are adjusted to the optimal values according to the characteristics of the medical images in the data set and by combining small-scale experiments. The specific process is as follows:

1. randomly selecting 10% of medical images from an original data set as test cases;

2. determining a weight coefficient in a loss function and an initial value of a focusing parameter according to the characteristics of the images in the data set;

3. carrying out small-scale experiments, observing the convergence speed and the segmentation effect of the segmentation model under different values of the hyperparameter of the loss function, and adjusting the hyperparameter of the loss function to an optimal value;

4. and constructing a liver tumor segmentation model based on the loss function.

S5: designing a liver tumor segmentation model;

because the medical image has fuzzy boundary and complex gradient, more high-resolution information is needed for accurate segmentation; meanwhile, the distribution of organs in the human body is relatively fixed, the distribution of the segmented target has a certain rule, and the semantics are simple and clear, so that low-resolution information is required for identifying the target object. A jump connection structure is introduced between an encoder and a decoder by the U-Net, low-resolution information and high-resolution information can be well combined, and the method is perfectly suitable for medical image segmentation, so that a liver tumor segmentation model is designed on the basis of the encoder-jump connection-decoder structure. The structure of each layer of the encoder comprises two convolution groups and a channel attention mechanism, wherein each convolution group consists of two convolution layers which respectively consist of convolution operation with a convolution kernel of 1 multiplied by 1, batch normalization processing and a modified linear unit (ReLU) and convolution operation with a convolution kernel of 3 multiplied by 3, batch normalization processing and a modified linear unit (ReLU); the channel attention mechanism comprises two operations of compression and excitation, global features are obtained through Global Average Pooling (GAP), the relation of the channel features is improved through two bottleneck structures, and the weight of each channel is obtained through an S-shaped growth curve (Sigmoid). The output of the first convolution group is referred to as O1, the output of the second convolution group is referred to as O2, the output of the channel attention mechanism is referred to as O3, the input of the structure is I, and the characteristics in the whole structure are changed as follows:

1. the input I is subjected to a first convolution group to obtain an output O1 from the first convolution group;

2. the O1 is input into the second convolution group to obtain output O2 from the second convolution group;

3. splicing the O1 and the O2 in a channel dimension, and obtaining output O3 through a channel attention mechanism;

4. the input signature I is added to the output O3 to obtain the final output.

After each layer of the encoder finishes the above operations, the feature size is reduced through global average pooling and is used as the input of the next layer of the encoder, and the processes are repeatedly executed in the whole encoding stage.

The object of the jump connection is to extract a valid part of a feature in an encoder, and since a feature map in the encoder undergoes far fewer linear and nonlinear operations than a decoder, and therefore contains more original features, such as spatial position information of the features and relationships of feature channels, a spatial attention mechanism and a channel attention mechanism are mainly used in the jump connection to extract spatial information and channel information to assist a feature restoration operation of the decoder.

The main purpose of the decoder is to restore the reduced feature map from the encoder to the original size without requiring an overly complex structure, so that each layer of the decoder has only two convolutional layers, with convolutional kernels of 1 × 1 and 3 × 3, respectively. And the input of each layer of the decoder is a feature map formed by fusing the output of the previous layer and the content in the jump connection. The number of layers of the decoder corresponds to that of the encoder, and the size of the feature map is continuously restored between each layer until the size is the original size through upsampling.

A schematic diagram of a liver tumor segmentation model is shown in fig. 4.

S6: training a segmentation model and performing liver tumor segmentation.

And (5) using the training set divided in the S3 and combining the loss function designed in the S4 to gradually train the segmentation model designed in the S5 to respectively obtain a liver segmentation model and a tumor segmentation model, wherein the liver segmentation model and the tumor segmentation model are used as a unified two-step model. The resolution of an input image in the training process is 256 × 256 pixels, the number of training rounds is limited between 60 and 150 rounds according to the number of data sets as shown in fig. 5, 10000 pictures are randomly selected from the training set for training in each round, the pictures are input at the density of 32 pictures each time, and one round of training is finished after 313 iterations. Meanwhile, in each iteration process, the loss, the model segmentation result and the Dice coefficient of correct marked data are calculated, and the parameters of the model are continuously updated through back propagation until fitting is carried out. Wherein the loss is calculated according to the loss function designed in S4,

the calculation formula (6) of the Dice coefficient is:

（6）

wherein the content of the first and second substances,Dicethe dice similarity coefficient is shown, a is GT, B is the predicted segmentation result, FP is the false positive case, FN is the false negative case, and TP is the true positive case.

For the trained liver segmentation model and tumor segmentation model, the two models are combined to obtain a unified two-step segmentation model, the segmentation process is shown in fig. 6, and the specific segmentation steps are as follows:

1. respectively inputting the cross section slice, the coronal plane slice and the sagittal plane slice into three liver segmentation results of the liver segmentation model;

2. adding and fusing the three liver segmentation results pixel by pixel with the same weight to obtain a target liver region (wherein the liver region pixel value is 1, and the non-liver region pixel value is 0);

3. multiplying the obtained target liver region with the original liver CT to obtain the CT only containing the liver region;

4. and inputting the CT only containing the liver region into a tumor segmentation model for tumor segmentation to obtain a final tumor segmentation result.

The data which are divided in the S3 and contain correct marking data in the verification set are used for carrying out relevant experiments, the fact that the trained segmentation model has good segmentation performance is verified, and the result is shown in the table 3, wherein DSC is the mean value of the Dice coefficients, and DG is the Dice coefficient which is obtained by regarding all CTs as an integral solving Dice coefficient.

The system uses the research result of computer vision frontier technology to design a liver tumor segmentation model to segment the liver tumor in the medical image; expanding the liver tumor image by designing a medical image expansion model so as to support the training of a large segmentation model; meanwhile, the designed loss function can also improve the stability of model training aiming at the characteristics of the medical image. Through the whole set of procedures, a unified framework of liver tumor image enhancement-segmentation is realized.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the description of the method part.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A method for enhancing and segmenting liver tumor images based on deep learning is characterized in that: the method comprises the following steps:

s1: carrying out batch preprocessing on the CT data set; preprocessing abdominal CT data sets in a given data set in batch, and adjusting HU values of the abdominal CT data sets to obtain CT with higher liver tumor region discrimination;

s2: designing an extended model based on the CT data set generating the countermeasure network; designing a CT data set expansion model based on a generated countermeasure network based on the preprocessed CT data set to obtain a high-quality liver tumor CT data set;

s6: training a liver tumor segmentation model and performing liver tumor segmentation; taking the training set obtained in the step S3 as the input of the liver tumor segmentation model designed in the step S5, training and learning the liver tumor segmentation model, and performing liver tumor segmentation through the trained liver tumor segmentation model; further, in S1, the CT data set is subjected to batch preprocessing, specifically: performing HU value adjustment operation on an existing liver 3DCT data set, adjusting HU values of CT in the CT data set to be in a range of [ -200,250], and dividing the 3DCT into 2D slices with 512 x 512 resolution;

in the step S5, a liver tumor segmentation model is designed, the liver tumor segmentation model and the tumor segmentation model are obtained by jointly training CT slices in multiple directions, and high resolution information and low resolution information are combined in the form of an encoder, a skip connection, and a decoder, specifically:

processing a characteristic graph by using a depth residual block in each layer of an encoder, and simultaneously performing two steps of operations on input, wherein in the first step, two groups of convolution groups with different scales are used for input, and the output of the two groups of convolution groups is spliced, wherein each group of convolution groups respectively comprises convolution with a convolution kernel size of 1 multiplied by 1 and convolution with a convolution kernel size of 3 multiplied by 3; the second step of input is that the channel attention value is calculated for the output of the previous step and multiplied by two groups of convolution groups, then the final output characteristic graph is added with the input characteristic graph, wherein the calculation of the channel attention value comprises two operations of compression and excitation, the dimensionality of the characteristic graph is compressed into one dimension through global average pooling, then the importance of each channel is predicted by a full connection layer, and excitation is completed; finally, obtaining the output of each layer of the encoder by using global average pooling;

the jumping connection is connected with a channel attention mechanism and a space attention mechanism in a parallel mode, wherein the principle of the channel attention mechanism is the same as the calculation mode of a channel attention value in an encoder, the space attention mechanism is processed through global average pooling and global maximum pooling, dimension reduction is carried out by using convolution operation with the convolution kernel size of 7 multiplied by 7, and the space attention value is generated by an S-shaped growth curve; extracting a spatial attention value and a channel attention value from the output of each layer of the encoder through a channel attention mechanism and a spatial attention mechanism, and giving the spatial attention value and the channel attention value to a feature map of each layer of the decoder to be used as the input of each layer of the decoder;

each layer of the decoder sequentially performs convolution, batch normalization processing and linear unit correction operation of different scales on input, obtains the output of each layer of the decoder through up-sampling, and continuously and repeatedly completes feature restoration;

in the liver segmentation model, training is carried out by using a training set mixed with cross section, coronal plane and sagittal plane slices, extraction and learning of liver features are carried out by an encoder, feature weights are given to feature maps by jumping connection, feature restoration is carried out by a decoder, and a multi-plane liver segmentation model suitable for the cross section, coronal plane and sagittal plane slices of the liver CT is fitted after a plurality of rounds of training; in the tumor segmentation model, the segmentation result of the liver segmentation model is input into the tumor segmentation model, the training set removes the background except the liver by taking the correct marking data of the liver as the reference, the influence of the complex background environment in the training process on the tumor feature learning capability of the tumor segmentation model is eliminated, and the multi-plane tumor segmentation model adaptive to the tumor features is fitted after several rounds of training.

2. The method of claim 1, wherein the method comprises the following steps: in the S2, a CT expansion model based on generation of a countermeasure network is designed, and the specific method comprises the following steps: the CT data set is preprocessed in S1, the preprocessed CT data set is used as a part of training set for S2 training an extended model based on generation of a countermeasure network, and the training of the extended model of the countermeasure network needs abdominal nuclear magnetic resonance imaging provided in a computed tomography-nuclear magnetic resonance imaging combined healthy abdominal organ segmentation data set as another part of training set for generating liver CT required by modal migration;

firstly, generating a liver CT pathological area image by using a cyclic generation countermeasure network, wherein the liver CT pathological area image comprises two generators and two discriminators, the first generator G is used for generating a liver tumor pathological area Gg, the second generator F is used for generating a normal liver area Fb, the first discriminator DX is used for distinguishing the true and false of the normal liver CT, and the second discriminator DY is used for distinguishing the true and false of the liver tumor pathological area image; wherein the first discriminator DX and the second discriminator DY give a high score to the true CT, and give a low score to the generated CT;

the designed objective function of the liver CT pathological region image generation network model comprises a double-pair loss function and a cycle consistency loss function, and for the first generator G and the second discriminator DY, the double-pair loss function is expressed by a formula (1):

（1）；

wherein the content of the first and second substances,

expressed as a dual-pair loss value, g represents the true normal region sub-image domain, b represents the true pathological region image domain, and λ ₁ For controlling parameters, the relative importance of the degree of similarity and diversity between images, in combination with a control parameter>

Represents the target function of the first generative countermeasure network>

（2）；

wherein, the first and the second end of the pipe are connected with each other,Lexpressed as the CT extended model objective function value of the generation countermeasure network,

representing a probability value obtained by a CT calculation generated by the generator, based on the value of the CT signal>

Represents a profile that generates CT>

Represents CT generated by the generator,. Sup.>

Represents the probability value obtained by the true CT calculation, is greater than or equal to>

（3）；

wherein the content of the first and second substances,

expressed as a double-immunity loss value, <' > or>

Representing the first objective function to generate the competing network,

representing the objective function of the second generation of the countermeasure network.

3. The method of claim 1, wherein the method comprises the following steps: in the step S3, random division without crossing is performed on the data set, and the specific method includes: the high-quality liver tumor CT is a 3D image, different directions of the 3D image are read and stored as 2D slices, the 2D slices are divided into a training set and a verification set according to a fixed proportion, the training set slices are repeatedly cut, and the resolution of each slice is 256 multiplied by 256 pixels; the slice of the validation set is at original size with a cross-sectional slice resolution of 512 x 512 pixels; the slice resolution of coronal and sagittal planes is 512 × n pixels, n is a non-fixed value and varies with the Z-axis length of the 3D CT.

4. The method of claim 1, wherein the method comprises the following steps: in the step S4, a loss function for a liver tumor segmentation model is designed, and the specific method includes: analyzing the image characteristics in the abdominal CT data set, designing a loss function for the liver tumor segmentation model according to the analysis result, determining the optimal value of the hyperparameter in the loss function by combining an experiment, wherein a formula (4) for designing the loss function by the liver tumor segmentation model is as follows:

（4）；

wherein the content of the first and second substances,

expressed as a loss function designed for a liver tumor segmentation model, C is the belonging category of a liver tumor region or a non-tumor region of a pixel point, and is/are>

For the weight coefficient of the category to which the pixel belongs>

Represents a focusing factor, < > or >>

Based on the cross-over ratio loss function>

Expression formula (5) is:

（5）；

wherein N represents the number of image pixels,

for a segmentation model to pixel point->

Predicted as category->

The probability value of (a) is determined,

is a pixel point>

True values belonging to the category to which they belong.

5. The method of claim 1, wherein the method comprises the following steps: in the step S6, training a liver tumor segmentation model and performing liver tumor segmentation, the specific method includes: in the training process of the liver tumor segmentation model, the model capability is optimized by calculating the loss of the dice similarity coefficient and utilizing back propagation until fitting; after the fitting is finished, the segmentation performance of the model is checked by using a verification set; the trained liver tumor segmentation model segments CT slices in multiple directions simultaneously and fuses the CT slices, and the method specifically comprises the following steps: