CN115132275A

CN115132275A - Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network

Info

Publication number: CN115132275A
Application number: CN202210583718.2A
Authority: CN
Inventors: 赵世杰; 刘卓岩; 韩军伟
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2022-09-30
Anticipated expiration: 2042-05-25
Also published as: CN115132275B

Abstract

The invention relates to a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolution neural network, which provides end-to-end global feature extraction processing for lungs based on a densely connected three-dimensional convolution structure, and simultaneously captures local lung nodule features by utilizing a proposed multi-scale cavity asymmetric module. In a small range, the lung nodule characteristics with different sizes, directions and angles are processed by utilizing a multi-scale hole asymmetric module, each part of details in the input three-dimensional image is searched, and the details are densely connected from small to large. In a large range, micro features of different stages and different scales of a target are combined and processed by using a dense network, so that comprehensive features of the lung are extracted, deeper information of the lung is extracted by using the idea of feature transformation and channel fusion, and finally, a prediction result of a model is obtained through a full connection layer and an activation function layer.

Description

Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network

Technical Field

The invention belongs to the field of computer vision, relates to a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network, and particularly relates to a method for extracting features from CT images and predicting genes by utilizing a neural network architecture.

Background

Currently, common methods for diagnosing EGFR mutation status include tissue biopsy and fluid biopsy, but there are many limitations including: due to the heterogeneity of tumors, biopsies may have sampling defects; biopsy needs to meet the requirements of invasive biopsy; biopsy may increase the potential risk of cancer metastasis. Among these, tissue biopsy may fail due to low tissue quality and is relatively costly; liquid biopsy can extract peripheral blood for detection rather than tumor tissue, but may suffer from low or no concentration of ctDNA.

With the development of deep learning, some neural network-based models also appear, but most of the methods rely on accurate annotation of tumor boundaries by experienced doctors or radiologists, are time-consuming and labor-consuming, and may introduce some subjective errors into annotation results. Although some of the methods that are currently available relax the requirements for data annotation, a radiologist is still required to perform a coarse localization of the lung nodules. Most importantly, the extracted features only come from the interior of the nodule and annotate the edges of the tumor, and other important information including the relative location of the tumor, the size of the tumor, and the interactions between different lung regions are ignored.

The existing method only considers the information of the interior and the edge of the nodule, depends on the result labeled by an expert, and cannot utilize all effective information of the lung.

Disclosure of Invention

Technical problem to be solved

In order to avoid the defects of the prior art, the invention provides a method for predicting the EGFR gene mutation state based on an end-to-end three-dimensional convolution neural network, which combines a multi-scale cavity convolution module and an asymmetric three-dimensional convolution module to extract the complete lung CT characteristics of a lung adenocarcinoma patient, transforms and recombines the complete lung CT characteristics in a high-dimensional space, and predicts the EGFR gene mutation state of the lung adenocarcinoma patient by using complete and effective lung information.

Technical scheme

A method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network is characterized by comprising the following steps:

step 1: preprocessing an original three-dimensional lung CT image, emptying contents except a lung area in the CT image, and scaling the CT to be uniform in size, namely 112 multiplied by 90, in a spline interpolation mode;

step 1 a: segmenting the lung through a U-Net model, and setting the region outside the lung area to be zero;

step 1 b: obtaining a preprocessing result in a spline interpolation mode, namely obtaining an image which is 112 multiplied by 90 and is positioned in the center of a visual field, ensuring that each slice in the X-axis, Y-axis and Z-axis directions can cut the lung, and keeping the detail characteristics of the lung;

step 2: constructing a neural network structure: adopting a multilayer convolution neural network with dense connection for extracting the overall characteristics of the target and fusing multi-scale characteristics;

step 2 a: the proposed method is based on Dense Block, four layers of densely connected convolutional neural networks are built, the output of each layer is connected in channel dimension and is used as the input of the next layer of convolution;

and step 2 b: in the connection process of layers, a bottleneck module for inter-channel feature fusion and dimension reduction is introduced, specifically, feature distribution is improved into normal distribution with the mean value of 0 and the variance of 1 through batch normalization, and the effect of dimension reduction is achieved through convolution of a linear rectification function and 1 multiplied by 1;

and step 2 c: embedding a multi-scale multi-expansion asymmetric cavity convolution module in a baseline model for capturing local tiny characteristics and paying attention to lung nodules appearing in the lung of a lung cancer patient in different directions, different sizes and different angles;

and step 3: training a neural network structure in an end-to-end mode, namely training the neural network structure by using a cross entropy loss function, and using a random gradient descent SGD (serving as an optimizer of a model), wherein the momentum is 0.9;

in the training process, the batch processing size is set to be 6, the iteration times of the model are set to be 300, and all learning rates are set to be 0.01; the EGFR mutant is defined as 1 and the EGFR wild type is defined as 0, i.e., the closer the model output result is to 1, the more easily it is determined as the EGFR mutant. The weight decay on the l2 regularization coefficients was set to 0.0004 to prevent overfitting;

and 4, step 4: and inputting the three-dimensional lung CT image data into the trained end-to-end three-dimensional convolutional neural network, inputting the data into a CT image of the whole lung, and outputting the data as a prediction result, namely predicting the EGFR gene mutation state.

Advantageous effects

The invention provides a method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolution neural network, which provides end-to-end global feature extraction processing for lungs based on a densely connected three-dimensional convolution structure, and simultaneously captures local lung nodule features by using a proposed multi-scale cavity asymmetric module. In a small range, the lung nodule features with different sizes, directions and angles are processed by utilizing a multi-scale hole asymmetric module, each part of detail in the input three-dimensional image is searched, and the detail is densely connected from small to large. In a large range, micro features of different stages and different scales of a target are combined and processed by using a dense network, so that comprehensive features of the lung are extracted, deeper information of the lung is extracted by using the idea of feature transformation and channel fusion, and finally, a prediction result of a model is obtained through a full connection layer and an activation function layer.

In particular, the proposed method contributes to the following: firstly, the proposed method is a method which does not need any pre-labeling step for the first time in the field of EGFR mutation state prediction historically, thereby greatly reducing the pressure of doctors and avoiding the introduction of manual errors. The model directly learns the characteristics of a complete three-dimensional CT image and predicts the EGFR mutation state of a lung adenocarcinoma patient in an end-to-end mode. Second, the present invention uses two intact lungs imported in three-dimensional space for the first time to predict EGFR gene mutation status, and in experiments it was demonstrated that EGFR mutation status is not only manifested in lung nodules but also in the entire lungs of patients. Thirdly, the proposed model is composed of dense connection modules composed of three-dimensional asymmetric convolution and three-dimensional multi-expansion dense convolution, the application of the modules can enable a network to capture lung nodule information in different directions, and the application of the three-dimensional multi-expansion blocks supports the model to expand the receptive field of the model under the condition of not losing resolution, so that the multi-scale context information of CT is captured, and the prediction performance of the model is further improved.

The invention successfully realizes the function of predicting the EGFR gene mutation state of a lung adenocarcinoma patient end to end from a three-dimensional CT image by utilizing a neural network, and has the following advantages in the field of EGFR mutation states:

the invention provides a deep learning model, namely a three-dimensional densely-communicated asymmetric convolution and multi-expansion density network for noninvasive prediction of EGFR mutation states of lung adenocarcinoma patients. At the same time a new view to study the EGFR mutation status was also innovatively proposed, i.e. information about EGFR status is present in the intact double lungs, not just in the lung nodules. Because the deep learning model requires the dimension of input data to be consistent, the invention provides a method for processing the problem of inconsistent CT image dimensions.

Firstly, compared with the traditional EGFR mutation state prediction model, the method provided by the invention does not need a professional radiologist to carry out edge marking on the nodule, and even does not need any rough positioning, thereby having wider application scenes. Meanwhile, the proposed method skillfully combines the target global feature processor and the lung nodule local feature extractor together and is embedded into a target three-dimensional prediction network, so that the problem of large-range search is resolved into the problem of small-range feature extraction, the problem that the target local feature is difficult to accurately capture during the prediction of a three-dimensional target is solved, the proposed three-dimensional shape deformation model can be trained end to end, and compared with the traditional two-dimensional slice prediction method, the method provided by the invention is based on a three-dimensional image and can not lose information in the depth direction.

Drawings

FIG. 1 shows a flow chart of the method of the present invention. The method aims at a high-quality three-dimensional model which is noninvasive, rapid and accurate in prediction, and directly extracts features and predicts the EGFR gene mutation state through the proposed end-to-end convolutional neural network by inputting a complete three-dimensional CT image of a target.

FIG. 2 is a graph comparing the method of the present invention with a conventional method. The 3DDADD Net framework provided by the invention is an end-to-end prediction network without any branch, and the EGFR mutation state prediction of the lung adenocarcinoma patients is directly realized through complete lung characteristics. The method takes a three-dimensional CT image as input, abandons the input mode of a two-dimensional lung nodule slice sequence or a two-dimensional CT sequence in the traditional method, and solves the problems that manual labeling wastes time and labor and context information of lung features is lost. The local feature extractor for capturing the nodules in different directions and under the multi-scale background is provided on the basis of the base line dense connection network, and the detail features can be well processed.

Fig. 3 is a diagram of a model framework of the present invention. The input to the model is a complete CT slice of a lung adenocarcinoma patient, and the input to each convolutional layer network is the concatenation of the outputs of the preceding networks. The dense network connectivity facilitates the use of all features learned from each layer before, without the need for repeated learning. Each layer is composed of a multi-expansion asymmetric convolution module, the expression capability of standard convolution is improved, multi-expansion convolution is involved, different expansion factors are arranged in a single layer to model different resolutions, and the aliasing problem of dense connection is avoided. The three-dimensional asymmetric convolution widens the path of model feature extraction, and improves the robustness of the model to certain transformation styles of lung nodules, such as turning and rotation.

Detailed Description

The invention will now be further described with reference to the following examples and drawings:

the implementation method for acquiring the EGFR gene mutation state of a lung adenocarcinoma patient from a three-dimensional lung CT image is characterized by comprising the following steps of:

step 1: the original CT image is pre-processed. In order to make the model focus more on the lungs of the patient, the outside lung area is emptied and the CT is scaled to a uniform size by spline interpolation, i.e.: 112 × 112 × 90.

Step 1 a: and (5) inputting the CT into a U-Net model to automatically segment the lung, and setting the regions outside the lung area to be zero. The model is enabled to pay more attention to the lung in the calculation process, and meanwhile, the interference outside the lung area is reduced.

Step 1 b: the spline interpolation method is used to obtain the final preprocessing result, namely an image with the size of 112 × 112 × 90 and located in the center of the visual field. Each slice in the X-axis, Y-axis and Z-axis directions can be cut into the lung, the detailed characteristics of the lung are kept to the maximum extent, and laboratory equipment is utilized to the maximum extent.

Step 2: .

Step 2 a: generally, the three-dimensional ensemble averaging features reflect global spatial structure information of such objects. The method is based on Dense Block, four layers of densely connected convolutional neural networks are built, the output of each layer is connected in channel dimension and is used as the input of the next layer of convolution. Therefore, the transfer of the characteristics can be enhanced, the characteristics can be better utilized, and the problem of gradient disappearance is effectively reduced.

And step 2 b: in the connection process of layers, a bottleneck module for inter-channel feature fusion and dimension reduction is introduced, so that the calculated amount of the model is effectively reduced. Specifically, the characteristic distribution is improved to be normal distribution with the mean value of 0 and the variance of 1 through batch normalization, and then the dimension reduction effect is achieved through the convolution of a linear rectification function and 1 multiplied by 1.

And step 2 c: a multi-scale multi-expansion asymmetric cavity convolution module is constructed and embedded in a baseline model, and the module is used for capturing local tiny characteristics and paying attention to lung nodules appearing in the lung of a lung cancer patient in different directions, different sizes and different angles. By densely connecting the three-dimensional cavity convolutions with the expansion factors of 1, 2 and 4 and combining the three-dimensional asymmetric convolution module, the extraction capability of the three-dimensional cavity convolution module on three-dimensional features of different scales is improved, and nodules appearing in the lung are subjected to carpet type search and accurately hit. The module appears in each layer of the baseline network, so that the processing capacity of the network on small features is improved, and the prediction performance of the model is improved.

And 3, step 3: the data set was from the university of medical Zunyi Hospital, and there were 173 lung adenocarcinoma patient samples, of which 119 were EGFR mutants and 54 were EGFR wild-type. The invention carries out experiments by a five-fold cross validation method, aims to obtain as much effective information as possible from limited data, reduces overfitting and better evaluates the prediction performance and generalization capability of the model. During the training process, the batch size was set to 6 and the number of iterations of the model was set to 300. All learning rates are set to 0.01. The proposed method defines EGFR mutants as 1 and EGFR wild type as 0, i.e. closer to 1 the model output results are more easily judged as EGFR mutants. The neural network was trained using a cross entropy loss function, using Stochastic Gradient Descent (SGD) as the optimizer for the model, with a momentum of 0.9. Meanwhile, the weight decay for the l2 regularization coefficients is set to 0.0004 to prevent overfitting.

And 4, step 4: and setting a training model according to the data set and the experimental parameters. Please note that the invention is trained in an end-to-end manner throughout, namely: the input is CT image of whole lung, and the output is prediction result.

The specific embodiment is as follows:

the present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.

The computer hardware environment for implementation is: an Intel Xeon E5-2600 [email protected] 8-core CPU processor, 64GB memory, equipped with a GeForce GTX TITAN X GPU. The software environment in which the software runs is: the Linux 16.0464 bit operating system. The method provided by the invention is realized by Python3.6.7 and TensorFlow 2.1.0 software.

Step 1: a data set is constructed.

Step 1 a: a predictive image data set is constructed. After approval by the ethical committee of the Guizhou medical university, the proposed method retrospectively analyzed data of patients with lung adenocarcinoma diagnosed by the hospital affiliated with Zunyi medical university (Guizhou, China) in month 10 to month 11, 2018.

Inclusion criteria for the proposed method to follow when collecting patient data are as follows: (1) the patient is proved to be a primary lung adenocarcinoma tumor by CT and histological examination; (2) the patient has an EGFR gene mutation detection result; (3) the CT image data of the tumor pathological specimen (4) obtained within 1 month is complete. The exclusion criteria followed were as follows: (1) the patient receives anti-tumor treatment (radiotherapy, chemotherapy or radiotherapy) before operation; (2) the time interval of postoperative CT imaging is more than 1 month; (3) CT images have difficulty identifying tumor boundaries (e.g., tumors located in the hilum of the lung or lesions combined with atelectasis); (4) lung tumor diameter is less than 1cm or CT image artifacts; 5) the poor quality of CT images affects segmentation and feature extraction.

After screening, 173 patients met the requirements, of which 75 males, 98 females, age 31-79 years, mean (58 ± 10 years). Of these patients, 57 smokers and 116 non-smokers were present. Clinical staging distribution: 80 cases in I stage, 9 cases in II stage, 21 cases in III stage and 63 cases in IV stage. And compared to previous deep learning methods, when prediction was performed using ROI images, nodule annotations were also collected on CT images of each patient by two radiologists with 5 and 10 years extensive diagnostic experience, respectively. Different, the problems in the labeling process are solved through negotiation of two doctors, 50 cases of data are randomly extracted for secondary labeling, and the consistency among observers is evaluated.

Step 1 b: in the pre-processing stage of the CT image, in order to effectively use the limited device memory and make the model focus more on the lungs, the proposed method removes the blank locations in the three-axis direction, leaving only that portion of the volume containing the lungs, which is then adjusted to a size of 112 × 112 × 90 by spline interpolation.

Step 2: and constructing a network structure.

Step 2 a: and constructing a three-dimensional densely connected baseline network. The network adopts a Dense Net network to map an input three-dimensional CT image into a vector representing an EGFR mutation state prediction result. The first convolutional layer consists of a convolutional kernel with the size of 7 × 7 × 7, the expansion coefficient of 7 and the step size of 1, and is followed by an ELU activation function to convert the 90 × 112 × 112 input feature map into the size of 80 × 112 × 112; starting from the second convolution layer, a densely connected multi-scale multi-expansion asymmetric convolution module is used, and the output of the densely connected multi-scale multi-expansion asymmetric convolution module is used as the input of the next layer; and finally, passing the output through an average value pooling layer, and then connecting with a full-connection layer to obtain the output of the gene prediction result.

And step 2 b: and constructing a multi-expansion asymmetric convolution module for capturing local features. The characteristic diagram firstly passes through a three-dimensional asymmetric convolution layer, and the convolution kernel comprises four forms: 3 × 3 × 3, 1 × 1 × 3, 3 × 1 × 1 and 1 × 3 × 1, and the feature map is calculated by the four convolution kernels at the same time, and the sum of the four calculation results is output; then, output is normalized through a BatchNorm3d layer and an ELU layer, so that the output is distributed uniformly and overfitting is inhibited; then passing through a multi-expansion convolutional layer, namely a convolutional layer with three densely connected expansion coefficients of 1, 2 and 4 respectively, wherein the sizes of convolution kernels are 3 multiplied by 3; then, the output is normalized through a BatchNorm3d layer and an ELU layer, so that the output is distributed uniformly and overfitting is inhibited; finally, the input is converted into a conversion layer, the number of channels of the output characteristic diagram is converted into half of the original number, and the size of the characteristic diagram is reduced through a maximum value pooling layer.

And step 3: and training a three-dimensional end-to-end EGFR gene mutation state prediction model. The network employs the lung adenocarcinoma CT dataset provided by Zunyi medical university as described in step 1 b. The parameters of the network are optimized by using the SGD optimizer, the initial learning rate is set to be 0.01, and the learning rate is attenuated to 10% of the original learning rate every 150 epochs of training.

And 3a, giving a three-dimensional input CT image, and coding the input image by a first layer of feature extraction layer to obtain a feature map with the size of 80 multiplied by 112.

And step 3 b: and (4) inputting the feature map obtained in the step (3 a) into the model to obtain a feature vector with the size of 1 multiplied by 2, which is the prediction result of the model.

And 3c, inputting the tissue biopsy result corresponding to the lung adenocarcinoma patient given by the database and the EGFR gene mutation state predicted by the model into the loss function by using the cross entropy loss as the loss function. The neural network is trained using a back propagation algorithm.

And 3d, judging whether to stop training. And returning to the step 3a when the training is continued, and entering the step 4 when the training is stopped.

And 4, step 4: and the trained integral three-dimensional prediction network is used as a detector to predict the EGFR gene mutation state of the target on the test set.

And 4a, inputting a three-dimensional CT image into the EGFR gene mutation state prediction network trained in the step 3, and outputting a characteristic vector with the size of 1 multiplied by 2, which is a prediction result of the model and represents the probability of the mutation type and the wild type respectively.

And 5: 5 training and testing were performed by the 5-fold cross-validation method mentioned in the "summary of cross-validation methods in model selection" published 2013 by Van Yongdong et al, the method proposed by the present invention yielded an AUC of 0.79 and an ACC of 0.77 in 173 lung adenocarcinoma CT data sets following medical university.

Claims

1. A method for predicting EGFR gene mutation state based on an end-to-end three-dimensional convolutional neural network is characterized by comprising the following steps:

step 2: constructing a neural network structure: adopting a densely connected multilayer convolutional neural network for extracting the overall characteristics of the target and fusing multi-scale characteristics;

step 2 a: the proposed method is based on a Dense Block, four layers of densely connected convolutional neural networks are built, the output of each layer is connected in channel dimensions and is used as the input of the convolution of the next layer;

and step 3: training a neural network structure in an end-to-end mode, namely a three-dimensional lung CT image data set, using a cross entropy loss function to train the neural network structure, using a random gradient descent SGD as an optimizer of a model, and enabling momentum to be 0.9;