CN115294151A - Lung CT interested region automatic detection method based on multitask convolution model - Google Patents

Lung CT interested region automatic detection method based on multitask convolution model Download PDF

Info

Publication number
CN115294151A
CN115294151A CN202210773167.6A CN202210773167A CN115294151A CN 115294151 A CN115294151 A CN 115294151A CN 202210773167 A CN202210773167 A CN 202210773167A CN 115294151 A CN115294151 A CN 115294151A
Authority
CN
China
Prior art keywords
roi
dimensional
decoder
automatic detection
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210773167.6A
Other languages
Chinese (zh)
Inventor
赵寅杰
傅小龙
潘小勇
徐志勇
林扬
申宇嘉
傅圆圆
沈红斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202210773167.6A priority Critical patent/CN115294151A/en
Publication of CN115294151A publication Critical patent/CN115294151A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20116Active contour; Active surface; Snakes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Image Processing (AREA)

Abstract

A lung CT interested region automatic detection method based on a multitask convolution model obtains images in a tensor form by analyzing CT scanning, and performs rough lung segmentation, resampling and pixel value normalization processing on the images to obtain a three-dimensional CT image voxel matrix only containing lung regions; then, after the three-dimensional image is sliced, inputting the obtained two-dimensional images of a plurality of cross sections into a two-dimensional convolution model to obtain a plurality of ROI rough outlines; and intercepting the three-dimensional image block by taking the rough ROI profile as a center, respectively giving the probability that each ROI is a real ROI through a three-dimensional classifier, inputting the obtained three-dimensional image block containing the real ROI into a three-dimensional convolution model obtained by multi-task training after screening to obtain an accurate ROI profile, and rendering the profile into different colors according to the probability that the ROI is the real ROI. The invention can rapidly process a large batch of CT images and improve the ROI detection efficiency.

Description

Lung CT interested region automatic detection method based on multitask convolution model
Technical Field
The invention relates to a technology in the field of image processing, in particular to a lung CT (Computed Tomography) region-of-interest automatic detection method based on a multitask convolution model.
Background
With the development of medical technology, more and more lung CT images need to be processed in time, wherein the detection of a Region of interest (ROI) belongs to the first step of lung CT image processing. In the field of digital image processing, ROI refers to a specific region of interest among images by a technician. If the traditional method of manually checking the CT images layer by layer to judge whether the ROI exists is adopted, a large burden is added to a doctor, and the doctor needs to refer to a plurality of images to continuously compare for judging when finding the ROI, so that the method is time-consuming and labor-consuming. The ROI is automatically positioned and the ROI outline is outlined by utilizing the convolutional network, so that the processing time of the CT image is reduced on the basis of ensuring the accuracy, the mass processing of the CT image becomes possible, a doctor is liberated from redundant work, and the integral operating efficiency of the hospital can be improved.
At present, an algorithm aiming at the lung CT image ROI automatic detection lacks universality, can only process a CT image with a certain layer thickness, and has poor processing generalization on CT images with different layer thicknesses or obtained by scanning different machines. In addition, most algorithms only use an ROI mask or an anchor frame as a semantic label to train the convolutional network, so that the convolutional model only focuses on visual features based on voxel values in the CT image, ignores clinical biological features and is not beneficial to accurately delineating the ROI outline.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides the lung CT interested region automatic detection method based on the multitask convolution model, different models are trained aiming at different layer thicknesses, and the problem that CT images with different layer thicknesses contain different information contents can be effectively solved. In addition, the ROI mask and the good and malignant semantic labels of the ROI are simultaneously used during training of the three-dimensional semantic segmentation model, and the imaging characteristics and the biological characteristics of the ROI are combined, so that the ROI outline can be accurately drawn. The ROI contour is obtained through a convolution model based on a coder decoder structure, a large batch of CT images can be processed quickly, and ROI detection efficiency is improved.
The invention is realized by the following technical scheme:
the invention relates to a lung CT interested area automatic detection method based on a multitask convolution model, which comprises the steps of obtaining an image in a tensor form by analyzing CT scanning, and carrying out rough lung segmentation, resampling and pixel value normalization processing on the image to obtain a three-dimensional CT image voxel matrix only containing a lung area; then after the three-dimensional image is sliced, the cross section slices are input into a two-dimensional semantic segmentation network based on a coder decoder structure, and a plurality of ROI rough outlines are obtained; and intercepting the three-dimensional image block by taking the rough ROI profile as a center, respectively giving the probability that each ROI is a real ROI through a three-dimensional classifier, inputting the obtained three-dimensional image block containing the real ROI into a three-dimensional convolution model obtained by multi-task training after screening to obtain an accurate ROI profile, and rendering the profile into different colors according to the probability that the ROI is the real ROI.
The three-dimensional image is obtained by the following method: for original CT scanning, that is, DICOM (Digital Imaging and Communications in medicine) type CT scanning, two-dimensional slice matrixes in a plurality of DICOM files are combined into a three-dimensional image according to position coordinates of slices, and then the three-dimensional image is preprocessed, specifically including: finding a rough anchor frame of the lung region by using a digital image processing technology, and cutting off a non-lung part in the image according to the anchor frame; sampling pixel pitches of the image along the (x, y, z) axis to (0.7mm, 1.5mm), respectively; the pixel values are normalized according to the lung window [ -1024,400] and the mediastinal window [ -160,240] to obtain two channels corresponding to the lung window and the mediastinal window, respectively.
The two-dimensional semantic segmentation network is a neural network based on U2-Net, the whole of the two-dimensional semantic segmentation network is of an encoder-decoder structure, and meanwhile each unit of the encoder and the decoder is of an encoding-decoding structure, wherein the encoder gradually enlarges the receptive field of a convolutional layer through a plurality of down-sampling, extracts high-dimensional semantic features in a feature map, and obtains the rough position of a nodule in the current input; and the decoder samples the feature map to the resolution which is the same as the input resolution step by step through a plurality of upsampling, extracts the low-dimensional visual features in the feature map and obtains the mask of the nodule in the input image.
The output e of each layer of the coder and the decoder i |i∈[1,N]And { d }and { d j |j∈[1,N]The resolutions of the ROI and the ROI are different, namely, the encoder decoder structure enables the model to extract the characteristics of the ROI under different resolutions through multi-layer down sampling and up sampling, the problem that the sizes of the ROI in various cases are inconsistent can be effectively solved, and N is the number of layers of the encoder and the decoder.
The corresponding layers of the coder and the decoder are provided with short-circuit connection, namely the input of the decoder of the j layer is Concat (e) j ,Up(d j -1)), wherein: concat means that the feature maps are spliced according to channel dimensions, and Up means that the feature maps are Up-sampled to twice the original resolution by using an interpolation method, so that a decoder simultaneously processes the decoding of the previous layerThe characteristic containing high-dimensional semantic information output by the device and the characteristic containing low-dimensional visual information output by the encoder at the same height are beneficial to improving the ROI outline drawing precision on the basis of ensuring the ROI position accuracy.
The ROI is the probability of a real ROI, and three-dimensional image blocks with the size of 96 multiplied by 48 are intercepted according to a plurality of ROI rough outlines output by a two-dimensional semantic segmentation model as centers; and then inputting the three-dimensional image block into a three-dimensional classifier based on a residual error network to obtain the probability that the ROI is a real ROI.
The three-dimensional classifier based on the residual error network is based on the three-dimensional variety of the residual error network, and the residual error structure is specifically as follows: y = F (x, { W) i }) + x, where: subscript i refers to the ith layer of the network, x is the input of the current layer, y is the output of the current layer, { W i Is a parameter of the i-th layer,
Figure BDA0003725014360000021
in the form of a function, e.g.
Figure BDA0003725014360000022
The problem of network deepening post-training degeneration can be partially solved through residual error connection. The bottleneck layer maps the feature map to a high-dimensional space to process and then compresses the feature map to a low-dimensional space, so that the characteristic that the multi-dimensional features can better express semantic information is reserved, the training parameters of the network are reduced, and the difficulty of network training is reduced.
The screening is as follows: and judging the ROI with the probability smaller than a certain threshold value as a false positive, and not outputting the outline of the ROI to a final mask.
The precise ROI outline is obtained by inputting a three-dimensional image block which takes the screened ROI as a center and has the size of 128 multiplied by 64 into a three-dimensional semantic segmentation network based on a coder decoder structure; the contours of each ROI are then output to an RT STRUCT file, which renders the contours of each ROI using different colors depending on the magnitude of the probability output by the classifier.
The three-dimensional semantic segmentation network is a three-dimensional variant of the two-dimensional semantic segmentation network, and only two-dimensional operation in the three-dimensional semantic segmentation network is changed into corresponding three-dimensional operation, namely, two-dimensional convolution, pooling and normalization layers are respectively changed into three-dimensional convolution, pooling and normalization layers.
The multi-task training is as follows: when the three-dimensional semantic segmentation network is trained, besides the main task of outputting the ROI mask, an auxiliary task of judging the quality of the ROI is added. Specifically, in the three-dimensional semantic segmentation network, in addition to an encoder for performing feature extraction on an input CT image and a decoder for outputting an ROI mask, a classifier for determining whether the current ROI is benign or malignant is added. The classifier takes the output of each layer of the encoder as input, and finally obtains the probability value representing the current ROI as benign through proper down-sampling and convolution operations. In the training process, parameters of the encoder are updated by the back propagation gradient from the decoder and the classifier at the same time;
the Loss function of the three-dimensional classifier and the classification part in the multi-task training is Focal local, and specifically comprises the following steps:
Figure BDA0003725014360000031
wherein: y is the sum of the average power of the power supply,
Figure BDA0003725014360000032
respectively indicating a real label and a predicted value of the ROI benign and malignant, wherein the hyper-parameter alpha is used for controlling the weight of the positive and negative samples, and when alpha is more than 0.5, the positive samples have greater contribution to the model parameters; the weight of the difficult and easy samples is controlled by the loss function through the hyper-parameter gamma, when gamma is larger than 1, the weight of the difficult samples is larger when gradient back propagation is carried out, namely, the model focuses more on the difficult samples.
The two-dimensional and three-dimensional semantic segmentation networks have a Loss function of Focal local + Dice local, and specifically comprise:
Figure BDA0003725014360000033
Figure BDA0003725014360000034
wherein: p is the sum of the total of the p,
Figure BDA0003725014360000035
the real label and the predicted value of the ROI mask are respectively, subscript i represents the ith pixel value in the mask, focal local is a pixel Loss function, the Loss function is calculated and summed for each pixel point, the gradient of the Loss function in the form is better, and stable training of a model is easy. The Dice Loss function is only related to a foreground region in the mask, is consistent with the final evaluation index, and can simultaneously alleviate the problem of class imbalance, but when the current scene region is too small, the model optimization process is unstable. By combining the two loss functions, the ROI mask output by the model can be better on the Dice coefficient on the premise of ensuring stable training.
The training data sets adopted by the training are two hospital internal data sets. The first data set contained 989 samples with a CT slice thickness of 5mm, the second data set contained 172 samples with a CT slice thickness of less than 1.5mm, and two sets of models were obtained by training on the two data sets, respectively. In practical application, when the thickness of the input CT image layer is greater than 3mm, the corresponding model trained from the first data set is used, otherwise, the model trained from the second data set is used.
Technical effects
The method simultaneously uses the ROI mask and the benign and malignant label as real labels for training the three-dimensional semantic segmentation model, so that the model can learn the visual characteristics in the CT image through the ROI mask and can also learn the biological characteristics in the CT image through the benign and malignant label of the ROI. In addition, the three steps of rough contour delineation, false positive inhibition and accurate contour delineation and the corresponding model are adopted, the high efficiency of positioning the ROI in the three-dimensional CT image and the accuracy of ROI contour delineation are considered, and meanwhile, the false positive of ROI positioning is reduced. Finally, the invention trains different models aiming at the CT images with different layer thicknesses, thereby improving the accuracy of delineating the ROI outline in the CT images with different layer thicknesses.
Drawings
FIG. 1 is a flow chart of an embodiment;
FIG. 2 is a schematic view of an ROI in a CT scan two-dimensional slice;
FIG. 3 is a schematic diagram of a process for roughly segmenting a lung region;
FIG. 4 is a three-dimensional semantic segmentation network architecture used in the present invention;
FIG. 5 is a schematic diagram of a two-dimensional comparison of an original image (left), ROI real mask (middle) and an ROI mask (right) output based on the present invention;
FIG. 6 is a schematic diagram showing a three-dimensional comparison of the real mask of ROI (left) with the ROI mask outputted based on the present invention (right).
Detailed Description
As shown in fig. 1, the present embodiment relates to a lung CT ROI automatic detection process based on a multi-task convolution model, according to the lung CT scan in DICOM format shown in fig. 2, a semantic segmentation model of a two-dimensional encoder-decoder structure is used to scan the whole lung region, a rough outline of an ROI is found, each ROI is judged by a classifier of a three-dimensional residual structure, a probability that the ROI is a real ROI is given, and finally, an ROI with a probability greater than a certain threshold is accurately outlined by the semantic segmentation model of the three-dimensional encoder-decoder structure.
The detection method comprises the following specific steps:
the first step, the CT image in DICOM format is analyzed, and a plurality of two-dimensional gray level images are combined into a three-dimensional matrix according to coordinates.
In the second step, the lung region is found using digital image processing techniques. The method specifically comprises the following steps: carrying out threshold binarization on the gray level image; the connected components are searched, the largest two connected components are reserved, the connected components are the lung area, and the lung area is cut out to remove the extrapulmonary part, as shown in fig. 3.
And thirdly, sampling the pixel pitch of the image to (0.7 mm,0.7mm and 1.5 mm) along the (x, y and z) axis respectively, and normalizing the pixel value of the three-dimensional image by using the lung window [ -1024,400] and the mediastinal window [ -160,240] respectively, wherein the specific operation is that the window lower limit is subtracted from the pixel value and then the pixel value is divided by the window width.
And fourthly, slicing the three-dimensional image, and slicing and resampling the three-dimensional image into a plurality of images with the sizes of 7 multiplied by 512 multiplied by 256, wherein the channel number 7 represents seven continuous two-dimensional slices.
And fifthly, inputting the sliced plurality of images into a two-dimensional semantic segmentation network of a coder decoder structure to obtain a rough outline of the ROI.
And sixthly, taking each ROI obtained in the fifth step as a center, cutting a three-dimensional image block with the size of 96 multiplied by 48, inputting the three-dimensional image block into a three-dimensional classifier based on a residual structure, and giving the probability P that each ROI is a real ROI.
And seventhly, intercepting a three-dimensional graphic block with the size of 128 multiplied by 64 by taking the ROI with the real probability greater than a certain threshold value in the sixth step as a center, inputting the three-dimensional graphic block into a three-dimensional semantic segmentation network of a coder decoder structure, and giving an accurate outline of each ROI. The three-dimensional semantic segmentation network structure is shown in fig. 4.
And eighthly, writing the precise contour obtained in the seventh step into an output file RT STRUCT in a DICOM format, and respectively endowing different colors according to the corresponding probability P of each ROI in the sixth step, wherein the contour with the P being more than 0.7 and less than or equal to 0.8 is green, the contour with the P being more than 0.8 and less than or equal to 0.9 is yellow, and the contour with the P being more than 0.9 is red. Two-dimensional and three-dimensional comparisons of the model-output ROI mask and the physician-annotated ROI mask are shown in FIGS. 5 and 6, respectively.
Compared with the prior art, the average Dice of the method is 0.7026 on a test set of 119 thick-layer CT images (the layer thickness is more than 3 mm), 0.7046 on a test set of 21 thin-layer CT images (the layer thickness is less than or equal to 3 mm), the average time of processing the thick-layer CT images on a single NVIDIA GeForce GTX 3090Ti GPU by the software is less than one minute, and the average time of processing the thin-layer CT images is less than two minutes.
The foregoing embodiments may be modified in many different ways by one skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and not by the preceding embodiments, and all embodiments within their scope are intended to be limited by the scope of the invention.

Claims (9)

1. A lung CT interested region automatic detection based on a multitask convolution model is characterized in that a tensor form image is obtained by analyzing CT scanning, and rough lung segmentation, resampling and pixel value normalization processing are carried out on the image to obtain a three-dimensional CT image voxel matrix only containing a lung region; then, after the three-dimensional image is sliced, the cross section slices are input into a two-dimensional semantic segmentation network based on a coder decoder structure to obtain a plurality of ROI rough outlines; and intercepting the three-dimensional image block by taking the rough ROI profile as a center, respectively giving the probability that each ROI is a real ROI through a three-dimensional classifier, inputting the obtained three-dimensional image block containing the real ROI into a three-dimensional convolution model obtained by multi-task training after screening to obtain an accurate ROI profile, and rendering the profile into different colors according to the probability that the ROI is the real ROI.
2. The pulmonary CT region-of-interest automatic detection method based on the multitask convolution model as claimed in claim 1, wherein the two-dimensional semantic segmentation network, the neural network based on U2-Net, is an encoder-decoder structure as a whole, and each unit of the encoder and the decoder is an encoding-decoding structure, wherein the encoder gradually enlarges the receptive field of the convolution layer through a plurality of downsampling, extracts the high-dimensional semantic features in the feature map, and obtains the rough position of the nodule in the current input; and the decoder samples the feature map to the resolution which is the same as the input resolution step by step through a plurality of upsampling, extracts the low-dimensional visual features in the feature map and obtains the mask of the nodule in the input image.
3. The pulmonary CT ROI automatic detection based on multi-tasking convolution model as claimed in claim 2, wherein the output { e ] of each layer of said encoder and decoder i |i∈[1,N]And { d }and { d } j |j∈[1,N]The resolutions of the encoders and the decoders are different, namely, the encoder and the decoder structure adopts multilayer down-sampling and up-sampling, and N is the layer number of the encoders and the decoders;
the corresponding layers of the encoder and the decoder are provided with short circuit connection, namely the input of the decoder of the j-th layer is Concat (e) j ,Up(d j-1 ) Whereinsaid: concat means that the feature map is spliced according to channel dimensions, and Up means that the feature map is Up-sampled to twice the original resolution by using an interpolation method.
4. The pulmonary CT ROI (region of interest) automatic detection method based on multi-task convolution model as claimed in claim 1, wherein said three-dimensional classifier is a three-dimensional variety based on residual error network, and its residual error structure is specifically: y = F (x, { W) i }) + x, where: subscript i refers to the ith layer of the network, x is the input of the current layer, y is the output of the current layer, { W i Is a parameter of the i-th layer,
Figure FDA0003725014350000012
in the form of a function, e.g.
Figure FDA0003725014350000011
The bottleneck layer recompresses the feature map to a lower dimensional space by mapping the feature map to a higher dimensional space for processing.
5. The pulmonary CT region-of-interest automatic detection method based on the multitask convolution model as claimed in claim 1, wherein the precise ROI outline is obtained by inputting a three-dimensional image block which is 128 x 64 in size and takes the screened ROI as a center into a three-dimensional semantic segmentation network based on a coder decoder structure; the contours of each ROI are then output to an RT STRUCT file, which renders the contours of each ROI using different colors depending on the magnitude of the probability output by the classifier.
6. The pulmonary CT ROI automatic detection based on a multitask convolution model according to claim 5, wherein said three-dimensional semantic segmentation network is a three-dimensional variant of said two-dimensional semantic segmentation network, and only changes two-dimensional operation thereof to corresponding three-dimensional operation, namely, changes two-dimensional convolution, pooling and normalization layers to three-dimensional convolution, pooling and normalization layers respectively.
7. The pulmonary CT region of interest automatic detection based on multitask convolution model as claimed in claim 6, characterized in that said multitask training means: when the three-dimensional semantic segmentation network is trained, besides the main task of outputting the ROI mask, an auxiliary task for judging whether the ROI is good or bad is added; specifically, in the three-dimensional semantic segmentation network, a classifier for judging whether the current ROI is benign or malignant is added in addition to an encoder for extracting the characteristics of an input CT image and a decoder for outputting an ROI mask; the classifier takes the output of each layer of the encoder as input, and finally obtains a probability value representing that the current ROI is benign through proper down-sampling and convolution operations; during the training process, the parameters of the encoder are updated simultaneously by the backpropagation gradient from the decoder and the classifier.
8. The pulmonary CT region of interest automatic detection based on multitask convolution model according to claim 7, characterized in that the Loss function of said three-dimensional classifier and the classification part in multitask training is Focal local, which is specifically:
Figure FDA0003725014350000021
wherein:
Figure FDA0003725014350000022
respectively indicating a real label and a predicted value of the ROI benign and malignant, wherein the hyper-parameter alpha is used for controlling the weight of the positive and negative samples, and when alpha is more than 0.5, the positive samples have greater contribution to the model parameters; the weight of the difficult and easy samples is controlled by the loss function through the hyper-parameter gamma, when gamma is larger than 1, the weight of the difficult samples is larger when gradient back propagation is carried out, namely, the model focuses more on the difficult samples.
9. The pulmonary CT ROI (computed tomography) automatic detection based on the multitask convolution model as claimed in claim 5 or 6, wherein the two-dimensional and three-dimensional semantic segmentation networks have a Loss function of Focal local + Dice local, and specifically comprise:
Figure FDA0003725014350000023
Figure FDA0003725014350000024
wherein: p is the sum of the total of the p,
Figure FDA0003725014350000025
respectively representing a real label and a predicted value of the ROI mask, wherein subscript i represents the ith pixel value in the mask, focal local is a pixel Loss function, and the Loss function is calculated and summed for each pixel point; the Dice Loss function is only related to a foreground region in the mask, is consistent with the final evaluation index, and simultaneously alleviates the problem of class imbalance, but when the current foreground region is too small, the model optimization process is unstable.
CN202210773167.6A 2022-07-01 2022-07-01 Lung CT interested region automatic detection method based on multitask convolution model Pending CN115294151A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210773167.6A CN115294151A (en) 2022-07-01 2022-07-01 Lung CT interested region automatic detection method based on multitask convolution model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210773167.6A CN115294151A (en) 2022-07-01 2022-07-01 Lung CT interested region automatic detection method based on multitask convolution model

Publications (1)

Publication Number Publication Date
CN115294151A true CN115294151A (en) 2022-11-04

Family

ID=83822823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210773167.6A Pending CN115294151A (en) 2022-07-01 2022-07-01 Lung CT interested region automatic detection method based on multitask convolution model

Country Status (1)

Country Link
CN (1) CN115294151A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351215A (en) * 2023-12-06 2024-01-05 上海交通大学宁波人工智能研究院 Artificial shoulder joint prosthesis design system and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351215A (en) * 2023-12-06 2024-01-05 上海交通大学宁波人工智能研究院 Artificial shoulder joint prosthesis design system and method
CN117351215B (en) * 2023-12-06 2024-02-23 上海交通大学宁波人工智能研究院 Artificial shoulder joint prosthesis design system and method

Similar Documents

Publication Publication Date Title
CN109580630B (en) Visual inspection method for defects of mechanical parts
CN111027547A (en) Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN108288271A (en) Image detecting system and method based on three-dimensional residual error network
CN114092439A (en) Multi-organ instance segmentation method and system
CN109840483B (en) Landslide crack detection and identification method and device
CN111462120A (en) Defect detection method, device, medium and equipment based on semantic segmentation model
CN112734755A (en) Lung lobe segmentation method based on 3D full convolution neural network and multitask learning
CN112132166A (en) Intelligent analysis method, system and device for digital cytopathology image
CN111402254A (en) CT image pulmonary nodule high-performance automatic detection method and device
CN112700461B (en) System for pulmonary nodule detection and characterization class identification
CN113223005B (en) Thyroid nodule automatic segmentation and grading intelligent system
CN111709929A (en) Lung canceration region segmentation and classification detection system
CN113139977B (en) Mouth cavity curve image wisdom tooth segmentation method based on YOLO and U-Net
CN112420170B (en) Method for improving image classification accuracy of computer aided diagnosis system
CN115909006A (en) Mammary tissue image classification method and system based on convolution Transformer
CN115546605A (en) Training method and device based on image labeling and segmentation model
CN115170518A (en) Cell detection method and system based on deep learning and machine vision
CN115294151A (en) Lung CT interested region automatic detection method based on multitask convolution model
CN114581474A (en) Automatic clinical target area delineation method based on cervical cancer CT image
CN116664590B (en) Automatic segmentation method and device based on dynamic contrast enhancement magnetic resonance image
CN116883341A (en) Liver tumor CT image automatic segmentation method based on deep learning
CN115423806B (en) Breast mass detection method based on multi-scale cross-path feature fusion
CN116563691A (en) Road disease detection method based on TransUnet model
CN115018780B (en) Thyroid nodule segmentation method integrating global reasoning and MLP architecture
CN113763343B (en) Deep learning-based Alzheimer's disease detection method and computer-readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination