CN117291935A - Head and neck tumor focus area image segmentation method and computer readable medium - Google Patents
Head and neck tumor focus area image segmentation method and computer readable medium Download PDFInfo
- Publication number
- CN117291935A CN117291935A CN202311383312.0A CN202311383312A CN117291935A CN 117291935 A CN117291935 A CN 117291935A CN 202311383312 A CN202311383312 A CN 202311383312A CN 117291935 A CN117291935 A CN 117291935A
- Authority
- CN
- China
- Prior art keywords
- group
- kth
- preprocessed images
- fusion
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000014829 head and neck neoplasm Diseases 0.000 title claims abstract description 106
- 238000003709 image segmentation Methods 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000011218 segmentation Effects 0.000 claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 8
- 230000004927 fusion Effects 0.000 claims description 208
- 238000005070 sampling Methods 0.000 claims description 69
- 230000003902 lesion Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000007499 fusion processing Methods 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 abstract 1
- 238000002600 positron emission tomography Methods 0.000 description 23
- 230000006870 function Effects 0.000 description 16
- 206010028980 Neoplasm Diseases 0.000 description 5
- 238000002591 computed tomography Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 210000001165 lymph node Anatomy 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000003759 clinical diagnosis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10104—Positron emission tomography [PET]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30016—Brain
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Processing (AREA)
Abstract
The invention provides a head and neck tumor focus area image segmentation method and a computer readable medium. The method comprises the steps of obtaining a plurality of groups of original PET-CT images of head and neck tumor, sequentially preprocessing to obtain each group of preprocessed images, and marking corresponding real classification labels; constructing a focus image segmentation network, carrying out focus segmentation prediction by combining each group of preprocessed images, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm; and carrying out prediction segmentation and probability threshold judgment on the PET-CT image of the head and neck tumor acquired in real time through a trained focus image segmentation network to obtain the pixel range of the focus area of the head and neck tumor in real time. The invention utilizes the advantage of information complementation among multiple modes, and improves the accuracy of segmentation prediction of the head and neck tumor focus pixel region.
Description
Technical Field
The invention belongs to the technical field of medical image processing, and particularly relates to a head and neck tumor focus area image segmentation method and a computer readable medium.
Background
Head and neck tumors are the fifth highest common cancer type worldwide. At present, PET-CT (positron emission tomography) and X-ray Computed Tomography (CT) are combined, PET provides molecular information such as functions and metabolism of a focus, CT provides accurate anatomical positioning of the focus, and assistance is provided for diagnosis of head and neck tumors. In the clinic, doctors mainly evaluate the tumor condition by observing the position, shape, size and boundary of a focus, so as to formulate a treatment scheme. Head and neck tumor lesion segmentation can assist doctors in completing clinical diagnosis and is therefore an important problem in medical image analysis. The problem of accurate segmentation of head and neck tumor lesions in PET-CT images remains to be solved, mainly for the following reasons:
PET and CT images provide different biological and anatomical information, however how to effectively blend these two different modalities remains a challenging problem;
head and neck tumors have great heterogeneity in shape, size and intensity due to varying degrees of lesions, and there may be a blurred boundary or overlap region between the tumor body and surrounding tissue, which increases the complexity of the segmentation.
Some lesions are blurred in boundary and the marking of different areas of the lesion by different radiologists may not be consistent. In addition, manual labeling relies on expertise and experience of the imaging specialist, is time consuming and laborious, and may introduce subjective differences to the doctor.
Therefore, the establishment of an accurate automatic segmentation method for head and neck tumor lesions has great significance for clinical diagnosis and disease management. In recent years, deep learning-based methods have exhibited good performance in solving various computer vision problems (e.g., image classification, object detection, and semantic segmentation). Many methods have been applied to head and neck tumor lesion segmentation and have achieved impressive segmentation effects.
Convolutional neural networks, represented by U-Net, have classical symmetric encoder and decoder structures and preserve high resolution details by a jump connection. It has achieved significant performance in medical image segmentation tasks due to the strong feature extraction and generalized biasing capability that is exhibited on smaller datasets. The multi-mode medical image has the advantages that different information can be provided, so that the judgment on the focus is more accurate, but if the multi-mode medical image is simply spliced and then is used as the input of the U-shaped network, the complementary information among the modes cannot be fully utilized.
Disclosure of Invention
In order to solve the technical problems, the invention provides a head and neck tumor focus area image segmentation method and a computer readable medium.
The technical scheme of the method is a head and neck tumor focus area image segmentation method, which is characterized in that:
constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for prediction to construct a cross entropy dess weighted loss function, and optimizing and training by a random gradient descent algorithm to obtain a trained focus image segmentation network;
and carrying out prediction segmentation through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and further combining with probability threshold judgment to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.
The technical scheme of the method specifically comprises the following steps:
step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment, splicing treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;
Step 2: constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for focus segmentation prediction, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm;
step 3: predictive segmentation is carried out on the PET-CT image of the head and neck tumor acquired in real time through a focus image segmentation network after training, a head and neck tumor predictive probability map of the PET-CT image of the head and neck tumor is obtained, and a head and neck tumor focus area pixel range of the PET-CT image of the head and neck tumor is obtained by combining probability threshold judgment;
preferably, the lesion image segmentation network of step 2 comprises:
a single-mode encoding network, a fusion encoding network, a single-mode decoding network, and a fusion decoding network;
the single-mode coding network performs feature extraction processing on each group of preprocessed images to obtain intermediate feature representation and final feature representation of each group of preprocessed images, outputs the intermediate feature representation of each group of preprocessed images to the fusion coding network, and outputs the final feature representation of each group of preprocessed images to the single-mode decoding network;
The fusion coding network performs feature fusion processing on the intermediate feature representation of each group of preprocessed images to obtain fusion coding features of each group of preprocessed images, and outputs the fusion coding features of each group of preprocessed images to the fusion decoding network;
the single-mode decoding network is used for decoding the final characteristic representation of each group of preprocessed images to obtain the decoding characteristic representation of each group of preprocessed images, and outputting the decoding characteristic representation of each group of preprocessed images to the fusion decoding network;
the fusion decoding network performs fusion decoding processing on the decoding characteristic representation of each group of preprocessed images to finally obtain a predicted focus region segmentation image of each preprocessed image;
the single-mode encoding network comprises:
a 1 st coding module, a 2 nd coding module, a..a, a K-th coding module, a bottleneck module;
the 1 st coding module, the 2 nd coding module, the third, the K coding module and the bottleneck module are sequentially cascaded;
the kth coding module is formed by sequentially cascading a plurality of convolution modules and downsampling layers, and k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
The fusion encoding network comprises:
the 1 st fusion coding module, the 2 nd fusion coding module, the..a., the K-th fusion coding module, the bottleneck module;
the 1 st fusion coding module, the 2 nd fusion coding module, the third, the K fusion coding module and the bottleneck module are sequentially cascaded;
the kth fusion coding module is formed by sequentially cascading a multi-layer convolution module, an attention module and a downsampling layer, wherein k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the single-mode decoding network comprises:
a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;
the 1 st decoding module, the 2 nd decoding module, the third and the K decoding modules are sequentially cascaded;
the kth decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and a plurality of layers of convolution modules, wherein k is E [1, K ];
the converged decoding network includes:
a 1 st fusion decoding module, a 2 nd fusion decoding module, a.k. fusion decoding module;
the 1 st fusion decoding module, the 2 nd fusion decoding module, the third and the K fusion decoding modules are sequentially cascaded;
the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, wherein k is E [1, K ];
The 1 st coding module is used for respectively enabling each mode of each group of preprocessed images to pass through the multi-layer convolution module to obtain 1 st phase characteristics of each mode of each group of preprocessed images, and respectively outputting the 1 st phase characteristics of each mode of each group of preprocessed images to the 1 st downsampling layer, the 1 st fusion coding module and the 1 st decoding module;
after the 1 st stage feature of each mode of each group of preprocessed images is output to the 1 st downsampling layer, the 1 st stage downsampling feature of each mode of each group of preprocessed images is obtained, and the 1 st stage downsampling feature of each mode of each group of preprocessed images is output to the 2 nd coding module;
if k is [2,K-1], the kth coding module performs feature extraction on the downsampling features of the kth-1 stage of each mode of each group of preprocessed images through a multi-layer convolution module to obtain the kth stage features of each mode of each group of preprocessed images, and outputs the kth stage features of each mode of each group of preprocessed images to the kth downsampling layer, the kth fusion coding module and the kth decoding module respectively;
after the characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth downsampling layer, downsampling characteristics of the kth stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth+1th coding module;
The Kth coding module extracts the down-sampling characteristics of each mode Kth stage of each group of preprocessed images through a multi-layer convolution module to obtain each mode Kth stage characteristic of each group of preprocessed images, and outputs each mode Kth stage characteristic of each group of preprocessed images to the Kth down-sampling layer, the Kth fusion coding module and the Kth decoding module respectively;
after the characteristics of the K stage of each mode of each group of preprocessed images are output to the K downsampling layer, the downsampling characteristics of the K stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the K stage of each mode of each group of preprocessed images are output to the bottleneck module;
the bottleneck module is used for extracting the downsampling characteristics of each mode K stage of each group of preprocessed images through the multi-layer convolution module to obtain each mode bottleneck layer characteristic of each group of preprocessed images, and outputting each mode bottleneck layer characteristic of each group of preprocessed images to the K decoding module;
the 1 st fusion coding module is used for splicing all modes of each group of preprocessed images in the channel direction, obtaining 1 st stage characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the 1 st stage characteristics of each group of preprocessed images and the 1 st stage characteristics of all modes of each group of preprocessed images into the 1 st attention module, obtaining 1 st stage fusion characteristics of each group of preprocessed images, outputting the 1 st stage fusion characteristics of each group of preprocessed images to the 1 st downsampling layer, obtaining 1 st stage downsampling fusion characteristics of each group of preprocessed images, and outputting the 1 st stage downsampling fusion characteristics of each group of preprocessed images to the 2 nd fusion coding network;
If k epsilon [2,K-1], the kth fusion coding module is used for obtaining the kth phase characteristic of each group of preprocessed images through a multi-layer convolution module, inputting the kth phase characteristic of each group of preprocessed images and the kth phase characteristic of each mode of each group of preprocessed images into the kth attention module to obtain the kth phase fusion characteristic of each group of preprocessed images, outputting the kth phase fusion characteristic of each group of preprocessed images to a kth downsampling layer to obtain the downsampling fusion characteristic of the kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristic of the kth phase of each group of preprocessed images to the kth+1th fusion coding network;
the Kth fusion coding module is used for obtaining the Kth phase characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the Kth phase characteristics of each group of preprocessed images and the Kth phase characteristics of each mode of each group of preprocessed images to the Kth attention module to obtain the Kth phase fusion characteristics of each group of preprocessed images, outputting the Kth phase fusion characteristics of each group of preprocessed images to the Kth downsampling layer to obtain the downsampling fusion characteristics of the Kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristics of the Kth phase of each group of preprocessed images to the bottleneck module;
The bottleneck module is used for extracting the downsampling fusion characteristics of the K-th stage of each group of preprocessed images through the multi-layer convolution module to obtain bottleneck layer fusion characteristics of each group of preprocessed images, and outputting the bottleneck layer fusion characteristics of each group of preprocessed images to the K-th fusion decoding module;
the Kth decoding module is used for upsampling the bottleneck layer characteristics of each mode of each group of preprocessed images to obtain upsampling characteristics of each mode of each group of preprocessed images in a Kth stage, performing channel dimension splicing on the upsampling characteristics of each mode of each group of preprocessed images in the Kth stage and the characteristics of each mode of each group of preprocessed images in the Kth stage, obtaining upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage through the multi-layer convolution module, and respectively outputting the upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage to the Kth-1 decoding module and the Kth-1 fusion decoding module;
if k is [2,K-1], the kth decoding module upsamples upsampled splicing features of the kth stage of each mode of each group of preprocessed images to obtain upsampled features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampled features of each mode of each group of preprocessed images and the features of each mode of each kth stage of each group of preprocessed images, obtains upsampled splicing features of each mode of each group of preprocessed images through a multi-layer convolution module, and outputs the upsampled splicing features of each mode of each group of preprocessed images to the kth-1 decoding module and the kth-1 fusion decoding module respectively;
The 1 st decoding module performs up-sampling on the up-sampling splicing characteristics of the 2 nd stage of each mode of each group of preprocessed image to obtain up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image, performs channel dimension splicing on the up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image and the 1 st stage characteristics of each mode of each group of preprocessed image, obtains the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image through the multi-layer convolution module, and outputs the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image to the 1 st fusion decoding module;
the Kth fusion decoding module performs up-sampling on bottleneck layer fusion characteristics of each group of preprocessed images to obtain up-sampling fusion characteristics of a Kth stage of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the Kth stage of each group of preprocessed images and up-sampling splicing characteristics of each mode of each Kth stage of each group of preprocessed images, and then obtains up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images through the Kth inversion bottleneck convolution module, and outputs the up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images to the Kth-1 fusion decoding module;
If k is [2,K-1], the kth fusion decoding module upsamples the upsampling splicing fusion feature of the kth+1 stage of each group of preprocessed images to obtain upsampling fusion features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampling fusion feature of the kth stage of each group of preprocessed images and the upsampling splicing feature of the kth stage of each mode of each group of preprocessed images, obtains the upsampling splicing fusion feature of the kth stage of each group of preprocessed images through the kth inversion bottleneck convolution module, and outputs the upsampling splicing fusion feature of the kth stage of each group of preprocessed images to the kth-1 fusion decoding module;
the 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain a head and neck tumor prediction probability map of each group of preprocessed images;
The cross entropy dess weighted loss function in the step 2 is defined as follows:
Loss=α×L ce +β×L dice
where Loss represents a cross entropy dess weighted Loss function, L ce Represents a cross entropy loss function, L dice Representing the dess loss function, α represents the cross entropy loss weight, and β represents the dess loss weight;
the cross entropy loss function is defined as follows:
wherein NUM represents the number of preprocessed images, M represents the number of rows of the ith preprocessed image, N represents the number of columns of the ith preprocessed image, g i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g i,(x,y) If =0, then it is the normal region pixel, if g i,(x,y) Let 1 be the focus region pixel, p i,(x,y) ∈[0,1],p i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;
the dess loss function is defined as follows:
wherein, delta epsilon [0,1], delta represents gradient propagation adjustable coefficient;
preferably, the step 3 is performed in combination with probability threshold value to determine the pixel range of the focal region of the head and neck tumor of the real-time head and neck tumor PET-CT image, specifically as follows:
and screening out pixels with the predicted probability value of which the pixels belong to the classification of the head and neck tumor focus areas in the head and neck tumor predicted probability image of the real-time head and neck tumor PET-CT image and are larger than a probability threshold value as focus area pixels, and further obtaining the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image.
The present invention also provides a computer readable medium storing a computer program for execution by an electronic device, which when run on the electronic device performs the steps of the head and neck tumor lesion area image segmentation method.
The invention provides a method for improving the segmentation prediction accuracy of the head and neck tumor focus pixel region, and by jointly realizing the feature fusion of the coding and decoding stages, when the difference between the head and neck tumor lesion regions with different sizes and the lesion region and the normal region is not obvious, a good segmentation effect is achieved, and a doctor can be helped to determine the focus region more accurately.
Drawings
Fig. 1: the method of the embodiment of the invention is a flow chart.
Fig. 2: the neural network structure diagram of the embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In particular, the method according to the technical solution of the present invention may be implemented by those skilled in the art using computer software technology to implement an automatic operation flow, and a system apparatus for implementing the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention, and a computer device including the operation of the corresponding computer program, should also fall within the protection scope of the present invention.
The following describes a technical scheme of an embodiment of the present invention with reference to fig. 1-2, which is a head and neck tumor focus area image segmentation method, specifically as follows:
FIG. 1 is a flow chart of the method of the present invention;
step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;
step 2: constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for focus segmentation prediction, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm;
As shown in fig. 2, a neural network structure diagram of the present invention;
the lesion image segmentation network in step 2 comprises:
a 1 st coding module, a 2 nd coding module, a..a, a K-th coding module, a bottleneck module;
the 1 st coding module, the 2 nd coding module, the third, the K coding module and the bottleneck module are sequentially cascaded;
the kth coding module is formed by sequentially cascading a plurality of convolution modules and downsampling layers, and k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the fusion encoding network comprises:
the 1 st fusion coding module, the 2 nd fusion coding module, the..a., the K-th fusion coding module, the bottleneck module;
the 1 st fusion coding module, the 2 nd fusion coding module, the third, the K fusion coding module and the bottleneck module are sequentially cascaded;
the kth fusion coding module is formed by sequentially cascading a multi-layer convolution module, an attention module and a downsampling layer, wherein k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the single-mode decoding network comprises:
a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;
the 1 st decoding module, the 2 nd decoding module, the third and the K decoding modules are sequentially cascaded;
The kth decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and a plurality of layers of convolution modules, wherein k is E [1, K ];
the converged decoding network includes:
a 1 st fusion decoding module, a 2 nd fusion decoding module, a.k. fusion decoding module;
the 1 st fusion decoding module, the 2 nd fusion decoding module, the third and the K fusion decoding modules are sequentially cascaded;
the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, wherein k is E [1, K ];
the 1 st coding module is used for respectively enabling each mode of each group of preprocessed images to pass through the multi-layer convolution module to obtain 1 st phase characteristics of each mode of each group of preprocessed images, and respectively outputting the 1 st phase characteristics of each mode of each group of preprocessed images to the 1 st downsampling layer, the 1 st fusion coding module and the 1 st decoding module;
after the 1 st stage feature of each mode of each group of preprocessed images is output to the 1 st downsampling layer, the 1 st stage downsampling feature of each mode of each group of preprocessed images is obtained, and the 1 st stage downsampling feature of each mode of each group of preprocessed images is output to the 2 nd coding module;
If k is [2,K-1], the kth coding module performs feature extraction on the downsampling features of the kth-1 stage of each mode of each group of preprocessed images through a multi-layer convolution module to obtain the kth stage features of each mode of each group of preprocessed images, and outputs the kth stage features of each mode of each group of preprocessed images to the kth downsampling layer, the kth fusion coding module and the kth decoding module respectively;
after the characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth downsampling layer, downsampling characteristics of the kth stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth+1th coding module;
the Kth coding module extracts the down-sampling characteristics of each mode Kth stage of each group of preprocessed images through a multi-layer convolution module to obtain each mode Kth stage characteristic of each group of preprocessed images, and outputs each mode Kth stage characteristic of each group of preprocessed images to the Kth down-sampling layer, the Kth fusion coding module and the Kth decoding module respectively;
after the characteristics of the K stage of each mode of each group of preprocessed images are output to the K downsampling layer, the downsampling characteristics of the K stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the K stage of each mode of each group of preprocessed images are output to the bottleneck module;
The bottleneck module is used for extracting the downsampling characteristics of each mode K stage of each group of preprocessed images through the multi-layer convolution module to obtain each mode bottleneck layer characteristic of each group of preprocessed images, and outputting each mode bottleneck layer characteristic of each group of preprocessed images to the K decoding module;
the 1 st fusion coding module is used for splicing all modes of each group of preprocessed images in the channel direction, obtaining 1 st stage characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the 1 st stage characteristics of each group of preprocessed images and the 1 st stage characteristics of all modes of each group of preprocessed images into the 1 st attention module, obtaining 1 st stage fusion characteristics of each group of preprocessed images, outputting the 1 st stage fusion characteristics of each group of preprocessed images to the 1 st downsampling layer, obtaining 1 st stage downsampling fusion characteristics of each group of preprocessed images, and outputting the 1 st stage downsampling fusion characteristics of each group of preprocessed images to the 2 nd fusion coding network;
if k epsilon [2,K-1], the kth fusion coding module is used for obtaining the kth phase characteristic of each group of preprocessed images through a multi-layer convolution module, inputting the kth phase characteristic of each group of preprocessed images and the kth phase characteristic of each mode of each group of preprocessed images into the kth attention module to obtain the kth phase fusion characteristic of each group of preprocessed images, outputting the kth phase fusion characteristic of each group of preprocessed images to a kth downsampling layer to obtain the downsampling fusion characteristic of the kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristic of the kth phase of each group of preprocessed images to the kth+1th fusion coding network;
The Kth fusion coding module is used for obtaining the Kth phase characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the Kth phase characteristics of each group of preprocessed images and the Kth phase characteristics of each mode of each group of preprocessed images to the Kth attention module to obtain the Kth phase fusion characteristics of each group of preprocessed images, outputting the Kth phase fusion characteristics of each group of preprocessed images to the Kth downsampling layer to obtain the downsampling fusion characteristics of the Kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristics of the Kth phase of each group of preprocessed images to the bottleneck module;
the bottleneck module is used for extracting the downsampling fusion characteristics of the K-th stage of each group of preprocessed images through the multi-layer convolution module to obtain bottleneck layer fusion characteristics of each group of preprocessed images, and outputting the bottleneck layer fusion characteristics of each group of preprocessed images to the K-th fusion decoding module;
the Kth decoding module is used for upsampling the bottleneck layer characteristics of each mode of each group of preprocessed images to obtain upsampling characteristics of each mode of each group of preprocessed images in a Kth stage, performing channel dimension splicing on the upsampling characteristics of each mode of each group of preprocessed images in the Kth stage and the characteristics of each mode of each group of preprocessed images in the Kth stage, obtaining upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage through the multi-layer convolution module, and respectively outputting the upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage to the Kth-1 decoding module and the Kth-1 fusion decoding module;
If k is [2,K-1], the kth decoding module upsamples upsampled splicing features of the kth stage of each mode of each group of preprocessed images to obtain upsampled features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampled features of each mode of each group of preprocessed images and the features of each mode of each kth stage of each group of preprocessed images, obtains upsampled splicing features of each mode of each group of preprocessed images through a multi-layer convolution module, and outputs the upsampled splicing features of each mode of each group of preprocessed images to the kth-1 decoding module and the kth-1 fusion decoding module respectively;
the 1 st decoding module performs up-sampling on the up-sampling splicing characteristics of the 2 nd stage of each mode of each group of preprocessed image to obtain up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image, performs channel dimension splicing on the up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image and the 1 st stage characteristics of each mode of each group of preprocessed image, obtains the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image through the multi-layer convolution module, and outputs the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image to the 1 st fusion decoding module;
The Kth fusion decoding module performs up-sampling on bottleneck layer fusion characteristics of each group of preprocessed images to obtain up-sampling fusion characteristics of a Kth stage of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the Kth stage of each group of preprocessed images and up-sampling splicing characteristics of each mode of each Kth stage of each group of preprocessed images, and then obtains up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images through the Kth inversion bottleneck convolution module, and outputs the up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images to the Kth-1 fusion decoding module;
if k is [2,K-1], the kth fusion decoding module upsamples the upsampling splicing fusion feature of the kth+1 stage of each group of preprocessed images to obtain upsampling fusion features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampling fusion feature of the kth stage of each group of preprocessed images and the upsampling splicing feature of the kth stage of each mode of each group of preprocessed images, obtains the upsampling splicing fusion feature of the kth stage of each group of preprocessed images through the kth inversion bottleneck convolution module, and outputs the upsampling splicing fusion feature of the kth stage of each group of preprocessed images to the kth-1 fusion decoding module;
The 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain a head and neck tumor prediction probability map of each group of preprocessed images;
the cross entropy dess weighted loss function in the step 2 is defined as follows:
Loss=α×L ce +β×L dice
where Loss represents a cross entropy dess weighted Loss function, L ce Represents a cross entropy loss function, L dice Representing a dess loss function, α=0.5, β=0.5;
the cross entropy loss function is defined as follows:
wherein num=450, m=144, n=144, g i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g i,(x,y) If =0, then it is the normal region pixel, if g i,(x,y) Let 1 be the focus region pixel, p i,(x,y) ∈[0,1],p i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;
the dess loss function is defined as follows:
wherein δ e [0,1], δ=0.00005;
step 3: predictive segmentation is carried out on the PET-CT image of the head and neck tumor acquired in real time through a focus image segmentation network after training, a head and neck tumor predictive probability map of the PET-CT image of the head and neck tumor is obtained, and a head and neck tumor focus area pixel range of the PET-CT image of the head and neck tumor is obtained by combining probability threshold judgment;
and 3, judging and obtaining the pixel range of the head and neck tumor focus area of the real-time head and neck tumor PET-CT image by combining the probability threshold, wherein the pixel range is as follows:
the pixels with the prediction probability value larger than the probability threshold value, which are classified by the head and neck tumor focus areas, of the pixels in the head and neck tumor prediction probability image of the real-time head and neck tumor PET-CT image are screened out to be used as focus area pixels, and the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image is further obtained;
the segmentation results of the methods obtained by taking the dess coefficient as a measurement index are as follows:
Ours | UNet | Att-UNet | UNETR | SwinUNet | |
tumor major region Dice | 0.7931 | 0.7868 | 0.7639 | 0.7967 | 0.7847 |
Lymph node area Dice | 0.7384 | 0.7072 | 0.6910 | 0.6932 | 0.6891 |
Average Dice | 0.7657 | 0.7474 | 0.7275 | 0.7450 | 0.7369 |
The dess coefficient is defined as follows:
from the experimental results, the average dess coefficient of the method is obviously higher than that of other methods when the head and neck tumor focus area prediction of the head and neck tumor PET-CT image is carried out, wherein the dess coefficient of the main tumor area is higher than UNet and Att-UNet, swinUNet and slightly lower than UNETR (0.0036); the dess coefficient of lymph node area is significantly higher than UNet (0.0312), att-UNet (0.0474), UNETR (0.0452), swinlunet (0.0493). Because the method performs feature fusion in both the encoding stage and the decoding stage, the recognition capability of the network to the lymph node region in the PET-CT image of the head and neck tumor is effectively improved, and the segmentation effect of the whole head and neck tumor focus region is further improved.
Particular embodiments of the present invention also provide a computer readable medium.
The computer readable medium is a server workstation;
the server workstation stores a computer program executed by an electronic device, which when run on the electronic device causes the electronic device to execute the steps of the stereoscopic SAR data-based preferred method of the embodiments of the invention.
It should be understood that parts of the specification not specifically set forth herein are all prior art.
It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.
Claims (8)
1. A head and neck tumor focus area image segmentation method is characterized in that:
constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for prediction to construct a cross entropy dess weighted loss function, and optimizing and training by a random gradient descent algorithm to obtain a trained focus image segmentation network;
and carrying out prediction segmentation through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and further combining with probability threshold judgment to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.
2. The method for segmenting an image of a focal region of a head and neck tumor according to claim 1, comprising the steps of:
Step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;
step 2: constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for focus segmentation prediction, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm;
step 3: and carrying out prediction segmentation on the head and neck tumor PET-CT image acquired in real time through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and judging by combining a probability threshold value to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.
3. The head and neck tumor lesion image segmentation method according to claim 2, wherein:
The lesion image segmentation network in step 2 comprises:
a single-mode encoding network, a fusion encoding network, a single-mode decoding network, and a fusion decoding network;
the single-mode coding network performs feature extraction processing on each group of preprocessed images to obtain intermediate feature representation and final feature representation of each group of preprocessed images, outputs the intermediate feature representation of each group of preprocessed images to the fusion coding network, and outputs the final feature representation of each group of preprocessed images to the single-mode decoding network;
the fusion coding network performs feature fusion processing on the intermediate feature representation of each group of preprocessed images to obtain fusion coding features of each group of preprocessed images, and outputs the fusion coding features of each group of preprocessed images to the fusion decoding network;
the single-mode decoding network is used for decoding the final characteristic representation of each group of preprocessed images to obtain the decoding characteristic representation of each group of preprocessed images, and outputting the decoding characteristic representation of each group of preprocessed images to the fusion decoding network;
and the fusion decoding network performs fusion decoding processing on the decoding characteristic representation of each group of preprocessed images to finally obtain the predicted focus region segmentation image of each preprocessed image.
4. The method for segmenting an image of a focal region of a head and neck tumor according to claim 3, wherein:
the single-mode encoding network comprises:
a 1 st coding module, a 2 nd coding module, a..a, a K-th coding module, a bottleneck module;
the 1 st coding module, the 2 nd coding module, the third, the K coding module and the bottleneck module are sequentially cascaded;
the kth coding module is formed by sequentially cascading a plurality of convolution modules and downsampling layers, and k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the fusion encoding network comprises:
the 1 st fusion coding module, the 2 nd fusion coding module, the..a., the K-th fusion coding module, the bottleneck module;
the 1 st fusion coding module, the 2 nd fusion coding module, the third, the K fusion coding module and the bottleneck module are sequentially cascaded;
the kth fusion coding module is formed by sequentially cascading a multi-layer convolution module, an attention module and a downsampling layer, wherein k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the single-mode decoding network comprises:
a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;
The 1 st decoding module, the 2 nd decoding module, the third and the K decoding modules are sequentially cascaded;
the kth decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and a plurality of layers of convolution modules, wherein k is E [1, K ];
the converged decoding network includes:
a 1 st fusion decoding module, a 2 nd fusion decoding module, a.k. fusion decoding module;
the 1 st fusion decoding module, the 2 nd fusion decoding module, the third and the K fusion decoding modules are sequentially cascaded;
the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, and k is E [1, K ].
5. The method for segmenting an image of a focal region of a head and neck tumor according to claim 4, wherein:
the 1 st coding module is used for respectively enabling each mode of each group of preprocessed images to pass through the multi-layer convolution module to obtain 1 st phase characteristics of each mode of each group of preprocessed images, and respectively outputting the 1 st phase characteristics of each mode of each group of preprocessed images to the 1 st downsampling layer, the 1 st fusion coding module and the 1 st decoding module;
after the 1 st stage feature of each mode of each group of preprocessed images is output to the 1 st downsampling layer, the 1 st stage downsampling feature of each mode of each group of preprocessed images is obtained, and the 1 st stage downsampling feature of each mode of each group of preprocessed images is output to the 2 nd coding module;
If k is [2,K-1], the kth coding module performs feature extraction on the downsampling features of the kth-1 stage of each mode of each group of preprocessed images through a multi-layer convolution module to obtain the kth stage features of each mode of each group of preprocessed images, and outputs the kth stage features of each mode of each group of preprocessed images to the kth downsampling layer, the kth fusion coding module and the kth decoding module respectively;
after the characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth downsampling layer, downsampling characteristics of the kth stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth+1th coding module;
the Kth coding module extracts the down-sampling characteristics of each mode Kth stage of each group of preprocessed images through a multi-layer convolution module to obtain each mode Kth stage characteristic of each group of preprocessed images, and outputs each mode Kth stage characteristic of each group of preprocessed images to the Kth down-sampling layer, the Kth fusion coding module and the Kth decoding module respectively;
after the characteristics of the K stage of each mode of each group of preprocessed images are output to the K downsampling layer, the downsampling characteristics of the K stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the K stage of each mode of each group of preprocessed images are output to the bottleneck module;
The bottleneck module is used for extracting the downsampling characteristics of each mode K stage of each group of preprocessed images through the multi-layer convolution module to obtain each mode bottleneck layer characteristic of each group of preprocessed images, and outputting each mode bottleneck layer characteristic of each group of preprocessed images to the K decoding module;
the 1 st fusion coding module is used for splicing all modes of each group of preprocessed images in the channel direction, obtaining 1 st stage characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the 1 st stage characteristics of each group of preprocessed images and the 1 st stage characteristics of all modes of each group of preprocessed images into the 1 st attention module, obtaining 1 st stage fusion characteristics of each group of preprocessed images, outputting the 1 st stage fusion characteristics of each group of preprocessed images to the 1 st downsampling layer, obtaining 1 st stage downsampling fusion characteristics of each group of preprocessed images, and outputting the 1 st stage downsampling fusion characteristics of each group of preprocessed images to the 2 nd fusion coding network;
if k epsilon [2,K-1], the kth fusion coding module is used for obtaining the kth phase characteristic of each group of preprocessed images through a multi-layer convolution module, inputting the kth phase characteristic of each group of preprocessed images and the kth phase characteristic of each mode of each group of preprocessed images into the kth attention module to obtain the kth phase fusion characteristic of each group of preprocessed images, outputting the kth phase fusion characteristic of each group of preprocessed images to a kth downsampling layer to obtain the downsampling fusion characteristic of the kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristic of the kth phase of each group of preprocessed images to the kth+1th fusion coding network;
The Kth fusion coding module is used for obtaining the Kth phase characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the Kth phase characteristics of each group of preprocessed images and the Kth phase characteristics of each mode of each group of preprocessed images to the Kth attention module to obtain the Kth phase fusion characteristics of each group of preprocessed images, outputting the Kth phase fusion characteristics of each group of preprocessed images to the Kth downsampling layer to obtain the downsampling fusion characteristics of the Kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristics of the Kth phase of each group of preprocessed images to the bottleneck module;
the bottleneck module is used for extracting the downsampling fusion characteristics of the K-th stage of each group of preprocessed images through the multi-layer convolution module to obtain bottleneck layer fusion characteristics of each group of preprocessed images, and outputting the bottleneck layer fusion characteristics of each group of preprocessed images to the K-th fusion decoding module;
the Kth decoding module is used for upsampling the bottleneck layer characteristics of each mode of each group of preprocessed images to obtain upsampling characteristics of each mode of each group of preprocessed images in a Kth stage, performing channel dimension splicing on the upsampling characteristics of each mode of each group of preprocessed images in the Kth stage and the characteristics of each mode of each group of preprocessed images in the Kth stage, obtaining upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage through the multi-layer convolution module, and respectively outputting the upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage to the Kth-1 decoding module and the Kth-1 fusion decoding module;
If k is [2,K-1], the kth decoding module upsamples upsampled splicing features of the kth stage of each mode of each group of preprocessed images to obtain upsampled features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampled features of each mode of each group of preprocessed images and the features of each mode of each kth stage of each group of preprocessed images, obtains upsampled splicing features of each mode of each group of preprocessed images through a multi-layer convolution module, and outputs the upsampled splicing features of each mode of each group of preprocessed images to the kth-1 decoding module and the kth-1 fusion decoding module respectively;
the 1 st decoding module performs up-sampling on the up-sampling splicing characteristics of the 2 nd stage of each mode of each group of preprocessed image to obtain up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image, performs channel dimension splicing on the up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image and the 1 st stage characteristics of each mode of each group of preprocessed image, obtains the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image through the multi-layer convolution module, and outputs the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image to the 1 st fusion decoding module;
The Kth fusion decoding module performs up-sampling on bottleneck layer fusion characteristics of each group of preprocessed images to obtain up-sampling fusion characteristics of a Kth stage of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the Kth stage of each group of preprocessed images and up-sampling splicing characteristics of each mode of each Kth stage of each group of preprocessed images, and then obtains up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images through the Kth inversion bottleneck convolution module, and outputs the up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images to the Kth-1 fusion decoding module;
if k is [2,K-1], the kth fusion decoding module upsamples the upsampling splicing fusion feature of the kth+1 stage of each group of preprocessed images to obtain upsampling fusion features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampling fusion feature of the kth stage of each group of preprocessed images and the upsampling splicing feature of the kth stage of each mode of each group of preprocessed images, obtains the upsampling splicing fusion feature of the kth stage of each group of preprocessed images through the kth inversion bottleneck convolution module, and outputs the upsampling splicing fusion feature of the kth stage of each group of preprocessed images to the kth-1 fusion decoding module;
The 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain the head and neck tumor prediction probability map of each group of preprocessed images.
6. The method for segmenting an image of a focal region of a head and neck tumor according to claim 5, wherein:
the cross entropy dess weighted loss function in the step 2 is defined as follows:
Loss=α×L ce +β×L dice
where Loss represents a cross entropy dess weighted Loss function, L ce Represents a cross entropy loss function, L dice Representing the dess loss function, α represents the cross entropy loss weight, and β represents the dess loss weight;
the cross entropy loss function is defined as follows:
wherein NUM represents the number of preprocessed images, M represents the number of rows of the ith preprocessed image, N represents the number of columns of the ith preprocessed image, g i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g i,(x,y) If =0, then it is the normal region pixel, if g i,(x,y) Let 1 be the focus region pixel, p i,(x,y) ∈[0,1],p i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;
the dess loss function is defined as follows:
wherein, delta epsilon [0,1], delta represents gradient propagation adjustable coefficient.
7. The method for segmenting the image of the focal region of the head and neck tumor according to claim 6, wherein:
and 3, judging and obtaining the pixel range of the head and neck tumor focus area of the real-time head and neck tumor PET-CT image by combining the probability threshold, wherein the pixel range is as follows:
and screening out pixels with the predicted probability value of which the pixels belong to the classification of the head and neck tumor focus areas in the head and neck tumor predicted probability image of the real-time head and neck tumor PET-CT image and are larger than a probability threshold value as focus area pixels, and further obtaining the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image.
8. The method for segmenting an image of a focal region of a head and neck tumor according to claim 7, wherein:
a computer readable medium, characterized in that it stores a computer program for execution by an electronic device, which computer program, when run on the electronic device, causes the electronic device to perform the steps of the method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311383312.0A CN117291935A (en) | 2023-10-23 | 2023-10-23 | Head and neck tumor focus area image segmentation method and computer readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311383312.0A CN117291935A (en) | 2023-10-23 | 2023-10-23 | Head and neck tumor focus area image segmentation method and computer readable medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117291935A true CN117291935A (en) | 2023-12-26 |
Family
ID=89248105
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311383312.0A Pending CN117291935A (en) | 2023-10-23 | 2023-10-23 | Head and neck tumor focus area image segmentation method and computer readable medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117291935A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118096773A (en) * | 2024-04-29 | 2024-05-28 | 东莞市人民医院 | Intratumoral and oncological Zhou Shengjing analysis method, device, equipment and storage medium |
-
2023
- 2023-10-23 CN CN202311383312.0A patent/CN117291935A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118096773A (en) * | 2024-04-29 | 2024-05-28 | 东莞市人民医院 | Intratumoral and oncological Zhou Shengjing analysis method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111784671B (en) | Pathological image focus region detection method based on multi-scale deep learning | |
CN111369565B (en) | Digital pathological image segmentation and classification method based on graph convolution network | |
CN111429473B (en) | Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion | |
CN115661144B (en) | Adaptive medical image segmentation method based on deformable U-Net | |
CN111951288B (en) | Skin cancer lesion segmentation method based on deep learning | |
CN112258488A (en) | Medical image focus segmentation method | |
CN112446892A (en) | Cell nucleus segmentation method based on attention learning | |
CN112767417B (en) | Multi-modal image segmentation method based on cascaded U-Net network | |
CN111091575B (en) | Medical image segmentation method based on reinforcement learning method | |
CN113539402B (en) | Multi-mode image automatic sketching model migration method | |
CN114219943A (en) | CT image organ-at-risk segmentation system based on deep learning | |
CN112862830A (en) | Multi-modal image segmentation method, system, terminal and readable storage medium | |
CN114266794B (en) | Pathological section image cancer region segmentation system based on full convolution neural network | |
CN117291935A (en) | Head and neck tumor focus area image segmentation method and computer readable medium | |
CN115375711A (en) | Image segmentation method of global context attention network based on multi-scale fusion | |
Shan et al. | SCA-Net: A spatial and channel attention network for medical image segmentation | |
CN115546466A (en) | Weak supervision image target positioning method based on multi-scale significant feature fusion | |
CN113538363A (en) | Lung medical image segmentation method and device based on improved U-Net | |
CN117437423A (en) | Weak supervision medical image segmentation method and device based on SAM collaborative learning and cross-layer feature aggregation enhancement | |
CN110992309B (en) | Fundus image segmentation method based on deep information transfer network | |
CN116883341A (en) | Liver tumor CT image automatic segmentation method based on deep learning | |
Nie et al. | Semantic-guided encoder feature learning for blurry boundary delineation | |
US20230115927A1 (en) | Systems and methods for plaque identification, plaque composition analysis, and plaque stability detection | |
US20230162353A1 (en) | Multistream fusion encoder for prostate lesion segmentation and classification | |
Ru et al. | A dermoscopic image segmentation algorithm based on U-shaped architecture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |