CN117291935A - Head and neck tumor focus area image segmentation method and computer readable medium - Google Patents

Head and neck tumor focus area image segmentation method and computer readable medium Download PDF

Info

Publication number
CN117291935A
CN117291935A CN202311383312.0A CN202311383312A CN117291935A CN 117291935 A CN117291935 A CN 117291935A CN 202311383312 A CN202311383312 A CN 202311383312A CN 117291935 A CN117291935 A CN 117291935A
Authority
CN
China
Prior art keywords
group
kth
preprocessed images
fusion
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311383312.0A
Other languages
Chinese (zh)
Inventor
蔡贤涛
李欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202311383312.0A priority Critical patent/CN117291935A/en
Publication of CN117291935A publication Critical patent/CN117291935A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10104Positron emission tomography [PET]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a head and neck tumor focus area image segmentation method and a computer readable medium. The method comprises the steps of obtaining a plurality of groups of original PET-CT images of head and neck tumor, sequentially preprocessing to obtain each group of preprocessed images, and marking corresponding real classification labels; constructing a focus image segmentation network, carrying out focus segmentation prediction by combining each group of preprocessed images, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm; and carrying out prediction segmentation and probability threshold judgment on the PET-CT image of the head and neck tumor acquired in real time through a trained focus image segmentation network to obtain the pixel range of the focus area of the head and neck tumor in real time. The invention utilizes the advantage of information complementation among multiple modes, and improves the accuracy of segmentation prediction of the head and neck tumor focus pixel region.

Description

Head and neck tumor focus area image segmentation method and computer readable medium
Technical Field
The invention belongs to the technical field of medical image processing, and particularly relates to a head and neck tumor focus area image segmentation method and a computer readable medium.
Background
Head and neck tumors are the fifth highest common cancer type worldwide. At present, PET-CT (positron emission tomography) and X-ray Computed Tomography (CT) are combined, PET provides molecular information such as functions and metabolism of a focus, CT provides accurate anatomical positioning of the focus, and assistance is provided for diagnosis of head and neck tumors. In the clinic, doctors mainly evaluate the tumor condition by observing the position, shape, size and boundary of a focus, so as to formulate a treatment scheme. Head and neck tumor lesion segmentation can assist doctors in completing clinical diagnosis and is therefore an important problem in medical image analysis. The problem of accurate segmentation of head and neck tumor lesions in PET-CT images remains to be solved, mainly for the following reasons:
PET and CT images provide different biological and anatomical information, however how to effectively blend these two different modalities remains a challenging problem;
head and neck tumors have great heterogeneity in shape, size and intensity due to varying degrees of lesions, and there may be a blurred boundary or overlap region between the tumor body and surrounding tissue, which increases the complexity of the segmentation.
Some lesions are blurred in boundary and the marking of different areas of the lesion by different radiologists may not be consistent. In addition, manual labeling relies on expertise and experience of the imaging specialist, is time consuming and laborious, and may introduce subjective differences to the doctor.
Therefore, the establishment of an accurate automatic segmentation method for head and neck tumor lesions has great significance for clinical diagnosis and disease management. In recent years, deep learning-based methods have exhibited good performance in solving various computer vision problems (e.g., image classification, object detection, and semantic segmentation). Many methods have been applied to head and neck tumor lesion segmentation and have achieved impressive segmentation effects.
Convolutional neural networks, represented by U-Net, have classical symmetric encoder and decoder structures and preserve high resolution details by a jump connection. It has achieved significant performance in medical image segmentation tasks due to the strong feature extraction and generalized biasing capability that is exhibited on smaller datasets. The multi-mode medical image has the advantages that different information can be provided, so that the judgment on the focus is more accurate, but if the multi-mode medical image is simply spliced and then is used as the input of the U-shaped network, the complementary information among the modes cannot be fully utilized.
Disclosure of Invention
In order to solve the technical problems, the invention provides a head and neck tumor focus area image segmentation method and a computer readable medium.
The technical scheme of the method is a head and neck tumor focus area image segmentation method, which is characterized in that:
constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for prediction to construct a cross entropy dess weighted loss function, and optimizing and training by a random gradient descent algorithm to obtain a trained focus image segmentation network;
and carrying out prediction segmentation through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and further combining with probability threshold judgment to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.
The technical scheme of the method specifically comprises the following steps:
step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment, splicing treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;
Step 2: constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for focus segmentation prediction, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm;
step 3: predictive segmentation is carried out on the PET-CT image of the head and neck tumor acquired in real time through a focus image segmentation network after training, a head and neck tumor predictive probability map of the PET-CT image of the head and neck tumor is obtained, and a head and neck tumor focus area pixel range of the PET-CT image of the head and neck tumor is obtained by combining probability threshold judgment;
preferably, the lesion image segmentation network of step 2 comprises:
a single-mode encoding network, a fusion encoding network, a single-mode decoding network, and a fusion decoding network;
the single-mode coding network performs feature extraction processing on each group of preprocessed images to obtain intermediate feature representation and final feature representation of each group of preprocessed images, outputs the intermediate feature representation of each group of preprocessed images to the fusion coding network, and outputs the final feature representation of each group of preprocessed images to the single-mode decoding network;
The fusion coding network performs feature fusion processing on the intermediate feature representation of each group of preprocessed images to obtain fusion coding features of each group of preprocessed images, and outputs the fusion coding features of each group of preprocessed images to the fusion decoding network;
the single-mode decoding network is used for decoding the final characteristic representation of each group of preprocessed images to obtain the decoding characteristic representation of each group of preprocessed images, and outputting the decoding characteristic representation of each group of preprocessed images to the fusion decoding network;
the fusion decoding network performs fusion decoding processing on the decoding characteristic representation of each group of preprocessed images to finally obtain a predicted focus region segmentation image of each preprocessed image;
the single-mode encoding network comprises:
a 1 st coding module, a 2 nd coding module, a..a, a K-th coding module, a bottleneck module;
the 1 st coding module, the 2 nd coding module, the third, the K coding module and the bottleneck module are sequentially cascaded;
the kth coding module is formed by sequentially cascading a plurality of convolution modules and downsampling layers, and k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
The fusion encoding network comprises:
the 1 st fusion coding module, the 2 nd fusion coding module, the..a., the K-th fusion coding module, the bottleneck module;
the 1 st fusion coding module, the 2 nd fusion coding module, the third, the K fusion coding module and the bottleneck module are sequentially cascaded;
the kth fusion coding module is formed by sequentially cascading a multi-layer convolution module, an attention module and a downsampling layer, wherein k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the single-mode decoding network comprises:
a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;
the 1 st decoding module, the 2 nd decoding module, the third and the K decoding modules are sequentially cascaded;
the kth decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and a plurality of layers of convolution modules, wherein k is E [1, K ];
the converged decoding network includes:
a 1 st fusion decoding module, a 2 nd fusion decoding module, a.k. fusion decoding module;
the 1 st fusion decoding module, the 2 nd fusion decoding module, the third and the K fusion decoding modules are sequentially cascaded;
the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, wherein k is E [1, K ];
The 1 st coding module is used for respectively enabling each mode of each group of preprocessed images to pass through the multi-layer convolution module to obtain 1 st phase characteristics of each mode of each group of preprocessed images, and respectively outputting the 1 st phase characteristics of each mode of each group of preprocessed images to the 1 st downsampling layer, the 1 st fusion coding module and the 1 st decoding module;
after the 1 st stage feature of each mode of each group of preprocessed images is output to the 1 st downsampling layer, the 1 st stage downsampling feature of each mode of each group of preprocessed images is obtained, and the 1 st stage downsampling feature of each mode of each group of preprocessed images is output to the 2 nd coding module;
if k is [2,K-1], the kth coding module performs feature extraction on the downsampling features of the kth-1 stage of each mode of each group of preprocessed images through a multi-layer convolution module to obtain the kth stage features of each mode of each group of preprocessed images, and outputs the kth stage features of each mode of each group of preprocessed images to the kth downsampling layer, the kth fusion coding module and the kth decoding module respectively;
after the characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth downsampling layer, downsampling characteristics of the kth stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth+1th coding module;
The Kth coding module extracts the down-sampling characteristics of each mode Kth stage of each group of preprocessed images through a multi-layer convolution module to obtain each mode Kth stage characteristic of each group of preprocessed images, and outputs each mode Kth stage characteristic of each group of preprocessed images to the Kth down-sampling layer, the Kth fusion coding module and the Kth decoding module respectively;
after the characteristics of the K stage of each mode of each group of preprocessed images are output to the K downsampling layer, the downsampling characteristics of the K stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the K stage of each mode of each group of preprocessed images are output to the bottleneck module;
the bottleneck module is used for extracting the downsampling characteristics of each mode K stage of each group of preprocessed images through the multi-layer convolution module to obtain each mode bottleneck layer characteristic of each group of preprocessed images, and outputting each mode bottleneck layer characteristic of each group of preprocessed images to the K decoding module;
the 1 st fusion coding module is used for splicing all modes of each group of preprocessed images in the channel direction, obtaining 1 st stage characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the 1 st stage characteristics of each group of preprocessed images and the 1 st stage characteristics of all modes of each group of preprocessed images into the 1 st attention module, obtaining 1 st stage fusion characteristics of each group of preprocessed images, outputting the 1 st stage fusion characteristics of each group of preprocessed images to the 1 st downsampling layer, obtaining 1 st stage downsampling fusion characteristics of each group of preprocessed images, and outputting the 1 st stage downsampling fusion characteristics of each group of preprocessed images to the 2 nd fusion coding network;
If k epsilon [2,K-1], the kth fusion coding module is used for obtaining the kth phase characteristic of each group of preprocessed images through a multi-layer convolution module, inputting the kth phase characteristic of each group of preprocessed images and the kth phase characteristic of each mode of each group of preprocessed images into the kth attention module to obtain the kth phase fusion characteristic of each group of preprocessed images, outputting the kth phase fusion characteristic of each group of preprocessed images to a kth downsampling layer to obtain the downsampling fusion characteristic of the kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristic of the kth phase of each group of preprocessed images to the kth+1th fusion coding network;
the Kth fusion coding module is used for obtaining the Kth phase characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the Kth phase characteristics of each group of preprocessed images and the Kth phase characteristics of each mode of each group of preprocessed images to the Kth attention module to obtain the Kth phase fusion characteristics of each group of preprocessed images, outputting the Kth phase fusion characteristics of each group of preprocessed images to the Kth downsampling layer to obtain the downsampling fusion characteristics of the Kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristics of the Kth phase of each group of preprocessed images to the bottleneck module;
The bottleneck module is used for extracting the downsampling fusion characteristics of the K-th stage of each group of preprocessed images through the multi-layer convolution module to obtain bottleneck layer fusion characteristics of each group of preprocessed images, and outputting the bottleneck layer fusion characteristics of each group of preprocessed images to the K-th fusion decoding module;
the Kth decoding module is used for upsampling the bottleneck layer characteristics of each mode of each group of preprocessed images to obtain upsampling characteristics of each mode of each group of preprocessed images in a Kth stage, performing channel dimension splicing on the upsampling characteristics of each mode of each group of preprocessed images in the Kth stage and the characteristics of each mode of each group of preprocessed images in the Kth stage, obtaining upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage through the multi-layer convolution module, and respectively outputting the upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage to the Kth-1 decoding module and the Kth-1 fusion decoding module;
if k is [2,K-1], the kth decoding module upsamples upsampled splicing features of the kth stage of each mode of each group of preprocessed images to obtain upsampled features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampled features of each mode of each group of preprocessed images and the features of each mode of each kth stage of each group of preprocessed images, obtains upsampled splicing features of each mode of each group of preprocessed images through a multi-layer convolution module, and outputs the upsampled splicing features of each mode of each group of preprocessed images to the kth-1 decoding module and the kth-1 fusion decoding module respectively;
The 1 st decoding module performs up-sampling on the up-sampling splicing characteristics of the 2 nd stage of each mode of each group of preprocessed image to obtain up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image, performs channel dimension splicing on the up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image and the 1 st stage characteristics of each mode of each group of preprocessed image, obtains the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image through the multi-layer convolution module, and outputs the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image to the 1 st fusion decoding module;
the Kth fusion decoding module performs up-sampling on bottleneck layer fusion characteristics of each group of preprocessed images to obtain up-sampling fusion characteristics of a Kth stage of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the Kth stage of each group of preprocessed images and up-sampling splicing characteristics of each mode of each Kth stage of each group of preprocessed images, and then obtains up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images through the Kth inversion bottleneck convolution module, and outputs the up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images to the Kth-1 fusion decoding module;
If k is [2,K-1], the kth fusion decoding module upsamples the upsampling splicing fusion feature of the kth+1 stage of each group of preprocessed images to obtain upsampling fusion features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampling fusion feature of the kth stage of each group of preprocessed images and the upsampling splicing feature of the kth stage of each mode of each group of preprocessed images, obtains the upsampling splicing fusion feature of the kth stage of each group of preprocessed images through the kth inversion bottleneck convolution module, and outputs the upsampling splicing fusion feature of the kth stage of each group of preprocessed images to the kth-1 fusion decoding module;
the 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain a head and neck tumor prediction probability map of each group of preprocessed images;
The cross entropy dess weighted loss function in the step 2 is defined as follows:
Loss=α×L ce +β×L dice
where Loss represents a cross entropy dess weighted Loss function, L ce Represents a cross entropy loss function, L dice Representing the dess loss function, α represents the cross entropy loss weight, and β represents the dess loss weight;
the cross entropy loss function is defined as follows:
wherein NUM represents the number of preprocessed images, M represents the number of rows of the ith preprocessed image, N represents the number of columns of the ith preprocessed image, g i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g i,(x,y) If =0, then it is the normal region pixel, if g i,(x,y) Let 1 be the focus region pixel, p i,(x,y) ∈[0,1],p i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;
the dess loss function is defined as follows:
wherein, delta epsilon [0,1], delta represents gradient propagation adjustable coefficient;
preferably, the step 3 is performed in combination with probability threshold value to determine the pixel range of the focal region of the head and neck tumor of the real-time head and neck tumor PET-CT image, specifically as follows:
and screening out pixels with the predicted probability value of which the pixels belong to the classification of the head and neck tumor focus areas in the head and neck tumor predicted probability image of the real-time head and neck tumor PET-CT image and are larger than a probability threshold value as focus area pixels, and further obtaining the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image.
The present invention also provides a computer readable medium storing a computer program for execution by an electronic device, which when run on the electronic device performs the steps of the head and neck tumor lesion area image segmentation method.
The invention provides a method for improving the segmentation prediction accuracy of the head and neck tumor focus pixel region, and by jointly realizing the feature fusion of the coding and decoding stages, when the difference between the head and neck tumor lesion regions with different sizes and the lesion region and the normal region is not obvious, a good segmentation effect is achieved, and a doctor can be helped to determine the focus region more accurately.
Drawings
Fig. 1: the method of the embodiment of the invention is a flow chart.
Fig. 2: the neural network structure diagram of the embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In particular, the method according to the technical solution of the present invention may be implemented by those skilled in the art using computer software technology to implement an automatic operation flow, and a system apparatus for implementing the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention, and a computer device including the operation of the corresponding computer program, should also fall within the protection scope of the present invention.
The following describes a technical scheme of an embodiment of the present invention with reference to fig. 1-2, which is a head and neck tumor focus area image segmentation method, specifically as follows:
FIG. 1 is a flow chart of the method of the present invention;
step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;
step 2: constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for focus segmentation prediction, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm;
As shown in fig. 2, a neural network structure diagram of the present invention;
the lesion image segmentation network in step 2 comprises:
a 1 st coding module, a 2 nd coding module, a..a, a K-th coding module, a bottleneck module;
the 1 st coding module, the 2 nd coding module, the third, the K coding module and the bottleneck module are sequentially cascaded;
the kth coding module is formed by sequentially cascading a plurality of convolution modules and downsampling layers, and k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the fusion encoding network comprises:
the 1 st fusion coding module, the 2 nd fusion coding module, the..a., the K-th fusion coding module, the bottleneck module;
the 1 st fusion coding module, the 2 nd fusion coding module, the third, the K fusion coding module and the bottleneck module are sequentially cascaded;
the kth fusion coding module is formed by sequentially cascading a multi-layer convolution module, an attention module and a downsampling layer, wherein k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the single-mode decoding network comprises:
a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;
the 1 st decoding module, the 2 nd decoding module, the third and the K decoding modules are sequentially cascaded;
The kth decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and a plurality of layers of convolution modules, wherein k is E [1, K ];
the converged decoding network includes:
a 1 st fusion decoding module, a 2 nd fusion decoding module, a.k. fusion decoding module;
the 1 st fusion decoding module, the 2 nd fusion decoding module, the third and the K fusion decoding modules are sequentially cascaded;
the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, wherein k is E [1, K ];
the 1 st coding module is used for respectively enabling each mode of each group of preprocessed images to pass through the multi-layer convolution module to obtain 1 st phase characteristics of each mode of each group of preprocessed images, and respectively outputting the 1 st phase characteristics of each mode of each group of preprocessed images to the 1 st downsampling layer, the 1 st fusion coding module and the 1 st decoding module;
after the 1 st stage feature of each mode of each group of preprocessed images is output to the 1 st downsampling layer, the 1 st stage downsampling feature of each mode of each group of preprocessed images is obtained, and the 1 st stage downsampling feature of each mode of each group of preprocessed images is output to the 2 nd coding module;
If k is [2,K-1], the kth coding module performs feature extraction on the downsampling features of the kth-1 stage of each mode of each group of preprocessed images through a multi-layer convolution module to obtain the kth stage features of each mode of each group of preprocessed images, and outputs the kth stage features of each mode of each group of preprocessed images to the kth downsampling layer, the kth fusion coding module and the kth decoding module respectively;
after the characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth downsampling layer, downsampling characteristics of the kth stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth+1th coding module;
the Kth coding module extracts the down-sampling characteristics of each mode Kth stage of each group of preprocessed images through a multi-layer convolution module to obtain each mode Kth stage characteristic of each group of preprocessed images, and outputs each mode Kth stage characteristic of each group of preprocessed images to the Kth down-sampling layer, the Kth fusion coding module and the Kth decoding module respectively;
after the characteristics of the K stage of each mode of each group of preprocessed images are output to the K downsampling layer, the downsampling characteristics of the K stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the K stage of each mode of each group of preprocessed images are output to the bottleneck module;
The bottleneck module is used for extracting the downsampling characteristics of each mode K stage of each group of preprocessed images through the multi-layer convolution module to obtain each mode bottleneck layer characteristic of each group of preprocessed images, and outputting each mode bottleneck layer characteristic of each group of preprocessed images to the K decoding module;
the 1 st fusion coding module is used for splicing all modes of each group of preprocessed images in the channel direction, obtaining 1 st stage characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the 1 st stage characteristics of each group of preprocessed images and the 1 st stage characteristics of all modes of each group of preprocessed images into the 1 st attention module, obtaining 1 st stage fusion characteristics of each group of preprocessed images, outputting the 1 st stage fusion characteristics of each group of preprocessed images to the 1 st downsampling layer, obtaining 1 st stage downsampling fusion characteristics of each group of preprocessed images, and outputting the 1 st stage downsampling fusion characteristics of each group of preprocessed images to the 2 nd fusion coding network;
if k epsilon [2,K-1], the kth fusion coding module is used for obtaining the kth phase characteristic of each group of preprocessed images through a multi-layer convolution module, inputting the kth phase characteristic of each group of preprocessed images and the kth phase characteristic of each mode of each group of preprocessed images into the kth attention module to obtain the kth phase fusion characteristic of each group of preprocessed images, outputting the kth phase fusion characteristic of each group of preprocessed images to a kth downsampling layer to obtain the downsampling fusion characteristic of the kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristic of the kth phase of each group of preprocessed images to the kth+1th fusion coding network;
The Kth fusion coding module is used for obtaining the Kth phase characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the Kth phase characteristics of each group of preprocessed images and the Kth phase characteristics of each mode of each group of preprocessed images to the Kth attention module to obtain the Kth phase fusion characteristics of each group of preprocessed images, outputting the Kth phase fusion characteristics of each group of preprocessed images to the Kth downsampling layer to obtain the downsampling fusion characteristics of the Kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristics of the Kth phase of each group of preprocessed images to the bottleneck module;
the bottleneck module is used for extracting the downsampling fusion characteristics of the K-th stage of each group of preprocessed images through the multi-layer convolution module to obtain bottleneck layer fusion characteristics of each group of preprocessed images, and outputting the bottleneck layer fusion characteristics of each group of preprocessed images to the K-th fusion decoding module;
the Kth decoding module is used for upsampling the bottleneck layer characteristics of each mode of each group of preprocessed images to obtain upsampling characteristics of each mode of each group of preprocessed images in a Kth stage, performing channel dimension splicing on the upsampling characteristics of each mode of each group of preprocessed images in the Kth stage and the characteristics of each mode of each group of preprocessed images in the Kth stage, obtaining upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage through the multi-layer convolution module, and respectively outputting the upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage to the Kth-1 decoding module and the Kth-1 fusion decoding module;
If k is [2,K-1], the kth decoding module upsamples upsampled splicing features of the kth stage of each mode of each group of preprocessed images to obtain upsampled features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampled features of each mode of each group of preprocessed images and the features of each mode of each kth stage of each group of preprocessed images, obtains upsampled splicing features of each mode of each group of preprocessed images through a multi-layer convolution module, and outputs the upsampled splicing features of each mode of each group of preprocessed images to the kth-1 decoding module and the kth-1 fusion decoding module respectively;
the 1 st decoding module performs up-sampling on the up-sampling splicing characteristics of the 2 nd stage of each mode of each group of preprocessed image to obtain up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image, performs channel dimension splicing on the up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image and the 1 st stage characteristics of each mode of each group of preprocessed image, obtains the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image through the multi-layer convolution module, and outputs the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image to the 1 st fusion decoding module;
The Kth fusion decoding module performs up-sampling on bottleneck layer fusion characteristics of each group of preprocessed images to obtain up-sampling fusion characteristics of a Kth stage of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the Kth stage of each group of preprocessed images and up-sampling splicing characteristics of each mode of each Kth stage of each group of preprocessed images, and then obtains up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images through the Kth inversion bottleneck convolution module, and outputs the up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images to the Kth-1 fusion decoding module;
if k is [2,K-1], the kth fusion decoding module upsamples the upsampling splicing fusion feature of the kth+1 stage of each group of preprocessed images to obtain upsampling fusion features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampling fusion feature of the kth stage of each group of preprocessed images and the upsampling splicing feature of the kth stage of each mode of each group of preprocessed images, obtains the upsampling splicing fusion feature of the kth stage of each group of preprocessed images through the kth inversion bottleneck convolution module, and outputs the upsampling splicing fusion feature of the kth stage of each group of preprocessed images to the kth-1 fusion decoding module;
The 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain a head and neck tumor prediction probability map of each group of preprocessed images;
the cross entropy dess weighted loss function in the step 2 is defined as follows:
Loss=α×L ce +β×L dice
where Loss represents a cross entropy dess weighted Loss function, L ce Represents a cross entropy loss function, L dice Representing a dess loss function, α=0.5, β=0.5;
the cross entropy loss function is defined as follows:
wherein num=450, m=144, n=144, g i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g i,(x,y) If =0, then it is the normal region pixel, if g i,(x,y) Let 1 be the focus region pixel, p i,(x,y) ∈[0,1],p i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;
the dess loss function is defined as follows:
wherein δ e [0,1], δ=0.00005;
step 3: predictive segmentation is carried out on the PET-CT image of the head and neck tumor acquired in real time through a focus image segmentation network after training, a head and neck tumor predictive probability map of the PET-CT image of the head and neck tumor is obtained, and a head and neck tumor focus area pixel range of the PET-CT image of the head and neck tumor is obtained by combining probability threshold judgment;
and 3, judging and obtaining the pixel range of the head and neck tumor focus area of the real-time head and neck tumor PET-CT image by combining the probability threshold, wherein the pixel range is as follows:
the pixels with the prediction probability value larger than the probability threshold value, which are classified by the head and neck tumor focus areas, of the pixels in the head and neck tumor prediction probability image of the real-time head and neck tumor PET-CT image are screened out to be used as focus area pixels, and the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image is further obtained;
the segmentation results of the methods obtained by taking the dess coefficient as a measurement index are as follows:
Ours UNet Att-UNet UNETR SwinUNet
tumor major region Dice 0.7931 0.7868 0.7639 0.7967 0.7847
Lymph node area Dice 0.7384 0.7072 0.6910 0.6932 0.6891
Average Dice 0.7657 0.7474 0.7275 0.7450 0.7369
The dess coefficient is defined as follows:
from the experimental results, the average dess coefficient of the method is obviously higher than that of other methods when the head and neck tumor focus area prediction of the head and neck tumor PET-CT image is carried out, wherein the dess coefficient of the main tumor area is higher than UNet and Att-UNet, swinUNet and slightly lower than UNETR (0.0036); the dess coefficient of lymph node area is significantly higher than UNet (0.0312), att-UNet (0.0474), UNETR (0.0452), swinlunet (0.0493). Because the method performs feature fusion in both the encoding stage and the decoding stage, the recognition capability of the network to the lymph node region in the PET-CT image of the head and neck tumor is effectively improved, and the segmentation effect of the whole head and neck tumor focus region is further improved.
Particular embodiments of the present invention also provide a computer readable medium.
The computer readable medium is a server workstation;
the server workstation stores a computer program executed by an electronic device, which when run on the electronic device causes the electronic device to execute the steps of the stereoscopic SAR data-based preferred method of the embodiments of the invention.
It should be understood that parts of the specification not specifically set forth herein are all prior art.
It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.

Claims (8)

1. A head and neck tumor focus area image segmentation method is characterized in that:
constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for prediction to construct a cross entropy dess weighted loss function, and optimizing and training by a random gradient descent algorithm to obtain a trained focus image segmentation network;
and carrying out prediction segmentation through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and further combining with probability threshold judgment to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.
2. The method for segmenting an image of a focal region of a head and neck tumor according to claim 1, comprising the steps of:
Step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;
step 2: constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for focus segmentation prediction, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm;
step 3: and carrying out prediction segmentation on the head and neck tumor PET-CT image acquired in real time through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and judging by combining a probability threshold value to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.
3. The head and neck tumor lesion image segmentation method according to claim 2, wherein:
The lesion image segmentation network in step 2 comprises:
a single-mode encoding network, a fusion encoding network, a single-mode decoding network, and a fusion decoding network;
the single-mode coding network performs feature extraction processing on each group of preprocessed images to obtain intermediate feature representation and final feature representation of each group of preprocessed images, outputs the intermediate feature representation of each group of preprocessed images to the fusion coding network, and outputs the final feature representation of each group of preprocessed images to the single-mode decoding network;
the fusion coding network performs feature fusion processing on the intermediate feature representation of each group of preprocessed images to obtain fusion coding features of each group of preprocessed images, and outputs the fusion coding features of each group of preprocessed images to the fusion decoding network;
the single-mode decoding network is used for decoding the final characteristic representation of each group of preprocessed images to obtain the decoding characteristic representation of each group of preprocessed images, and outputting the decoding characteristic representation of each group of preprocessed images to the fusion decoding network;
and the fusion decoding network performs fusion decoding processing on the decoding characteristic representation of each group of preprocessed images to finally obtain the predicted focus region segmentation image of each preprocessed image.
4. The method for segmenting an image of a focal region of a head and neck tumor according to claim 3, wherein:
the single-mode encoding network comprises:
a 1 st coding module, a 2 nd coding module, a..a, a K-th coding module, a bottleneck module;
the 1 st coding module, the 2 nd coding module, the third, the K coding module and the bottleneck module are sequentially cascaded;
the kth coding module is formed by sequentially cascading a plurality of convolution modules and downsampling layers, and k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the fusion encoding network comprises:
the 1 st fusion coding module, the 2 nd fusion coding module, the..a., the K-th fusion coding module, the bottleneck module;
the 1 st fusion coding module, the 2 nd fusion coding module, the third, the K fusion coding module and the bottleneck module are sequentially cascaded;
the kth fusion coding module is formed by sequentially cascading a multi-layer convolution module, an attention module and a downsampling layer, wherein k is E [1, K ];
the bottleneck module is composed of a plurality of layers of convolution modules;
the single-mode decoding network comprises:
a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;
The 1 st decoding module, the 2 nd decoding module, the third and the K decoding modules are sequentially cascaded;
the kth decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and a plurality of layers of convolution modules, wherein k is E [1, K ];
the converged decoding network includes:
a 1 st fusion decoding module, a 2 nd fusion decoding module, a.k. fusion decoding module;
the 1 st fusion decoding module, the 2 nd fusion decoding module, the third and the K fusion decoding modules are sequentially cascaded;
the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, and k is E [1, K ].
5. The method for segmenting an image of a focal region of a head and neck tumor according to claim 4, wherein:
the 1 st coding module is used for respectively enabling each mode of each group of preprocessed images to pass through the multi-layer convolution module to obtain 1 st phase characteristics of each mode of each group of preprocessed images, and respectively outputting the 1 st phase characteristics of each mode of each group of preprocessed images to the 1 st downsampling layer, the 1 st fusion coding module and the 1 st decoding module;
after the 1 st stage feature of each mode of each group of preprocessed images is output to the 1 st downsampling layer, the 1 st stage downsampling feature of each mode of each group of preprocessed images is obtained, and the 1 st stage downsampling feature of each mode of each group of preprocessed images is output to the 2 nd coding module;
If k is [2,K-1], the kth coding module performs feature extraction on the downsampling features of the kth-1 stage of each mode of each group of preprocessed images through a multi-layer convolution module to obtain the kth stage features of each mode of each group of preprocessed images, and outputs the kth stage features of each mode of each group of preprocessed images to the kth downsampling layer, the kth fusion coding module and the kth decoding module respectively;
after the characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth downsampling layer, downsampling characteristics of the kth stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth+1th coding module;
the Kth coding module extracts the down-sampling characteristics of each mode Kth stage of each group of preprocessed images through a multi-layer convolution module to obtain each mode Kth stage characteristic of each group of preprocessed images, and outputs each mode Kth stage characteristic of each group of preprocessed images to the Kth down-sampling layer, the Kth fusion coding module and the Kth decoding module respectively;
after the characteristics of the K stage of each mode of each group of preprocessed images are output to the K downsampling layer, the downsampling characteristics of the K stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the K stage of each mode of each group of preprocessed images are output to the bottleneck module;
The bottleneck module is used for extracting the downsampling characteristics of each mode K stage of each group of preprocessed images through the multi-layer convolution module to obtain each mode bottleneck layer characteristic of each group of preprocessed images, and outputting each mode bottleneck layer characteristic of each group of preprocessed images to the K decoding module;
the 1 st fusion coding module is used for splicing all modes of each group of preprocessed images in the channel direction, obtaining 1 st stage characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the 1 st stage characteristics of each group of preprocessed images and the 1 st stage characteristics of all modes of each group of preprocessed images into the 1 st attention module, obtaining 1 st stage fusion characteristics of each group of preprocessed images, outputting the 1 st stage fusion characteristics of each group of preprocessed images to the 1 st downsampling layer, obtaining 1 st stage downsampling fusion characteristics of each group of preprocessed images, and outputting the 1 st stage downsampling fusion characteristics of each group of preprocessed images to the 2 nd fusion coding network;
if k epsilon [2,K-1], the kth fusion coding module is used for obtaining the kth phase characteristic of each group of preprocessed images through a multi-layer convolution module, inputting the kth phase characteristic of each group of preprocessed images and the kth phase characteristic of each mode of each group of preprocessed images into the kth attention module to obtain the kth phase fusion characteristic of each group of preprocessed images, outputting the kth phase fusion characteristic of each group of preprocessed images to a kth downsampling layer to obtain the downsampling fusion characteristic of the kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristic of the kth phase of each group of preprocessed images to the kth+1th fusion coding network;
The Kth fusion coding module is used for obtaining the Kth phase characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the Kth phase characteristics of each group of preprocessed images and the Kth phase characteristics of each mode of each group of preprocessed images to the Kth attention module to obtain the Kth phase fusion characteristics of each group of preprocessed images, outputting the Kth phase fusion characteristics of each group of preprocessed images to the Kth downsampling layer to obtain the downsampling fusion characteristics of the Kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristics of the Kth phase of each group of preprocessed images to the bottleneck module;
the bottleneck module is used for extracting the downsampling fusion characteristics of the K-th stage of each group of preprocessed images through the multi-layer convolution module to obtain bottleneck layer fusion characteristics of each group of preprocessed images, and outputting the bottleneck layer fusion characteristics of each group of preprocessed images to the K-th fusion decoding module;
the Kth decoding module is used for upsampling the bottleneck layer characteristics of each mode of each group of preprocessed images to obtain upsampling characteristics of each mode of each group of preprocessed images in a Kth stage, performing channel dimension splicing on the upsampling characteristics of each mode of each group of preprocessed images in the Kth stage and the characteristics of each mode of each group of preprocessed images in the Kth stage, obtaining upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage through the multi-layer convolution module, and respectively outputting the upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage to the Kth-1 decoding module and the Kth-1 fusion decoding module;
If k is [2,K-1], the kth decoding module upsamples upsampled splicing features of the kth stage of each mode of each group of preprocessed images to obtain upsampled features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampled features of each mode of each group of preprocessed images and the features of each mode of each kth stage of each group of preprocessed images, obtains upsampled splicing features of each mode of each group of preprocessed images through a multi-layer convolution module, and outputs the upsampled splicing features of each mode of each group of preprocessed images to the kth-1 decoding module and the kth-1 fusion decoding module respectively;
the 1 st decoding module performs up-sampling on the up-sampling splicing characteristics of the 2 nd stage of each mode of each group of preprocessed image to obtain up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image, performs channel dimension splicing on the up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image and the 1 st stage characteristics of each mode of each group of preprocessed image, obtains the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image through the multi-layer convolution module, and outputs the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image to the 1 st fusion decoding module;
The Kth fusion decoding module performs up-sampling on bottleneck layer fusion characteristics of each group of preprocessed images to obtain up-sampling fusion characteristics of a Kth stage of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the Kth stage of each group of preprocessed images and up-sampling splicing characteristics of each mode of each Kth stage of each group of preprocessed images, and then obtains up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images through the Kth inversion bottleneck convolution module, and outputs the up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images to the Kth-1 fusion decoding module;
if k is [2,K-1], the kth fusion decoding module upsamples the upsampling splicing fusion feature of the kth+1 stage of each group of preprocessed images to obtain upsampling fusion features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampling fusion feature of the kth stage of each group of preprocessed images and the upsampling splicing feature of the kth stage of each mode of each group of preprocessed images, obtains the upsampling splicing fusion feature of the kth stage of each group of preprocessed images through the kth inversion bottleneck convolution module, and outputs the upsampling splicing fusion feature of the kth stage of each group of preprocessed images to the kth-1 fusion decoding module;
The 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain the head and neck tumor prediction probability map of each group of preprocessed images.
6. The method for segmenting an image of a focal region of a head and neck tumor according to claim 5, wherein:
the cross entropy dess weighted loss function in the step 2 is defined as follows:
Loss=α×L ce +β×L dice
where Loss represents a cross entropy dess weighted Loss function, L ce Represents a cross entropy loss function, L dice Representing the dess loss function, α represents the cross entropy loss weight, and β represents the dess loss weight;
the cross entropy loss function is defined as follows:
wherein NUM represents the number of preprocessed images, M represents the number of rows of the ith preprocessed image, N represents the number of columns of the ith preprocessed image, g i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g i,(x,y) If =0, then it is the normal region pixel, if g i,(x,y) Let 1 be the focus region pixel, p i,(x,y) ∈[0,1],p i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;
the dess loss function is defined as follows:
wherein, delta epsilon [0,1], delta represents gradient propagation adjustable coefficient.
7. The method for segmenting the image of the focal region of the head and neck tumor according to claim 6, wherein:
and 3, judging and obtaining the pixel range of the head and neck tumor focus area of the real-time head and neck tumor PET-CT image by combining the probability threshold, wherein the pixel range is as follows:
and screening out pixels with the predicted probability value of which the pixels belong to the classification of the head and neck tumor focus areas in the head and neck tumor predicted probability image of the real-time head and neck tumor PET-CT image and are larger than a probability threshold value as focus area pixels, and further obtaining the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image.
8. The method for segmenting an image of a focal region of a head and neck tumor according to claim 7, wherein:
a computer readable medium, characterized in that it stores a computer program for execution by an electronic device, which computer program, when run on the electronic device, causes the electronic device to perform the steps of the method according to any one of claims 1-7.
CN202311383312.0A 2023-10-23 2023-10-23 Head and neck tumor focus area image segmentation method and computer readable medium Pending CN117291935A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311383312.0A CN117291935A (en) 2023-10-23 2023-10-23 Head and neck tumor focus area image segmentation method and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311383312.0A CN117291935A (en) 2023-10-23 2023-10-23 Head and neck tumor focus area image segmentation method and computer readable medium

Publications (1)

Publication Number Publication Date
CN117291935A true CN117291935A (en) 2023-12-26

Family

ID=89248105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311383312.0A Pending CN117291935A (en) 2023-10-23 2023-10-23 Head and neck tumor focus area image segmentation method and computer readable medium

Country Status (1)

Country Link
CN (1) CN117291935A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118096773A (en) * 2024-04-29 2024-05-28 东莞市人民医院 Intratumoral and oncological Zhou Shengjing analysis method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118096773A (en) * 2024-04-29 2024-05-28 东莞市人民医院 Intratumoral and oncological Zhou Shengjing analysis method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111784671B (en) Pathological image focus region detection method based on multi-scale deep learning
CN111369565B (en) Digital pathological image segmentation and classification method based on graph convolution network
CN111429473B (en) Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion
CN115661144B (en) Adaptive medical image segmentation method based on deformable U-Net
CN111951288B (en) Skin cancer lesion segmentation method based on deep learning
CN112258488A (en) Medical image focus segmentation method
CN112446892A (en) Cell nucleus segmentation method based on attention learning
CN112767417B (en) Multi-modal image segmentation method based on cascaded U-Net network
CN111091575B (en) Medical image segmentation method based on reinforcement learning method
CN113539402B (en) Multi-mode image automatic sketching model migration method
CN114219943A (en) CT image organ-at-risk segmentation system based on deep learning
CN112862830A (en) Multi-modal image segmentation method, system, terminal and readable storage medium
CN114266794B (en) Pathological section image cancer region segmentation system based on full convolution neural network
CN117291935A (en) Head and neck tumor focus area image segmentation method and computer readable medium
CN115375711A (en) Image segmentation method of global context attention network based on multi-scale fusion
Shan et al. SCA-Net: A spatial and channel attention network for medical image segmentation
CN115546466A (en) Weak supervision image target positioning method based on multi-scale significant feature fusion
CN113538363A (en) Lung medical image segmentation method and device based on improved U-Net
CN117437423A (en) Weak supervision medical image segmentation method and device based on SAM collaborative learning and cross-layer feature aggregation enhancement
CN110992309B (en) Fundus image segmentation method based on deep information transfer network
CN116883341A (en) Liver tumor CT image automatic segmentation method based on deep learning
Nie et al. Semantic-guided encoder feature learning for blurry boundary delineation
US20230115927A1 (en) Systems and methods for plaque identification, plaque composition analysis, and plaque stability detection
US20230162353A1 (en) Multistream fusion encoder for prostate lesion segmentation and classification
Ru et al. A dermoscopic image segmentation algorithm based on U-shaped architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination