CN117291935A

CN117291935A - Head and neck tumor focus area image segmentation method and computer readable medium

Info

Publication number: CN117291935A
Application number: CN202311383312.0A
Authority: CN
Inventors: 蔡贤涛; 李欢
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2023-10-23
Filing date: 2023-10-23
Publication date: 2023-12-26

Abstract

The invention provides a head and neck tumor focus area image segmentation method and a computer readable medium. The method comprises the steps of obtaining a plurality of groups of original PET-CT images of head and neck tumor, sequentially preprocessing to obtain each group of preprocessed images, and marking corresponding real classification labels; constructing a focus image segmentation network, carrying out focus segmentation prediction by combining each group of preprocessed images, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm; and carrying out prediction segmentation and probability threshold judgment on the PET-CT image of the head and neck tumor acquired in real time through a trained focus image segmentation network to obtain the pixel range of the focus area of the head and neck tumor in real time. The invention utilizes the advantage of information complementation among multiple modes, and improves the accuracy of segmentation prediction of the head and neck tumor focus pixel region.

Description

Head and neck tumor focus area image segmentation method and computer readable medium

Technical Field

The invention belongs to the technical field of medical image processing, and particularly relates to a head and neck tumor focus area image segmentation method and a computer readable medium.

Background

Head and neck tumors are the fifth highest common cancer type worldwide. At present, PET-CT (positron emission tomography) and X-ray Computed Tomography (CT) are combined, PET provides molecular information such as functions and metabolism of a focus, CT provides accurate anatomical positioning of the focus, and assistance is provided for diagnosis of head and neck tumors. In the clinic, doctors mainly evaluate the tumor condition by observing the position, shape, size and boundary of a focus, so as to formulate a treatment scheme. Head and neck tumor lesion segmentation can assist doctors in completing clinical diagnosis and is therefore an important problem in medical image analysis. The problem of accurate segmentation of head and neck tumor lesions in PET-CT images remains to be solved, mainly for the following reasons:

PET and CT images provide different biological and anatomical information, however how to effectively blend these two different modalities remains a challenging problem;

head and neck tumors have great heterogeneity in shape, size and intensity due to varying degrees of lesions, and there may be a blurred boundary or overlap region between the tumor body and surrounding tissue, which increases the complexity of the segmentation.

Some lesions are blurred in boundary and the marking of different areas of the lesion by different radiologists may not be consistent. In addition, manual labeling relies on expertise and experience of the imaging specialist, is time consuming and laborious, and may introduce subjective differences to the doctor.

Therefore, the establishment of an accurate automatic segmentation method for head and neck tumor lesions has great significance for clinical diagnosis and disease management. In recent years, deep learning-based methods have exhibited good performance in solving various computer vision problems (e.g., image classification, object detection, and semantic segmentation). Many methods have been applied to head and neck tumor lesion segmentation and have achieved impressive segmentation effects.

Convolutional neural networks, represented by U-Net, have classical symmetric encoder and decoder structures and preserve high resolution details by a jump connection. It has achieved significant performance in medical image segmentation tasks due to the strong feature extraction and generalized biasing capability that is exhibited on smaller datasets. The multi-mode medical image has the advantages that different information can be provided, so that the judgment on the focus is more accurate, but if the multi-mode medical image is simply spliced and then is used as the input of the U-shaped network, the complementary information among the modes cannot be fully utilized.

Disclosure of Invention

In order to solve the technical problems, the invention provides a head and neck tumor focus area image segmentation method and a computer readable medium.

The technical scheme of the method is a head and neck tumor focus area image segmentation method, which is characterized in that:

constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for prediction to construct a cross entropy dess weighted loss function, and optimizing and training by a random gradient descent algorithm to obtain a trained focus image segmentation network;

and carrying out prediction segmentation through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and further combining with probability threshold judgment to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.

The technical scheme of the method specifically comprises the following steps:

step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment, splicing treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;

Step 2: constructing a focus image segmentation network, inputting each group of preprocessed images into the focus image segmentation network for focus segmentation prediction, obtaining a head and neck tumor prediction probability map of each group of preprocessed images, constructing a cross entropy dess weighted loss function by combining the real classification labels of each pixel head and neck tumor focus of each group of preprocessed images, and obtaining a trained focus image segmentation network through optimizing training of a random gradient descent algorithm;

step 3: predictive segmentation is carried out on the PET-CT image of the head and neck tumor acquired in real time through a focus image segmentation network after training, a head and neck tumor predictive probability map of the PET-CT image of the head and neck tumor is obtained, and a head and neck tumor focus area pixel range of the PET-CT image of the head and neck tumor is obtained by combining probability threshold judgment;

preferably, the lesion image segmentation network of step 2 comprises:

a single-mode encoding network, a fusion encoding network, a single-mode decoding network, and a fusion decoding network;

the single-mode coding network performs feature extraction processing on each group of preprocessed images to obtain intermediate feature representation and final feature representation of each group of preprocessed images, outputs the intermediate feature representation of each group of preprocessed images to the fusion coding network, and outputs the final feature representation of each group of preprocessed images to the single-mode decoding network;

The fusion coding network performs feature fusion processing on the intermediate feature representation of each group of preprocessed images to obtain fusion coding features of each group of preprocessed images, and outputs the fusion coding features of each group of preprocessed images to the fusion decoding network;

the single-mode decoding network is used for decoding the final characteristic representation of each group of preprocessed images to obtain the decoding characteristic representation of each group of preprocessed images, and outputting the decoding characteristic representation of each group of preprocessed images to the fusion decoding network;

the fusion decoding network performs fusion decoding processing on the decoding characteristic representation of each group of preprocessed images to finally obtain a predicted focus region segmentation image of each preprocessed image;

the single-mode encoding network comprises:

a 1 st coding module, a 2 nd coding module, a..a, a K-th coding module, a bottleneck module;

the 1 st coding module, the 2 nd coding module, the third, the K coding module and the bottleneck module are sequentially cascaded;

the kth coding module is formed by sequentially cascading a plurality of convolution modules and downsampling layers, and k is E [1, K ];

the bottleneck module is composed of a plurality of layers of convolution modules;

The fusion encoding network comprises:

the 1 st fusion coding module, the 2 nd fusion coding module, the..a., the K-th fusion coding module, the bottleneck module;

the 1 st fusion coding module, the 2 nd fusion coding module, the third, the K fusion coding module and the bottleneck module are sequentially cascaded;

the kth fusion coding module is formed by sequentially cascading a multi-layer convolution module, an attention module and a downsampling layer, wherein k is E [1, K ];

the single-mode decoding network comprises:

a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;

the 1 st decoding module, the 2 nd decoding module, the third and the K decoding modules are sequentially cascaded;

the kth decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and a plurality of layers of convolution modules, wherein k is E [1, K ];

the converged decoding network includes:

a 1 st fusion decoding module, a 2 nd fusion decoding module, a.k. fusion decoding module;

the 1 st fusion decoding module, the 2 nd fusion decoding module, the third and the K fusion decoding modules are sequentially cascaded;

the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, wherein k is E [1, K ];

The 1 st coding module is used for respectively enabling each mode of each group of preprocessed images to pass through the multi-layer convolution module to obtain 1 st phase characteristics of each mode of each group of preprocessed images, and respectively outputting the 1 st phase characteristics of each mode of each group of preprocessed images to the 1 st downsampling layer, the 1 st fusion coding module and the 1 st decoding module;

after the 1 st stage feature of each mode of each group of preprocessed images is output to the 1 st downsampling layer, the 1 st stage downsampling feature of each mode of each group of preprocessed images is obtained, and the 1 st stage downsampling feature of each mode of each group of preprocessed images is output to the 2 nd coding module;

if k is [2,K-1], the kth coding module performs feature extraction on the downsampling features of the kth-1 stage of each mode of each group of preprocessed images through a multi-layer convolution module to obtain the kth stage features of each mode of each group of preprocessed images, and outputs the kth stage features of each mode of each group of preprocessed images to the kth downsampling layer, the kth fusion coding module and the kth decoding module respectively;

after the characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth downsampling layer, downsampling characteristics of the kth stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the kth stage of each mode of each group of preprocessed images are output to the kth+1th coding module;

The Kth coding module extracts the down-sampling characteristics of each mode Kth stage of each group of preprocessed images through a multi-layer convolution module to obtain each mode Kth stage characteristic of each group of preprocessed images, and outputs each mode Kth stage characteristic of each group of preprocessed images to the Kth down-sampling layer, the Kth fusion coding module and the Kth decoding module respectively;

after the characteristics of the K stage of each mode of each group of preprocessed images are output to the K downsampling layer, the downsampling characteristics of the K stage of each mode of each group of preprocessed images are obtained, and the downsampling characteristics of the K stage of each mode of each group of preprocessed images are output to the bottleneck module;

the bottleneck module is used for extracting the downsampling characteristics of each mode K stage of each group of preprocessed images through the multi-layer convolution module to obtain each mode bottleneck layer characteristic of each group of preprocessed images, and outputting each mode bottleneck layer characteristic of each group of preprocessed images to the K decoding module;

the 1 st fusion coding module is used for splicing all modes of each group of preprocessed images in the channel direction, obtaining 1 st stage characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the 1 st stage characteristics of each group of preprocessed images and the 1 st stage characteristics of all modes of each group of preprocessed images into the 1 st attention module, obtaining 1 st stage fusion characteristics of each group of preprocessed images, outputting the 1 st stage fusion characteristics of each group of preprocessed images to the 1 st downsampling layer, obtaining 1 st stage downsampling fusion characteristics of each group of preprocessed images, and outputting the 1 st stage downsampling fusion characteristics of each group of preprocessed images to the 2 nd fusion coding network;

If k epsilon [2,K-1], the kth fusion coding module is used for obtaining the kth phase characteristic of each group of preprocessed images through a multi-layer convolution module, inputting the kth phase characteristic of each group of preprocessed images and the kth phase characteristic of each mode of each group of preprocessed images into the kth attention module to obtain the kth phase fusion characteristic of each group of preprocessed images, outputting the kth phase fusion characteristic of each group of preprocessed images to a kth downsampling layer to obtain the downsampling fusion characteristic of the kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristic of the kth phase of each group of preprocessed images to the kth+1th fusion coding network;

the Kth fusion coding module is used for obtaining the Kth phase characteristics of each group of preprocessed images through the multi-layer convolution module, inputting the Kth phase characteristics of each group of preprocessed images and the Kth phase characteristics of each mode of each group of preprocessed images to the Kth attention module to obtain the Kth phase fusion characteristics of each group of preprocessed images, outputting the Kth phase fusion characteristics of each group of preprocessed images to the Kth downsampling layer to obtain the downsampling fusion characteristics of the Kth phase of each group of preprocessed images, and outputting the downsampling fusion characteristics of the Kth phase of each group of preprocessed images to the bottleneck module;

The bottleneck module is used for extracting the downsampling fusion characteristics of the K-th stage of each group of preprocessed images through the multi-layer convolution module to obtain bottleneck layer fusion characteristics of each group of preprocessed images, and outputting the bottleneck layer fusion characteristics of each group of preprocessed images to the K-th fusion decoding module;

the Kth decoding module is used for upsampling the bottleneck layer characteristics of each mode of each group of preprocessed images to obtain upsampling characteristics of each mode of each group of preprocessed images in a Kth stage, performing channel dimension splicing on the upsampling characteristics of each mode of each group of preprocessed images in the Kth stage and the characteristics of each mode of each group of preprocessed images in the Kth stage, obtaining upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage through the multi-layer convolution module, and respectively outputting the upsampling splicing characteristics of each mode of each group of preprocessed images in the Kth stage to the Kth-1 decoding module and the Kth-1 fusion decoding module;

if k is [2,K-1], the kth decoding module upsamples upsampled splicing features of the kth stage of each mode of each group of preprocessed images to obtain upsampled features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampled features of each mode of each group of preprocessed images and the features of each mode of each kth stage of each group of preprocessed images, obtains upsampled splicing features of each mode of each group of preprocessed images through a multi-layer convolution module, and outputs the upsampled splicing features of each mode of each group of preprocessed images to the kth-1 decoding module and the kth-1 fusion decoding module respectively;

The 1 st decoding module performs up-sampling on the up-sampling splicing characteristics of the 2 nd stage of each mode of each group of preprocessed image to obtain up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image, performs channel dimension splicing on the up-sampling characteristics of the 1 st stage of each mode of each group of preprocessed image and the 1 st stage characteristics of each mode of each group of preprocessed image, obtains the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image through the multi-layer convolution module, and outputs the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed image to the 1 st fusion decoding module;

the Kth fusion decoding module performs up-sampling on bottleneck layer fusion characteristics of each group of preprocessed images to obtain up-sampling fusion characteristics of a Kth stage of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the Kth stage of each group of preprocessed images and up-sampling splicing characteristics of each mode of each Kth stage of each group of preprocessed images, and then obtains up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images through the Kth inversion bottleneck convolution module, and outputs the up-sampling splicing fusion characteristics of the Kth stage of each group of preprocessed images to the Kth-1 fusion decoding module;

If k is [2,K-1], the kth fusion decoding module upsamples the upsampling splicing fusion feature of the kth+1 stage of each group of preprocessed images to obtain upsampling fusion features of the kth stage of each mode of each group of preprocessed images, performs channel dimension splicing on the upsampling fusion feature of the kth stage of each group of preprocessed images and the upsampling splicing feature of the kth stage of each mode of each group of preprocessed images, obtains the upsampling splicing fusion feature of the kth stage of each group of preprocessed images through the kth inversion bottleneck convolution module, and outputs the upsampling splicing fusion feature of the kth stage of each group of preprocessed images to the kth-1 fusion decoding module;

the 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain a head and neck tumor prediction probability map of each group of preprocessed images;

The cross entropy dess weighted loss function in the step 2 is defined as follows:

Loss＝α×L _ce +β×L _dice

where Loss represents a cross entropy dess weighted Loss function, L _ce Represents a cross entropy loss function, L _dice Representing the dess loss function, α represents the cross entropy loss weight, and β represents the dess loss weight;

the cross entropy loss function is defined as follows:

wherein NUM represents the number of preprocessed images, M represents the number of rows of the ith preprocessed image, N represents the number of columns of the ith preprocessed image, g _i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g _i,(x,y) If =0, then it is the normal region pixel, if g _i,(x,y) Let 1 be the focus region pixel, p _i,(x,y) ∈[0，1]，p _i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;

the dess loss function is defined as follows:

wherein, delta epsilon [0,1], delta represents gradient propagation adjustable coefficient;

preferably, the step 3 is performed in combination with probability threshold value to determine the pixel range of the focal region of the head and neck tumor of the real-time head and neck tumor PET-CT image, specifically as follows:

and screening out pixels with the predicted probability value of which the pixels belong to the classification of the head and neck tumor focus areas in the head and neck tumor predicted probability image of the real-time head and neck tumor PET-CT image and are larger than a probability threshold value as focus area pixels, and further obtaining the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image.

The present invention also provides a computer readable medium storing a computer program for execution by an electronic device, which when run on the electronic device performs the steps of the head and neck tumor lesion area image segmentation method.

The invention provides a method for improving the segmentation prediction accuracy of the head and neck tumor focus pixel region, and by jointly realizing the feature fusion of the coding and decoding stages, when the difference between the head and neck tumor lesion regions with different sizes and the lesion region and the normal region is not obvious, a good segmentation effect is achieved, and a doctor can be helped to determine the focus region more accurately.

Drawings

Fig. 1: the method of the embodiment of the invention is a flow chart.

Fig. 2: the neural network structure diagram of the embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In particular, the method according to the technical solution of the present invention may be implemented by those skilled in the art using computer software technology to implement an automatic operation flow, and a system apparatus for implementing the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention, and a computer device including the operation of the corresponding computer program, should also fall within the protection scope of the present invention.

The following describes a technical scheme of an embodiment of the present invention with reference to fig. 1-2, which is a head and neck tumor focus area image segmentation method, specifically as follows:

FIG. 1 is a flow chart of the method of the present invention;

step 1: acquiring a plurality of groups of original head and neck tumor PET-CT images, sequentially performing cutting treatment, normalization treatment and data enhancement treatment on each group of original head and neck tumor PET-CT images to obtain each group of preprocessed images, and marking a real classification label of each pixel head and neck tumor focus of each group of preprocessed images;

As shown in fig. 2, a neural network structure diagram of the present invention;

the lesion image segmentation network in step 2 comprises:

the fusion encoding network comprises:

the single-mode decoding network comprises:

a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;

the converged decoding network includes:

Loss＝α×L _ce +β×L _dice

where Loss represents a cross entropy dess weighted Loss function, L _ce Represents a cross entropy loss function, L _dice Representing a dess loss function, α=0.5, β=0.5;

the cross entropy loss function is defined as follows:

wherein num=450, m=144, n=144, g _i,(x,y) True classification label for representing pixel head and neck tumor focus of ith group pretreatment image x row y column, if g _i,(x,y) If =0, then it is the normal region pixel, if g _i,(x,y) Let 1 be the focus region pixel, p _i,(x,y) ∈[0，1]，p _i,(x,y) Representing the predictive probability value of the pixels belonging to the classification of the head and neck tumor focus areas of the x row and y column of the head and neck tumor predictive probability map of the i-th group of preprocessed images calculated by the segmentation method;

the dess loss function is defined as follows:

wherein δ e [0,1], δ=0.00005;

and 3, judging and obtaining the pixel range of the head and neck tumor focus area of the real-time head and neck tumor PET-CT image by combining the probability threshold, wherein the pixel range is as follows:

the pixels with the prediction probability value larger than the probability threshold value, which are classified by the head and neck tumor focus areas, of the pixels in the head and neck tumor prediction probability image of the real-time head and neck tumor PET-CT image are screened out to be used as focus area pixels, and the head and neck tumor focus area pixel range of the real-time head and neck tumor PET-CT image is further obtained;

the segmentation results of the methods obtained by taking the dess coefficient as a measurement index are as follows:

	Ours	UNet	Att-UNet	UNETR	SwinUNet
						tumor major region Dice	0.7931	0.7868	0.7639	0.7967	0.7847
Lymph node area Dice	0.7384	0.7072	0.6910	0.6932	0.6891
						Average Dice	0.7657	0.7474	0.7275	0.7450	0.7369

The dess coefficient is defined as follows:

from the experimental results, the average dess coefficient of the method is obviously higher than that of other methods when the head and neck tumor focus area prediction of the head and neck tumor PET-CT image is carried out, wherein the dess coefficient of the main tumor area is higher than UNet and Att-UNet, swinUNet and slightly lower than UNETR (0.0036); the dess coefficient of lymph node area is significantly higher than UNet (0.0312), att-UNet (0.0474), UNETR (0.0452), swinlunet (0.0493). Because the method performs feature fusion in both the encoding stage and the decoding stage, the recognition capability of the network to the lymph node region in the PET-CT image of the head and neck tumor is effectively improved, and the segmentation effect of the whole head and neck tumor focus region is further improved.

Particular embodiments of the present invention also provide a computer readable medium.

The computer readable medium is a server workstation;

the server workstation stores a computer program executed by an electronic device, which when run on the electronic device causes the electronic device to execute the steps of the stereoscopic SAR data-based preferred method of the embodiments of the invention.

It should be understood that parts of the specification not specifically set forth herein are all prior art.

It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.

Claims

1. A head and neck tumor focus area image segmentation method is characterized in that:

2. The method for segmenting an image of a focal region of a head and neck tumor according to claim 1, comprising the steps of:

step 3: and carrying out prediction segmentation on the head and neck tumor PET-CT image acquired in real time through a trained focus image segmentation network to obtain a head and neck tumor prediction probability map of the real-time head and neck tumor PET-CT image, and judging by combining a probability threshold value to obtain the head and neck tumor focus region pixel range of the real-time head and neck tumor PET-CT image.

3. The head and neck tumor lesion image segmentation method according to claim 2, wherein:

The lesion image segmentation network in step 2 comprises:

and the fusion decoding network performs fusion decoding processing on the decoding characteristic representation of each group of preprocessed images to finally obtain the predicted focus region segmentation image of each preprocessed image.

4. The method for segmenting an image of a focal region of a head and neck tumor according to claim 3, wherein:

the single-mode encoding network comprises:

the fusion encoding network comprises:

the single-mode decoding network comprises:

a 1 st decoding module, a 2 nd decoding module, a.k. decoding module;

the converged decoding network includes:

the kth fusion decoding module is formed by sequentially cascading an up-sampling layer, a splicing layer and an inverted bottleneck convolution module, and k is E [1, K ].

5. The method for segmenting an image of a focal region of a head and neck tumor according to claim 4, wherein:

The 1 st fusion decoding module performs up-sampling on the up-sampling splicing fusion characteristics of the 2 nd stage of each group of preprocessed images to obtain up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images, performs channel dimension splicing on the up-sampling fusion characteristics of the 1 st stage of each mode of each group of preprocessed images and the up-sampling splicing characteristics of the 1 st stage of each mode of each group of preprocessed images, obtains the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images through the 1 st inversion bottleneck convolution module, and convolves the up-sampling splicing fusion characteristics of the 1 st stage of each group of preprocessed images to obtain the head and neck tumor prediction probability map of each group of preprocessed images.

6. The method for segmenting an image of a focal region of a head and neck tumor according to claim 5, wherein:

Loss＝α×L _ce +β×L _dice

the cross entropy loss function is defined as follows:

the dess loss function is defined as follows:

wherein, delta epsilon [0,1], delta represents gradient propagation adjustable coefficient.

7. The method for segmenting the image of the focal region of the head and neck tumor according to claim 6, wherein:

8. The method for segmenting an image of a focal region of a head and neck tumor according to claim 7, wherein:

a computer readable medium, characterized in that it stores a computer program for execution by an electronic device, which computer program, when run on the electronic device, causes the electronic device to perform the steps of the method according to any one of claims 1-7.