CN116485811A - Stomach pathological section gland segmentation method based on Swin-Unet model - Google Patents
Stomach pathological section gland segmentation method based on Swin-Unet model Download PDFInfo
- Publication number
- CN116485811A CN116485811A CN202310314667.8A CN202310314667A CN116485811A CN 116485811 A CN116485811 A CN 116485811A CN 202310314667 A CN202310314667 A CN 202310314667A CN 116485811 A CN116485811 A CN 116485811A
- Authority
- CN
- China
- Prior art keywords
- image
- module
- json
- file
- npz
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 42
- 210000002784 stomach Anatomy 0.000 title claims abstract description 29
- 230000001575 pathological effect Effects 0.000 title claims abstract description 28
- 210000004907 gland Anatomy 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000012545 processing Methods 0.000 claims abstract description 17
- 238000004393 prognosis Methods 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 69
- 238000003709 image segmentation Methods 0.000 claims description 38
- 238000012549 training Methods 0.000 claims description 31
- 238000005070 sampling Methods 0.000 claims description 25
- 238000012216 screening Methods 0.000 claims description 24
- 230000007170 pathology Effects 0.000 claims description 20
- 230000002496 gastric effect Effects 0.000 claims description 18
- 238000012360 testing method Methods 0.000 claims description 15
- 238000006243 chemical reaction Methods 0.000 claims description 12
- 238000004140 cleaning Methods 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 6
- 238000012800 visualization Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000008447 perception Effects 0.000 abstract description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30092—Stomach; Gastric
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Image Processing (AREA)
Abstract
A stomach pathological section gland segmentation method based on a Swin-Unet model belongs to the technical field of image processing, and improves the accuracy of model segmentation through a novel connecting module between codecs. And the cavity convolution is applied in the connection module, and the weights of different cavities are adjusted through the attention mechanism, so that the perception range of the model is controlled more finely. By designing the prognosis processing module, the model is clearer at the boundary of the prediction result.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a stomach pathological section gland segmentation method based on a Swin-Unet model.
Background
Gastric pathological section segmentation is a medical image processing technique aimed at segmenting gastric pathological section images into different regions so that doctors can better diagnose and treat diseases. The traditional segmentation method requires manual selection of features and parameters, is inefficient and is susceptible to subjective factors. In recent years, the development of deep learning technology makes an image segmentation method based on a convolutional neural network a research hotspot. The Swin-Unet network is a common image segmentation network, and has a structure similar to a self-encoder, and can perform feature extraction and pixel-level classification. In the stomach pathological section gland segmentation, the Swin-Unet network can effectively extract the characteristics of gland images and segment the gland images for a professional doctor to look into and analyze the illness state or put forward a treatment scheme. But only skip connections are used when feature fusion between codecs is performed and less attention is paid to the sharpness of the boundary. The definition of the boundary of the segmentation result is often the key aspect of the doctor.
Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a method for generating a predicted image with clearer boundary than an original predicted result image.
The technical scheme adopted for overcoming the technical problems is as follows:
a stomach pathological section gland segmentation method based on a Swin-Unet model comprises the following steps:
a) Obtaining stomach pathology images and json annotation files corresponding to the stomach pathology images;
b) Preprocessing stomach pathology images to obtain npz files;
c) Dividing all npz files into a training set and a testing set;
d) Establishing an image segmentation network, inputting a training set into the image segmentation network, and outputting to obtain a gastric pathological section gland segmentation result diagram;
e) Training an image segmentation network to obtain an optimized image segmentation network;
f) And (3) inputting the test set into an optimized image segmentation network, and outputting to obtain a final stomach pathological section gland segmentation result image output_final.
Further, in step a), a stomach pathology image is obtained from a stomach section summary classification in a pathology digital section cloud labeling platform.
Further, step b) comprises the steps of:
b-1) replacing a dataset module in the Swin-Unet model with a mydataset module, wherein the mydataset module comprises an image file, a json file and an npz file in Python, the image file comprises an image segmentation function, an image screening function, a label information visualization function and an image format conversion function, the json file comprises a label information cleaning function, a label segmentation function, a label screening function and a label format conversion function, and the npz file comprises a npz generation function and a file name generation function;
b-2) inputting the acquired json annotation file into an annotation information cleaning function in the json. Py file, renaming the file, removing 3dh in the file name, and obtaining the json annotation file which is formatted after screening out unnecessary annotation information;
b-3) inputting the acquired stomach pathology image into an image segmentation function in an image file, segmenting the stomach pathology image into images with the size of 512 x 512 by using an openlide module in Python, keeping the step pitch equal to the segmentation size, and storing the segmented images in an image folder of a project current catalog; b-4) inputting the json annotation files which are subjected to screening of unnecessary annotation information and formatting into an annotation segmentation function in the json file, segmenting the json annotation files into json annotation files with the size of 512 x 512, keeping the step distance equal to the segmentation size, and storing the segmented json annotation files in an animation folder of the current catalogue of the project;
b-5) traversing all images in an image folder through an image screening function in an image file, screening out corresponding images according to pixel coordinate information in a json annotation file corresponding to the images by calling a numpy module, a json module and an os module in python, storing the images in a select_image folder of a project current directory, traversing all json annotation files in an animation folder through an annotation screening function in the json file, screening out json annotation files with the length of a shape smaller than 4 in the json annotation files by calling a numpy module, a json module and an os module in python, and storing each json annotation file after screening in the select_json folder of the project current directory; b-6) in the annotation information visualization function of the image file, the annotation information is visualized on the screened image according to points in the annotation information by calling a PIL module, an os module, a numpy module and a json module in python, and the annotation information is stored in a visual_json folder of the current catalog of the project; b-7) in the image format conversion function of the image file, converting the image format into png format by calling PLI module and os module in python to traverse all images in the select_image folder, and storing the converted image format into tif_png folder of the current catalog of the project;
b-8) in the annotation format conversion function in the json file, traversing all json annotation files in the select_json folder by calling the json module and the cv2 module in the python, converting the json annotation files into an image in png format, setting the rgb value of the background pixel of the image to be (0, 0), setting the rgb value of other pixels of the image to be (0, 1), and storing the json annotation files converted into png format into the json_png folder of the current catalog of the project;
b-9) transmitting the paths of the tif_png folder and the paths of the json_png folder into a npz generating function of the npz.py file, wherein the npz generating function generates npz files for model training by calling a numpy module, a cv2 module and an os module in python and combining the images and json labeling files correspondingly, a first number group of the npz files is named as image, a second number group of the npz files is named as label, and all npz files are saved in the npz folders of the current catalogue of the project.
Further, step c) comprises the steps of:
c-1) dividing all npz files into a training set and a testing set according to the proportion of 9:1;
c-2) storing the training set data in a train_ npz folder under the current catalog of the project, and storing the test set data in a test_ npz folder under the current catalog of the project;
c-3) inputting the path of the train_ npz folder and the path of the test_ npz folder into the file name generating function of the npz.py file, wherein the file name generating function extracts the file names of all files in the train_ npz folder and stores the file names in the train.txt file of the current catalog of the project by calling the os module in python, and extracts the file names of all files in the test_ npz folder and stores the file names in the test.txt file of the current catalog of the project.
Further, step d) comprises the steps of:
d-1) the image segmentation network consists of an image downsampling unit, a bottleneck unit, an image upsampling unit, an attention connecting unit and a prognosis processing unit;
d-2) an image downsampling unit sequentially comprises a first downsampling module, a second downsampling module and a third downsampling module, wherein the first downsampling module sequentially comprises a Patch Partition layer and a Liear embedding layer, the second downsampling module comprises a Patch Merrgeing layer, the third downsampling module comprises a Patch Merrgeing layer, a first digital image of npz files of a training set is input into the first downsampling module, the image is reduced to H/4*W/4*C, an image downsampleis formed, H is the image height, W is the image width, C is the dimension, the image downsamples are input into a swin-transformation network, and the image downsamples 1 are output;
d-3) the attention connection unit sequentially comprises a hole convolution module and a multi-head attention mechanism, and an image downsample is input into the attention connection unit and output to obtain an image downsample_1;
d-4) inputting image Down sample1 to the thIn the two downsampling modules, the image downsamples 1 is reduced to H/8*W/8 x 2C to form the image downsamples 1 Down sample the image 1 Inputting the images into a swin-transformer network, and outputting the images to obtain an image 2;
d-5) downsampling the image 1 Inputting the image into an attention connection unit, and outputting the image down sample_2;
d-6) inputting the image downsampled 2 into a third downsampling module, and reducing the image downsampled 2 to H/16W/16C to form the image downsampled 2 Down sample the image 2 Inputting the images into a swin-transformer network, and outputting the images to obtain an image down sample3;
d-7) downsampling the image 2 Inputting the image into an attention connection unit, and outputting the image down sample_3;
d-8) the bottleneck unit is sequentially formed by two identical switch-transformation networks, the image downsamples 3 is input into the Patch merge layer to reduce the image downsamples 3 to H/32W/32C, and the image downsamples are formed 3 Down sample the image 3 Inputting the image into a bottleneck unit, and outputting the image down sample4;
d-9) inputting the image downsample4 into the Patch expansion layer to expand the image downsample4 to H/16W/16X 4C to form the image upsamples 1 Image upsample 1 Adding the image down sample_3, inputting the added image down sample_3 into a swin-transformer network, and outputting an obtained image up sample3;
d-10) the image up-sampling unit sequentially comprises a third up-sampling module, a second up-sampling module and a first up-sampling module, wherein the third up-sampling module is composed of a Patch Expanding layer, the second up-sampling module is composed of a Patch Expanding layer, the first up-sampling module is composed of a Patch Expanding layer, the image up sample3 is input into the third up-sampling module, and the image up sample3 is expanded to H/8*W/8 x 2C to form the image up sample 2 Image upsample 2 Adding the image down sample_2, inputting the added image down sample_2 into a swin-transformer network, and outputting the added image down sample_2;
d-11) inputting the image upsample2 into the second upsampling module, expanding the image upsample2 to H/4*W/4*C for forming an image upsample 3 Image upsample 3 Adding the image down sample_1, inputting the added image down sample_1 into a swin-transformer network, and outputting the added image down sample_1;
d-12) inputting the image upsample1 into the first upsampling module, expanding the image upsample1 to H W C, and forming the image upsample 4 Image upsample 4 Input into Linear Projection and image up sample 4 Changing into H W CLASS to form an image output, wherein the CLASS is the classified category number;
d-13) the prognosis processing unit is composed of an image reading module and a conditional random field CRF module, wherein the image reading module is composed of a cv2 module and a nupy module in python, the conditional random field CRF module is composed of a pydensecrf module in python, and the image reading module of the prognosis processing unit is used for carrying out reasoning calculation on the image output after reading the image output to obtain a gastric pathological section gland segmentation result diagram.
Preferably, C in step d-2) has a value of 96.
Further, step e) comprises the steps of:
e-1) calculating to obtain total loss through a formula loss=a x cross EntropyLoss+b x diceLoss, wherein a and b are weight values, a+b=1, and the cross EntropyLoss and the diceLoss are loss functions in a swin-unet model;
e-2) training the image segmentation network through total loss by using an SGD optimizer to obtain an optimized image segmentation network.
Preferably, a has a value of 0.4 and b has a value of 0.6.
Further, in step e-2), the momentum factor of the SGD optimizer is set to 0.9, the weight attenuation weight_decay is set to 0.0001, the learning rate lr is set to the basic learning rate 0.01, when the image segmentation network is trained, the iterator DataLoader in python is used to iterate data, the parameter of the DataLoader is set to be batch_size, the buffer is set to True, the num_works is set to 0, the pin_memory is set to tube, 180 epochs are trained, one model weight result is saved for every ten epochs from half of the model training, the training is ended, the model parameters are saved, and the optimized image segmentation network is obtained.
The beneficial effects of the invention are as follows: the accuracy of model segmentation is improved, the cavity convolution is applied to the connection module, and the weights of different cavities are adjusted through the attention mechanism, so that the perception range of the model is controlled more finely. By designing the prognosis processing module, the model is clearer at the boundary of the prediction result.
Detailed Description
The present invention will be further described below.
A stomach pathological section gland segmentation method based on a Swin-Unet model comprises the following steps:
a) And obtaining stomach pathology images and corresponding json annotation files.
b) Preprocessing the stomach pathology image to obtain npz file.
c) All npz files are divided into training and test sets.
d) And establishing an image segmentation network, inputting the training set into the image segmentation network, and outputting to obtain a gastric pathological section gland segmentation result graph.
e) Training the image segmentation network to obtain an optimized image segmentation network.
f) And (3) inputting the test set into an optimized image segmentation network, and outputting to obtain a final stomach pathological section gland segmentation result image output_final.
The jump connection between the encoder and the decoder in the original Swin-Unet model is replaced by a new connection module (attention connection module), so that the model has better improvement on the segmentation precision. And the awareness capability of the model is improved by using the cavity convolution in the attention connection module. The attention mechanism can help the model to pay more attention to important features so as to improve the accuracy of the model, and when the hole convolution is used, the weight of different holes is adjusted through the attention mechanism, so that the perception range of the model is controlled more finely. The attention connection module is beneficial to better fusing shallow features when the model up-samples and slowly restores the feature map size in the decoder, and the information of the original feature map is kept as much as possible. And a prediction result prognosis processing module is designed to perform further prognosis processing on the model prediction result, so that the model is clearer at the boundary of the prediction result, and the readability of the model prediction result is greatly improved.
In one embodiment of the invention, in step a) gastric pathology images are obtained from gastric slice summary classifications in a pathology digital slice cloud labeling platform. Since the pathological section image has extremely high resolution (80000×70000) and the image format cannot be directly input into the model for training, further segmentation and format modification processing are required for the image.
In one embodiment of the invention, step b) comprises the steps of:
b-1) replacing a dataset module in the Swin-Unet model with a mydataset module, wherein the mydataset module comprises an image file, a json file and an npz file in Python, the image file comprises an image segmentation function, an image screening function, a label information visualization function and an image format conversion function, the json file comprises a label information cleaning function, a label segmentation function, a label screening function and a label format conversion function, and the npz file comprises a npz generation function and a file name generation function.
b-2) inputting the obtained json annotation file into an annotation information cleaning function in the json.py file, renaming the file, removing 3dh in the file name, cleaning the json annotation file, and obtaining the json annotation file which is formatted and is screened out of unnecessary annotation information.
b-3) inputting the acquired stomach pathology image into an image segmentation function in an image file, segmenting the stomach pathology image into images with the size of 512 x 512 by using an openlide module in Python, keeping the step pitch identical to the segmentation size, and storing the segmented images in an image folder of a project current catalog. b-4) inputting the json annotation file which is subjected to screening and unnecessary annotation information formatting into an annotation segmentation function in the json file, segmenting the json annotation file into json annotation files with the size of 512 x 512, keeping the step distance equal to the segmentation size, and storing the segmented json annotation files in an animation folder of the current item catalog.
b-5) because the segmented image and the label also have a part of blank area (without any information), the data in the image folder and the animation folder need to be further screened. Therefore, all images in an image folder are traversed through an image screening function in an image file, a numpy module, a json module and an os module in python are called, corresponding images are screened out according to pixel coordinate information in json annotation files corresponding to the images and stored in a select_image folder of a current item catalog, all json annotation files in an animation folder are traversed through an annotation screening function in the json file, json annotation files with the value length of a value smaller than 4 in the json annotation files are screened out through the numpy module, the json module and the os module in python, and each json annotation file after screening is stored in the select_json folder of the current item catalog.
b-6) in the annotation information visualization function of the image file, the annotation information is visualized on the screened image according to points in the annotation information by calling a PIL module, an os module, a numpy module and a json module in python, and the annotation information is stored in a visual_json folder of the current catalog of the project, so that the model segmentation effect can be conveniently checked and evaluated.
b-7) in the image format conversion function of the image. Py file, the image format is converted to png format and saved to the tif_png folder of the project current directory by calling the PLI module and os module in python to traverse all images in the select_image folder.
b-8) in the annotation format conversion function in the json file, traversing all json annotation files in the select_json folder by calling the json module and the cv2 module in the python, converting the json annotation files into an image in png format, setting the rgb value of the background pixel of the image to be (0, 0), setting the rgb value of other pixels of the image to be (0, 1), and storing the json annotation files converted into png format in the json_png folder of the current catalog of the project.
b-9) transmitting the paths of the tif_png folder and the paths of the json_png folder into a npz generating function of the npz.py file, wherein the npz generating function generates npz files for model training by calling a numpy module, a cv2 module and an os module in python and combining the images and json labeling files correspondingly, a first number group of the npz files is named as image, a second number group of the npz files is named as label, and all npz files are saved in the npz folders of the current catalogue of the project.
In one embodiment of the invention, step c) comprises the steps of:
c-1) all npz files are divided into training sets and test sets according to the ratio of 9:1.
c-2) storing the training set data in the train_ npz folder under the current catalog of the project, and storing the test set data in the test_ npz folder under the current catalog of the project.
c-3) inputting the path of the train_ npz folder and the path of the test_ npz folder into the file name generating function of the npz.py file, wherein the file name generating function extracts the file names of all files in the train_ npz folder and stores the file names in the train.txt file of the current catalog of the project by calling the os module in python, and extracts the file names of all files in the test_ npz folder and stores the file names in the test.txt file of the current catalog of the project.
In one embodiment of the invention, step d) comprises the steps of:
d-1) the image segmentation network is composed of an image downsampling unit, a bottleneck unit, an image upsampling unit, an attention connecting unit and a prognosis processing unit.
d-2) the image downsampling unit sequentially comprises a first downsampling module, a second downsampling module and a third downsampling module, wherein the first downsampling module sequentially comprises a Patch Partition layer and a Liear embedding layer, the second downsampling module comprises a Patch merge layer, the third downsampling module comprises a Patch merge layer, a first digital image of npz files of a training set is input into the first downsampling module, the image is reduced to H/4*W/4*C, an image downsampleis formed, H is the image height, W is the image width, C is the dimension, the image downsampleis input into a switch-transform network, and the image downsamples 1 is output. The swin-transformer network is composed of a LayerNormer layer, a multi-head attention module, a quick connection and MLP with RELU nonlinearity in sequence, and the structure of the swin-transformer network is the prior art and is not repeated here. Further preferably, C has a value of 96.
d-3) the attention connection unit sequentially comprises a hole convolution module and a multi-head attention mechanism, and an image downsample is input into the attention connection unit and output to obtain an image downsample_1.
d-4) inputting the image downsampled 1 into a second downsampling module, reducing the image downsampled 1 to H/8*W/8 x 2C to form the image downsampled 1 Down sample the image 1 And inputting the images into a swin-transformer network, and outputting the images to obtain the image 2.
d-5) downsampling the image 1 Input to the attention connection unit, and output the obtained image downsample_2.
d-6) inputting the image downsampled 2 into a third downsampling module, and reducing the image downsampled 2 to H/16W/16C to form the image downsampled 2 Down sample the image 2 And inputting the images into a swin-transformer network, and outputting the images to obtain the image down sample3.
d-7) downsampling the image 2 Input to the attention connection unit, and output the obtained image downsample_3.
d-8) the bottleneck unit is sequentially formed by two identical switch-transformation networks, the image downsamples 3 is input into the Patch merge layer to reduce the image downsamples 3 to H/32W/32C, and the image downsamples are formed 3 Down sample the image 3 And inputting the image into a bottleneck unit, and outputting the image downsample4.
d-9) inputting the image downsample4 into the Patch expansion layer to expand the image downsample4 to H/16W/16X 4C to form the image upsamples 1 Image upsample 1 And adding the image down sample_3, inputting the added image down sample_3 into a swin-transformer network, and outputting the added image down sample_3.
d-10) the image up-sampling unit is sequentially composed of a third up-sampling module, a second up-sampling module and a first up-sampling moduleThe sampling module comprises a third up-sampling module which is composed of a Patch expansion layer, a second up-sampling module which is composed of a Patch expansion layer, a first up-sampling module which is composed of a Patch expansion layer, an image up sample3 which is input into the third up-sampling module, and the image up sample3 which is expanded to H/8*W/8 x 2C, so as to form the image up sample 2 Image upsample 2 And adding the image down sample_2, inputting the added image down sample_2 into a swin-transformer network, and outputting the added image down sample_2.
d-11) inputting the image upsample2 into the second upsampling module, expanding the image upsample2 to H/4*W/4*C, forming the image upsample 3 Image upsample 3 And adding the image down sample_1, inputting the added image down sample_1 into a swin-transformer network, and outputting the added image down sample_1.
d-12) inputting the image upsample1 into the first upsampling module, expanding the image upsample1 to H W C, and forming the image upsample 4 Image upsample 4 Input into Linear Projection and image up sample 4 Changing into H W CLASS, and forming an image output, wherein the CLASS is the classified category number.
d-13) the prognosis processing unit is composed of an image reading module and a conditional random field CRF module, wherein the image reading module is composed of a cv2 module and a nupy module in python, the conditional random field CRF module is composed of a pydensecrf module in python, and the image reading module of the prognosis processing unit is used for carrying out reasoning calculation on the image output after reading the image output to obtain a gastric pathological section gland segmentation result diagram.
In one embodiment of the invention, step e) comprises the steps of:
e-1) calculating to obtain total loss through a formula loss=a x cross EntropyLoss+b x DiceLoss, wherein a and b are weight values, a+b=1, and the cross EntropyLoss and the DiceLoss are loss functions in a swin-unet model. Further preferably, a has a value of 0.4 and b has a value of 0.6.
e-2) training the image segmentation network through total loss by using an SGD optimizer to obtain an optimized image segmentation network. The momentum factor momentum of the SGD optimizer is set to 0.9, the weight attenuation weight_decay is set to 0.0001, the learning rate lr is set to be the basic learning rate 0.01, an iterator DataLoader in python is used for iterating data when the image segmentation network is trained, the parameter of the DataLoader is set to be the batch_size and is set to be 12, the buffer is set to True, the num_works is set to be 0, the pin_memory is set to be Ture, 180 epochs are trained, one model weight result is saved every ten epochs from half of model training, training is finished, and model parameters are saved, so that the optimized image segmentation network is obtained. Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (9)
1. The stomach pathological section gland segmentation method based on the Swin-Unet model is characterized by comprising the following steps of:
a) Obtaining stomach pathology images and json annotation files corresponding to the stomach pathology images;
b) Preprocessing stomach pathology images to obtain npz files;
c) Dividing all npz files into a training set and a testing set;
d) Establishing an image segmentation network, inputting a training set into the image segmentation network, and outputting to obtain a gastric pathological section gland segmentation result diagram;
e) Training an image segmentation network to obtain an optimized image segmentation network;
f) And (3) inputting the test set into an optimized image segmentation network, and outputting to obtain a final stomach pathological section gland segmentation result image output_final.
2. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 1, wherein the method comprises the following steps: and in the step a), obtaining stomach pathology images from stomach section outline classification in a pathology digital section cloud labeling platform.
3. The method for segmenting gastric pathological segments glands based on the Swin-Unet model according to claim 1, wherein the step b) comprises the following steps:
b-1) replacing a dataset module in the Swin-Unet model with a mydataset module, wherein the mydataset module comprises an image file, a json file and an npz file in Python, the image file comprises an image segmentation function, an image screening function, a label information visualization function and an image format conversion function, the json file comprises a label information cleaning function, a label segmentation function, a label screening function and a label format conversion function, and the npz file comprises a npz generation function and a file name generation function;
b-2) inputting the acquired json annotation file into an annotation information cleaning function in the json. Py file, renaming the file, removing 3dh in the file name, and obtaining the json annotation file which is formatted after screening out unnecessary annotation information;
b-3) inputting the acquired stomach pathology image into an image segmentation function in an image file, segmenting the stomach pathology image into images with the size of 512 x 512 by using an openlide module in Python, keeping the step pitch equal to the segmentation size, and storing the segmented images in an image folder of a project current catalog; b-4) inputting the json annotation files which are subjected to screening of unnecessary annotation information and formatting into an annotation segmentation function in the json file, segmenting the json annotation files into json annotation files with the size of 512 x 512, keeping the step distance equal to the segmentation size, and storing the segmented json annotation files in an animation folder of the current catalogue of the project;
b-5) traversing all images in an image folder through an image screening function in an image file, screening out corresponding images according to pixel coordinate information in a json annotation file corresponding to the images by calling a numpy module, a json module and an os module in python, storing the images in a select_image folder of a project current directory, traversing all json annotation files in an animation folder through an annotation screening function in the json file, screening out json annotation files with the length of a shape smaller than 4 in the json annotation files by calling a numpy module, a json module and an os module in python, and storing each json annotation file after screening in the select_json folder of the project current directory; b-6) in the annotation information visualization function of the image file, the annotation information is visualized on the screened image according to points in the annotation information by calling a PIL module, an os module, a numpy module and a json module in python, and the annotation information is stored in a visual_json folder of the current catalog of the project; b-7) in the image format conversion function of the image file, converting the image format into png format by calling PLI module and os module in python to traverse all images in the select_image folder, and storing the converted image format into tif_png folder of the current catalog of the project;
b-8) in the annotation format conversion function in the json file, traversing all json annotation files in the select_json folder by calling the json module and the cv2 module in the python, converting the json annotation files into an image in png format, setting the rgb value of the background pixel of the image to be (0, 0), setting the rgb value of other pixels of the image to be (0, 1), and storing the json annotation files converted into png format into the json_png folder of the current catalog of the project;
b-9) transmitting the paths of the tif_png folder and the paths of the json_png folder into a npz generating function of the npz.py file, wherein the npz generating function generates npz files for model training by calling a numpy module, a cv2 module and an os module in python and combining the images and json labeling files correspondingly, a first number group of the npz files is named as image, a second number group of the npz files is named as label, and all npz files are saved in the npz folders of the current catalogue of the project.
4. A method for segmentation of gastric pathological section glands based on the Swin-Unet model according to claim 3, wherein step c) comprises the steps of:
c-1) dividing all npz files into a training set and a testing set according to the proportion of 9:1;
c-2) storing the training set data in a train_ npz folder under the current catalog of the project, and storing the test set data in a test_ npz folder under the current catalog of the project;
c-3) inputting the path of the train_ npz folder and the path of the test_ npz folder into the file name generating function of the npz.py file, wherein the file name generating function extracts the file names of all files in the train_ npz folder and stores the file names in the train.txt file of the current catalog of the project by calling the os module in python, and extracts the file names of all files in the test_ npz folder and stores the file names in the test.txt file of the current catalog of the project.
5. The method for segmenting gastric pathological segments glands according to claim 4, wherein the step d) comprises the following steps:
d-1) the image segmentation network consists of an image downsampling unit, a bottleneck unit, an image upsampling unit, an attention connecting unit and a prognosis processing unit;
d-2) an image downsampling unit sequentially comprises a first downsampling module, a second downsampling module and a third downsampling module, wherein the first downsampling module sequentially comprises a Patch Partition layer and a Liear embedding layer, the second downsampling module comprises a Patch Merrgeing layer, the third downsampling module comprises a Patch Merrgeing layer, a first digital image of npz files of a training set is input into the first downsampling module, the image is reduced to H/4*W/4*C, an image downsampleis formed, H is the image height, W is the image width, C is the dimension, the image downsamples are input into a swin-transformation network, and the image downsamples 1 are output;
d-3) the attention connection unit sequentially comprises a hole convolution module and a multi-head attention mechanism, and an image downsample is input into the attention connection unit and output to obtain an image downsample_1;
d-4) inputting the image downsampled 1 into a second downsampling module, reducing the image downsampled 1 to H/8*W/8 x 2C to form the image downsampled 1 Down sample the image 1 Inputting the images into a swin-transformer network, and outputting the images to obtain an image 2;
d-5) downsampling the image 1 Inputting the image into an attention connection unit, and outputting the image down sample_2;
d-6) inputting the image downsampled 2 into a third downsampling module, and reducing the image downsampled 2 to H/16W/16C to form the image downsampled 2 Down sample the image 2 Inputting the images into a swin-transformer network, and outputting the images to obtain an image down sample3;
d-7) downsampling the image 2 Inputting the image into an attention connection unit, and outputting the image down sample_3;
d-8) the bottleneck unit is sequentially formed by two identical switch-transformation networks, the image downsamples 3 is input into the Patch merge layer to reduce the image downsamples 3 to H/32W/32C, and the image downsamples are formed 3 Down sample the image 3 Inputting the image into a bottleneck unit, and outputting the image down sample4;
d-9) inputting the image downsample4 into the Patch expansion layer to expand the image downsample4 to H/16W/16X 4C to form the image upsamples 1 Image upsample 1 Adding the image down sample_3, inputting the added image down sample_3 into a swin-transformer network, and outputting an obtained image up sample3;
d-10) the image up-sampling unit sequentially comprises a third up-sampling module, a second up-sampling module and a first up-sampling module, wherein the third up-sampling module is composed of a Patch Expanding layer, the second up-sampling module is composed of a Patch Expanding layer, the first up-sampling module is composed of a Patch Expanding layer, the image up sample3 is input into the third up-sampling module, and the image up sample3 is expanded to H/8*W/8 x 2C to form the image up sample 2 Image upsample 2 Adding the image down sample_2, inputting the added image down sample_2 into a swin-transformer network, and outputting the added image down sample_2;
d-11) inputting the image upsample2 into the second upsampling module, expanding the image upsample2 to H/4*W/4*C, forming the image upsample 3 Image upsample 3 Adding with the image downsampled_1, inputting into a swin-transformer network, inputtingObtaining an image upsample1;
d-12) inputting the image upsample1 into the first upsampling module, expanding the image upsample1 to H W C, and forming the image upsample 4 Image upsample 4 Input into Linear Projection and image up sample 4 Changing into H W CLASS to form an image output, wherein the CLASS is the classified category number;
d-13) the prognosis processing unit is composed of an image reading module and a conditional random field CRF module, wherein the image reading module is composed of a cv2 module and a nupy module in python, the conditional random field CRF module is composed of a pydensecrf module in python, and the image reading module of the prognosis processing unit is used for carrying out reasoning calculation on the image output after reading the image output to obtain a gastric pathological section gland segmentation result diagram.
6. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 1, wherein the method comprises the following steps: in step d-2) the value of C is 96.
7. The method for segmenting gastric pathological sections glands according to claim 1, wherein step e) comprises the steps of:
e-1) calculating to obtain total loss through a formula loss=a x cross EntropyLoss+b x diceLoss, wherein a and b are weight values, a+b=1, and the cross EntropyLoss and the diceLoss are loss functions in a swin-unet model;
e-2) training the image segmentation network through total loss by using an SGD optimizer to obtain an optimized image segmentation network.
8. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 7, wherein the method comprises the following steps: a is 0.4 and b is 0.6.
9. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 7, wherein the method comprises the following steps: in the step e-2), the momentum factor momentum of the SGD optimizer is set to 0.9, the weight attenuation weight_decay is set to 0.0001, the learning rate lr is set to be the basic learning rate 0.01, when the image segmentation network is trained, an iterator DataLoader in python is used for iterating data, the parameter of the DataLoader is set to be the batch_size and is set to 12, the buffer is set to True, the num_works is set to 0, the pin_memory is set to be Ture, 180 epochs are trained, one half of the model weight result is saved for every ten epochs from model training, the training is finished, and the model parameters are saved, so that the optimized image segmentation network is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310314667.8A CN116485811A (en) | 2023-03-29 | 2023-03-29 | Stomach pathological section gland segmentation method based on Swin-Unet model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310314667.8A CN116485811A (en) | 2023-03-29 | 2023-03-29 | Stomach pathological section gland segmentation method based on Swin-Unet model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116485811A true CN116485811A (en) | 2023-07-25 |
Family
ID=87211080
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310314667.8A Pending CN116485811A (en) | 2023-03-29 | 2023-03-29 | Stomach pathological section gland segmentation method based on Swin-Unet model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116485811A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117196972A (en) * | 2023-08-25 | 2023-12-08 | 山东浪潮科学研究院有限公司 | Improved transducer-based document artifact removal method |
-
2023
- 2023-03-29 CN CN202310314667.8A patent/CN116485811A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117196972A (en) * | 2023-08-25 | 2023-12-08 | 山东浪潮科学研究院有限公司 | Improved transducer-based document artifact removal method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111598892B (en) | Cell image segmentation method based on Res2-uneXt network structure | |
CN110782462B (en) | Semantic segmentation method based on double-flow feature fusion | |
CN111784671B (en) | Pathological image focus region detection method based on multi-scale deep learning | |
KR20200084434A (en) | Machine Learning Method for Restoring Super-Resolution Image | |
CN110136062B (en) | Super-resolution reconstruction method combining semantic segmentation | |
CN110728682A (en) | Semantic segmentation method based on residual pyramid pooling neural network | |
Sood et al. | An application of generative adversarial networks for super resolution medical imaging | |
CN116797787B (en) | Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network | |
CN116485811A (en) | Stomach pathological section gland segmentation method based on Swin-Unet model | |
CN113256494B (en) | Text image super-resolution method | |
CN112927253A (en) | Rock core FIB-SEM image segmentation method based on convolutional neural network | |
CN111914654A (en) | Text layout analysis method, device, equipment and medium | |
CN114596318A (en) | Breast cancer magnetic resonance imaging focus segmentation method based on Transformer | |
CN114266957A (en) | Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation | |
Wu et al. | Multi-focus image fusion: Transformer and shallow feature attention matters | |
CN116977387B (en) | Deformable medical image registration method based on deformation field fusion | |
Zuo et al. | Gradient-guided single image super-resolution based on joint trilateral feature filtering | |
CN114708353B (en) | Image reconstruction method and device, electronic equipment and storage medium | |
CN111080516A (en) | Super-resolution image reconstruction method based on self-sampling enhancement | |
CN114331922B (en) | Multi-scale self-calibration method and system for restoring turbulence degraded image by aerodynamic optical effect | |
Schirrmacher et al. | SR 2: Super-resolution with structure-aware reconstruction | |
CN113205005B (en) | Low-illumination low-resolution face image reconstruction method | |
CN114898096A (en) | Segmentation and annotation method and system for figure image | |
CN113947102A (en) | Backbone two-path image semantic segmentation method for scene understanding of mobile robot in complex environment | |
CN112464733A (en) | High-resolution optical remote sensing image ground feature classification method based on bidirectional feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |