CN116485811A - Stomach pathological section gland segmentation method based on Swin-Unet model - Google Patents

Stomach pathological section gland segmentation method based on Swin-Unet model Download PDF

Info

Publication number
CN116485811A
CN116485811A CN202310314667.8A CN202310314667A CN116485811A CN 116485811 A CN116485811 A CN 116485811A CN 202310314667 A CN202310314667 A CN 202310314667A CN 116485811 A CN116485811 A CN 116485811A
Authority
CN
China
Prior art keywords
image
module
json
file
npz
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310314667.8A
Other languages
Chinese (zh)
Inventor
潘景山
周贺虎
李娜
葛菁
王迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Shandong Computer Science Center National Super Computing Center in Jinan
Original Assignee
Qilu University of Technology
Shandong Computer Science Center National Super Computing Center in Jinan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology, Shandong Computer Science Center National Super Computing Center in Jinan filed Critical Qilu University of Technology
Priority to CN202310314667.8A priority Critical patent/CN116485811A/en
Publication of CN116485811A publication Critical patent/CN116485811A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30092Stomach; Gastric
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Processing (AREA)

Abstract

A stomach pathological section gland segmentation method based on a Swin-Unet model belongs to the technical field of image processing, and improves the accuracy of model segmentation through a novel connecting module between codecs. And the cavity convolution is applied in the connection module, and the weights of different cavities are adjusted through the attention mechanism, so that the perception range of the model is controlled more finely. By designing the prognosis processing module, the model is clearer at the boundary of the prediction result.

Description

Stomach pathological section gland segmentation method based on Swin-Unet model
Technical Field
The invention relates to the technical field of image processing, in particular to a stomach pathological section gland segmentation method based on a Swin-Unet model.
Background
Gastric pathological section segmentation is a medical image processing technique aimed at segmenting gastric pathological section images into different regions so that doctors can better diagnose and treat diseases. The traditional segmentation method requires manual selection of features and parameters, is inefficient and is susceptible to subjective factors. In recent years, the development of deep learning technology makes an image segmentation method based on a convolutional neural network a research hotspot. The Swin-Unet network is a common image segmentation network, and has a structure similar to a self-encoder, and can perform feature extraction and pixel-level classification. In the stomach pathological section gland segmentation, the Swin-Unet network can effectively extract the characteristics of gland images and segment the gland images for a professional doctor to look into and analyze the illness state or put forward a treatment scheme. But only skip connections are used when feature fusion between codecs is performed and less attention is paid to the sharpness of the boundary. The definition of the boundary of the segmentation result is often the key aspect of the doctor.
Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a method for generating a predicted image with clearer boundary than an original predicted result image.
The technical scheme adopted for overcoming the technical problems is as follows:
a stomach pathological section gland segmentation method based on a Swin-Unet model comprises the following steps:
a) Obtaining stomach pathology images and json annotation files corresponding to the stomach pathology images;
b) Preprocessing stomach pathology images to obtain npz files;
c) Dividing all npz files into a training set and a testing set;
d) Establishing an image segmentation network, inputting a training set into the image segmentation network, and outputting to obtain a gastric pathological section gland segmentation result diagram;
e) Training an image segmentation network to obtain an optimized image segmentation network;
f) And (3) inputting the test set into an optimized image segmentation network, and outputting to obtain a final stomach pathological section gland segmentation result image output_final.
Further, in step a), a stomach pathology image is obtained from a stomach section summary classification in a pathology digital section cloud labeling platform.
Further, step b) comprises the steps of:
b-1) replacing a dataset module in the Swin-Unet model with a mydataset module, wherein the mydataset module comprises an image file, a json file and an npz file in Python, the image file comprises an image segmentation function, an image screening function, a label information visualization function and an image format conversion function, the json file comprises a label information cleaning function, a label segmentation function, a label screening function and a label format conversion function, and the npz file comprises a npz generation function and a file name generation function;
b-2) inputting the acquired json annotation file into an annotation information cleaning function in the json. Py file, renaming the file, removing 3dh in the file name, and obtaining the json annotation file which is formatted after screening out unnecessary annotation information;
b-3) inputting the acquired stomach pathology image into an image segmentation function in an image file, segmenting the stomach pathology image into images with the size of 512 x 512 by using an openlide module in Python, keeping the step pitch equal to the segmentation size, and storing the segmented images in an image folder of a project current catalog; b-4) inputting the json annotation files which are subjected to screening of unnecessary annotation information and formatting into an annotation segmentation function in the json file, segmenting the json annotation files into json annotation files with the size of 512 x 512, keeping the step distance equal to the segmentation size, and storing the segmented json annotation files in an animation folder of the current catalogue of the project;
b-5) traversing all images in an image folder through an image screening function in an image file, screening out corresponding images according to pixel coordinate information in a json annotation file corresponding to the images by calling a numpy module, a json module and an os module in python, storing the images in a select_image folder of a project current directory, traversing all json annotation files in an animation folder through an annotation screening function in the json file, screening out json annotation files with the length of a shape smaller than 4 in the json annotation files by calling a numpy module, a json module and an os module in python, and storing each json annotation file after screening in the select_json folder of the project current directory; b-6) in the annotation information visualization function of the image file, the annotation information is visualized on the screened image according to points in the annotation information by calling a PIL module, an os module, a numpy module and a json module in python, and the annotation information is stored in a visual_json folder of the current catalog of the project; b-7) in the image format conversion function of the image file, converting the image format into png format by calling PLI module and os module in python to traverse all images in the select_image folder, and storing the converted image format into tif_png folder of the current catalog of the project;
b-8) in the annotation format conversion function in the json file, traversing all json annotation files in the select_json folder by calling the json module and the cv2 module in the python, converting the json annotation files into an image in png format, setting the rgb value of the background pixel of the image to be (0, 0), setting the rgb value of other pixels of the image to be (0, 1), and storing the json annotation files converted into png format into the json_png folder of the current catalog of the project;
b-9) transmitting the paths of the tif_png folder and the paths of the json_png folder into a npz generating function of the npz.py file, wherein the npz generating function generates npz files for model training by calling a numpy module, a cv2 module and an os module in python and combining the images and json labeling files correspondingly, a first number group of the npz files is named as image, a second number group of the npz files is named as label, and all npz files are saved in the npz folders of the current catalogue of the project.
Further, step c) comprises the steps of:
c-1) dividing all npz files into a training set and a testing set according to the proportion of 9:1;
c-2) storing the training set data in a train_ npz folder under the current catalog of the project, and storing the test set data in a test_ npz folder under the current catalog of the project;
c-3) inputting the path of the train_ npz folder and the path of the test_ npz folder into the file name generating function of the npz.py file, wherein the file name generating function extracts the file names of all files in the train_ npz folder and stores the file names in the train.txt file of the current catalog of the project by calling the os module in python, and extracts the file names of all files in the test_ npz folder and stores the file names in the test.txt file of the current catalog of the project.
Further, step d) comprises the steps of:
d-1) the image segmentation network consists of an image downsampling unit, a bottleneck unit, an image upsampling unit, an attention connecting unit and a prognosis processing unit;
d-2) an image downsampling unit sequentially comprises a first downsampling module, a second downsampling module and a third downsampling module, wherein the first downsampling module sequentially comprises a Patch Partition layer and a Liear embedding layer, the second downsampling module comprises a Patch Merrgeing layer, the third downsampling module comprises a Patch Merrgeing layer, a first digital image of npz files of a training set is input into the first downsampling module, the image is reduced to H/4*W/4*C, an image downsampleis formed, H is the image height, W is the image width, C is the dimension, the image downsamples are input into a swin-transformation network, and the image downsamples 1 are output;
d-3) the attention connection unit sequentially comprises a hole convolution module and a multi-head attention mechanism, and an image downsample is input into the attention connection unit and output to obtain an image downsample_1;
d-4) inputting image Down sample1 to the thIn the two downsampling modules, the image downsamples 1 is reduced to H/8*W/8 x 2C to form the image downsamples 1 Down sample the image 1 Inputting the images into a swin-transformer network, and outputting the images to obtain an image 2;
d-5) downsampling the image 1 Inputting the image into an attention connection unit, and outputting the image down sample_2;
d-6) inputting the image downsampled 2 into a third downsampling module, and reducing the image downsampled 2 to H/16W/16C to form the image downsampled 2 Down sample the image 2 Inputting the images into a swin-transformer network, and outputting the images to obtain an image down sample3;
d-7) downsampling the image 2 Inputting the image into an attention connection unit, and outputting the image down sample_3;
d-8) the bottleneck unit is sequentially formed by two identical switch-transformation networks, the image downsamples 3 is input into the Patch merge layer to reduce the image downsamples 3 to H/32W/32C, and the image downsamples are formed 3 Down sample the image 3 Inputting the image into a bottleneck unit, and outputting the image down sample4;
d-9) inputting the image downsample4 into the Patch expansion layer to expand the image downsample4 to H/16W/16X 4C to form the image upsamples 1 Image upsample 1 Adding the image down sample_3, inputting the added image down sample_3 into a swin-transformer network, and outputting an obtained image up sample3;
d-10) the image up-sampling unit sequentially comprises a third up-sampling module, a second up-sampling module and a first up-sampling module, wherein the third up-sampling module is composed of a Patch Expanding layer, the second up-sampling module is composed of a Patch Expanding layer, the first up-sampling module is composed of a Patch Expanding layer, the image up sample3 is input into the third up-sampling module, and the image up sample3 is expanded to H/8*W/8 x 2C to form the image up sample 2 Image upsample 2 Adding the image down sample_2, inputting the added image down sample_2 into a swin-transformer network, and outputting the added image down sample_2;
d-11) inputting the image upsample2 into the second upsampling module, expanding the image upsample2 to H/4*W/4*C for forming an image upsample 3 Image upsample 3 Adding the image down sample_1, inputting the added image down sample_1 into a swin-transformer network, and outputting the added image down sample_1;
d-12) inputting the image upsample1 into the first upsampling module, expanding the image upsample1 to H W C, and forming the image upsample 4 Image upsample 4 Input into Linear Projection and image up sample 4 Changing into H W CLASS to form an image output, wherein the CLASS is the classified category number;
d-13) the prognosis processing unit is composed of an image reading module and a conditional random field CRF module, wherein the image reading module is composed of a cv2 module and a nupy module in python, the conditional random field CRF module is composed of a pydensecrf module in python, and the image reading module of the prognosis processing unit is used for carrying out reasoning calculation on the image output after reading the image output to obtain a gastric pathological section gland segmentation result diagram.
Preferably, C in step d-2) has a value of 96.
Further, step e) comprises the steps of:
e-1) calculating to obtain total loss through a formula loss=a x cross EntropyLoss+b x diceLoss, wherein a and b are weight values, a+b=1, and the cross EntropyLoss and the diceLoss are loss functions in a swin-unet model;
e-2) training the image segmentation network through total loss by using an SGD optimizer to obtain an optimized image segmentation network.
Preferably, a has a value of 0.4 and b has a value of 0.6.
Further, in step e-2), the momentum factor of the SGD optimizer is set to 0.9, the weight attenuation weight_decay is set to 0.0001, the learning rate lr is set to the basic learning rate 0.01, when the image segmentation network is trained, the iterator DataLoader in python is used to iterate data, the parameter of the DataLoader is set to be batch_size, the buffer is set to True, the num_works is set to 0, the pin_memory is set to tube, 180 epochs are trained, one model weight result is saved for every ten epochs from half of the model training, the training is ended, the model parameters are saved, and the optimized image segmentation network is obtained.
The beneficial effects of the invention are as follows: the accuracy of model segmentation is improved, the cavity convolution is applied to the connection module, and the weights of different cavities are adjusted through the attention mechanism, so that the perception range of the model is controlled more finely. By designing the prognosis processing module, the model is clearer at the boundary of the prediction result.
Detailed Description
The present invention will be further described below.
A stomach pathological section gland segmentation method based on a Swin-Unet model comprises the following steps:
a) And obtaining stomach pathology images and corresponding json annotation files.
b) Preprocessing the stomach pathology image to obtain npz file.
c) All npz files are divided into training and test sets.
d) And establishing an image segmentation network, inputting the training set into the image segmentation network, and outputting to obtain a gastric pathological section gland segmentation result graph.
e) Training the image segmentation network to obtain an optimized image segmentation network.
f) And (3) inputting the test set into an optimized image segmentation network, and outputting to obtain a final stomach pathological section gland segmentation result image output_final.
The jump connection between the encoder and the decoder in the original Swin-Unet model is replaced by a new connection module (attention connection module), so that the model has better improvement on the segmentation precision. And the awareness capability of the model is improved by using the cavity convolution in the attention connection module. The attention mechanism can help the model to pay more attention to important features so as to improve the accuracy of the model, and when the hole convolution is used, the weight of different holes is adjusted through the attention mechanism, so that the perception range of the model is controlled more finely. The attention connection module is beneficial to better fusing shallow features when the model up-samples and slowly restores the feature map size in the decoder, and the information of the original feature map is kept as much as possible. And a prediction result prognosis processing module is designed to perform further prognosis processing on the model prediction result, so that the model is clearer at the boundary of the prediction result, and the readability of the model prediction result is greatly improved.
In one embodiment of the invention, in step a) gastric pathology images are obtained from gastric slice summary classifications in a pathology digital slice cloud labeling platform. Since the pathological section image has extremely high resolution (80000×70000) and the image format cannot be directly input into the model for training, further segmentation and format modification processing are required for the image.
In one embodiment of the invention, step b) comprises the steps of:
b-1) replacing a dataset module in the Swin-Unet model with a mydataset module, wherein the mydataset module comprises an image file, a json file and an npz file in Python, the image file comprises an image segmentation function, an image screening function, a label information visualization function and an image format conversion function, the json file comprises a label information cleaning function, a label segmentation function, a label screening function and a label format conversion function, and the npz file comprises a npz generation function and a file name generation function.
b-2) inputting the obtained json annotation file into an annotation information cleaning function in the json.py file, renaming the file, removing 3dh in the file name, cleaning the json annotation file, and obtaining the json annotation file which is formatted and is screened out of unnecessary annotation information.
b-3) inputting the acquired stomach pathology image into an image segmentation function in an image file, segmenting the stomach pathology image into images with the size of 512 x 512 by using an openlide module in Python, keeping the step pitch identical to the segmentation size, and storing the segmented images in an image folder of a project current catalog. b-4) inputting the json annotation file which is subjected to screening and unnecessary annotation information formatting into an annotation segmentation function in the json file, segmenting the json annotation file into json annotation files with the size of 512 x 512, keeping the step distance equal to the segmentation size, and storing the segmented json annotation files in an animation folder of the current item catalog.
b-5) because the segmented image and the label also have a part of blank area (without any information), the data in the image folder and the animation folder need to be further screened. Therefore, all images in an image folder are traversed through an image screening function in an image file, a numpy module, a json module and an os module in python are called, corresponding images are screened out according to pixel coordinate information in json annotation files corresponding to the images and stored in a select_image folder of a current item catalog, all json annotation files in an animation folder are traversed through an annotation screening function in the json file, json annotation files with the value length of a value smaller than 4 in the json annotation files are screened out through the numpy module, the json module and the os module in python, and each json annotation file after screening is stored in the select_json folder of the current item catalog.
b-6) in the annotation information visualization function of the image file, the annotation information is visualized on the screened image according to points in the annotation information by calling a PIL module, an os module, a numpy module and a json module in python, and the annotation information is stored in a visual_json folder of the current catalog of the project, so that the model segmentation effect can be conveniently checked and evaluated.
b-7) in the image format conversion function of the image. Py file, the image format is converted to png format and saved to the tif_png folder of the project current directory by calling the PLI module and os module in python to traverse all images in the select_image folder.
b-8) in the annotation format conversion function in the json file, traversing all json annotation files in the select_json folder by calling the json module and the cv2 module in the python, converting the json annotation files into an image in png format, setting the rgb value of the background pixel of the image to be (0, 0), setting the rgb value of other pixels of the image to be (0, 1), and storing the json annotation files converted into png format in the json_png folder of the current catalog of the project.
b-9) transmitting the paths of the tif_png folder and the paths of the json_png folder into a npz generating function of the npz.py file, wherein the npz generating function generates npz files for model training by calling a numpy module, a cv2 module and an os module in python and combining the images and json labeling files correspondingly, a first number group of the npz files is named as image, a second number group of the npz files is named as label, and all npz files are saved in the npz folders of the current catalogue of the project.
In one embodiment of the invention, step c) comprises the steps of:
c-1) all npz files are divided into training sets and test sets according to the ratio of 9:1.
c-2) storing the training set data in the train_ npz folder under the current catalog of the project, and storing the test set data in the test_ npz folder under the current catalog of the project.
c-3) inputting the path of the train_ npz folder and the path of the test_ npz folder into the file name generating function of the npz.py file, wherein the file name generating function extracts the file names of all files in the train_ npz folder and stores the file names in the train.txt file of the current catalog of the project by calling the os module in python, and extracts the file names of all files in the test_ npz folder and stores the file names in the test.txt file of the current catalog of the project.
In one embodiment of the invention, step d) comprises the steps of:
d-1) the image segmentation network is composed of an image downsampling unit, a bottleneck unit, an image upsampling unit, an attention connecting unit and a prognosis processing unit.
d-2) the image downsampling unit sequentially comprises a first downsampling module, a second downsampling module and a third downsampling module, wherein the first downsampling module sequentially comprises a Patch Partition layer and a Liear embedding layer, the second downsampling module comprises a Patch merge layer, the third downsampling module comprises a Patch merge layer, a first digital image of npz files of a training set is input into the first downsampling module, the image is reduced to H/4*W/4*C, an image downsampleis formed, H is the image height, W is the image width, C is the dimension, the image downsampleis input into a switch-transform network, and the image downsamples 1 is output. The swin-transformer network is composed of a LayerNormer layer, a multi-head attention module, a quick connection and MLP with RELU nonlinearity in sequence, and the structure of the swin-transformer network is the prior art and is not repeated here. Further preferably, C has a value of 96.
d-3) the attention connection unit sequentially comprises a hole convolution module and a multi-head attention mechanism, and an image downsample is input into the attention connection unit and output to obtain an image downsample_1.
d-4) inputting the image downsampled 1 into a second downsampling module, reducing the image downsampled 1 to H/8*W/8 x 2C to form the image downsampled 1 Down sample the image 1 And inputting the images into a swin-transformer network, and outputting the images to obtain the image 2.
d-5) downsampling the image 1 Input to the attention connection unit, and output the obtained image downsample_2.
d-6) inputting the image downsampled 2 into a third downsampling module, and reducing the image downsampled 2 to H/16W/16C to form the image downsampled 2 Down sample the image 2 And inputting the images into a swin-transformer network, and outputting the images to obtain the image down sample3.
d-7) downsampling the image 2 Input to the attention connection unit, and output the obtained image downsample_3.
d-8) the bottleneck unit is sequentially formed by two identical switch-transformation networks, the image downsamples 3 is input into the Patch merge layer to reduce the image downsamples 3 to H/32W/32C, and the image downsamples are formed 3 Down sample the image 3 And inputting the image into a bottleneck unit, and outputting the image downsample4.
d-9) inputting the image downsample4 into the Patch expansion layer to expand the image downsample4 to H/16W/16X 4C to form the image upsamples 1 Image upsample 1 And adding the image down sample_3, inputting the added image down sample_3 into a swin-transformer network, and outputting the added image down sample_3.
d-10) the image up-sampling unit is sequentially composed of a third up-sampling module, a second up-sampling module and a first up-sampling moduleThe sampling module comprises a third up-sampling module which is composed of a Patch expansion layer, a second up-sampling module which is composed of a Patch expansion layer, a first up-sampling module which is composed of a Patch expansion layer, an image up sample3 which is input into the third up-sampling module, and the image up sample3 which is expanded to H/8*W/8 x 2C, so as to form the image up sample 2 Image upsample 2 And adding the image down sample_2, inputting the added image down sample_2 into a swin-transformer network, and outputting the added image down sample_2.
d-11) inputting the image upsample2 into the second upsampling module, expanding the image upsample2 to H/4*W/4*C, forming the image upsample 3 Image upsample 3 And adding the image down sample_1, inputting the added image down sample_1 into a swin-transformer network, and outputting the added image down sample_1.
d-12) inputting the image upsample1 into the first upsampling module, expanding the image upsample1 to H W C, and forming the image upsample 4 Image upsample 4 Input into Linear Projection and image up sample 4 Changing into H W CLASS, and forming an image output, wherein the CLASS is the classified category number.
d-13) the prognosis processing unit is composed of an image reading module and a conditional random field CRF module, wherein the image reading module is composed of a cv2 module and a nupy module in python, the conditional random field CRF module is composed of a pydensecrf module in python, and the image reading module of the prognosis processing unit is used for carrying out reasoning calculation on the image output after reading the image output to obtain a gastric pathological section gland segmentation result diagram.
In one embodiment of the invention, step e) comprises the steps of:
e-1) calculating to obtain total loss through a formula loss=a x cross EntropyLoss+b x DiceLoss, wherein a and b are weight values, a+b=1, and the cross EntropyLoss and the DiceLoss are loss functions in a swin-unet model. Further preferably, a has a value of 0.4 and b has a value of 0.6.
e-2) training the image segmentation network through total loss by using an SGD optimizer to obtain an optimized image segmentation network. The momentum factor momentum of the SGD optimizer is set to 0.9, the weight attenuation weight_decay is set to 0.0001, the learning rate lr is set to be the basic learning rate 0.01, an iterator DataLoader in python is used for iterating data when the image segmentation network is trained, the parameter of the DataLoader is set to be the batch_size and is set to be 12, the buffer is set to True, the num_works is set to be 0, the pin_memory is set to be Ture, 180 epochs are trained, one model weight result is saved every ten epochs from half of model training, training is finished, and model parameters are saved, so that the optimized image segmentation network is obtained. Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. The stomach pathological section gland segmentation method based on the Swin-Unet model is characterized by comprising the following steps of:
a) Obtaining stomach pathology images and json annotation files corresponding to the stomach pathology images;
b) Preprocessing stomach pathology images to obtain npz files;
c) Dividing all npz files into a training set and a testing set;
d) Establishing an image segmentation network, inputting a training set into the image segmentation network, and outputting to obtain a gastric pathological section gland segmentation result diagram;
e) Training an image segmentation network to obtain an optimized image segmentation network;
f) And (3) inputting the test set into an optimized image segmentation network, and outputting to obtain a final stomach pathological section gland segmentation result image output_final.
2. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 1, wherein the method comprises the following steps: and in the step a), obtaining stomach pathology images from stomach section outline classification in a pathology digital section cloud labeling platform.
3. The method for segmenting gastric pathological segments glands based on the Swin-Unet model according to claim 1, wherein the step b) comprises the following steps:
b-1) replacing a dataset module in the Swin-Unet model with a mydataset module, wherein the mydataset module comprises an image file, a json file and an npz file in Python, the image file comprises an image segmentation function, an image screening function, a label information visualization function and an image format conversion function, the json file comprises a label information cleaning function, a label segmentation function, a label screening function and a label format conversion function, and the npz file comprises a npz generation function and a file name generation function;
b-2) inputting the acquired json annotation file into an annotation information cleaning function in the json. Py file, renaming the file, removing 3dh in the file name, and obtaining the json annotation file which is formatted after screening out unnecessary annotation information;
b-3) inputting the acquired stomach pathology image into an image segmentation function in an image file, segmenting the stomach pathology image into images with the size of 512 x 512 by using an openlide module in Python, keeping the step pitch equal to the segmentation size, and storing the segmented images in an image folder of a project current catalog; b-4) inputting the json annotation files which are subjected to screening of unnecessary annotation information and formatting into an annotation segmentation function in the json file, segmenting the json annotation files into json annotation files with the size of 512 x 512, keeping the step distance equal to the segmentation size, and storing the segmented json annotation files in an animation folder of the current catalogue of the project;
b-5) traversing all images in an image folder through an image screening function in an image file, screening out corresponding images according to pixel coordinate information in a json annotation file corresponding to the images by calling a numpy module, a json module and an os module in python, storing the images in a select_image folder of a project current directory, traversing all json annotation files in an animation folder through an annotation screening function in the json file, screening out json annotation files with the length of a shape smaller than 4 in the json annotation files by calling a numpy module, a json module and an os module in python, and storing each json annotation file after screening in the select_json folder of the project current directory; b-6) in the annotation information visualization function of the image file, the annotation information is visualized on the screened image according to points in the annotation information by calling a PIL module, an os module, a numpy module and a json module in python, and the annotation information is stored in a visual_json folder of the current catalog of the project; b-7) in the image format conversion function of the image file, converting the image format into png format by calling PLI module and os module in python to traverse all images in the select_image folder, and storing the converted image format into tif_png folder of the current catalog of the project;
b-8) in the annotation format conversion function in the json file, traversing all json annotation files in the select_json folder by calling the json module and the cv2 module in the python, converting the json annotation files into an image in png format, setting the rgb value of the background pixel of the image to be (0, 0), setting the rgb value of other pixels of the image to be (0, 1), and storing the json annotation files converted into png format into the json_png folder of the current catalog of the project;
b-9) transmitting the paths of the tif_png folder and the paths of the json_png folder into a npz generating function of the npz.py file, wherein the npz generating function generates npz files for model training by calling a numpy module, a cv2 module and an os module in python and combining the images and json labeling files correspondingly, a first number group of the npz files is named as image, a second number group of the npz files is named as label, and all npz files are saved in the npz folders of the current catalogue of the project.
4. A method for segmentation of gastric pathological section glands based on the Swin-Unet model according to claim 3, wherein step c) comprises the steps of:
c-1) dividing all npz files into a training set and a testing set according to the proportion of 9:1;
c-2) storing the training set data in a train_ npz folder under the current catalog of the project, and storing the test set data in a test_ npz folder under the current catalog of the project;
c-3) inputting the path of the train_ npz folder and the path of the test_ npz folder into the file name generating function of the npz.py file, wherein the file name generating function extracts the file names of all files in the train_ npz folder and stores the file names in the train.txt file of the current catalog of the project by calling the os module in python, and extracts the file names of all files in the test_ npz folder and stores the file names in the test.txt file of the current catalog of the project.
5. The method for segmenting gastric pathological segments glands according to claim 4, wherein the step d) comprises the following steps:
d-1) the image segmentation network consists of an image downsampling unit, a bottleneck unit, an image upsampling unit, an attention connecting unit and a prognosis processing unit;
d-2) an image downsampling unit sequentially comprises a first downsampling module, a second downsampling module and a third downsampling module, wherein the first downsampling module sequentially comprises a Patch Partition layer and a Liear embedding layer, the second downsampling module comprises a Patch Merrgeing layer, the third downsampling module comprises a Patch Merrgeing layer, a first digital image of npz files of a training set is input into the first downsampling module, the image is reduced to H/4*W/4*C, an image downsampleis formed, H is the image height, W is the image width, C is the dimension, the image downsamples are input into a swin-transformation network, and the image downsamples 1 are output;
d-3) the attention connection unit sequentially comprises a hole convolution module and a multi-head attention mechanism, and an image downsample is input into the attention connection unit and output to obtain an image downsample_1;
d-4) inputting the image downsampled 1 into a second downsampling module, reducing the image downsampled 1 to H/8*W/8 x 2C to form the image downsampled 1 Down sample the image 1 Inputting the images into a swin-transformer network, and outputting the images to obtain an image 2;
d-5) downsampling the image 1 Inputting the image into an attention connection unit, and outputting the image down sample_2;
d-6) inputting the image downsampled 2 into a third downsampling module, and reducing the image downsampled 2 to H/16W/16C to form the image downsampled 2 Down sample the image 2 Inputting the images into a swin-transformer network, and outputting the images to obtain an image down sample3;
d-7) downsampling the image 2 Inputting the image into an attention connection unit, and outputting the image down sample_3;
d-8) the bottleneck unit is sequentially formed by two identical switch-transformation networks, the image downsamples 3 is input into the Patch merge layer to reduce the image downsamples 3 to H/32W/32C, and the image downsamples are formed 3 Down sample the image 3 Inputting the image into a bottleneck unit, and outputting the image down sample4;
d-9) inputting the image downsample4 into the Patch expansion layer to expand the image downsample4 to H/16W/16X 4C to form the image upsamples 1 Image upsample 1 Adding the image down sample_3, inputting the added image down sample_3 into a swin-transformer network, and outputting an obtained image up sample3;
d-10) the image up-sampling unit sequentially comprises a third up-sampling module, a second up-sampling module and a first up-sampling module, wherein the third up-sampling module is composed of a Patch Expanding layer, the second up-sampling module is composed of a Patch Expanding layer, the first up-sampling module is composed of a Patch Expanding layer, the image up sample3 is input into the third up-sampling module, and the image up sample3 is expanded to H/8*W/8 x 2C to form the image up sample 2 Image upsample 2 Adding the image down sample_2, inputting the added image down sample_2 into a swin-transformer network, and outputting the added image down sample_2;
d-11) inputting the image upsample2 into the second upsampling module, expanding the image upsample2 to H/4*W/4*C, forming the image upsample 3 Image upsample 3 Adding with the image downsampled_1, inputting into a swin-transformer network, inputtingObtaining an image upsample1;
d-12) inputting the image upsample1 into the first upsampling module, expanding the image upsample1 to H W C, and forming the image upsample 4 Image upsample 4 Input into Linear Projection and image up sample 4 Changing into H W CLASS to form an image output, wherein the CLASS is the classified category number;
d-13) the prognosis processing unit is composed of an image reading module and a conditional random field CRF module, wherein the image reading module is composed of a cv2 module and a nupy module in python, the conditional random field CRF module is composed of a pydensecrf module in python, and the image reading module of the prognosis processing unit is used for carrying out reasoning calculation on the image output after reading the image output to obtain a gastric pathological section gland segmentation result diagram.
6. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 1, wherein the method comprises the following steps: in step d-2) the value of C is 96.
7. The method for segmenting gastric pathological sections glands according to claim 1, wherein step e) comprises the steps of:
e-1) calculating to obtain total loss through a formula loss=a x cross EntropyLoss+b x diceLoss, wherein a and b are weight values, a+b=1, and the cross EntropyLoss and the diceLoss are loss functions in a swin-unet model;
e-2) training the image segmentation network through total loss by using an SGD optimizer to obtain an optimized image segmentation network.
8. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 7, wherein the method comprises the following steps: a is 0.4 and b is 0.6.
9. The gastric pathological section gland segmentation method based on the Swin-Unet model according to claim 7, wherein the method comprises the following steps: in the step e-2), the momentum factor momentum of the SGD optimizer is set to 0.9, the weight attenuation weight_decay is set to 0.0001, the learning rate lr is set to be the basic learning rate 0.01, when the image segmentation network is trained, an iterator DataLoader in python is used for iterating data, the parameter of the DataLoader is set to be the batch_size and is set to 12, the buffer is set to True, the num_works is set to 0, the pin_memory is set to be Ture, 180 epochs are trained, one half of the model weight result is saved for every ten epochs from model training, the training is finished, and the model parameters are saved, so that the optimized image segmentation network is obtained.
CN202310314667.8A 2023-03-29 2023-03-29 Stomach pathological section gland segmentation method based on Swin-Unet model Pending CN116485811A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310314667.8A CN116485811A (en) 2023-03-29 2023-03-29 Stomach pathological section gland segmentation method based on Swin-Unet model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310314667.8A CN116485811A (en) 2023-03-29 2023-03-29 Stomach pathological section gland segmentation method based on Swin-Unet model

Publications (1)

Publication Number Publication Date
CN116485811A true CN116485811A (en) 2023-07-25

Family

ID=87211080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310314667.8A Pending CN116485811A (en) 2023-03-29 2023-03-29 Stomach pathological section gland segmentation method based on Swin-Unet model

Country Status (1)

Country Link
CN (1) CN116485811A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196972A (en) * 2023-08-25 2023-12-08 山东浪潮科学研究院有限公司 Improved transducer-based document artifact removal method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196972A (en) * 2023-08-25 2023-12-08 山东浪潮科学研究院有限公司 Improved transducer-based document artifact removal method

Similar Documents

Publication Publication Date Title
CN111598892B (en) Cell image segmentation method based on Res2-uneXt network structure
CN110782462B (en) Semantic segmentation method based on double-flow feature fusion
CN111784671B (en) Pathological image focus region detection method based on multi-scale deep learning
KR20200084434A (en) Machine Learning Method for Restoring Super-Resolution Image
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN110728682A (en) Semantic segmentation method based on residual pyramid pooling neural network
Sood et al. An application of generative adversarial networks for super resolution medical imaging
CN116797787B (en) Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN116485811A (en) Stomach pathological section gland segmentation method based on Swin-Unet model
CN113256494B (en) Text image super-resolution method
CN112927253A (en) Rock core FIB-SEM image segmentation method based on convolutional neural network
CN111914654A (en) Text layout analysis method, device, equipment and medium
CN114596318A (en) Breast cancer magnetic resonance imaging focus segmentation method based on Transformer
CN114266957A (en) Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation
Wu et al. Multi-focus image fusion: Transformer and shallow feature attention matters
CN116977387B (en) Deformable medical image registration method based on deformation field fusion
Zuo et al. Gradient-guided single image super-resolution based on joint trilateral feature filtering
CN114708353B (en) Image reconstruction method and device, electronic equipment and storage medium
CN111080516A (en) Super-resolution image reconstruction method based on self-sampling enhancement
CN114331922B (en) Multi-scale self-calibration method and system for restoring turbulence degraded image by aerodynamic optical effect
Schirrmacher et al. SR 2: Super-resolution with structure-aware reconstruction
CN113205005B (en) Low-illumination low-resolution face image reconstruction method
CN114898096A (en) Segmentation and annotation method and system for figure image
CN113947102A (en) Backbone two-path image semantic segmentation method for scene understanding of mobile robot in complex environment
CN112464733A (en) High-resolution optical remote sensing image ground feature classification method based on bidirectional feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination