CN115965602A - Abnormal cell detection method based on improved YOLOv7 and Swin-Unet - Google Patents

Abnormal cell detection method based on improved YOLOv7 and Swin-Unet Download PDF

Info

Publication number
CN115965602A
CN115965602A CN202211726362.XA CN202211726362A CN115965602A CN 115965602 A CN115965602 A CN 115965602A CN 202211726362 A CN202211726362 A CN 202211726362A CN 115965602 A CN115965602 A CN 115965602A
Authority
CN
China
Prior art keywords
cell
swin
yolov7
model
improved
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211726362.XA
Other languages
Chinese (zh)
Inventor
胡鹤轩
方晓杰
黄倩
杨天金
胡强
巫义锐
张晔
狄峰
胡震云
周晓军
沈勤
吕京澴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiuyisanluling Medical Technology Nanjing Co ltd
Hohai University HHU
Original Assignee
Jiuyisanluling Medical Technology Nanjing Co ltd
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiuyisanluling Medical Technology Nanjing Co ltd, Hohai University HHU filed Critical Jiuyisanluling Medical Technology Nanjing Co ltd
Priority to CN202211726362.XA priority Critical patent/CN115965602A/en
Publication of CN115965602A publication Critical patent/CN115965602A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses an abnormal cell detection method based on improved YOLOv7 and Swin-Unet, which comprises the following steps: collecting pathological cell smear images, and making an abnormal cell detection data set and a segmentation data set; constructing an improved YOLOv7 model and training; constructing a detection result screening module, and classifying cells in the detection network output image; building and training a Swin-Unet model for segmenting the overlapped cell mass images: based on an Unet model, a Swin-Transformer module is introduced to sample according to the local relation and the global relation of the cell image under multiple scales; abnormal cell detection was performed using the improved YOLOv7 and Swin-uet models. The invention fully utilizes the context information during cell detection, effectively processes the cell clusters difficult to detect, and can greatly improve the accuracy and the recall rate on the premise of ensuring the detection rate.

Description

Abnormal cell detection method based on improved YOLOv7 and Swin-Unet
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to an abnormal cell detection method based on improved YOLOv7 and Swin-Unet.
Background
The pathological cytology examination refers to taking cytology specimens, such as sputum cast-off cells, liquid-based cells and the like, making pathological cytology slices through smear, pathological section making technology and the like, and then observing the conditions of cell types, cell types and the like through a microscope to diagnose diseases. For example, screening of diseases such as breast cancer and cervical cancer is mostly diagnosed by applying pathological cytology examination, wherein the cervical cancer is the most common gynecological malignant tumor worldwide and seriously threatens the life of women. 2016 world health organization reports: more than 50 million new cervical cancer cases are globally sent every year, the cases account for about 28 percent of China in developing countries, the early treatment effect of the cancer is good, the cost is low, the difficulty is small, but no obvious symptoms exist and the cancer is not easy to find, cytology (including traditional pap smears) is taken as a main screening means of common female cancers such as the cervical cancer in China, but the overall screening level is not high, and mainly because domestic experienced cytopathologists and auxiliary personnel are scarce, the auxiliary detection of pathological cells by using a computer is very necessary and valuable.
The detection method in the prior art is mainly based on a deep learning method, and comprises a method based on target detection and a method based on example segmentation. Chinese patent application (CN 202111048528.2) "an abnormal cell detection method based on attention-inducing mechanism", adopts advanced target detection network RetinaNet to screen suspicious cells, and then classifies the suspicious cells by using Mean-Teacher network with attention-inducing mechanism. The method effectively realizes false positive inhibition in the detection process and improves the detection precision, but the method has poor performance when the noise of a detection sample is increased and the overlapped cell mass is increased. The main performance is as follows: (1) overlapping abnormal cells are difficult to detect; (2) When the sample contains non-cell units such as tissue fluid, the detection precision is obviously reduced; (3) The attention mechanism introduced does not sufficiently combine multi-scale information, and the detection performance needs to be improved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an abnormal cell detection method based on improved YOLOv7 and Swin-Unet, which is used for carrying out staged detection on overlapped abnormal cells difficult to detect, effectively detecting difficult-to-detect cell samples by utilizing the characteristics of good performance of large targets and high precision of example segmentation of target detection, and preventing the phenomena of missed detection and false detection; the most advanced dynamic head attention mechanism is adopted in the detection head part, the attention of three dimensions of scale, space and task is fully fused, and the detection precision can be greatly improved.
In order to solve the technical problems, the invention adopts the following technical scheme.
An abnormal cell detection method based on improved YOLOv7 and Swin-Unet, comprising the following steps:
step 1, collecting a cell smear image in pathological cytology examination, and making an abnormal cell detection data set and an abnormal cell segmentation data set;
step 2, constructing an improved YOLOv7 model and training the improved YOLOv7 model for detecting abnormal cells and overlapped cell clusters, improving the extraction problem of multi-dimensional characteristic information on the basis of the latest YOLOv7 model to obtain an improved YOLOv7 model structure, wherein the improved YOLOv7 model structure comprises an abnormal cell detection data preprocessing module Process, a Backbone network backhaul, a Neck network neutral and a detection network Head;
step 3, building a detection result screening module, classifying cells in the detection network output image, outputting the abnormal cell image, and inputting the overlapped cell cluster image as a segmentation model;
step 4, building a Swin-Unet model and training, wherein the Swin-Unet model is used for segmenting the overlapped cell mass images: based on a most commonly used Unet model in the medical field, a Swin-Transformer module is introduced to sample the local relation and the global relation of the cell image under multiple scales; the Swin-Unet model structure comprises an Encoder Encoder, a Neck network Neck and a Decoder Decoder;
and 5, carrying out abnormal cell detection by using a modified Yolov7 model and a Swin-Unet model.
The step 1 process is as follows:
1-1, collecting cell smear images in pathological cytology examination, including cervical cell images and mammary gland cell images, performing sliding window cutting on original cell smear images, wherein the cutting size is 640 multiplied by 640, the overlapping range of the sliding window is 50%, obtaining cell images of small areas, performing rectangular frame labeling on independent abnormal cells and overlapping cell groups by using a LabelImg tool, storing labels as XML files, and making an abnormal cell detection data set for training an improved YOLOv7 model;
and 1-2, screening cell images with labels of overlapped cell groups in the abnormal cell detection data set, subdividing a segmentation area by utilizing a polygonal labeling function of a LabelImg tool, labeling the area, labeling the abnormal cells, storing the labels, and preparing the abnormal cell segmentation data set for training a Swin-Unet model.
Specifically, in step 2, the building of the improved YOLOv7 model includes:
2-1, constructing an abnormal cell detection data preprocessing module, comprising: performing data enhancement of turning and translation on the cell image; and (3) carrying out noise reduction processing on the cell image by using Gaussian filtering, wherein a Gaussian kernel function is as follows:
Figure BDA0004026364530000021
g (x, y) is a pixel value of the denoised cell image, x and y represent coordinates of pixel points, and sigma is a standard variance of Gaussian, so that the smoothness degree of the cell image is determined;
2-2, constructing a backbone network of the improved YOLOv7 model: firstly, convolving a feature map of an input cell image by a 4-layer CBS module, wherein the CBS module comprises a Conv layer, a BN layer and a SiLU layer, and then stacking an ELAN module and an MP module to output three feature maps; the ELAN comprises a plurality of CBS modules, the input and output characteristic size of the ELAN is kept unchanged, the number of channels is changed in the first two CBS modules, the latter input channels are all kept consistent with the output channels, and the output channels are required channels through the last CBS module; the MP module is spliced by output vectors of the Maxpool and CBS modules;
2-3, building a neck network of the improved YOLOv7 model: fusing three characteristic graphs output by a backbone network by using a PAFPN structure;
2-4. Build the head network of the improved YOLOv7 model: introducing a dynamic head Dyhead module to perform characteristic diagram attention fusion; its dynamic head module structure includes: scale-aware attention, spatially-aware attention, and task-aware attention with a stacked fit of attention functions; the formula for applying self-attention is:
W(F)=π CSL (F)·F)·F)·F (2)
wherein F ∈ R L×S×C Corresponding to the input feature vector, R is the input feature vector set, L represents the scale degree of the feature, S = H multiplied by W is the remodeling of the height H and width W dimensions of the feature map, C represents the channel number of the feature map, pi L (x)、π S (x)、π C (x) The attention function which is independent on three dimensions of task, space and scale respectively corresponds to the formulas (3), (4) and (5):
Figure BDA0004026364530000031
/>
Figure BDA0004026364530000032
π C (F)·F=max(α 1 (F)·F c1 (F),α 2 (F)·F c2 (F)) (5)
wherein, in the formula (3), f (x) is a linear function approximated by 1 × 1 convolution,
Figure BDA0004026364530000033
is a Hard-Sigmoid activation function;
in equation (4), K is the number of sparse sampling locations, ω is the weighting factor for l and K, p k +Δp k Is the position of the spatial offset, Δ m, by self-learning k Is position p k A self-learning scalar;
in the formula (5), [ alpha ] 1212 ] T (·) is a hyper-function of the learning control activation threshold, F C The feature slice at the C-th channel is represented.
Specifically, in step 2, the process of training the improved YOLOv7 model is as follows:
inputting cell images in abnormal cell detection data sets, wherein the size of the cell images is 640 multiplied by 640, the batch-size is set to be 16, training 180 epochs to obtain an improved YOLOv7 model with the best effect, extracting overlapped cell mass images in detection results, and setting different IoU cross-over ratios; where @ x represents the performance of setting the IoU threshold to x; the mAP represents the average value of the AP calculated for each corresponding category, the higher the value, the better the detection effect, the higher the Recall Recall rate, namely the Recall ratio, the higher the value, the fewer the leaked marked cells.
Specifically, the process of building the Swin-uet model in step 4 includes:
4-1, constructing an encoder part of a Swin-Unet model: firstly, the minimum unit of a cell image is converted into a 4 x 4 Patch from a pixel through the Patch Panel, a structure of a high-dimensional space is maintained through a linear embedding module, then a down-sampling process is carried out, the initial part still uses a convolution layer in Unet, and the two following down-sampling processes are replaced by a pair of Swin-Transformer, and the basic formula is as follows:
Figure BDA0004026364530000041
wherein Q, K, V represent query, key, value matrix in the self Attention separately, d represents the dimensionality, B is the relative position bias that can be learnt, attention (Q, K, V) represents the Attention function in each patch;
4-2, building a neck layer of a Swin-Unet model: filtering the high-dimensional characteristic information after down-sampling by using a group of paired Swin-transformers;
4-3, constructing a decoder part of the Swin-Unet model: the network structure corresponding to the encoder network comprises the steps of firstly, performing twin Swin-transducer + batch expansion twice, performing original up-sampling +2 times of convolution on the last layer of characteristics, splicing characteristic graphs corresponding to encoders in the same phase for each up-sampling module to form a residual block, performing linear projection on output characteristics once, and sending the output characteristics into a convolution network for classification to obtain a final output result.
Specifically, the training process of the Swin-Unet model in the step 4 is as follows:
taking the overlapped cell mass image as input, setting the batch-size to 64, training 80 epochs, and setting different IoU (cross-over ratio, which is expressed as a cross-over ratio threshold value of a prediction Mask and a group Truth Mask in a segmentation task.
In the step 5, the process of abnormal cell detection using the improved Yolov7 and Swin-Unet model comprises the following steps:
5-1, obtaining a cell smear image in the pathological cytology examination, performing sliding window cutting on the cell smear image, and inputting the cut cell image into a detection network in sequence;
5-2, extracting the features of the cell image by using the improved backbone network of the YOLOv7 network, sending feature maps of different scales into a neck network, fusing the features, sending the feature maps into a detection head network, and outputting a detection result;
5-3, screening the detection result by using a detection result screening module, judging that abnormal cells are directly used as output, and sending the abnormal cells into a segmentation network if the abnormal cells are judged to be overlapped cell clusters;
and 5-4. The encoder of the Swin-Unet network performs down-sampling on the overlapped cell images, the overlapped cell images are filtered by the neck network, the overlapped cell images enter the decoder network to perform up-sampling by using a residual error structure, the output characteristic images enter a convolution layer to classify image segmentation areas, the areas judged to be abnormal cells are output, and the output characteristic images and the abnormal cells in the step 3 are used as final output results.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention adopts a staged detection method of target detection and example segmentation, fully utilizes the high precision of a segmentation model to process overlapping cell clusters which are difficult to detect, simultaneously keeps the high performance of the original detection model and the easy labeling property of data, is used for processing single abnormal cells which are easy to detect, effectively solves the problems of cell clusters which are difficult to detect and easy to detect by mistake under the condition of not losing the real-time detection performance, improves the recall ratio and the accuracy of detection, realizes good balance on the hardware level and precision, meets the practical requirements better and has feasibility.
2. The invention introduces the most advanced dynamic head module (Dyhead), and simultaneously fits scale perception attention, space perception attention and task perception attention by stacking of attention functions, so that the cell context correlation under multiple dimensions is fully considered in the detection process of a target detection network and is matched with the real detection process of a pathologist, the model is more robust, and the detection accuracy is improved.
3. According to the method, a Swin-transducer attention module is introduced, and the local attention and the global attention of the image are combined through a sliding window mechanism, so that the segmentation performance of the segmentation network is enhanced effectively, the identification accuracy of the model on the complex cell mass is improved, and the overall detection precision is improved.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention.
Fig. 2 is a structural diagram of an improved YOLOv7 model according to an embodiment of the present invention.
FIG. 3 is a diagram of a dynamic head module according to an embodiment of the present invention.
Fig. 4 is a schematic structural diagram of the Swin-uet model according to an embodiment of the present invention.
Figure 5 is a Swin-Transformer block diagram according to one embodiment of the present invention.
FIG. 6 is a diagram of an overall algorithm implementation process according to an embodiment of the present invention.
Detailed Description
The invention relates to an abnormal cell detection method based on improved YOLOv7 and Swin-Unet, which comprises the following steps: collecting cell slice images in cytopathology examination, and making an abnormal cell detection data set and an abnormal cell segmentation data set; an improved YOLOv7 model is built, and the robustness of the model is enhanced by adding a dynamic attention head to detect independent abnormal cells and overlapped cell clusters; carrying out abnormal cell detection by using the model with the best training effect, and screening the detection result; inputting the overlapped cell groups in the detection result into a Swin-Unet model for segmentation detection. The invention fully considers the context information during cell detection, effectively processes the cell clusters difficult to detect, and effectively improves the accuracy and the recall rate on the premise of ensuring the detection rate.
The present invention will be described in further detail with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method according to an embodiment of the present invention. As shown in fig. 1, the method of this embodiment includes the following steps:
step 1, collecting a cell smear image in pathological cytology examination, and making an abnormal cell detection data set and an abnormal cell segmentation data set;
1-1, collecting cell smear images in pathological cytology examination, including cervical cell images and mammary gland cell images, performing sliding window cutting on original cell smear images, wherein the cutting size is 640 multiplied by 640, the overlapping range of the sliding window is 50%, obtaining cell images of small areas, performing rectangular frame labeling on independent abnormal cells and overlapping cell groups by using a LabelImg tool, storing labels as XML files, and making an abnormal cell detection data set for training an improved YOLOv7 model;
1-2, screening cell images with labels of overlapped cell groups in the abnormal cell detection data set, subdividing a segmentation area by utilizing a polygon labeling function of a LabelImg tool, labeling the area, labeling abnormal cells, storing the labels, and preparing an abnormal cell segmentation data set for training a Swin-Unet model;
and 2, constructing an improved YOLOv7 model and training the model for detecting abnormal cells and overlapped cell clusters. The invention is based on the latest YOLOv7 model, and improves the extraction problem of multi-dimensional characteristic information, the improved YOLOv7 model structure is shown in FIG. 2, the overall structure comprises a data preprocessing module (Process), a Backbone network (Backbone), a Neck network (Neck) and a detection network (Head), and the construction of each network is as follows:
2-1, constructing an abnormal cell detection data preprocessing module, comprising: performing data enhancement of turning and translation on the cell image; and (3) carrying out noise reduction processing on the cell image by using Gaussian filtering, wherein a Gaussian kernel function is as follows:
Figure BDA0004026364530000061
g (x, y) is a pixel value of the denoised cell image, x and y represent coordinates of pixel points, and sigma is a standard variance of Gaussian, so that the smoothness degree of the cell image is determined;
2-2, constructing a backbone network of the improved YOLOv7 model: firstly, convolving a feature map of an input cell image by a 4-layer CBS module, wherein the CBS module comprises a Conv layer, a BN layer and a SiLU layer, and then stacking an ELAN module and an MP module to output three feature maps; the ELAN comprises a plurality of CBS modules, the input and output characteristic size of the ELAN is kept unchanged, the number of channels is changed in the first two CBS modules, the latter input channels are all kept consistent with the output channels, and the output channels are required channels through the last CBS module; the MP module is spliced by output vectors of the Maxpool and CBS modules;
2-3, building a neck network of the improved YOLOv7 model: fusing three characteristic graphs output by a main Network by utilizing a PAFPN (Path Aggregation Network with Feature Pyramid Networks) structure;
2-4. Build the head network of the improved YOLOv7 model: introducing a dynamic head Dyhead module to perform characteristic diagram attention fusion; its dynamic head module structure includes: scale-aware attention, spatially-aware attention, and task-aware attention with a stacked fit of attention functions; the formula for applying self-attention is:
W(F)=π CSL (F)·F)·F)·F (2)
wherein F ∈ R L×S×C Corresponding to the input feature vector, R is an input feature vector set, L represents the scale number of the feature, S = H multiplied by W is the reshaping of the height (H) and width (W) dimensions of the feature map, C represents the channel number of the feature map, and pi L (x)、π S (x)、π C (x) The attention functions which are independent in three dimensions of task, space and scale respectively correspond to formulas (3), (4) and (5):
Figure BDA0004026364530000062
Figure BDA0004026364530000063
/>
π C (F)·F=max(α 1 (F)·F c1 (F),α 2 (F)·F c2 (F)) (5)
wherein in the formula (3), f (x) is a linear function approximated by a 1 × 1 convolution,
Figure BDA0004026364530000064
is a Hard-Sigmoid activation function;
in equation (4), K is the number of sparsely sampled locations, ω is the weighting factor corresponding to l and K, p k +Δp k Is the position of the spatial offset, Δ m, by self-learning k Is position p k A self-learning scalar;
in the formula (5), [ alpha ] 1212 ] T (·) is a hyper-function of the learning control activation threshold, F C Representing the feature slice at the C-th channel;
2-5 training the improved Yolov7 model: inputting an abnormal cell detection data set, cutting an image into a size of 640 multiplied by 640, setting a batch-size of 16, training 180 epochs to obtain an improved Yolov7 model with the best effect, extracting overlapped cell mass samples in a detection result, setting different IoU (intersection ratio, which is expressed as the intersection ratio of a prediction frame and a group trial box in a target detection task) thresholds, and setting the training result as shown in Table 1.
TABLE 1
Figure BDA0004026364530000071
Where @ x represents an expression in which the IoU threshold is set to x, and mapp represents an Average value of AP (Average Precision) calculated for each category, and a higher value indicates a better detection effect, and Recall is also called Recall rate, and a higher value indicates a smaller number of cells with labels that have been missed.
Step 3, building a detection result screening module, classifying cells in the detection network output image, outputting the abnormal cell image, and inputting the overlapped cell cluster image as a segmentation model;
and 4, building a Swin-Unet model and training the Swin-Unet model for segmenting the overlapped cell mass images. The invention is based on the most commonly used Unet model in the medical field, aiming at the local relation and the global relation of the cell image under the multi-scale, a Swin-Transformer module is introduced for sampling, the structure of the Swin-Unet model is shown as figure 4, the overall structure of the Swin-Unet model comprises an Encoder (Encoder), a Neck network (Neck) and a Decoder (Decoder), and the construction of each network is as follows:
4-1, constructing an encoder part of a Swin-Unet model: firstly, the minimum unit of a cell image is converted into a 4 × 4 Patch from a pixel through the Patch Panel, a downsampling process is performed after a structure of a high-dimensional space is maintained through a linear embedding module, the initial part still uses a convolution layer in Unet, and downsampling of the two subsequent times is replaced by a pair of Swin-Transformers, the structure of which is shown in FIG. 5, and the basic formula of which is shown in formula (6):
Figure BDA0004026364530000072
wherein Q, K, V represent query, key, value matrix in the self Attention separately, d represents the dimensionality, B is the relative position bias that can be learnt, attention (Q, K, V) represents the Attention function in each patch;
4-2, building a neck layer of a Swin-Unet model: filtering the high-dimensional characteristic information after down-sampling by using a group of paired Swin-transformers;
4-3, constructing a decoder part of a Swin-Unet model: the network structure corresponding to the encoder network comprises the steps of firstly, performing twin Swin-transducer + batch expansion twice, performing original up-sampling +2 times of convolution on the last layer of characteristics, splicing characteristic graphs corresponding to encoders in the same phase for each up-sampling module to form a residual block, performing linear projection on output characteristics once, and sending the output characteristics into a convolution network for classification to obtain a final output result.
4-4, training Swin-Unet model: overlapping cell clusters were entered as segmentation datasets, with the batch-size set to 64, 80 epochs trained, and different IoU (cross-over ratio, expressed as the cross-over ratio of the predictive Mask to the group try Mask in the example segmentation task) thresholds set, with the training results shown in table 2.
TABLE 2
Figure BDA0004026364530000081
Wherein each index is the same as table 1.
Step 5, using improved Yolov7 and Swin-Unet model to detect abnormal cells, the algorithm flow is shown in FIG. 6, and the detailed process is as follows:
5-1, obtaining a cell smear image in the pathological cytology examination, performing sliding window cutting on the cell smear image, and inputting the cut cell image into a detection network in sequence;
5-2, extracting the features of the cell image by using the improved backbone network of the YOLOv7 network, sending feature maps of different scales into a neck network, fusing the features, sending the feature maps into a detection head network, and outputting a detection result;
5-3, screening the detection result by using a detection result screening module, judging that abnormal cells are directly used as output, and sending the abnormal cells into a segmentation network if the abnormal cells are judged to be overlapped cell clusters;
and 5-4. Down-sampling the overlapped cell images by an encoder of the Swin-Unet network, filtering the overlapped cell images by the neck network, entering a decoder network to perform up-sampling by using a residual error structure, entering an output characteristic image into a convolution layer to classify image segmentation regions, outputting the regions which are judged to be abnormal cells, and taking the abnormal cells together with the abnormal cells in the step 3 as a final output result.

Claims (6)

1. An abnormal cell detection method based on improved YOLOv7 and Swin-Unet, which is characterized by comprising the following steps:
step 1, collecting a cell smear image in pathological cytology examination, and making an abnormal cell detection data set and an abnormal cell segmentation data set;
step 2, constructing an improved YOLOv7 model and training the improved YOLOv7 model for detecting abnormal cells and overlapped cell clusters, improving the extraction problem of multi-dimensional characteristic information on the basis of the latest YOLOv7 model to obtain an improved YOLOv7 model structure, wherein the improved YOLOv7 model structure comprises an abnormal cell detection data preprocessing module Process, a Backbone network backhaul, a Neck network neutral and a detection network Head;
step 3, building a detection result screening module, classifying cells in the detection network output image, outputting the abnormal cell image, and inputting the overlapped cell cluster image as a segmentation model;
step 4, building a Swin-Unet model and training, wherein the Swin-Unet model is used for segmenting the overlapped cell mass images: based on the most common Unet model in the medical field, aiming at the local relation and the global relation of the cell image under the multi-scale, a Swin-Transformer module is introduced for sampling; the Swin-Unet model structure comprises an Encoder Encoder, a Neck network Neck and a Decoder Decoder;
step 5, carrying out abnormal cell detection by using improved YOLOv7 and Swin-Unet model;
the step 1 process is as follows:
1-1, collecting cell smear images in pathological cytology examination, including cervical cell images and mammary gland cell images, performing sliding window cutting on original cell smear images, wherein the cutting size is 640 multiplied by 640, the overlapping range of the sliding window is 50%, obtaining cell images of small areas, performing rectangular frame labeling on independent abnormal cells and overlapping cell groups by using a LabelImg tool, storing labels as XML files, and making an abnormal cell detection data set for training an improved YOLOv7 model;
and 1-2, screening cell images with labels of overlapped cell groups in the abnormal cell detection data set, subdividing a segmentation area by utilizing a polygonal labeling function of a LabelImg tool, labeling the area, labeling the abnormal cells, storing the labels, and preparing the abnormal cell segmentation data set for training a Swin-Unet model.
2. The method for detecting abnormal cells based on improved YOLOv7 and Swin-Unet as claimed in claim 1, wherein in step 2, the construction of the improved YOLOv7 model comprises:
2-1, constructing an abnormal cell detection data preprocessing module, which comprises: performing data enhancement of turning and translation on the cell image; and (3) carrying out noise reduction processing on the cell image by using Gaussian filtering, wherein a Gaussian kernel function is as follows:
Figure FDA0004026364520000011
g (x, y) is a pixel value of the denoised cell image, x and y represent coordinates of pixel points, and sigma is a standard variance of Gaussian, so that the smoothness degree of the cell image is determined;
2-2, constructing a backbone network of the improved YOLOv7 model: firstly, convolving a feature map of an input cell image by a 4-layer CBS module, wherein the CBS module comprises a Conv layer, a BN layer and a SiLU layer, and then stacking an ELAN module and an MP module to output three feature maps; the ELAN comprises a plurality of CBS modules, the input and output characteristic size of the ELAN is kept unchanged, the number of channels is changed in the first two CBS modules, the latter input channels are all kept consistent with the output channels, and the output channels are required channels through the last CBS module; the MP module is spliced by output vectors of the Maxpool and CBS modules;
2-3. Build the neck network of the improved YOLOv7 model: fusing three feature maps output by a backbone network by using a PAFPN structure;
2-4. Build the head network of the improved YOLOv7 model: a dynamic head Dyhead module is introduced to carry out feature map attention fusion; its dynamic head module structure includes: scale-aware attention, spatially-aware attention, and task-aware attention with a stacked fit of attention functions; the formula for applying self-attention is:
W(F)=π CSL (F)·F)·F)·F (2)
wherein F ∈ R L×S×C Corresponding to the input feature vector, R is the input feature vector set, L represents the scale number of the feature, S = H multiplied by W is the remodeling of the height H and width W dimensions of the feature map, C represents the channel number of the feature map, and pi L (x)、π S (x)、π C (x) The attention function which is independent on three dimensions of task, space and scale respectively corresponds to the formulas (3), (4) and (5):
Figure FDA0004026364520000021
Figure FDA0004026364520000022
π C (F)·F=max(α 1 (F)·F c1 (F),α 2 (F)·F c2 (F)) (5)
wherein in the formula (3), f (x) is a linear function approximated by a 1 × 1 convolution,
Figure FDA0004026364520000023
is a Hard-Sigmoid activation function;
in equation (4), K is the number of sparse sampling locations, ω is the weighting factor for l and K, p k +Δp k Is the position of the spatial offset, Δ m, by self-learning k Is position p k A self-learning scalar;
in the formula (5), [ alpha ] 1212 ] T (= θ) ·) is a hyper-function of learning control activation threshold, F C The feature slice at the C-th channel is represented.
3. The method for detecting abnormal cells based on improved YOLOv7 and Swin-uet as claimed in claim 1, wherein in step 2, the process of training the improved YOLOv7 model is:
inputting cell images in abnormal cell detection data sets, wherein the size of the cell images is 640 multiplied by 640, the batch-size is set to be 16, training 180 epochs to obtain an improved YOLOv7 model with the best effect, extracting overlapped cell mass images in detection results, and setting different IoU cross-over ratios; where @ x represents the performance of setting the IoU threshold to x; the mAP represents the average value of the AP calculated for each corresponding category, the higher the value, the better the detection effect, the higher the Recall Recall rate, namely the Recall ratio, the higher the value, the fewer the leaked marked cells.
4. The improved YOLOv7 and Swin-uet based abnormal cell detection method as claimed in claim 1, wherein the process of constructing Swin-uet model in step 4 comprises:
4-1, constructing an encoder part of a Swin-Unet model: firstly, the minimum unit of a cell image is converted into a 4 x 4 Patch from a pixel through the Patch Panel, a structure of a high-dimensional space is maintained through a linear embedding module, then a down-sampling process is carried out, the initial part still uses a convolution layer in Unet, and the two following down-sampling processes are replaced by a pair of Swin-Transformer, and the basic formula is as follows:
Figure FDA0004026364520000031
wherein Q, K, V represent query, key, value matrix in the self Attention separately, d represents the dimensionality, B is the relative position bias that can be learnt, attention (Q, K, V) represents the Attention function in each patch;
4-2, building a neck layer of the Swin-Unet model: filtering the high-dimensional characteristic information after down-sampling by using a group of paired Swin-transformers;
4-3, constructing a decoder part of the Swin-Unet model: the network structure corresponding to the encoder network comprises the steps of firstly, performing twin Swin-transducer + batch expansion twice, performing original up-sampling +2 times of convolution on the last layer of characteristics, splicing characteristic graphs corresponding to encoders in the same phase for each up-sampling module to form a residual block, performing linear projection on output characteristics once, and sending the output characteristics into a convolution network for classification to obtain a final output result.
5. The improved Yolov7 and Swin-Unet based abnormal cell detection method as claimed in claim 1, wherein the Swin-Unet model training process in step 4 is:
with the overlapping cell mass images as input, set the batch-size to 64, train 80 epochs, and set different ious (union ratio, expressed as a union ratio threshold for predicting Mask and group Truth Mask in the segmentation task.
6. The method for detecting abnormal cells based on improved Yolov7 and Swin-Unet as claimed in claim 1, wherein in step 5, the process of abnormal cell detection using improved Yolov7 and Swin-Unet model comprises:
5-1, obtaining a cell smear image in the pathological cytology examination, performing sliding window cutting on the cell smear image, and inputting the cut cell image into a detection network in sequence;
5-2, extracting the features of the cell image by using a backbone network of the improved YOLOv7 network, sending feature maps with different scales into a neck network, sending the feature maps into a detection head network after feature fusion, and outputting a detection result;
5-3, screening the detection result by using a detection result screening module, judging that abnormal cells are directly used as output, and sending the abnormal cells into a segmentation network if the abnormal cells are judged to be overlapped cell clusters;
and 5-4. Down-sampling the overlapped cell images by an encoder of the Swin-Unet network, filtering the overlapped cell images by the neck network, entering a decoder network to perform up-sampling by using a residual error structure, entering an output characteristic image into a convolution layer to classify image segmentation regions, outputting the regions which are judged to be abnormal cells, and taking the regions and the abnormal cells in the step 3 as final output results.
CN202211726362.XA 2022-12-29 2022-12-29 Abnormal cell detection method based on improved YOLOv7 and Swin-Unet Pending CN115965602A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211726362.XA CN115965602A (en) 2022-12-29 2022-12-29 Abnormal cell detection method based on improved YOLOv7 and Swin-Unet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211726362.XA CN115965602A (en) 2022-12-29 2022-12-29 Abnormal cell detection method based on improved YOLOv7 and Swin-Unet

Publications (1)

Publication Number Publication Date
CN115965602A true CN115965602A (en) 2023-04-14

Family

ID=87363239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211726362.XA Pending CN115965602A (en) 2022-12-29 2022-12-29 Abnormal cell detection method based on improved YOLOv7 and Swin-Unet

Country Status (1)

Country Link
CN (1) CN115965602A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452574A (en) * 2023-04-28 2023-07-18 合肥工业大学 Gap detection method, system and storage medium based on improved YOLOv7
CN116630294A (en) * 2023-06-08 2023-08-22 南方医科大学南方医院 Whole blood sample detection method and device based on deep learning and storage medium
CN116844161A (en) * 2023-09-04 2023-10-03 深圳市大数据研究院 Cell detection classification method and system based on grouping prompt learning
CN117314898A (en) * 2023-11-28 2023-12-29 中南大学 Multistage train rail edge part detection method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452574A (en) * 2023-04-28 2023-07-18 合肥工业大学 Gap detection method, system and storage medium based on improved YOLOv7
CN116630294A (en) * 2023-06-08 2023-08-22 南方医科大学南方医院 Whole blood sample detection method and device based on deep learning and storage medium
CN116630294B (en) * 2023-06-08 2023-12-05 南方医科大学南方医院 Whole blood sample detection method and device based on deep learning and storage medium
CN116844161A (en) * 2023-09-04 2023-10-03 深圳市大数据研究院 Cell detection classification method and system based on grouping prompt learning
CN116844161B (en) * 2023-09-04 2024-03-05 深圳市大数据研究院 Cell detection classification method and system based on grouping prompt learning
CN117314898A (en) * 2023-11-28 2023-12-29 中南大学 Multistage train rail edge part detection method
CN117314898B (en) * 2023-11-28 2024-03-01 中南大学 Multistage train rail edge part detection method

Similar Documents

Publication Publication Date Title
CN115965602A (en) Abnormal cell detection method based on improved YOLOv7 and Swin-Unet
CN112070772B (en) Blood leukocyte image segmentation method based on UNet++ and ResNet
Jiang et al. Deep learning for computational cytology: A survey
CN110472676A (en) Stomach morning cancerous tissue image classification system based on deep neural network
CN110942446A (en) Pulmonary nodule automatic detection method based on CT image
CN113378791B (en) Cervical cell classification method based on double-attention mechanism and multi-scale feature fusion
CN111951288A (en) Skin cancer lesion segmentation method based on deep learning
CN106845551A (en) A kind of histopathology image-recognizing method
CN114332572B (en) Method for extracting breast lesion ultrasonic image multi-scale fusion characteristic parameters based on saliency map-guided hierarchical dense characteristic fusion network
CN109815974A (en) A kind of cell pathology slide classification method, system, equipment, storage medium
CN112330616A (en) Automatic identification and counting method for cerebrospinal fluid cell image
CN113838009A (en) Abnormal cell detection false positive inhibition method based on semi-supervision mechanism
CN112233085A (en) Cervical cell image segmentation method based on pixel prediction enhancement
CN115471701A (en) Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning
CN115206495A (en) Renal cancer pathological image analysis method and system based on CoAtNet deep learning and intelligent microscopic device
Zhang et al. Research on application of classification model based on stack generalization in staging of cervical tissue pathological images
Jiang et al. A systematic review of deep learning-based cervical cytology screening: from cell identification to whole slide image analysis
CN114387596A (en) Automatic interpretation system for cytopathology smear
CN114140437A (en) Fundus hard exudate segmentation method based on deep learning
CN117036288A (en) Tumor subtype diagnosis method for full-slice pathological image
CN116778164A (en) Semantic segmentation method for improving deep V < 3+ > network based on multi-scale structure
CN115775226B (en) Medical image classification method based on transducer
CN115937188A (en) Cytopathology image abnormality detection method based on improved YOLOv5 and EfficientNet
CN113012167B (en) Combined segmentation method for cell nucleus and cytoplasm
CN113205484B (en) Mammary tissue classification and identification method based on transfer learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination