CN114972266A - Lymphoma ultrasonic image semantic segmentation method based on self-attention mechanism and stable learning - Google Patents
Lymphoma ultrasonic image semantic segmentation method based on self-attention mechanism and stable learning Download PDFInfo
- Publication number
- CN114972266A CN114972266A CN202210604099.0A CN202210604099A CN114972266A CN 114972266 A CN114972266 A CN 114972266A CN 202210604099 A CN202210604099 A CN 202210604099A CN 114972266 A CN114972266 A CN 114972266A
- Authority
- CN
- China
- Prior art keywords
- image
- module
- lymphoma
- self
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010025323 Lymphomas Diseases 0.000 title claims abstract description 49
- 230000011218 segmentation Effects 0.000 title claims abstract description 38
- 230000007246 mechanism Effects 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000007781 pre-processing Methods 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 13
- 238000005070 sampling Methods 0.000 claims abstract description 7
- 230000007613 environmental effect Effects 0.000 claims abstract description 4
- 230000003993 interaction Effects 0.000 claims abstract description 4
- 238000002604 ultrasonography Methods 0.000 claims description 15
- 238000003709 image segmentation Methods 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 238000013135 deep learning Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 3
- 230000007423 decrease Effects 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 230000000873 masking effect Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000009827 uniform distribution Methods 0.000 claims description 3
- 239000012634 fragment Substances 0.000 claims 1
- 230000015556 catabolic process Effects 0.000 abstract 1
- 238000006731 degradation reaction Methods 0.000 abstract 1
- 239000000523 sample Substances 0.000 description 29
- 238000010586 diagram Methods 0.000 description 12
- 238000013528 artificial neural network Methods 0.000 description 8
- 238000003384 imaging method Methods 0.000 description 8
- 238000003745 diagnosis Methods 0.000 description 5
- 238000002591 computed tomography Methods 0.000 description 4
- 238000007689 inspection Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000003902 lesion Effects 0.000 description 2
- 210000001165 lymph node Anatomy 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 208000017604 Hodgkin disease Diseases 0.000 description 1
- 208000021519 Hodgkin lymphoma Diseases 0.000 description 1
- 208000010747 Hodgkins lymphoma Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000010009 beating Methods 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 230000017531 blood circulation Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001839 endoscopy Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000000004 hemodynamic effect Effects 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000013421 nuclear magnetic resonance imaging Methods 0.000 description 1
- 230000036285 pathological change Effects 0.000 description 1
- 231100000915 pathological change Toxicity 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 238000002600 positron emission tomography Methods 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/10—Image enhancement or restoration using non-spatial domain filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/52—Scale-space analysis, e.g. wavelet analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10132—Ultrasound image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Processing (AREA)
- Ultra Sonic Daignosis Equipment (AREA)
Abstract
The invention relates to a lymphoma ultrasonic image semantic segmentation method based on a self-attention mechanism and stable learning. And cutting an interested area of the marked instrument scanning image and adjusting the pixel spacing to obtain a new data set for training the model through a data preprocessing module. The non-local interaction between the coding characteristics of the coder is realized by adopting a self-attention mechanism, the problem of information degradation caused by multiple times of sampling is relieved, and the structural boundary of a more accurate segmented target object is obtained; the dependence between the environmental characteristics and the essential characteristics is eliminated by adopting a stable learning method through a random Fourier characteristic and a sample weighting mode, so that the false correlation problem is reduced, and the segmentation precision is improved; and adopting a counter-fact interpretation method to interpret the model result at an instance level, thereby improving the credibility of the model.
Description
Technical Field
The invention relates to the field of medical image processing, in particular to a lymphoma ultrasonic image semantic segmentation method based on a self-attention mechanism and stable learning.
Background
Lymphoma is a fatal cancer formed by abnormal mutations of cells of the immune system. There are many different histological subtypes, the diagnosis of which is usually based on sampling (biopsy). Lymphoma originates in lymph nodes and tissues, can occur in any part of the body, and has various clinical manifestations. The disease is usually characterized by painless lymph node enlargement, and also can invade extranodal organs to cause damage to the corresponding organs. They are mainly classified into hodgkin and non-hodgkin lymphomas.
Lymphoma is diagnosed by combining clinical manifestations, physical examination, laboratory examination, imaging examination, pathological examination results, etc. of patients. These different methods play different roles in diagnosing the stage and type of lymphoma. Imaging techniques play an important role in the staging and typing of lymphomas. Commonly used Imaging methods include CT, Nuclear Magnetic Resonance Imaging (MRI), positron Emission computed Tomography (PET-CT), ultrasound, endoscopy, and the like.
Different imaging modalities produce images with different effects on the diagnosis of lymphoma due to different imaging techniques. Compared with other imaging technologies, the ultrasonic examination is a safe examination without radioactive damage, and belongs to a completely nondestructive, noninvasive and radiationless examination technology. Because the probe of supersound can be placed at will, consequently can take the section of multiple position, multi-angle to inspect during the human body inspection, very nimble can carry out the location measurement to the focus, and the relation of the position of deciding pathological change and surrounding tissue is clear away, and real-time dynamic display especially to the heart in the position of different periods heart beats, the change of the Doppler blood flow of beating, can observe the condition ultrasonic examination of hemodynamics and can obtain the inspection result immediately, can also repeat the inspection many times repeatedly. Ultrasonic methods have wide availability (including bedside) and relatively low cost. Ultrasound examination therefore plays an important role in the early diagnosis of lymphoma.
But due to the inherent acoustic characteristics of ultrasonic imaging, the obtained image has high noise, low contrast and poor imaging quality. The image needs to be segmented, which is the key to the analysis of the ultrasound image. Typically, this segmentation is performed manually by the clinician, which reduces the objectivity of the diagnosis and is labor intensive. Even experts, the description is slightly different according to their experience and skill. Therefore, for many medical image applications, correct segmentation of lesion regions by models is the key to successful application. An automatic segmentation model that can accurately capture a region of interest (ROI) from an image can provide a basis for a clinician to diagnose or perform pathology studies. However, the existing lymphoma image segmentation is mostly focused on three-dimensional Positron Emission Tomography (PET) images and Computed Tomography (CT) images, and the lymphoma image is not segmented.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a lymphoma ultrasonic image semantic segmentation method based on a self-attention mechanism and stable learning, which can effectively segment a lymphoma ultrasonic image under the condition of a small amount of data.
A lymphoma ultrasonic image semantic segmentation method based on a self-attention mechanism and stable learning comprises the following steps:
acquiring a lymphoma ultrasonic image, and processing the image to be used as a training sample;
constructing a lymphoma ultrasonic image segmentation network: the lymphoma ultrasonic image segmentation network comprises a data preprocessing module, a feature extraction network module, a self-attention mechanism module, a stable learning module and a counterfactual interpretation module;
the data preprocessing module is used for cutting and labeling all lymphoma ultrasonic image samples and taking the preprocessed samples as input data of a segmentation network;
the characteristic extraction network module is used for extracting the spatial information and the global information of the image, fusing the extracted characteristics and capturing a clearer object boundary;
the self-attention mechanism module is used for realizing non-local interaction among the features and relieving the problem of information decline caused by multiple sampling;
the stable learning module is used for eliminating the dependency relationship between the environmental characteristics and the essential characteristics in a random Fourier characteristic and sample weighting mode to realize generalization;
and the counterfactual interpretation module is used for carrying out example level interpretation on the model result and improving the credibility of the model.
Preferably, the operation steps of the data preprocessing module of the present invention include:
step S11, cutting the ultrasonic image scanned by the ultrasonic instrument to remove the extra content added by the ultrasonic scanning instrument;
step S12, uploading the cut image to a PLAbel marking system for free marking, submitting the image marked in the system to an experienced medical expert for examination and correction after marking is finished, and exporting the final marking result;
step S13 reads the exported json file to generate a real mask image;
step S14 adjusts the pixel pitch between the annotation image and the original image to obtain an image with a resolution of 256 × 256, and cuts out the region-of-interest image, and adjusts the resolution to 512 × 512.
Preferably, the operation steps of the feature extraction network module of the present invention include:
step S21, using the encoder structure to perform downsampling to extract low-level features, thereby performing accurate segmentation using the extracted spatial information and global information;
the decoder recovers the spatial information step by upsampling, fusing the features extracted during the encoder encoding process to capture clearer object boundaries at step S22.
Preferably, the operation steps of the self-attention mechanism module of the present invention include:
step S32, inputting the last layer of features encoded by the encoder into a TSA module, and the TSA module carries out different linear changes on the input feature map and the generated position embedding vector to generate three vectors, namely, query (Q), key (K), value (V), which are used for calculating the attribute;
step S32 multiplies the transposes of Q and K to obtain the degree of similarity between the Q and K elements, divided by the dimension d of the vector Q k The evolution of (c) guarantees the gradient of softmax and is normalized by softmax, resulting in a contextual attention map A, which is multiplied by V to obtain the attention weighted value, the formula is as follows:
step S33 adds the return value obtained by the attention mechanism module element by element with the last layer feature to obtain a fused feature map F for the input of the decoder.
Preferably, the operation steps of the stable learning module of the present invention include:
step S41, inputting the feature map into a stable learning module, mapping the input features from low dimension to high dimension space through random Fourier features, and eliminating the correlation among the features, wherein the random Fourier feature formula is as follows:
h is a high-dimensional space feature obtained by performing random Fourier transform on the input low-dimensional feature x, omega is a random variable sampled from a standard normal distribution, omega x represents the random variable omega multiplied by x for transformation,is a random variable sampled from a uniform distribution;
step S42 obtains an optimal value w of the sample weighting weight by calculating the minimum value of the covariance between two random variables * The formula is as follows:
w * =argmin w∈Δn ∑ 1≤i<j≤n cov(w i X i ,w j X j ),
in the formulan is the number of input batches, i.e. the number of incoming sample profiles, X i And X j Respectively, characteristic maps, w, of different samples in the sample space i And w j Are respectively a sample X i And X j A corresponding weighting weight;
step S43 deep learning requires a huge overhead to learn sample weights and features globally using all samples, so it is necessary to store and reload sample weights, with a learnable parameter α i To perform global weight and feature update, the weight update formula is as follows:
X′ Gi =α i X Gi +(1-α i )X L ,
W′ Gi =α i W Gi +(1-α i )W L ,
X Gi and X L Global sample feature and current sample feature, W, respectively Gi And W L Are respectively globalA sample weight and a current sample weight;
step S44 multiplies the calculated optimal weight by the loss value of the sample to obtain a new loss for training the model, where the loss update formula is as follows:
loss=SoftDiceLoss(SR,GT).view(1,-1).mm(w * ).view(1),
SR and GT are the predicted value and the true value given by the model respectively.
Preferably, the operation steps of the counterfactual explanation module of the present invention include:
step S51 generating segmentation segments of an image by a fast-shift image segmentation algorithm
Step S52 finds a set of irreducible segments by masking the segmented segments so that the IoU score of the model is reduced most, as shown below:
T(I\S)<T(I)(IoU reduce),
wherein I represents a segmentation segment of a generated image, S represents an over-segmentation segment, and T represents a segmentation model;
step S53 maps the set of segments to artwork, generating an instance-level interpretation of the model result.
The method is used for preprocessing the data aiming at the lymphoma ultrasonic scanning image data to obtain two data sets for training the model, so that the model obtains higher performance; the lymphoma ultrasonic image is subjected to automatic semantic segmentation based on a deep learning framework, the problems of fuzzy and generalization of ultrasonic image boundaries are solved through a self-attention module and a stable learning module, and meanwhile, the model is subjected to instance-level explanation through a counterfactual explanation module, so that the segmentation precision and the reliability of the model are improved; the automatic segmentation does not need human factor intervention, eliminates the interference of subjective factors, saves a large amount of time cost and labor cost, and improves the efficiency and precision of diagnosis.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic flow diagram of a data preprocessing module according to the present invention; the system comprises a data preprocessing module, a self-attention mechanism module, a stable learning module and a counterfactual interpretation module;
FIG. 3 is a sample structure diagram of a normative dataset according to the present invention;
FIG. 4 is a diagram of a neural network architecture according to the present invention;
FIG. 5 is a schematic diagram of a feature extraction network module architecture according to the present invention;
FIG. 6 is a schematic diagram of a self-attention mechanism module architecture of the present invention;
FIG. 7 is a block diagram of a stable learning module according to the present invention;
FIG. 8 is a diagram of a counterfactual explanation module architecture according to the present invention;
FIG. 9 is a schematic diagram of the process of mapping IOU 0.89105 to 0.84559;
fig. 10 is a schematic diagram of the process of mapping IOU 0.95403 to 0.91483.
Detailed Description
A lymphoma ultrasonic image semantic segmentation method based on a self-attention mechanism and stable learning comprises the following steps:
acquiring a lymphoma ultrasonic image, and processing the image to be used as a training sample;
constructing a lymphoma ultrasonic image segmentation network: the lymphoma ultrasonic image segmentation network comprises a data preprocessing module, a feature extraction network module, a self-attention mechanism module, a stable learning module and a counterfactual interpretation module;
the data preprocessing module is used for cutting and labeling all lymphoma ultrasonic image samples and taking the preprocessed samples as input data of a segmentation network;
the characteristic extraction network module is used for extracting the spatial information and the global information of the image, fusing the extracted characteristics and capturing a clearer object boundary;
the self-attention mechanism module is used for realizing non-local interaction among the features and relieving the problem of information decline caused by multiple sampling;
the stable learning module is used for eliminating the dependency relationship between the environmental characteristics and the essential characteristics in a mode of weighting the samples by the random Fourier characteristics to realize generalization;
and the counterfactual interpretation module is used for carrying out example level interpretation on the model result and improving the credibility of the model.
As shown in FIG. 1, the flow framework of the present invention is mainly composed of data preprocessing, deep neural network and counter-facts. Firstly, taking a lymphoma ultrasonic image of a patient as an original data set, and preprocessing the original data set to obtain a data set for neural network training; the trained neural network can provide a segmentation result for the input picture; the counterfactual interpretation method gives an example level of interpretation in conjunction with a deep neural network.
As shown in fig. 2, a preprocessing module based on a patient lymphoma ultrasound image is implemented, and the operation steps include:
step S11, cutting the ultrasonic image scanned by the ultrasonic instrument to remove the extra content added by the ultrasonic scanning instrument;
step S12, uploading the cut image to a PLAbel marking system for free marking, submitting the image marked in the system to an experienced medical expert for examination and correction after marking is finished, and exporting the final marking result;
step S13 reads the exported json file to generate a real mask image;
step S14 adjusts the pixel pitch between the annotation image and the original image to obtain an image with a resolution of 256 × 256, and cuts out the region-of-interest image, and adjusts the resolution to 512 × 512.
And aiming at the preprocessed data set, dividing a training set and a test set, wherein the test set is used for training the neural network, and the test set is used for evaluating the accuracy of the model. Referring to FIG. 3, the preprocessed image can be seen, and the image of stage1 is used for the training of the first stage of deep neural network for a rough location of the lesion area; the image of stage2 is the region-of-interest image used for the training of the second stage of the deep neural network for optimizing the target boundary of the segmented object.
As shown in fig. 4 and 5, a feature extraction network module based on a patient lymphoma ultrasound image is implemented, the module extracts information through encoder cavity convolution, recovers spatial information through decoder sampling, and obtains an image segmentation result, and the operation steps include:
step S21 down-sampling using the encoder structure to extract low-level features, thereby accurately segmenting using the extracted spatial information and global information;
the decoder recovers the spatial information step by upsampling, fusing the features extracted during the encoder encoding process to capture clearer object boundaries at step S22.
As shown in fig. 6, a module of a self-attention mechanism based on an ultrasound image of a patient lymphoma is implemented, the module adds position embedding to features of the patient lymphoma image, and obtains a feature map weighted by attention through the self-attention mechanism, and the operation steps include:
step S32, inputting the last layer of features encoded by the encoder into a TSA module, and the TSA module carries out different linear changes on the input feature map and the generated position embedding vector to generate three vectors, namely, query (Q), key (K), value (V), which are used for calculating the attribute;
step S32 multiplies the transposes of Q and K to obtain the degree of similarity between the Q and K elements, divided by the dimension d of the vector Q k The evolution of (a) guarantees the gradient of softmax and is normalized by softmax to obtain a contextual attention map a, which is multiplied by V to obtain the attention weighted value, as shown below:
step S33 adds the return value obtained by the attention mechanism module element by element with the last layer feature to obtain a fused feature map F for the input of the decoder.
As shown in fig. 7, a stable learning module based on a patient lymphoma ultrasound image is implemented, which obtains sample weights for updating the loss function by performing a random fourier transform (RFF) on sample characteristics of the patient lymphoma ultrasound image and a weighted decorrelation (LSWD) of the learning sample, and the operation steps include:
step S41, inputting the feature map into a stable learning module, mapping the input features from low dimension to high dimension space through random Fourier features, and eliminating the correlation among the features, wherein the random Fourier feature formula is as follows:
h is a high-dimensional space feature obtained by performing random Fourier transform on the input low-dimensional feature x, omega is a random variable sampled from a standard normal distribution, omega x represents the random variable omega multiplied by x for transformation,is a random variable sampled from a uniform distribution;
step S42 obtains an optimal value w of the sample weighting weight by calculating the minimum value of the covariance between two random variables * The formula is as follows:
w * =argmin w∈Δn ∑ 1≤i<j≤n cov(w i X i ,w j X j ),
in the formulan is the number of input batches, i.e. the number of incoming sample profiles, X i And X j Respectively, characteristic maps, w, of different samples in the sample space i And w j Are respectively a sample X i And X j Corresponding weighting weights;
step S43 deep learning requires a huge overhead to learn sample weights and features globally using all samples, so it is necessary to store and reload samplesWeight, by a learnable parameter α i To perform global weight and feature update, the weight update formula is as follows:
X′ Gi =α i X Gi +(1-α i )X L ,
W′ Gi =α i W Gi +(1-α i )W L ,
X Gi and X L Global sample feature and current sample feature, W, respectively Gi And W L Global sample weight and current sample weight respectively;
step S44 multiplies the calculated optimal weight by the loss value of the sample to obtain a new loss for model training, where the loss update formula is as follows:
loss=SoftDiceLoss(SR,GT).view(1,-1).mm(w * ).view(1),
SR and GT are the predicted value and the true value given by the model respectively.
As shown in fig. 8, a Counterfactual interpretation module based on a patient lymphoma ultrasound image is implemented, and the Counterfactual interpretation module processes the patient lymphoma ultrasound image through a rapid displacement image segmentation algorithm SETC and a computational Counterfactual module to generate an example-level Counterfactual interpretation, and the operation steps include:
step S51 generating segmentation segments of an image by a fast-shift image segmentation algorithm
Step S52 finds a set of irreducible segments by masking the segmented segments so that the IoU score of the model is reduced most, as shown below:
T(I\S)<T(I)(IoU reduce),
wherein I represents a segmentation segment of a generated image, S represents an over-segmentation segment, and T represents a segmentation model;
step S53 maps the set of segments to artwork, generating an instance-level interpretation of the model result.
As shown in fig. 9 and 10, fig. 9 is a schematic diagram of a process of mapping IOU 0.89105 to 0.84559, fig. 9 is a schematic diagram of a process of mapping IOU 0.95403 to 0.91483, and it can be seen that the counterfactual interpretation result generated by the counterfactual module is shown, where a is an input original image in fig. 9 and 10; b is a visual graph of the segmentation segment generated by the SETC algorithm; c is a counterfactual explanation generated by the counterfactual module; d is a reflection of the fact interpretation mapping on the original image.
Experimental results show that the lymphoma ultrasonic image of a patient can be efficiently utilized, the segmentation precision of the lymphoma patient is effectively improved, the boundary blurring and generalization problems of the lymphoma patient can be solved by the model, the model is subjected to example-level explanation through counterfactual explanation, and the obtained segmentation result has more confidence for auxiliary diagnosis of doctors.
Claims (6)
1. A lymphoma ultrasonic image semantic segmentation method based on a self-attention mechanism and stable learning is characterized by comprising the following steps:
acquiring a lymphoma ultrasonic image, and processing the image to be used as a training sample;
constructing a lymphoma ultrasonic image segmentation network: the lymphoma ultrasonic image segmentation network comprises a data preprocessing module, a feature extraction network module, a self-attention mechanism module, a stable learning module and a counterfactual interpretation module;
the data preprocessing module is used for cutting and labeling all lymphoma ultrasonic image samples and taking the preprocessed samples as input data of a segmentation network;
the characteristic extraction network module is used for extracting the spatial information and the global information of the image, fusing the extracted characteristics and capturing a clearer object boundary;
the self-attention mechanism module is used for realizing non-local interaction among the features and relieving the problem of information decline caused by multiple sampling;
the stable learning module is used for eliminating the dependency relationship between the environmental characteristics and the essential characteristics in a random Fourier characteristic and sample weighting mode to realize generalization;
and the counterfactual interpretation module is used for carrying out example level interpretation on the model result and improving the credibility of the model.
2. The method for semantic segmentation of lymphoma ultrasound images based on self-attention mechanism and stable learning according to claim 1, wherein the operation steps of said data preprocessing module comprise:
step S11, cutting the ultrasonic image scanned by the ultrasonic instrument to remove the extra content added by the ultrasonic scanning instrument;
step S12, uploading the cut image to a PLAbel marking system for free marking, submitting the image marked in the system to an experienced medical expert for examination and correction after marking is finished, and exporting the final marking result;
step S13 reads the exported json file to generate a real mask image;
step S14 adjusts the pixel pitch between the annotation image and the original image to obtain an image with a resolution of 256 × 256, and cuts out the region-of-interest image, and adjusts the resolution to 512 × 512.
3. The lymphoma ultrasound image semantic segmentation method based on self-attention mechanism and stable learning according to claim 2, wherein the feature extraction network module operating step comprises:
step S21, using the encoder structure to perform downsampling to extract low-level features, thereby performing accurate segmentation using the extracted spatial information and global information;
the decoder recovers the spatial information step by upsampling, fusing the features extracted during the encoder encoding process to capture clearer object boundaries at step S22.
4. The lymphoma ultrasound image semantic segmentation method based on self-attention mechanism and stable learning according to claim 3, wherein the self-attention mechanism module is operated by the following steps:
step S32, inputting the last layer of features encoded by the encoder into a TSA module, and the TSA module carries out different linear changes on the input feature map and the generated position embedding vector to generate three vectors, namely, query (Q), key (K), value (V), which are used for calculating the attribute;
step S32 multiplies the transposes of Q and K to obtain the degree of similarity between the Q and K elements, divided by the dimension d of the vector Q k The evolution of (a) guarantees the gradient of softmax and is normalized by softmax to obtain a contextual attention map a, which is multiplied by V to obtain the attention weighted value, as shown below:
step S33 adds the return value obtained by the attention mechanism module element by element with the last layer feature to obtain a fused feature map F for the input of the decoder.
5. The lymphoma ultrasound image semantic segmentation method based on self-attention mechanism and stable learning according to claim 4, wherein the stable learning module is operated by the following steps:
step S41, inputting the feature map into a stable learning module, mapping the input features from low dimension to high dimension space through random Fourier features, and eliminating the correlation among the features, wherein the random Fourier feature formula is as follows:
h is a high-dimensional space feature obtained by performing random Fourier transform on the input low-dimensional feature x, omega is a random variable sampled from a standard normal distribution, omega x represents the transformation by multiplying the random variable omega by x,is a random variable sampled from a uniform distribution;
step S42 obtains an optimal value w of the sample weighting weight by calculating the minimum value of the covariance between two random variables * The formula is as follows:
w * =argmin w∈Δn ∑ 1≤i<j≤n cov(w i X i ,w j X j ),
in the formulan is the number of input batches, i.e. the number of incoming sample profiles, X i And X j Respectively, characteristic maps, w, of different samples in the sample space i And w j Are respectively a sample X i And X j A corresponding weighting weight;
step S43 deep learning requires a huge overhead to learn sample weights and features globally using all samples, so it is necessary to store and reload sample weights, with a learnable parameter α i To perform global weight and feature update, the weight update formula is as follows:
X′ Gi =α i X Gi +(1-α i )X L ,
W′ Gi =α i W Gi +(1-α i )W L ,
X Gi and X L Global sample feature and current sample feature, W, respectively Gi And W L Respectively a global sample weight and a current sample weight;
step S44 multiplies the calculated optimal weight by the loss value of the sample to obtain a new loss for training the model, where the loss update formula is as follows:
loss=SoftDiceLoss(SR,GT).view(1,-1).mm(w * ).view(1),
SR and GT are the predicted value and the true value given by the model respectively.
6. The method for semantic segmentation of lymphoma ultrasound images based on self-attention mechanism and stable learning according to claim 5, wherein said counterfactual interpretation module comprises:
step S51 generating segmentation segments of an image by a fast-shift image segmentation algorithm
Step S52 finds a set of irreducible segments by masking the segmented segments so that the IoU score of the model is reduced most, as shown below:
T(I\S)<T(I)(IoU reduce),
i represents a segmentation segment of a generated image, S represents an over-segmentation segment, and T represents a segmentation model;
step S53 maps the set of fragments to the original image, and generates an instance-level interpretation of the model result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210604099.0A CN114972266A (en) | 2022-05-31 | 2022-05-31 | Lymphoma ultrasonic image semantic segmentation method based on self-attention mechanism and stable learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210604099.0A CN114972266A (en) | 2022-05-31 | 2022-05-31 | Lymphoma ultrasonic image semantic segmentation method based on self-attention mechanism and stable learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114972266A true CN114972266A (en) | 2022-08-30 |
Family
ID=82957378
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210604099.0A Pending CN114972266A (en) | 2022-05-31 | 2022-05-31 | Lymphoma ultrasonic image semantic segmentation method based on self-attention mechanism and stable learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114972266A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116012586A (en) * | 2023-01-06 | 2023-04-25 | 阿里巴巴(中国)有限公司 | Image processing method, storage medium and computer terminal |
CN117351183A (en) * | 2023-10-09 | 2024-01-05 | 广州医科大学附属第一医院(广州呼吸中心) | Intelligent identification method and system for endometrial cancer lymph node metastasis |
-
2022
- 2022-05-31 CN CN202210604099.0A patent/CN114972266A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116012586A (en) * | 2023-01-06 | 2023-04-25 | 阿里巴巴(中国)有限公司 | Image processing method, storage medium and computer terminal |
CN117351183A (en) * | 2023-10-09 | 2024-01-05 | 广州医科大学附属第一医院(广州呼吸中心) | Intelligent identification method and system for endometrial cancer lymph node metastasis |
CN117351183B (en) * | 2023-10-09 | 2024-06-04 | 广州医科大学附属第一医院(广州呼吸中心) | Intelligent identification method and system for endometrial cancer lymph node metastasis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111539930B (en) | Dynamic ultrasonic breast nodule real-time segmentation and identification method based on deep learning | |
CN113870258B (en) | Counterwork learning-based label-free pancreas image automatic segmentation system | |
KR102125127B1 (en) | Method of brain disorder diagnosis via deep learning | |
US20200367853A1 (en) | Apparatus for ai-based automatic ultrasound diagnosis of liver steatosis and remote medical diagnosis method using the same | |
JP4652023B2 (en) | Image data processing method and apparatus useful for disease detection | |
CN110490851B (en) | Mammary gland image segmentation method, device and system based on artificial intelligence | |
CN111784671A (en) | Pathological image focus region detection method based on multi-scale deep learning | |
JP2016531709A (en) | Image analysis technology for diagnosing disease | |
CN114972266A (en) | Lymphoma ultrasonic image semantic segmentation method based on self-attention mechanism and stable learning | |
CN111340827A (en) | Lung CT image data processing and analyzing method and system | |
CN113420826B (en) | Liver focus image processing system and image processing method | |
CN113393469A (en) | Medical image segmentation method and device based on cyclic residual convolutional neural network | |
CN113298830B (en) | Acute intracranial ICH region image segmentation method based on self-supervision | |
CN112508884A (en) | Comprehensive detection device and method for cancerous region | |
JP2024035070A (en) | Multi-modal medical data fusion system based on multi-view subspace clustering | |
CN114332132A (en) | Image segmentation method and device and computer equipment | |
CN111640127B (en) | Accurate clinical diagnosis navigation method for orthopedics department | |
CN112200810A (en) | Multi-modal automated ventricular segmentation system and method of use thereof | |
CN116597950A (en) | Medical image layering method | |
CN116168029A (en) | Method, device and medium for evaluating rib fracture | |
CN115409812A (en) | CT image automatic classification method based on fusion time attention mechanism | |
CN112967295B (en) | Image processing method and system based on residual network and attention mechanism | |
CN111640126B (en) | Artificial intelligent diagnosis auxiliary method based on medical image | |
CN114332278A (en) | OCTA image motion correction method based on deep learning | |
CN114494952A (en) | Breast MRI (magnetic resonance imaging) image time sequence generation method based on perception loss |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |