CN113592797A - Mammary nodule risk grade prediction system based on multi-data fusion and deep learning - Google Patents
Mammary nodule risk grade prediction system based on multi-data fusion and deep learning Download PDFInfo
- Publication number
- CN113592797A CN113592797A CN202110825224.6A CN202110825224A CN113592797A CN 113592797 A CN113592797 A CN 113592797A CN 202110825224 A CN202110825224 A CN 202110825224A CN 113592797 A CN113592797 A CN 113592797A
- Authority
- CN
- China
- Prior art keywords
- data
- nodule
- breast
- molybdenum target
- target image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 35
- 238000013135 deep learning Methods 0.000 title claims abstract description 20
- ZOKXTWBITQBERF-UHFFFAOYSA-N Molybdenum Chemical compound [Mo] ZOKXTWBITQBERF-UHFFFAOYSA-N 0.000 claims abstract description 51
- 229910052750 molybdenum Inorganic materials 0.000 claims abstract description 51
- 239000011733 molybdenum Substances 0.000 claims abstract description 51
- 206010006272 Breast mass Diseases 0.000 claims abstract description 37
- 230000011218 segmentation Effects 0.000 claims abstract description 18
- 238000001514 detection method Methods 0.000 claims abstract description 17
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 17
- 210000005075 mammary gland Anatomy 0.000 claims abstract description 15
- 238000003709 image segmentation Methods 0.000 claims abstract description 14
- 206010028980 Neoplasm Diseases 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 16
- 201000011510 cancer Diseases 0.000 claims description 15
- 230000036210 malignancy Effects 0.000 claims description 15
- 238000000034 method Methods 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 13
- 210000000481 breast Anatomy 0.000 claims description 10
- 230000003211 malignant effect Effects 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 7
- 238000002604 ultrasonography Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000002474 experimental method Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 238000002790 cross-validation Methods 0.000 claims description 3
- 238000012847 principal component analysis method Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000002068 genetic effect Effects 0.000 claims description 2
- 238000011002 quantification Methods 0.000 claims 1
- 230000003902 lesion Effects 0.000 description 11
- 230000001575 pathological effect Effects 0.000 description 7
- 238000003745 diagnosis Methods 0.000 description 6
- 206010006187 Breast cancer Diseases 0.000 description 5
- 208000026310 Breast neoplasm Diseases 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013145 classification model Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 208000004434 Calcinosis Diseases 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002308 calcification Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/32—Indexing scheme for image data processing or generation, in general involving image mosaicing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10132—Ultrasound image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30068—Mammography; Breast
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Ultra Sonic Daignosis Equipment (AREA)
Abstract
The present disclosure provides a breast nodule risk level prediction system based on multidata fusion and deep learning, which is characterized by comprising: acquiring a mammary gland ultrasonic image and a molybdenum target image of a target to be predicted, and corresponding clinical data, gene detection data and case data; respectively carrying out image segmentation on the mammary gland ultrasonic image and the molybdenum target image by using a pre-trained segmentation model to obtain an ultrasonic image and a molybdenum target image of a mammary gland nodule region; respectively inputting the nodule ultrasonic image and the molybdenum target image into a pre-trained nodule type identification model to obtain a nodule type, nodule ultrasonic image characteristics and molybdenum target image characteristics; quantifying clinical data, gene detection data and case data of a target to be predicted, reducing dimensions of the data, and splicing with ultrasonic image features of nodules and molybdenum target image features to obtain multi-source fusion features; and inputting the multi-source fusion characteristics into a pre-trained multilayer perceptron model to obtain the probability of the nodule class to which the target nodule to be detected belongs.
Description
Technical Field
The disclosure belongs to the technical field of medical auxiliary diagnosis, and particularly relates to a breast nodule risk level prediction system based on multi-data fusion and deep learning.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The breast cancer is one of the important reasons for death of women, no effective primary preventive measure exists for the breast cancer, and early discovery and early diagnosis are important means for improving the prognosis of the breast cancer and reducing the death rate. But also has important significance for clinical treatment and operation selection. The breast lesion examination and diagnosis means mainly comprise breast ultrasound, molybdenum target, nuclear magnetic resonance, gene detection, pathological puncture and the like, the workload of doctors is very large in the face of massive medical data, the subjectivity of image interpretation characteristics of the doctors is strong, the results are mainly dependent on experience, and misdiagnosis or missed diagnosis is easily caused due to the influence of factors such as the imaging mechanism, the acquisition conditions and the display equipment of medical imaging equipment. Therefore, it is necessary to use computer technology, digital image processing technology, artificial intelligence technology, etc. to realize ultrasound image aided diagnosis. Valuable clinical diagnosis information is obtained by segmenting, reconstructing, registering, identifying and the like medical images acquired in different modes, so that doctors can more directly and clearly observe lesion parts and provide auxiliary reference for clinically diagnosing lesion characteristics of the doctors.
The inventor finds that the existing auxiliary diagnosis method is often based on a data modeling method, but each data inspection focus may be emphasized, for example, molybdenum target images are superior to diagnosis of calcifications, and ultrasonic images are easier to screen the tumor structures. Therefore, the single data can not obtain the comprehensive information of the focus, the limitation is large, and the prediction precision is low.
Disclosure of Invention
In order to solve the problems, the scheme integrates the multisource characteristics of two types of breast image characteristics, clinical data, gene detection data and pathological data, and screens out the main characteristics of classification breast nodule malignancy by means of principal component analysis; and classification prediction is carried out by utilizing a multilayer perceptron, main clinical medical indexes can be objectively quantified, the accuracy of evaluating the malignant risk of the breast nodules is improved, and the method has good adaptability.
According to a first aspect of the embodiments of the present disclosure, there is provided a breast nodule risk level prediction system based on multidata fusion and deep learning, including:
the data acquisition unit is used for acquiring a mammary gland ultrasonic image and a molybdenum target image of a target to be predicted and corresponding clinical data, gene detection data and case data;
the automatic segmentation unit is used for carrying out image segmentation on the mammary gland ultrasonic image and the molybdenum target image respectively by utilizing a pre-trained segmentation model to obtain an ultrasonic image and a molybdenum target image of a mammary gland nodule region;
the type identification and feature extraction unit is used for respectively inputting the nodule ultrasonic image and the molybdenum target image into a pre-trained nodule type identification model to obtain a nodule type, a nodule ultrasonic image feature and a molybdenum target image feature;
the data fusion unit is used for quantifying clinical data, gene detection data and case data of a target to be predicted, reducing dimensions of the data, and splicing the data with ultrasonic image features of nodules and molybdenum target image features to obtain multi-source fusion features;
and the prediction unit is used for inputting the multi-source fusion characteristics into a pre-trained multilayer perceptron model to obtain the probability of the nodule class to which the target nodule to be detected belongs.
Further, the nodule class identification model is a deep convolution network structure including a convolution layer, a down-sampling layer and a full-connection layer, and is trained by using an ImageNet data set in advance, then retrained by using the constructed breast nodule ultrasonic image and molybdenum target image data sets respectively, and finally output as the class of the nodule.
Further, the classes of nodules include a high malignancy risk level, a medium malignancy risk level, and a low malignancy risk level, which correspond to three levels from low to high in the clinical breast cancer malignancy grade, respectively.
Further, the quantifying the clinical data, the gene detection data and the case data of the target to be predicted and the data dimension reduction are specifically as follows: quantifying clinical data, gene detection data and case data of a target to be predicted and carrying out normalization processing by using a z-score method; and reducing the dimension of the data after the normalization processing by a principal component analysis method.
Further, the training of the multi-layer perceptron adopts a k-time cross validation method for training, specifically: averagely dividing the fused multi-source information characteristics into k groups which are not overlapped with each other, selecting k-1 group data as a training set for training the multilayer perceptron model to identify the malignant risk level of the breast nodules, and using the rest group data as a test set for testing the trained multilayer perceptron model; repeating the experiment k times until each group of data is subjected to a test set; and storing the weight and the offset parameter of the multilayer perceptron model each time, evaluating the result according to the accuracy on the test set, and finishing the training of the multilayer perceptron model when the difference value between the accuracy and the average value of k times is within a preset range.
Further, the segmentation model comprises a mammary gland ultrasonic image segmentation model and a molybdenum target image segmentation model, and the segmentation is based on a deep convolution network structure, wherein the mammary gland ultrasonic image segmentation model comprises 13 convolution layers and 3 downsampling layers; the molybdenum target image segmentation model comprises 11 convolution layers and 3 down-sampling layers.
According to a second aspect of the embodiments of the present disclosure, there is provided an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the memory, wherein the processor implements functions performed by the breast nodule risk level prediction system based on multiple data fusion and deep learning when executing the program.
According to a third aspect of the embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the functions performed by the breast nodule risk level prediction system based on multiple data fusion and deep learning.
Compared with the prior art, the beneficial effect of this disclosure is:
the scheme not only can automatically segment a focus region by means of a deep convolutional neural network model, overcomes the defect that the weak boundary problem (namely the breast nodule boundary with uneven gradient intensity information) cannot be solved based on an active contour and the like, but also integrates the multisource characteristics of two breast image characteristics, clinical data, gene detection data and pathological data, screens out the main characteristics of the malignancy degree of the classified breast nodules by means of principal component analysis, performs classification prediction by using a multilayer perceptron, can objectively quantify main clinical medical indexes, improves the accuracy rate of evaluating the malignancy risk of the breast nodules, and obtains high adaptability.
Advantages of additional aspects of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
Fig. 1 is a flowchart of breast nodule malignancy risk level prediction based on multi-modality imagery and multi-source data information according to a first embodiment of the disclosure;
fig. 2(a) is an ultrasound image of an original image of a lesion used in a first embodiment of the present disclosure;
fig. 2(b) is a molybdenum target image of a lesion original image used in a first embodiment of the present disclosure;
fig. 3(a) is a picture of a lesion region mask after the expert marks the ultrasound image in fig. 2(a) according to a first embodiment of the present disclosure;
fig. 3(b) is a picture of a lesion area mask obtained after the expert marks the molybdenum target image in fig. 2(b) according to a first embodiment of the present disclosure;
FIG. 4(a) is a schematic diagram of a B-Net and M-Net segmentation model network structure according to a first embodiment of the present disclosure;
fig. 4(b) is a schematic structural diagram of a Rec-Net nodule class identification model network according to a first embodiment of the present disclosure;
fig. 5(a) is a picture of an effect of automatically segmenting a lesion area in the ultrasound image of fig. 2(a) by using B-Net according to a first embodiment of the present disclosure.
Fig. 5(b) is a diagram illustrating an effect of automatically segmenting the focus area of the molybdenum target image in fig. 2(b) by using M-Net according to a first embodiment of the present disclosure.
Detailed Description
The present disclosure is further described with reference to the following drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
The first embodiment is as follows:
the purpose of this embodiment is a breast nodule risk level prediction system based on multidata fusion and deep learning.
A breast nodule risk level prediction system based on multidata fusion and deep learning comprises:
the data acquisition unit is used for acquiring a mammary gland ultrasonic image and a molybdenum target image of a target to be predicted and corresponding clinical data, gene detection data and case data;
the automatic segmentation unit is used for carrying out image segmentation on the mammary gland ultrasonic image and the molybdenum target image respectively by utilizing a pre-trained segmentation model to obtain an ultrasonic image and a molybdenum target image of a mammary gland nodule region;
the type identification and feature extraction unit is used for respectively inputting the nodule ultrasonic image and the molybdenum target image into a pre-trained nodule type identification model to obtain a nodule type, a nodule ultrasonic image feature and a molybdenum target image feature;
the data fusion unit is used for quantifying clinical data, gene detection data and case data of a target to be predicted, reducing dimensions of the data, and splicing the data with ultrasonic image features of nodules and molybdenum target image features to obtain multi-source fusion features;
and the prediction unit is used for inputting the multi-source fusion characteristics into a pre-trained multilayer perceptron model to obtain the probability of the nodule class to which the target nodule to be detected belongs.
Specifically, for ease of understanding, the embodiments of the present disclosure are described in detail below with reference to the accompanying drawings:
as shown in fig. 1, a breast nodule risk level prediction system based on multidata fusion and deep learning includes a data acquisition unit, an automatic segmentation unit, a type identification and feature extraction unit, a data fusion unit and a prediction unit, specifically:
(1) data acquisition unit
The data acquisition unit is used for collecting mammary gland ultrasonic image data and molybdenum target image data, and corresponding clinical data, gene detection data and pathological data thereof, and establishing a database; the method specifically comprises the following steps: the focus images (which can be in picture format or standard dicom pictures) are collected, wherein 5000 cases of high, medium and low grade malignancy cases are included, 8 ultrasonic images of each patient are provided, and 4 molybdenum target images are provided. 1500 of these cases had genetic testing information, clinical information, and pathological information; preprocessing two image data, including removing redundant information around an ultrasonic image and a molybdenum target image, delineating a nodule region by an expert, and resampling the image to make the size of the image 512 x 512;
(2) automatic segmentation unit
The automatic segmentation unit is used for reading the ultrasonic image and the molybdenum target image respectively, establishing two depth convolution neural network models which are respectively marked as B-Net and M-Net, and is used for fully automatically segmenting nodule regions in the mammary ultrasonic image and the molybdenum target image, and the nodule regions are called regions of interest (ROI); the method specifically comprises the following steps:
1) and respectively designing a deep convolution network structure B-Net and an M-Net to respectively establish a breast ultrasonic image and a molybdenum target image full-automatic segmentation model. The B-Net is a network structure consisting of 13 convolutional layers and 3 downsampling layers; the sizes of convolution kernels of each convolution layer are respectively as follows: the first three convolutional layers are 13 × 13, the fourth convolutional layer and the fifth convolutional layer are 11 × 11, the sixth convolutional layer, the seventh convolutional layer and the eighth convolutional layer are 5 × 5, and the rest layers are 3 × 3; the step length of the second convolution layer is 2, and the rest is 1; the size of the down-sampling layers is 3 x 3, and the step size is 2; the M-Net is a network structure consisting of 11 convolution layers and 3 down-sampling layers; the sizes of convolution kernels of each convolution layer are respectively as follows: 13 × 13 for the first convolutional layer, 11 × 11 for the second and third convolutional layers, 7 × 7 for the fourth convolutional layer, 5 × 5 for the fifth and sixth convolutional layers, and 3 × 3 for the remaining layers; the step length of the second convolution layer is 2, and the rest is 1; the size of the down-sampling layers is 3 x 3, and the step size is 2; the two networks are pre-trained on the basis of ImageNet data sets;
2) initializing deep convolution networks B-Net and M-Net according to parameters reserved by pre-training, then retraining the two networks by taking a breast ultrasonic image and a molybdenum target image as input respectively, and outputting a segmentation probability map of a nodule region based on multi-layer convolution and pooling automatic learning of nodule characteristics.
(3) Type identification and feature extraction unit
The type recognition and feature extraction unit is used for designing a deep convolutional neural network architecture for a recognition task, and establishing classification models of high, medium and low malignant risks of the breast nodules based on the ultrasonic images and the molybdenum target images of the breast nodules respectively and recording the classification models as Rec-Net. Storing the ultrasonic image characteristics and molybdenum target image characteristics of various nodules; the method specifically comprises the following steps:
1) designing a deep convolution network structure Rec-Net, and respectively establishing a classification recognition model of a breast nodule ultrasonic image and a molybdenum target image; the Rec-Net is a network structure consisting of 7 convolutional layers, 3 downsampling layers and 3 full-connection layers, and the number of neuron nodes of the three full-connection layers is 4096, 4096 and 1; the sizes of convolution kernels of each convolution layer are respectively as follows: the first two convolutional layers are 11 × 11, the third 7 × 7, the fourth 5 × 5, and the remaining convolutional layers are all 3 × 3; the step sizes are respectively: the second convolution layer and the fourth convolution layer are both 2, and the rest are all 1; the size of the down-sampling layers is 3 x 3, and the step size is 2;
2) then training Rec-Net based on ImageNet data set, and retraining the network model by taking breast nodule ultrasonic images and molybdenum target images as the input of the network Rec-Net, and outputting the classification of each nodule, namely high, medium and low malignancy risk grades (respectively corresponding to the malignancy grade of breast cancer in clinic: 1. three levels 2, 3). Respectively keeping the ultrasonic image characteristics and the molybdenum target image characteristics of each nodule;
(4) data fusion unit
The data fusion unit is used for fusing the characteristics, clinical data, gene detection data and pathological data of the ultrasonic image and the molybdenum target image obtained by the deep convolutional neural network; the method specifically comprises the following steps:
1) the clinical data, the gene detection data and the pathological data of the breast nodules are sorted and quantified, and the characteristics are normalized by using a z-score method;
2) performing feature screening on the fused data features based on a principal component analysis method, performing dimension reduction on the fused data features, and weighing the optimized feature combinations as text features;
3) and splicing the optimized text features with the ultrasonic image features and the molybdenum target image features reserved in the third process to obtain fused multi-source information features.
(5) Prediction unit
The prediction unit is used for classifying and predicting the malignancy degree of the breast nodules by using the multi-source information features obtained by fusion of the multi-layer perceptron as input. The method specifically comprises the following steps:
1) designing a multi-layer perceptron structure, taking the fused multi-source information characteristics as input, and outputting the probability of which malignant risk level each breast nodule belongs to;
2) and training the multilayer perceptron model by adopting a k-time cross validation method, and finely adjusting the parameters of the multilayer perceptron model. Specifically, the fused multi-source information features are averagely divided into k groups which are not overlapped with each other, k-1 group data are selected as a training set and used for training the multilayer perceptron model to identify the malignant risk level of the breast nodule, and the rest group data are used as a test set and used for testing the trained multilayer perceptron model; and repeat the experiment k times until each set of data has been collected. The weight and the offset parameters of the multilayer perceptron model are saved each time, and the result is evaluated according to the accuracy rate on the test set, wherein the calculation formula of the accuracy rate isWherein AC represents accuracy; TN represents the number of correctly sorted samples; FN indicates the number of samples with classification errors. And if the difference between the accuracy of each time and the average value of k times is not large, taking a group of weight and bias parameters with higher accuracy as the optimal parameters of the multilayer perceptron model, namely the multilayer perceptron model is trained.
And finally, inputting the multi-source data of the breast nodules needing to evaluate the malignant risk level into the established deep convolutional neural network and the multilayer perceptron model in sequence, so that the multi-source information characteristics of the nodules can be obtained, and the characteristics are analyzed and evaluated, so that the malignant risk level of the nodules is predicted.
Further, to demonstrate the accuracy of the protocol described in the present disclosure, the following proof experiments were performed:
FIGS. 2(a) and 2(B), and FIGS. 3(a) and 3(B) show the original image of the lesion and the mask image of the corresponding lesion region used to train the B-Net and M-Net models in the experiment; fig. 4(a) and 4(b) show a network structure in an embodiment; FIGS. 5(a) and 5(B) are graphs showing the effect of using B-Net and M-Net to automatically segment the lesion area.
In further embodiments, there is also provided:
an electronic device comprising a memory and a processor, and computer instructions stored on the memory and executed on the processor, wherein the computer instructions, when executed by the processor, perform the functions performed by the system of the first embodiment. For brevity, no further description is provided herein.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASI C, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
A computer readable storage medium storing computer instructions which, when executed by a processor, perform the functions performed by the system of the first embodiment.
The system in one embodiment may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in the memory, and the processor reads the information in the memory and combines the hardware to complete the functions executed by the system. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements, i.e., algorithm steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The breast nodule risk level prediction system based on multi-data fusion and deep learning provided by the embodiment can be realized, and has a wide application prospect.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Claims (10)
1. A breast nodule risk level prediction system based on multidata fusion and deep learning is characterized by comprising:
the data acquisition unit is used for acquiring a mammary gland ultrasonic image and a molybdenum target image of a target to be predicted and corresponding clinical data, gene detection data and case data;
the automatic segmentation unit is used for carrying out image segmentation on the mammary gland ultrasonic image and the molybdenum target image respectively by utilizing a pre-trained segmentation model to obtain an ultrasonic image and a molybdenum target image of a mammary gland nodule region;
the type identification and feature extraction unit is used for respectively inputting the nodule ultrasonic image and the molybdenum target image into a pre-trained nodule type identification model to obtain a nodule type, a nodule ultrasonic image feature and a molybdenum target image feature;
the data fusion unit is used for quantifying clinical data, gene detection data and case data of a target to be predicted, reducing dimensions of the data, and splicing the data with ultrasonic image features of nodules and molybdenum target image features to obtain multi-source fusion features;
and the prediction unit is used for inputting the multi-source fusion characteristics into a pre-trained multilayer perceptron model to obtain the probability of the nodule class to which the target nodule to be detected belongs.
2. The breast nodule risk classification predicting system based on multiple data fusion and deep learning as claimed in claim 1, wherein the nodule type recognition model is a deep convolution network structure including convolution layers, down sampling layers and full connection layers, and is trained by using ImageNet data sets in advance, then retrained by using the constructed breast nodule ultrasonic images and molybdenum target image data sets respectively, and finally output is the type of the nodule.
3. The system of claim 2, wherein the classes of nodules comprise a high malignancy risk class, a medium malignancy risk class, and a low malignancy risk class.
4. The breast nodule risk level prediction system based on multiple data fusion and deep learning as claimed in claim 1, wherein the quantification and data dimension reduction of clinical data, genetic testing data and case data of the target to be predicted are specifically: quantifying clinical data, gene detection data and case data of a target to be predicted and carrying out normalization processing by using a z-score method; and reducing the dimension of the data after the normalization processing by a principal component analysis method.
5. The breast nodule risk level prediction system based on multiple data fusion and deep learning as claimed in claim 1, wherein the training of the multi-layer perceptron adopts a k-times cross validation method for training, specifically: averagely dividing the fused multi-source information characteristics into k groups which are not overlapped with each other, selecting k-1 group data as a training set for training the multilayer perceptron model to identify the malignant risk level of the breast nodules, and using the rest group data as a test set for testing the trained multilayer perceptron model; repeating the experiment k times until each group of data is subjected to a test set; and storing the weight and the offset parameter of the multilayer perceptron model each time, evaluating the result according to the accuracy on the test set, and finishing the training of the multilayer perceptron model when the difference value between the accuracy and the average value of k times is within a preset range.
6. The breast nodule risk level prediction system based on multidata fusion and deep learning as claimed in claim 5, wherein the calculation formula of the accuracy rate is as follows:
wherein AC represents accuracy; TN represents the number of correctly sorted samples; FN indicates the number of samples with classification errors.
7. The breast nodule risk level prediction system based on multiple data fusion and deep learning of claim 1, wherein the segmentation models comprise a breast ultrasound image segmentation model and a molybdenum target image segmentation model, the segmentation is based on a deep convolutional network structure, and the breast ultrasound image segmentation model comprises 13 convolutional layers and 3 downsampling layers; the molybdenum target image segmentation model comprises 11 convolution layers and 3 down-sampling layers.
8. The breast nodule risk level prediction system based on multiple data fusion and deep learning as claimed in claim 1, wherein the training of the segmentation model specifically comprises: the breast ultrasonic image segmentation model and the molybdenum target image segmentation model are pre-trained on the basis of the ImageNet data set, and then are trained through the constructed breast ultrasonic image and molybdenum target image data set respectively.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory for execution, wherein the processor implements the functions of the breast nodule risk level prediction system based on multiple data fusion and deep learning according to any one of claims 1-8.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the functions of the breast nodule risk level prediction system based on multiple data fusion and deep learning according to any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110825224.6A CN113592797A (en) | 2021-07-21 | 2021-07-21 | Mammary nodule risk grade prediction system based on multi-data fusion and deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110825224.6A CN113592797A (en) | 2021-07-21 | 2021-07-21 | Mammary nodule risk grade prediction system based on multi-data fusion and deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113592797A true CN113592797A (en) | 2021-11-02 |
Family
ID=78248768
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110825224.6A Pending CN113592797A (en) | 2021-07-21 | 2021-07-21 | Mammary nodule risk grade prediction system based on multi-data fusion and deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113592797A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116416235A (en) * | 2023-04-12 | 2023-07-11 | 北京建筑大学 | Feature region prediction method and device based on multi-mode ultrasonic data |
CN116630680A (en) * | 2023-04-06 | 2023-08-22 | 南方医科大学南方医院 | Dual-mode image classification method and system combining X-ray photography and ultrasound |
CN116912236A (en) * | 2023-09-08 | 2023-10-20 | 首都医科大学附属北京妇产医院 | Method, system and storable medium for predicting fetal congenital heart disease risk based on artificial intelligence |
CN117237351A (en) * | 2023-11-14 | 2023-12-15 | 腾讯科技(深圳)有限公司 | Ultrasonic image analysis method and related device |
CN117274185A (en) * | 2023-09-19 | 2023-12-22 | 阿里巴巴达摩院(杭州)科技有限公司 | Detection method, detection model product, electronic device, and computer storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108257135A (en) * | 2018-02-01 | 2018-07-06 | 浙江德尚韵兴图像科技有限公司 | The assistant diagnosis system of medical image features is understood based on deep learning method |
CN110060235A (en) * | 2019-03-27 | 2019-07-26 | 天津大学 | A kind of thyroid nodule ultrasonic image division method based on deep learning |
CN111369565A (en) * | 2020-03-09 | 2020-07-03 | 麦克奥迪(厦门)医疗诊断***有限公司 | Digital pathological image segmentation and classification method based on graph convolution network |
CN111739033A (en) * | 2020-06-22 | 2020-10-02 | 中山大学肿瘤防治中心(中山大学附属肿瘤医院、中山大学肿瘤研究所) | Method for establishing breast molybdenum target and MR image omics model based on machine learning |
CN112420170A (en) * | 2020-12-10 | 2021-02-26 | 北京理工大学 | Method for improving image classification accuracy of computer aided diagnosis system |
-
2021
- 2021-07-21 CN CN202110825224.6A patent/CN113592797A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108257135A (en) * | 2018-02-01 | 2018-07-06 | 浙江德尚韵兴图像科技有限公司 | The assistant diagnosis system of medical image features is understood based on deep learning method |
CN110060235A (en) * | 2019-03-27 | 2019-07-26 | 天津大学 | A kind of thyroid nodule ultrasonic image division method based on deep learning |
CN111369565A (en) * | 2020-03-09 | 2020-07-03 | 麦克奥迪(厦门)医疗诊断***有限公司 | Digital pathological image segmentation and classification method based on graph convolution network |
CN111739033A (en) * | 2020-06-22 | 2020-10-02 | 中山大学肿瘤防治中心(中山大学附属肿瘤医院、中山大学肿瘤研究所) | Method for establishing breast molybdenum target and MR image omics model based on machine learning |
CN112420170A (en) * | 2020-12-10 | 2021-02-26 | 北京理工大学 | Method for improving image classification accuracy of computer aided diagnosis system |
Non-Patent Citations (2)
Title |
---|
DONGDONG SUN 等: "Integrating genomic data and pathological images to effectively predict breast cancer clinical outcome", 《COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE》, pages 45 - 53 * |
孔小函 等: "基于卷积神经网络和多信息融合的三维乳腺超声分类方法", 《中国生物医学工程学报》, vol. 37, no. 4, pages 414 - 422 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116630680A (en) * | 2023-04-06 | 2023-08-22 | 南方医科大学南方医院 | Dual-mode image classification method and system combining X-ray photography and ultrasound |
CN116630680B (en) * | 2023-04-06 | 2024-02-06 | 南方医科大学南方医院 | Dual-mode image classification method and system combining X-ray photography and ultrasound |
CN116416235A (en) * | 2023-04-12 | 2023-07-11 | 北京建筑大学 | Feature region prediction method and device based on multi-mode ultrasonic data |
CN116416235B (en) * | 2023-04-12 | 2023-12-05 | 北京建筑大学 | Feature region prediction method and device based on multi-mode ultrasonic data |
CN116912236A (en) * | 2023-09-08 | 2023-10-20 | 首都医科大学附属北京妇产医院 | Method, system and storable medium for predicting fetal congenital heart disease risk based on artificial intelligence |
CN116912236B (en) * | 2023-09-08 | 2023-12-26 | 首都医科大学附属北京妇产医院 | Method, system and storable medium for predicting fetal congenital heart disease risk based on artificial intelligence |
CN117274185A (en) * | 2023-09-19 | 2023-12-22 | 阿里巴巴达摩院(杭州)科技有限公司 | Detection method, detection model product, electronic device, and computer storage medium |
CN117274185B (en) * | 2023-09-19 | 2024-05-07 | 阿里巴巴达摩院(杭州)科技有限公司 | Detection method, detection model product, electronic device, and computer storage medium |
CN117237351A (en) * | 2023-11-14 | 2023-12-15 | 腾讯科技(深圳)有限公司 | Ultrasonic image analysis method and related device |
CN117237351B (en) * | 2023-11-14 | 2024-04-26 | 腾讯科技(深圳)有限公司 | Ultrasonic image analysis method and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113592797A (en) | Mammary nodule risk grade prediction system based on multi-data fusion and deep learning | |
Das et al. | Computer-aided histopathological image analysis techniques for automated nuclear atypia scoring of breast cancer: a review | |
CN107680678B (en) | Thyroid ultrasound image nodule diagnosis system based on multi-scale convolution neural network | |
Pellicer-Valero et al. | Deep learning for fully automatic detection, segmentation, and Gleason grade estimation of prostate cancer in multiparametric magnetic resonance images | |
CN112292691A (en) | Methods and systems for improving cancer detection using deep learning | |
Hashemi et al. | Mass detection in lung CT images using region growing segmentation and decision making based on fuzzy inference system and artificial neural network | |
GB2604962A (en) | Lesion detection artificial intelligence pipeline computing system | |
Cicalese et al. | Kidney level lupus nephritis classification using uncertainty guided Bayesian convolutional neural networks | |
US11471096B2 (en) | Automatic computerized joint segmentation and inflammation quantification in MRI | |
Sahran et al. | Machine learning methods for breast cancer diagnostic | |
BenTaieb et al. | Deep learning models for digital pathology | |
CN116740435A (en) | Breast cancer ultrasonic image classifying method based on multi-mode deep learning image group science | |
Ortiz-Rodriguez et al. | Breast cancer detection by means of artificial neural networks | |
Yang et al. | Automatic prostate cancer detection on multi-parametric mri with hierarchical weakly supervised learning | |
Kaliyugarasan et al. | Pulmonary nodule classification in lung cancer from 3D thoracic CT scans using fastai and MONAI | |
Gajula et al. | An MRI brain tumour detection using logistic regression-based machine learning model | |
CN113889229A (en) | Construction method of medical image diagnosis standard based on human-computer combination | |
CN116630680B (en) | Dual-mode image classification method and system combining X-ray photography and ultrasound | |
Ganeshkumar et al. | Two-stage deep learning model for automate detection and classification of lung diseases | |
Bandaru et al. | A review on advanced methodologies to identify the breast cancer classification using the deep learning techniques | |
CN116416225A (en) | Pancreatic cancer diagnosis method, pancreatic cancer diagnosis system, pancreatic cancer medium and pancreatic cancer electronic device | |
Lai et al. | BrainSec: automated brain tissue segmentation pipeline for scalable neuropathological analysis | |
Rubin et al. | A Bayesian Network to assist mammography interpretation | |
Yücel et al. | Automated AI-based grading of neuroendocrine tumors using Ki-67 proliferation index: comparative evaluation and performance analysis | |
Parvatikar et al. | Prototypical models for classifying high-risk atypical breast lesions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |