CN117954099A - Breast cancer risk prediction method, device, equipment and storage medium - Google Patents

Breast cancer risk prediction method, device, equipment and storage medium Download PDF

Info

Publication number
CN117954099A
CN117954099A CN202410347514.8A CN202410347514A CN117954099A CN 117954099 A CN117954099 A CN 117954099A CN 202410347514 A CN202410347514 A CN 202410347514A CN 117954099 A CN117954099 A CN 117954099A
Authority
CN
China
Prior art keywords
breast cancer
histology
image
pet
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410347514.8A
Other languages
Chinese (zh)
Inventor
刘建井
边海曼
戴东
徐文贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Medical University Cancer Institute and Hospital
Original Assignee
Tianjin Medical University Cancer Institute and Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Medical University Cancer Institute and Hospital filed Critical Tianjin Medical University Cancer Institute and Hospital
Priority to CN202410347514.8A priority Critical patent/CN117954099A/en
Publication of CN117954099A publication Critical patent/CN117954099A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Processing (AREA)

Abstract

The application provides a breast cancer risk prediction method, a device, equipment and a storage medium, and belongs to the technical field of breast cancer prediction. And splicing clinical, PET and CT features into a total feature matrix, establishing a logistic regression model, determining sample weights according to the model and a loss function, and obtaining a risk assessment value of each sample. The method establishes a mapping relation between breast cancer molecular typing and images through PET/CT image processing and machine learning, predicts breast cancer molecular typing probability and establishes an individual diagnosis and treatment scheme. Compared with the traditional biopsy, the prediction system reduces the pain of patients, reduces the medical cost, and has high repeatability and easy operability.

Description

Breast cancer risk prediction method, device, equipment and storage medium
Technical Field
The present application relates to the field of breast cancer prediction, and in particular, to a breast cancer risk prediction method, apparatus, device, and storage medium.
Background
Molecular typing of breast cancer has significant differences in clinical characteristics, biological behavior, treatment regimen selection and prognosis, and therefore, determining molecular typing of breast cancer has important implications for clinical practice. However, at present, the pathological detection is mainly carried out by puncturing or operating to acquire tissues, and the defects of wound, time consumption, easy missed diagnosis and the like exist.
Disclosure of Invention
The application aims to overcome the defects in the prior art and provide a breast cancer risk prediction method, a breast cancer risk prediction device, breast cancer risk prediction equipment and a breast cancer risk prediction storage medium.
The application provides a breast cancer risk prediction method, which comprises the following steps:
acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images;
respectively and sequentially preprocessing the PET image and the CT image of the primary focus ROI of breast cancer, and extracting image histology characteristics to obtain PET histology characteristics and CT histology characteristics;
representing the clinical characteristics, PET histology characteristics and CT histology characteristics as matrixes, and splicing to obtain a total characteristic matrix;
establishing a logistic regression model and a loss function according to the total feature matrix;
And determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and using the risk evaluation value for breast cancer risk prediction.
Optionally, the preprocessing includes:
Exponential, gradient, laplacian of gaussian, logarithmic, square root, and wavelet filtering;
Wherein the wavelet filtering consists of a combination of high-pass H and low-pass L of 3 dimensions incorporated into the PET/CT image, respectively comprising LLH, LHL, HHL, LLL, HHH, LHH, HLL, HLH.
Optionally, the image histology feature extraction includes:
Three-dimensional and two-dimensional shape features are extracted from the original image of PET/CT;
extracting shape features, first-order features, gray level co-occurrence matrix, gray level run length matrix, gray level size area matrix and gray level dependency matrix from the preprocessed image and the original image of the PET/CT.
Optionally, determining weights of the samples according to the logistic regression model and the loss function, multiplying and summing the weights with features of the samples to obtain risk assessment values of each sample, including:
Calculating image histology features with significant differences using wilcoxon test;
calculating the correlation between every two image histology feature combinations with obvious differences, determining the redundancy degree between the features according to the correlation, and removing the high-dimensional feature redundancy;
removing high-dimensional feature redundancy image histology feature combinations by using LASSO regression screening;
And linearly combining the filtered reserved characteristics and the corresponding characteristic weights to calculate the image component score of each patient.
Alternatively, LASSO regression is used to screen for combinations of image histology features that are predicted to be more efficient, expressed as follows:
Where P (y= 1|X) represents the probability that the sample is predicted to be class 1; for the weight of each sample,/> E is a natural constant for the feature corresponding to each sample.
Optionally, the loss function expression is as follows:
Where y represents the true classification value of the sample, Representing the predicted value;
let the loss function Minimum, obtain the optimal weight/>, of each sample
Optionally, the clinical features include: age, menstrual condition, clinical stage, pathology type, ER, PR, and HER-2 expression status, molecular typing of the patient.
The application also provides a breast cancer risk prediction device, which comprises:
The acquisition module is used for acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images;
the feature module is used for marking the ROI of the breast cancer primary focus of the PET image and the CT image, respectively and sequentially preprocessing the ROI, and extracting the image histology features to obtain PET histology features and CT histology features;
The splicing module is used for representing the clinical characteristics, the PET histology characteristics and the CT histology characteristics as matrixes and splicing the matrixes to obtain a total characteristic matrix;
the calculation module is used for establishing a logistic regression model and a loss function according to the total feature matrix;
And the prediction module is used for determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and the risk evaluation value is used for breast cancer risk prediction.
The present application also provides a breast cancer risk prediction apparatus comprising:
a memory for storing a computer executable program of the breast cancer risk prediction method;
A processor for invoking the computer executable program to perform: acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images; respectively and sequentially preprocessing the PET image and the CT image of the primary focus ROI of breast cancer, and extracting image histology characteristics to obtain PET histology characteristics and CT histology characteristics; representing the clinical characteristics, PET histology characteristics and CT histology characteristics as matrixes, and splicing to obtain a total characteristic matrix; establishing a logistic regression model and a loss function according to the total feature matrix; and determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and using the risk evaluation value for breast cancer risk prediction.
The present application also provides a storage medium storing a computer executable program for being called by a processor to perform the steps of the breast cancer risk prediction method.
The beneficial effects of the application are as follows:
The application provides a breast cancer risk prediction method, which comprises the following steps: acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images; respectively and sequentially preprocessing the PET image and the CT image of the primary focus ROI of breast cancer, and extracting image histology characteristics to obtain PET histology characteristics and CT histology characteristics; representing the clinical characteristics, PET histology characteristics and CT histology characteristics as matrixes, and splicing to obtain a total characteristic matrix; establishing a logistic regression model and a loss function according to the total feature matrix; and determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and using the risk evaluation value for breast cancer risk prediction. According to the application, through PET/CT image processing and machine learning, a mapping relation between breast cancer molecular typing and images is established, and the probability of breast cancer molecular typing is predicted according to clinical characteristics of patients and PET/CT images, so that a more accurate individualized diagnosis and treatment scheme is formulated. Compared with the traditional biopsy method, the prediction system based on the statistical modeling has the advantages of reducing pain of patients, reducing medical cost, being high in repeatability and easy to operate and the like.
Drawings
FIG. 1 is a schematic diagram of a breast cancer risk prediction flow chart according to the present application;
FIG. 2 is a schematic diagram of a breast cancer risk prediction system according to the present application;
FIG. 3 is a schematic diagram of a breast cancer risk prediction image processing flow in the present application;
FIG. 4 is a graphical representation of PET/CT contrast with breast cancer primary foci for patients in the triad and non-triad;
FIG. 5 is a schematic representation of the correlation of 10 image histology features incorporated 18 F-FDG-PET/CT;
FIG. 6 is a schematic representation of the ROC curve comparison of the integrated image histology model and the combined clinical profile for the molecular typing prediction of triple negative breast cancer;
FIG. 7 is a schematic diagram comparing an image histology model with a calibration curve (A) and a decision curve (B) of a comprehensive image histology model combining clinical feature information for model-based prediction of triple negative breast cancer molecules;
Detailed Description
The present application is further described in conjunction with the drawings and detailed embodiments below to enable one skilled in the art to better understand and practice the application.
The following is a detailed description of the embodiments of the present application provided for the purpose of illustrating the technical solution to be protected, but the present application may be implemented in other ways than as described herein, and the present application is not limited to the following specific embodiments, since the present application is implemented by various technical means under the guidance of the inventive concept by those skilled in the art.
Referring to fig. 1 to 3, a breast cancer risk prediction method includes the steps of:
S101, acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images.
The acquisition of comprehensive clinical characteristics and imaging image data of breast cancer patients is a key step of diagnosis and treatment. The data comprises basic information of age, menstrual condition, clinical stage, pathological type and the like of patients, and also comprises data of important biomarkers such as ER, PR and HER-2 expression states, molecular typing and the like.
Meanwhile, the application acquires detailed imaging images of breast cancer through PET and CT imaging examination. The images provide important information about the tumor location, size, morphology and interrelation with surrounding tissue. By analyzing the images, the malignancy of the tumor, the growth rate, and the extent of spread are more accurately assessed.
PET images show the metabolic activity of the tumor, helping to identify differences between tumor tissue and normal tissue. By measuring metabolic parameters such as SUVmax, SUVmean, SUVpeak, MTV and TLG of the tumor, the growth speed and malignancy of the tumor are known.
Among them, maximum normalized uptake value (Maximumstandarduptakevalue, SUVmax), mean normalized PET image-characterized uptake value (meanstandarduptakevalue, SUVmean), normalized uptake value peak (peakofstandarduptakevalue, SUVpeak), tumor metabolic volume (metabolictumorvolume, MTV), and total focal glycolysis (totallesionglycolysis, TLG).
The specific measuring method comprises the following steps: more than two experienced nuclear medicine physicians read the film on Xeleris workstations (GEHealthcare, milwaukee, wis., US), xeleris workstations realize free switching of transverse, coronal and sagittal positions, and physicians read the film blindly. All images were processed by PETVCAR software of the AW4.6 post-processing workstation. PETVCAR is an automatic software system, which uses an iterative adaptive algorithm to detect a threshold level, takes 42% of the SUVmax of the primary lesion of breast cancer as a threshold, needs to position a mouse to a target lesion, automatically marks a region of interest through an Insert key, and wraps the whole lesion. If activity outside the region of interest is unavoidable, the lesion is rejected prior to analysis. After confirming that the region of interest is suitable, PETVCAR software automatically calculates the indicators SUVmax, SUVmean, SUVpeak, MTV and TLG within the region of interest.
CT images provide more detailed anatomical information that helps determine the exact location, size, and edge characteristics of the tumor. By analyzing the CT image, it is evaluated whether or not the tumor invades the adjacent tissue, and whether or not there is metastasis of lymph nodes or other organs.
CT morphological features include measuring from CT images the location (left/right), number (single/multiple), size (maximum), edge (blurring/edge-rounding), whether calcification is incorporated, whether recidivism is adjacent to the nipple or skin, whether there is ipsilateral underarm lymph node metastasis, brain metastasis, etc. of the breast cancer primary lesion.
S102, marking the PET image and the CT image breast cancer primary focus ROI, respectively and sequentially preprocessing, and extracting image histology characteristics to obtain PET histology characteristics and CT histology characteristics.
PET and CT images of the patients in the group are respectively exported to be in DICOM format, which is the standard format of medical images, and the accuracy and the universality of image data are ensured. This format facilitates subsequent image processing and analysis.
Careful segmentation of breast cancer primary focus regions of interest (ROIs) of PET and CT images (mainly in soft tissue windows) of patients was performed by two nuclear medicine doctors with more than 5 years of PET/CT interpretation experience, using advanced machine-combined manual segmentation modules. This procedure is performed in the axial, coronal and sagittal positions, respectively, ensuring full coverage of the tumor area. Doctors can review the segmentation results of each other, and further ensure the accuracy of the results.
When two doctors diverge in the segmentation result, they agree through negotiation. This is a rigorous process that is intended to ensure that each tumor region is correctly identified. If agreement is not yet reached, another nuclear medicine physician with more than 10 years of experience in PET/CT reading will be sought to assist. The physician can check and correct the ROI marks with the divergence, and make final judgment to ensure the accuracy and reliability of the data.
After manual segmentation is completed, the labeled PET and CT images and corresponding ROI area files are saved as a nii format. This format is a standard format in neuroimaging and is also suitable for subsequent image processing and analysis.
In the present application, these images were further analyzed in depth using the Pyradiomics module in python 3.7.1. The module has powerful image processing and feature extraction functions. First, filtering preprocessing is performed on PET and CT images, including exponential filtering, gradient filtering, laplacian of gaussian filtering, logarithmic filtering, square root filtering, wavelet filtering, and the like. These filtering techniques help extract important features in the image, such as edges, textures, morphological features, etc.
Among these, wavelet filtering is particularly important when processing images. It achieves multi-scale analysis of tumor regions by incorporating a combination of high-pass H and low-pass L in three dimensions (transverse, longitudinal and depth) of the PET/CT image. The multi-scale analysis can better capture heterogeneity inside tumor, and provide more comprehensive information for subsequent feature extraction and model construction. Specific wavelet filter combinations include LLH, LHL, HHL, LLL, HHH, LHH, HLL and HLH, etc.
For the comprehensive analysis of PET/CT images of breast cancer patients, the present application extracts a series of morphological and textural features from the raw and pre-processed images, including:
three-dimensional and two-dimensional shape features are extracted from the original image:
From the PET/CT original image, the application extracts the three-dimensional morphological characteristics of the tumor, such as volume, surface area, sphericity, relative compactness and the like. These features provide insight into the overall morphology of the tumor.
Meanwhile, the application also extracts two-dimensional shape features such as perimeter, area, circularity and the like from the CT image so as to supplement the description of tumor morphology.
Extracting other features from the preprocessed image and the original image:
morphological features: based on the edges, regions and size of the tumor, a series of morphological features were extracted.
First-order statistical features: histogram information describing the image, such as mean, median, standard deviation, etc.
Gray co-occurrence matrix (GLCM) features: for analyzing spatial relationships between pixels and the symbiotic situation of gray levels.
Gray run length matrix (GLRLM) features: a continuous sequence of pixel values and their lengths are described.
Gray-scale-size-area matrix (GLSZM) features: pixel distribution of different sized regions is described.
Gray dependent matrix (GLDM) features: for describing the dependency between pixel values.
Next, stable image histology features are screened.
To ensure that the selected features have stability and predictive power, the present application takes the following screening steps:
Using the Wilcoxon test, the present application screens out image histology features that have significant differences between the two groups (e.g., case and control groups). The present application sets the significance level to p=0.05.
By calculating the correlation (R) between every two features, the present application identifies and removes high-dimensional feature redundancy. Redundancy features are defined as those features with smaller AUC values of the two features with a correlation R > 0.8.
S103, representing the clinical features, PET histology features and CT histology features as matrixes, and splicing to obtain a total feature matrix.
In the present application, the clinical feature is a matrix with i rows and j columns, denoted as a (i, j). These characteristics include the age, sex, family history, tumor marker levels, etc. of the patient, each row representing one sample and each column representing one clinical characteristic.
The CT image is characterized as a matrix having i rows and k columns, denoted as B (i, k). These features include morphological features of the tumor, density, edge information, etc., derived from in-depth analysis of the CT image. Each row represents a CT image of a sample and each column represents an image feature.
The PET metabolic parameters were a matrix with i rows and p columns, denoted C (i, p). These parameters include metabolic parameters of PET images such as SUVmax, SUVmean, which reflect the metabolic activity of the tumor. Each row represents a PET image of a sample and each column represents a metabolic parameter.
Further, the image histology score is an array of i, denoted as D (i, 1). This is a one-dimensional array for storing the image histology score for each sample. This score was calculated from the stability characteristics screened as described above, and included a quantitative assessment of malignancy of the tumor.
In the application, all feature matrixes are spliced according to a sample sequence to obtain a total feature matrix with i rows and n columns, which is marked as M (i, n). Where n=j+k+p+1 represents the total number of all features.
S104, establishing a logistic regression model and a loss function according to the total feature matrix.
In building a logistic regression model, the classification result of a sample is predicted by defining a probability P (y=1|x) that each sample predicts as class 1. Here, a multi-factor logistic regression model is used, with weights optimized by iterationTo minimize the loss function and find the best predictive model.
The formula of the logistic regression model is as follows:
where P (y=1|x) represents the probability that the sample is predicted to be of class 1; Weights for the individual samples. /(I) E is a natural constant for the feature corresponding to each sample.
In the application, the loss function is used for measuring the difference between the predicted value and the actual value, and the optimal model parameter is found by minimizing the loss function. The loss function formula is as follows:
Where y represents the true classification value of the sample, Representing the predicted value. By optimizing this loss function, the weights are adjusted stepwise and finally the best model parameters are found.
S105, determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and using the risk evaluation value for breast cancer risk prediction.
The application multiplies and sums the characteristics of each sample with the corresponding weight to obtain a risk evaluation value R of each sample. The risk assessment value is calculated based on a logistic regression model, and influences of aspects such as clinical characteristics, CT image characteristics, PET metabolic parameters, image histology scores and the like are comprehensively considered.
Specifically, the risk assessment value R for each sample is calculated by the following formula:
Wherein, Is the weight of each sample,/>Is the eigenvalue of the corresponding sample. These weights are trained by logistic regression models to reflect the degree of influence of each feature on the predicted outcome.
By calculating the risk assessment value R for each sample, the risk of breast cancer is more fully understood. The risk assessment value not only integrates clinical characteristics of a patient, but also considers various aspects such as imaging characteristics, metabolic parameters and the like, and is beneficial to improving the accuracy and reliability of prediction.
In practical applications, the degree of risk of breast cancer is estimated according to the magnitude of the risk assessment value R. For example, by setting a threshold, samples with R values higher than the threshold are classified as high risk, and samples lower than the threshold are classified as low risk. In this way, a personalized breast cancer risk prediction is provided for the patient, which is helpful for early detection and treatment of breast cancer.
The application also provides a breast cancer risk prediction device, which comprises:
The acquisition module is used for acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images;
the feature module is used for marking the ROI of the breast cancer primary focus of the PET image and the CT image, respectively and sequentially preprocessing the ROI, and extracting the image histology features to obtain PET histology features and CT histology features;
The splicing module is used for representing the clinical characteristics, the PET histology characteristics and the CT histology characteristics as matrixes and splicing the matrixes to obtain a total characteristic matrix;
the calculation module is used for establishing a logistic regression model and a loss function according to the total feature matrix;
And the prediction module is used for determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and the risk evaluation value is used for breast cancer risk prediction.
The present application also provides a breast cancer risk prediction apparatus comprising:
a memory for storing a computer executable program of the breast cancer risk prediction method;
A processor for invoking the computer executable program to perform: acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images; respectively and sequentially preprocessing the PET image and the CT image of the primary focus ROI of breast cancer, and extracting image histology characteristics to obtain PET histology characteristics and CT histology characteristics; representing the clinical characteristics, PET histology characteristics and CT histology characteristics as matrixes, and splicing to obtain a total characteristic matrix; establishing a logistic regression model and a loss function according to the total feature matrix; and determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and using the risk evaluation value for breast cancer risk prediction.
The present application also provides a storage medium storing a computer executable program for being called by a processor to perform the steps of the breast cancer risk prediction method.
Experimental description:
Clinical features and CT morphological feature comparison:
the application is incorporated into 227 patients, which are women with ages of 52.16 +/-11.04 years (24-78 years), and most of the patients are invasive catheter cancers (209 patients) accounting for 92.07% of the total number. Patients were divided into two groups according to their hormone receptor and HER-2 expression status, namely 82 cases of triple negative breast cancer, 145 cases of non-triple negative breast cancer (of which LuminalA types of breast cancer 7 cases, luminalB types of 109 cases, and 29 cases of HER-2 overexpression type).
The clinical and CT morphological characteristics of the two groups of patients are compared as shown in table 1 below.
TABLE 1 comparison of clinical characteristics and CT morphological characteristics of patients in the triad and non-triad
As can be seen from the above table, the differences in primary focal tumor diameter, margin, merging ipsilateral lymph node metastasis, recidivism adjacent to the nipple or skin were statistically significant (p < 0.05) in the three-negative patients and in the non-three-negative patients. The focus of triple negative breast cancer is larger, the edge of the tumor is irregular, the adjacent tissue is easy to recidivism, the sentinel lymph node metastasis occurs, and stronger invasiveness and metastasis tendency are shown. Whereas the two groups of patients did not differ significantly in age, menopausal status, tumor location, number, combined calcification, brain metastasis, pathology type and clinical stage (p > 0.05).
Comparison of PET metabolism parameters:
As shown in FIG. 4, the present application focuses on the value of PET metabolic parameters, including SUVmax, SUVmean, SUVpeak, MTV and TLG, in the diagnosis of triple negative breast cancer molecular typing.
As shown in Table 2 below, SUVmax, SUVmean, SUVpeak and TLG were significantly higher in both the triad than in the non-triad, and statistical analysis showed that this inter-group difference was statistically significant (P < 0.05 in both); there was no significant difference in MTV between the two groups of patients (P > 0.05).
TABLE 2 comparison of PET metabolism parameters for patients in the triad and non-triad
Note that: PET is positron emission tomography; TN is triple negative breast cancer; SUVmax is the maximum normalized uptake value; SUVmean is the average normalized uptake value; SUVpeak is the normalized uptake peak; MTV is tumor metabolic volume; TLG is the total amount of focal glycolysis.
As shown in FIG. 4, A-D are a whole body MIP image and a CT image, a PET image and a PET/CT fusion image of a primary focus of breast cancer of a triple negative breast cancer patient (female, age 48); the primary focus of breast cancer is located in the upper right-hand outer quadrant, with irregular edges, approximately 2.3cm in diameter, suvmax=15.16, suvmean=9.47, suvpeak=9.21, mtv of 1.76cm3, tlg of 16.68g. E-H are respectively a whole body MIP image, a CT image, a PET image and a PET/CT fusion image of a primary focus of breast cancer of a non-triple negative breast cancer patient (female, 29 years old); the primary focus of breast cancer is located in the upper quadrant of the right breast, with a margin of about 1.0cm in diameter, suvmax=6.47, suvmean=3.84, suvpeak=2.97, mtv of 0.57cm3, tlg of 2.17g.
18 Value of F-FDG-PET/CT image histology analysis in triple negative breast cancer molecular typing diagnosis:
Along with the continuous development of big data artificial intelligence analysis technology, the image histology model constructed based on medical image images can also effectively predict molecular typing of breast cancer. The application establishes an image histology model based on 18 F-FDG-PET/CT images through three-dimensional target region sketching, image feature extraction and feature dimension reduction screening based on PET images and CT images.
As shown in fig. 5, there are 4 image histology features based on CT images, and 6 image histology features based on PET images.
As shown in fig. 6, ROC curve analysis shows that the area under ROC curve AUC is 0.83, accuracy is 75.9%, sensitivity is 74.5%, specificity is 77.2%, indicating that the image histology model can effectively predict molecular typing of triple negative breast cancer.
In addition, 8 image features with statistical differences between the triple negative breast cancer group and the non-triple negative breast cancer group, including tumor diameter, margin, merging ipsilateral axillary lymph node metastasis, recidivism adjacent nipple or skin, SUVmax, SUVmean, SUVpeak and TLG, screened according to the previous statistical analysis are included in the image histology model to construct a comprehensive image histology model combining clinical feature information.
As shown in fig. 7, ROC curve analysis showed that the area under ROC curve AUC of the integrated image histology model was 0.86, accuracy was 77.2%, sensitivity was 78.6%, and specificity was 75.9%. Compared with a simple image histology model, the prediction efficiency is further improved, and Delong tests prove that the ROC difference of the two models has statistical significance (z= -3.27, p < 0.01). The drawn calibration curve and decision curve also show that the image histology model and the comprehensive image histology model constructed by the application have good fitting conditions and potential clinical application value.

Claims (10)

1. A method for predicting risk of breast cancer, comprising:
acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images;
respectively and sequentially preprocessing the PET image and the CT image of the primary focus ROI of breast cancer, and extracting image histology characteristics to obtain PET histology characteristics and CT histology characteristics;
representing the clinical characteristics, PET histology characteristics and CT histology characteristics as matrixes, and splicing to obtain a total characteristic matrix;
establishing a logistic regression model and a loss function according to the total feature matrix;
And determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and using the risk evaluation value for breast cancer risk prediction.
2. The breast cancer risk prediction method according to claim 1, wherein the pretreatment comprises:
Exponential, gradient, laplacian of gaussian, logarithmic, square root, and wavelet filtering;
Wherein the wavelet filtering consists of a combination of high-pass H and low-pass L of 3 dimensions incorporated into the PET/CT image, respectively comprising LLH, LHL, HHL, LLL, HHH, LHH, HLL, HLH.
3. The method of claim 1, wherein the image histology feature extraction comprises:
Three-dimensional and two-dimensional shape features are extracted from the original image of PET/CT;
extracting shape features, first-order features, gray level co-occurrence matrix, gray level run length matrix, gray level size area matrix and gray level dependency matrix from the preprocessed image and the original image of the PET/CT.
4. The breast cancer risk prediction method according to claim 1, wherein determining weights of the respective samples according to the logistic regression model and the loss function, multiplying and summing the weights with features of the respective samples, and obtaining risk assessment values of each sample, comprises:
Calculating image histology features with significant differences using wilcoxon test;
calculating the correlation between every two image histology feature combinations with obvious differences, determining the redundancy degree between the features according to the correlation, and removing the high-dimensional feature redundancy;
removing high-dimensional feature redundancy image histology feature combinations by using LASSO regression screening;
And linearly combining the filtered reserved characteristics and the corresponding characteristic weights to calculate the image component score of each patient.
5. The method of claim 1, wherein the combination of image histology features with higher prediction efficacy is selected using LASSO regression, expressed as follows:
Where P (y= 1|X) represents the probability that the sample is predicted to be class 1; for the weight of each sample,/> E is a natural constant for the feature corresponding to each sample.
6. The method of claim 1, wherein the loss function expression is as follows:
Where y represents the true classification value of the sample, Representing the predicted value;
let the loss function Minimum, obtain the optimal weight/>, of each sample
7. The method of claim 1, wherein the clinical features comprise: age, menstrual condition, clinical stage, pathology type, ER, PR, and HER-2 expression status, molecular typing of the patient.
8. A breast cancer risk prediction apparatus, comprising:
The acquisition module is used for acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images;
the feature module is used for marking the ROI of the breast cancer primary focus of the PET image and the CT image, respectively and sequentially preprocessing the ROI, and extracting the image histology features to obtain PET histology features and CT histology features;
The splicing module is used for representing the clinical characteristics, the PET histology characteristics and the CT histology characteristics as matrixes and splicing the matrixes to obtain a total characteristic matrix;
the calculation module is used for establishing a logistic regression model and a loss function according to the total feature matrix;
And the prediction module is used for determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and the risk evaluation value is used for breast cancer risk prediction.
9. A breast cancer risk prediction apparatus, comprising:
a memory for storing a computer-executable program for a breast cancer risk prediction method according to any one of claims 1 to 7;
A processor for invoking the computer executable program to perform: acquiring clinical characteristics and imaging images of a breast cancer patient, wherein the imaging images comprise PET images and CT images; respectively and sequentially preprocessing the PET image and the CT image of the primary focus ROI of breast cancer, and extracting image histology characteristics to obtain PET histology characteristics and CT histology characteristics; representing the clinical characteristics, PET histology characteristics and CT histology characteristics as matrixes, and splicing to obtain a total characteristic matrix; establishing a logistic regression model and a loss function according to the total feature matrix; and determining the weight of each sample according to the logistic regression model and the loss function, multiplying and summing the weight and the characteristics of each sample to obtain a risk evaluation value of each sample, and using the risk evaluation value for breast cancer risk prediction.
10. A storage medium storing a computer-executable program for being invoked by a processor to perform the steps of a breast cancer risk prediction method according to any one of claims 1-7.
CN202410347514.8A 2024-03-26 2024-03-26 Breast cancer risk prediction method, device, equipment and storage medium Pending CN117954099A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410347514.8A CN117954099A (en) 2024-03-26 2024-03-26 Breast cancer risk prediction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410347514.8A CN117954099A (en) 2024-03-26 2024-03-26 Breast cancer risk prediction method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117954099A true CN117954099A (en) 2024-04-30

Family

ID=90798545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410347514.8A Pending CN117954099A (en) 2024-03-26 2024-03-26 Breast cancer risk prediction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117954099A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112216395A (en) * 2020-09-11 2021-01-12 中山大学孙逸仙纪念医院 Axillary lymph node metastasis prediction model for breast cancer patient and construction method thereof
CN113208640A (en) * 2021-04-26 2021-08-06 复旦大学附属肿瘤医院 Method for predicting axillary lymph node metastasis based on PET (positron emission tomography) imaging omics special for mammary gland
CN114677378A (en) * 2022-05-31 2022-06-28 四川省医学科学院·四川省人民医院 Computer-aided diagnosis and treatment system based on ovarian tumor benign and malignant prediction model
WO2023017901A1 (en) * 2021-08-12 2023-02-16 서울대학교산학협력단 Breast cancer risk assessment system and method
CN117711615A (en) * 2023-11-09 2024-03-15 上海健康医学院 Lymph node metastasis state classification prediction method and device based on image histology
CN117727441A (en) * 2024-01-22 2024-03-19 天津市肿瘤医院(天津医科大学肿瘤医院) Method for predicting lung cancer immune curative effect based on clinical-fusion image computer model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112216395A (en) * 2020-09-11 2021-01-12 中山大学孙逸仙纪念医院 Axillary lymph node metastasis prediction model for breast cancer patient and construction method thereof
CN113208640A (en) * 2021-04-26 2021-08-06 复旦大学附属肿瘤医院 Method for predicting axillary lymph node metastasis based on PET (positron emission tomography) imaging omics special for mammary gland
WO2023017901A1 (en) * 2021-08-12 2023-02-16 서울대학교산학협력단 Breast cancer risk assessment system and method
CN114677378A (en) * 2022-05-31 2022-06-28 四川省医学科学院·四川省人民医院 Computer-aided diagnosis and treatment system based on ovarian tumor benign and malignant prediction model
CN117711615A (en) * 2023-11-09 2024-03-15 上海健康医学院 Lymph node metastasis state classification prediction method and device based on image histology
CN117727441A (en) * 2024-01-22 2024-03-19 天津市肿瘤医院(天津医科大学肿瘤医院) Method for predicting lung cancer immune curative effect based on clinical-fusion image computer model

Similar Documents

Publication Publication Date Title
Yousef et al. A holistic overview of deep learning approach in medical imaging
US10339648B2 (en) Quantitative predictors of tumor severity
JP4310099B2 (en) Method and system for lung disease detection
US11308611B2 (en) Reducing false positive detections of malignant lesions using multi-parametric magnetic resonance imaging
WO2021179491A1 (en) Image processing method and apparatus, computer device and storage medium
CN111553892B (en) Lung nodule segmentation calculation method, device and system based on deep learning
WO2002085211A2 (en) Method and system for automatically detecting lung nodules from multi-slice high resolution computed tomography (mshr ct) images
CN113208640B (en) Method for predicting axillary lymph node metastasis based on special PET image histology of mammary gland
CN116097302A (en) Connected machine learning model with joint training for lesion detection
CN110348477B (en) Medical image processing method, storage medium, and computer device
CN111784704B (en) MRI hip joint inflammation segmentation and classification automatic quantitative classification sequential method
CN112561869B (en) Pancreatic neuroendocrine tumor postoperative recurrence risk prediction method
CN111340825A (en) Method and system for generating mediastinal lymph node segmentation model
CN112767407A (en) CT image kidney tumor segmentation method based on cascade gating 3DUnet model
CN110874860A (en) Target extraction method of symmetric supervision model based on mixed loss function
CN112071418B (en) Gastric cancer peritoneal metastasis prediction system and method based on enhanced CT image histology
CN111784652B (en) MRI (magnetic resonance imaging) segmentation method based on reinforcement learning multi-scale neural network
CN116630680B (en) Dual-mode image classification method and system combining X-ray photography and ultrasound
JP2023508358A (en) Systems and methods for analyzing two-dimensional and three-dimensional image data
Xu et al. Improved cascade R-CNN for medical images of pulmonary nodules detection combining dilated HRNet
Chang et al. DARWIN: a highly flexible platform for imaging research in radiology
CN117954099A (en) Breast cancer risk prediction method, device, equipment and storage medium
US11615881B2 (en) Liver cancer detection
CN115132275A (en) Method for predicting EGFR gene mutation state based on end-to-end three-dimensional convolutional neural network
CN112884759B (en) Method and related device for detecting metastasis state of axillary lymph nodes of breast cancer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination