CN113408603B - Coronary artery stenosis degree identification method based on multi-classifier fusion - Google Patents

Coronary artery stenosis degree identification method based on multi-classifier fusion Download PDF

Info

Publication number
CN113408603B
CN113408603B CN202110658791.7A CN202110658791A CN113408603B CN 113408603 B CN113408603 B CN 113408603B CN 202110658791 A CN202110658791 A CN 202110658791A CN 113408603 B CN113408603 B CN 113408603B
Authority
CN
China
Prior art keywords
image
classifier
coronary artery
sample
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110658791.7A
Other languages
Chinese (zh)
Other versions
CN113408603A (en
Inventor
谢国
王承兰
穆凌霞
李艳恺
梁莉莉
李思雨
杨婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Futong Kangying Technology Co ltd
Original Assignee
Xi'an Huaqi Zhongxin Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Huaqi Zhongxin Technology Development Co ltd filed Critical Xi'an Huaqi Zhongxin Technology Development Co ltd
Priority to CN202110658791.7A priority Critical patent/CN113408603B/en
Publication of CN113408603A publication Critical patent/CN113408603A/en
Application granted granted Critical
Publication of CN113408603B publication Critical patent/CN113408603B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • G06F18/256Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/259Fusion by voting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

The application discloses a coronary artery stenosis degree identification method based on multi-classifier fusion, which comprises the steps of firstly constructing an image sample library; preprocessing a CT original sequence diagram extracted from heart CTA, and then extracting features, namely three main features of interesting texture features, gray features and geometric features; dividing the sample into a training group and a testing group, calculating the correlation between each feature and the predicted result, and eliminating the features with small correlation; and (3) establishing a multi-classifier fusion prediction model, fusing the results of the single classifiers to predict the coronary artery stenosis degree, determining the weights of the 3 single classifiers in the fusion classifier by adopting a weighting method, judging the coronary artery stenosis degree as a normal sample when the stenosis degree is lower than 50%, and judging the coronary artery stenosis degree as a lesion sample when the stenosis degree is higher than 50%. The application realizes automatic classification pre-judgment on the aspect of judging the stenosis degree, and avoids the injury brought by invasive surgery to patients.

Description

Coronary artery stenosis degree identification method based on multi-classifier fusion
Technical Field
The application belongs to the technical field of bioengineering, and particularly relates to a method for identifying the degree of coronary artery stenosis based on multi-classifier fusion.
Background
In recent years, the incidence rate and the mortality rate of cardiovascular and cerebrovascular diseases are all the first times of various diseases, especially coronary arteries are positioned on the surface of a heart to supply blood to the heart muscle, and once lesions occur, more importance is required for examination and treatment measures. Coronary artery disease may be diagnosed by anatomical parameters (e.g., diameter stenosis) or functional parameters associated with coronary myocardial ischemia. Clinically diagnosing coronary heart disease and determining a treatment scheme generally requires that a patient firstly performs heart CTA examination, a doctor gives preliminary diagnosis by a CT image, but only gives preliminary prognosis of lesion degree according to the CT image, diagnosis results have subjectivity and can give preliminary prognosis of light, medium and severe stenosis degree, if the stenosis rate is further determined, the patient performs coronary angiography DSA, a diagnosis 'gold index' is obtained, and the treatment scheme is given by the gold index. However, it is notable that coronary angiography is an invasive procedure requiring contrast injection and pressure guidewire intervention, and thus patients are prone to intra-and post-operative adverse reactions and are more traumatic. Aiming at the problems, the noninvasive detection becomes a current research hot spot, can avoid great pain and serious risks caused by the intervention of living surgery on the patient level, and can greatly improve the diagnosis efficiency and accuracy of doctors on the medical level. The development of image histology provides possibility for mining hidden information from two-dimensional CT images, constructing association between image features and diseases, further carrying out in-vitro identification and prediction on lesions, and has achieved important research results in diagnosis and treatment of many diseases, and future medicine will be developed towards more intelligent and convenient directions, and the diagnosis of diseases by using the leading edge technology will be a focus and a hotspot of future research.
In the diagnosis method of coronary artery stenosis related lesions, there are currently medical coronary angiography pressure guide wire in-vivo detection methods, and no-invasive diagnosis methods based on fluid mechanics and deep learning appear later, but the methods still have some problems needing further research, mainly in the following two aspects:
the fluid mechanics simulation modeling process is complex: the method has the advantages that a patient-specific coronary artery model needs to be established for carrying out hemodynamic simulation, a single data single model is carried out, the running condition of human blood flow is complex, some factors possibly not considered yet, such as when transient simulation is carried out, pressure and speed waveforms can be obtained from documents, more accurate example simulation is carried out, in-vitro or in-vivo measurement is needed, the modeling process needs to process and reconstruct images, the requirement on a processor is high, the operation complexity is increased, the calculation time is correspondingly prolonged, and the time is unequal for a plurality of hours.
The effect of the deep learning algorithm is easy to overfit: medical real data is generally difficult to obtain, and usually belongs to the category of small sample learning, and a deep neural network requires a large number of image samples for training due to the complex model structure. However, such algorithms with strong expressive power focus on interpreting training data, and are prone to sacrificing interpretation of future data, i.e. test data, often requiring more data samples to learn to avoid overfitting to ensure that a better result is still achieved on the new data set, which is not applicable when the image sample data is small.
Therefore, on the basis of machine learning, the application uses the small sample classifier aiming at the characteristic of less medical coronary CT images, thereby avoiding the characteristic of easy fitting of the prior deep learning algorithm. The method has the advantages that the CTA image is used for dividing the coronary blood vessel to carry out stenosis classification, the stenosis degree is directly diagnosed without invasive examination, meanwhile, the method for diagnosing the fusion of multiple classifiers is provided, the advantages of the multiple classifiers are combined, and compared with a single classifier, the fusion classifier has higher classification accuracy and performance, and more reliable basis can be provided for clinical diagnosis.
Disclosure of Invention
The application aims to provide a method for identifying the degree of coronary artery stenosis based on multi-classifier fusion, which realizes automatic classification and pre-judgment on the degree of the stenosis and avoids the injury brought by invasive surgery to a patient.
The technical scheme adopted by the application is that the method for identifying the degree of coronary artery stenosis based on multi-classifier fusion is implemented according to the following steps:
step 1, constructing an image sample library;
step 2, denoising and segmentation binarization processing is carried out on a CT original sequence image extracted from heart CTA, so as to obtain a coronary artery extraction image;
step 3, extracting features of the segmented image, namely three main features of interesting texture features, gray features and geometric features;
step 4, according to the principle of 7:3, dividing 500 samples into training groups and test groups by adopting a random index method, screening three image group chemical characteristics of texture, gray scale and geometry extracted in the step 3 by adopting a multi-classification Relieff characteristic weighting algorithm, carrying out ten-fold cross validation on random characteristics, calculating the correlation between each characteristic and a prediction result, and eliminating the characteristics with small correlation;
step 5, forming a feature set by the features of the texture, the gray level and the geometry selected in the step 4, establishing a multi-classifier fusion prediction model, and selecting three classifiers of a Support Vector Machine (SVM), a Random Forest (RF) and an Extreme Learning Machine (ELM) with good medical image classification effect to fusion predict coronary artery lesion degree; and determining the weight of the 3 classifiers in the fusion classifier by adopting a weighting method, judging as a normal sample when the stenosis degree is lower than 50%, and judging as a lesion sample when the stenosis degree is higher than 50%.
The present application is also characterized in that,
the step 1 is specifically as follows:
patient information and images which are subjected to heart CTA and coronary angiography DSA examination in recent three years are collected in a hospital data system, namely, a CT image and coronary stenosis gold index data can be corresponding, after basic information of a patient in the images is hidden, 500 coronary CT images of the patient which meet the image quality are selected as selected input samples, and label category labeling is carried out.
The step 2 is specifically as follows:
step 2.1, arranging all pixel points in the neighborhood of Gaussian noise points in an original CT image according to a size rule, taking the gray value of the pixel in the middle as the gray value of the noise point to reduce noise of the image, wherein the principle expression is as follows:
wherein i, j represents the coordinate value of the pixel point, g ij The gray value of the noise point is represented by A, and the neighborhood region taken by the noise point is represented by A; { f ij -a sequence of data; med means a median operation.
The image quality of the CT image can be improved through denoising, and meanwhile, the denoised image can more clearly reflect coronary artery structure information in the CT image, so that the segmentation operation in the step 2.2 is facilitated.
Steps 2.2 and R represent the whole image, and the segmentation is considered as a process of segmenting the whole denoised CT image R into c sub-regions, and the following conditions (1) to (4) should be satisfied at the same time:
①U(R x )=R,R x is a sub-communication area;
②R x ∩R y =Φ, x, y=1, 2,3.. and for any x and y, x is not equal to y;
③P(R x ) The number of times of for x=1, 2,3.
④R(R x ∪R y )=False,x≠y;
And extracting coronary vessels with continuous areas from the original CT image by using an area growth segmentation algorithm, and obtaining a coronary artery extraction map.
The step 3 is specifically as follows:
step 3.1, extracting gray features in six aspects of mean value, variance, energy, entropy, kurtosis and skewness of the coronary artery extraction map in the step 2 by adopting a gray histogram method;
step 3.2, constructing a gray level co-occurrence matrix, selecting a sliding window of 5 multiplied by 5, calculating gray level characteristic values of each pixel point of the coronary artery extraction map in the step 2, and extracting texture characteristics of the image;
and 3.3, extracting geometric features of the coronary artery image by using a Hu invariant moment method based on the coronary artery extraction map obtained in the step 2, firstly calculating second-order and third-order center distances of the coronary artery image, then carrying out normalization processing to obtain an invariant moment group, and describing the geometric features of the shape of the coronary artery extraction image by the invariant moment group.
The step 4 is specifically as follows:
step 4.1, selecting the first d features with the largest correlation from all features of texture, gray scale and geometric three-image histology extracted in the step 3 through a ReliefF feature weighting algorithm to form d feature subsets, wherein each subset comprises the feature numbers from 1 to d in sequence;
step 4.2, performing ten-fold cross validation, dividing a sample set into 10 subsets, selecting one subset as a test set each time, taking the rest 9 subsets as training sets, repeating 10 times, and finally selecting the average recognition accuracy of 10 times as a result;
and 4.3, calculating the prediction error rate of each feature subset according to the process, and selecting the feature subset with the minimum pre-error rate as the input feature of the multi-classifier fusion prediction model in the step 5.
The ReliefF feature weighting algorithm in step 4.1 is specifically as follows:
randomly extracting a sample S from the training sample set each time, and respectively finding k neighbor samples H from the same type of samples and different types of samples of the sample S l 、M l And then updating the weight occupied by each feature in the three types of texture, gray scale and geometry features extracted in the step 3 in the prediction process, wherein the features with the weights smaller than the set threshold value are rejected, and the feature weight calculation formula is as follows:
in the above, m isThe number of samples, k, is the number of nearest neighbor samples, l=1. Once again, the combination of the two components, diff (A, S, H) l ) Representing sample S and sample H l The difference in feature A, C is the sample class, p (C) is the ratio of the number of C-class target samples to the total number of samples, and p (class (S)) is the ratio of the number of samples in sample S to the total number of samples.
The step 5 is specifically as follows:
step 5.1, firstly, the feature sample set screened in the step 4 is respectively passed through three single classifiers of a Support Vector Machine (SVM), an Extreme Learning Machine (ELM) and a Random Forest (RF) to obtain the recognition result of each classifier on the coronary artery stenosis degree, namely 3 classes obtained by the classification prediction of the sample to be recognized by each classifier, and the weight occupied by each single classifier in a final multi-classifier fusion prediction model is calculated according to the classification correct capacity of each classifier;
step 5.2, adopting a majority weighted voting method to fuse classification results of three single classifiers of a Support Vector Machine (SVM), an Extreme Learning Machine (ELM) and a Random Forest (RF), when the output result of the classifier is +1, the classification result is a normal class, namely, the stenosis degree is lower than 50%, and when the output result of the classifier is-1, the classification result is a lesion class, namely, the stenosis degree is higher than 50%; multiplying the classification result of each classifier by the corresponding weight obtained in the step 5.1, adding the three products to obtain a classification result of the multi-classifier fusion prediction model, and judging the classification result as a normal class when the addition result is positive and judging the classification result as a lesion class when the addition result is negative.
In step 5.1, the weight occupied by each classifier is determined according to the classification accuracy, and the accuracy calculation formula of the classification model is as follows:
wherein a = 1,2,3; n=narrow, non-narrow; e' n Is thatE n Is divided intoAccumulated times of normal class or abnormal class; y is a E { +1, -1} is the tag of the training sample, +.>Respectively representing the classification results of the models;
calculating the weight w of each model a The method comprises the following steps:
wherein ,
and 5.2, multiplying and adding the results obtained by each model with the corresponding weight to obtain a final output result:
when the output result is positive, the classification result is a normal category, namely the stenosis degree is lower than 50%, and when the output result is negative, the classification result is a lesion category, namely the stenosis degree is higher than 50%.
The coronary artery stenosis degree identification method based on multi-classifier fusion has the advantages that CTA and DSA images and diagnostic reports of existing patients can be corresponded, a multi-classifier fusion prediction model can be established by machine learning directly through heart CTA detection results, gold indexes of coronary artery stenosis degree of the patients are predicted, and a treatment scheme is determined. The method adopts an in-vitro classification prediction mode, avoids adverse reaction and wound brought by invasive coronary angiography to a patient, and does not need to singly conduct coronary angiography operation, so that the applicability of coronary lesion diagnosis can be improved, meanwhile, the advantages of all the classifiers can be combined by fusing multiple classifiers, the prediction accuracy and the prediction speed have good performance, and the diagnosis efficiency of a clinician is improved.
Drawings
FIG. 1 is a schematic diagram of a framework based on a multi-classifier fusion prediction model of the present application;
FIG. 2 is a flow chart of a structure based on a multi-classifier fusion prediction model of the present application;
FIG. 3 is a schematic diagram of a region growing segmentation algorithm employed in the present application;
FIG. 4 is a flow chart of feature extraction in accordance with the present application;
FIG. 5 is a feature screening flow chart of the present application;
FIG. 6 is a flow chart of the implementation of the classifiers used in the present application;
fig. 7 is a schematic diagram of a fusion classifier network construction of the present application.
Detailed Description
The application will be described in detail below with reference to the drawings and the detailed description.
The application provides a machine learning method for automatically identifying the disease degree by using a fusion classifier based on the original purpose of noninvasive coronary stenosis degree identification, and the gold index is compared by using a two-dimensional CT shooting image, so that the clinical diagnosis efficiency is improved. As shown in the overall framework of FIG. 1, the method mainly comprises six basic modules of sample library construction, image preprocessing, feature extraction, feature screening, fusion classifier model construction and experimental verification, and can be understood as mainly comprising two main stages of sample acquisition and modeling. In the sample acquisition stage, various processing procedures on the training samples need to be completed, in the modeling stage, a machine learning model needs to be established, and the classifier structure and parameter tuning are determined. Finally, the efficiency of the method provided by the application can be verified and evaluated. It should be noted that the present application is directed to the inventive solution but is not limited thereto, and that the diagnosis of other diseases is applicable in addition to the one suitable for the present research context.
The application relates to a method for identifying the degree of coronary artery stenosis based on multi-classifier fusion, which is implemented by combining fig. 1 and fig. 2, and specifically comprises the following steps:
step 1, constructing an image sample library;
the step 1 is specifically as follows:
patient information and images which are subjected to heart CTA and coronary angiography DSA examination in recent three years are collected in a hospital data system, namely, a CT image and coronary stenosis gold index data can be corresponding, after basic information of a patient in the images is hidden, 500 coronary CT images of the patient which meet the image quality are selected as selected input samples, and label category labeling is carried out.
Step 2, denoising and segmentation binarization processing is carried out on a CT original sequence image extracted from heart CTA, so as to obtain a coronary artery extraction image;
the step 2 is specifically as follows:
step 2.1, because the object aimed by the application is a CT image, the noise mainly introduced by the medical image is Gaussian noise, a median filtering mode is adopted to achieve a better denoising effect on the Gaussian noise, all pixel points in the neighborhood of the Gaussian noise point in the original CT image are arranged according to the size rule, the gray value of the pixel in the middle is taken as the gray value of the noise point to denoise the image, and the principle expression is as follows:
wherein i, j represents the coordinate value of the pixel point, g ij The gray value of the noise point is represented by A, and the neighborhood region taken by the noise point is represented by A; { f ij -a sequence of data; med means a median operation.
The image quality of the CT image can be improved through denoising, and meanwhile, the denoised image can more clearly reflect coronary artery structure information in the CT image, so that the segmentation operation in the step 2.2 is facilitated.
And 2.2, according to the shape characteristics of blood vessels, the extracted area and the external area have obvious differences, so that an image is segmented by using an algorithm based on area growth, and the concept is that each pixel point with certain similar characteristics is divided into the same area to realize segmentation. Firstly, selecting a seed point in each to-be-segmented area of the whole image as a starting point of area growth, merging pixels which are similar or similar to the characteristics of the pixel points around the seed point into an area where preset seed pixels are located according to a growth criterion which optimizes the target of the seed point, and then continuously growing the merged new pixels serving as seed areas according to the method until the whole image is traversed, so that when pixels which do not meet preset conditions or criteria in the whole image can be merged into the seed areas, ending the whole area growth segmentation process.
The region growing and dividing algorithm can divide the connected region with the same characteristics well to provide good boundary information, R represents the whole image, and the dividing process is regarded as the process of dividing the whole denoised CT image R into c sub-regions, and the following conditions (1) to (4) are satisfied at the same time:
①U(R x )=R,R x is a sub-communication area;
②R x ∩R y =Φ, x, y=1, 2,3.. and for any x and y, x is not equal to y;
③P(R x ) The number of times of for x=1, 2,3.
④R(R x ∪R y )=False,x≠y;
The region growing and dividing algorithm is a process of gathering pixels or subareas into larger areas according to a predefined criterion, and coronary vessels with continuous areas are extracted from an original CT image through the region growing and dividing algorithm, so that a coronary artery extraction map is obtained.
Step 3, extracting features of the segmented image, namely extracting three main types of features of interesting texture features, gray features and geometric features according to the characteristics of the medical image as shown in fig. 3;
the step 3 is specifically as follows:
step 3.1, extracting gray features in six aspects of mean value, variance, energy, entropy, kurtosis and skewness of the coronary artery extraction map in the step 2 by adopting a gray histogram method;
step 3.2, constructing a gray level co-occurrence matrix, selecting a sliding window of 5 multiplied by 5, calculating gray level characteristic values of each pixel point of the coronary artery extraction map in the step 2, and extracting texture characteristics of the image;
and 3.3, extracting geometric features of the coronary artery image by using a Hu invariant moment method based on the coronary artery extraction graph obtained in the step 2, wherein in statistics, the moment reflects the scattering situation of random variables, and the method is popularized to the image field, and if the gray value of the image is regarded as a density scattering function, the moment mode can be used for extracting the image features. The Hu invariant moment method characterizes the geometric features of the image area, first calculates the second-order and third-order center distances of the coronary artery image, then carries out normalization processing to obtain an invariant moment group, and describes the geometric features of the shape of the coronary artery extracted image by the invariant moment group.
Step 4, according to the principle of 7:3, dividing 500 samples into training groups and test groups by adopting a random index method, screening three image group chemical characteristics of texture, gray scale and geometry extracted in the step 3 by adopting a multi-classification Relieff characteristic weighting algorithm, carrying out ten-fold cross validation on random characteristics, calculating the correlation between each characteristic and a prediction result, and eliminating the characteristics with small correlation;
the step 4 is specifically as follows:
step 4.1, selecting the first d features with the largest correlation from all features of texture, gray scale and geometric three-image histology extracted in the step 3 through a ReliefF feature weighting algorithm to form d feature subsets, wherein each subset comprises the feature numbers from 1 to d in sequence;
step 4.2, performing ten-fold cross validation, dividing a sample set into 10 subsets, selecting one subset as a test set each time, taking the rest 9 subsets as training sets, repeating 10 times, and finally selecting the average recognition accuracy of 10 times as a result;
and 4.3, calculating the prediction error rate of each feature subset according to the process, and selecting the feature subset with the minimum pre-error rate as the input feature of the multi-classifier fusion prediction model in the step 5.
The ReliefF feature weighting algorithm in step 4.1 is specifically as follows:
as shown in the flowchart 4, the features are the basis of machine learning, but redundancy and correlation among the features are opposite to reduce the accuracy of classification, especially, the application is a small sample learning model, too many features not only increase the complexity of the model, but also reduce the generalization capability of the model to a certain extent, so the application optimizes and selects the features extracted in the step 2 by adopting a ReliefF feature weighting algorithm, and gives different weights to each feature, so that the features with the weights smaller than the set threshold value can be removed.
Randomly extracting a sample S from the training sample set each time, and respectively finding k neighbor samples H from the same type of samples and different types of samples of the sample S l 、M l And then updating the weight occupied by each feature in the three types of texture, gray scale and geometry features extracted in the step 3 in the prediction process, wherein the features with the weights smaller than the set threshold value are rejected, and the feature weight calculation formula is as follows:
in the above formula, m is the number of sample samples, k is the number of nearest neighbor samples, l=1 l ) Representing sample S and sample H l The difference in feature A, C is the sample class, p (C) is the ratio of the number of C-class target samples to the total number of samples, and p (class (S)) is the ratio of the number of samples in sample S to the total number of samples.
Step 5, forming a feature set by the features of the texture, the gray level and the geometry selected in the step 4, establishing a multi-classifier fusion prediction model, and selecting three classifiers of a Support Vector Machine (SVM), a Random Forest (RF) and an Extreme Learning Machine (ELM) with good medical image classification effect to fusion predict coronary artery lesion degree; and determining the weight of the 3 classifiers in the fusion classifier by adopting a weighting method, so that the prediction effect is optimal, judging the normal sample when the stenosis degree is lower than 50%, and judging the lesion sample when the stenosis degree is higher than 50%.
The step 5 is specifically as follows:
as shown in the topological structure of the classifier in fig. 7, a fusion classifier consisting of a Support Vector Machine (SVM), an Extreme Learning Machine (ELM) and a Random Forest (RF) is selected to classify a sample set, and the principle of each classifier is as follows:
as shown in fig. 6 (a), the basic principle of the Support Vector Machine (SVM) is to find an optimal hyperplane capable of separating different samples, and the solution of the optimal hyperplane corresponds to the optimization process of convex quadratic programming: searching an objective function and determining constraint conditions. Dimension disasters can be avoided, the robustness is good, and the generalization capability is strong; the classification performance of SVM is affected by a number of factors, two of which are 1) error penalty parameter C; 2) Kernel function form and parameter g thereof. The error punishment parameters enable the generalization capability of the learning machine to be best through adjusting the confidence range and experience risk in the feature subspace. The radial basis function has nonlinearity and few parameters, and can map the original characteristics to infinite dimensions, so the application selects the radial basis function as the kernel function of the support vector machine.
As shown in fig. 6 (b), the basic structure of the extreme learning machine is a single hidden layer neural network, and compared with the traditional BP neural network, the extreme learning machine has better generalization capability and faster learning speed, in short, the network structure of the Extreme Learning Machine (ELM) model is the same as that of the single hidden layer feedforward neural network (SLFN), but is no longer a gradient-based algorithm (backward propagation) frequently found in the traditional neural network in the training stage, and a random input layer weight and deviation are adopted, and the output layer weight is calculated by generalized inverse matrix theory. Training of an Extreme Learning Machine (ELM) is completed after the weights and the deviations on all network nodes are obtained, and when the data is tested, the prediction of the data can be calculated by using the output layer weight just obtained. In the algorithm implementation process, inputs comprise a data set, the number of hidden layer neurons and an activation function, outputs are beta weights, and hidden layer outputs and output layer weights are calculated by randomly generating the input weights and hidden layer deviations.
As shown in fig. 6 (c), the random forest input includes training data sets and the number of sample subsets, and the output is a final strong classifier, which has good applicability in terms of machine learning, and does not need a complex parameter tuning process, and only one tree can be normally constructed for one data set, so that a plurality of data subsets related to each other can be divided on the same data set through guiding the aggregation algorithm idea to construct a plurality of subtrees, and the optimal classification is determined by voting on classification results of a plurality of decision trees.
Step 5.1, as shown in a fusion schematic diagram of fig. 7, firstly, the feature sample set screened in the step 4 is respectively passed through three single classifiers of a Support Vector Machine (SVM), an Extreme Learning Machine (ELM) and a Random Forest (RF) to obtain the recognition result of each classifier on the coronary artery stenosis degree, namely 3 classes obtained by the classification prediction of the sample to be recognized by each classifier, and the weight occupied by each single classifier in the final multi-classifier fusion prediction model is calculated by the correct classification capacity of each classifier;
step 5.2, adopting a majority weighted voting method to fuse classification results of three single classifiers, namely a Support Vector Machine (SVM), an Extreme Learning Machine (ELM) and a Random Forest (RF), when the output result of the classifier is +1, the classification result is a normal class, namely the stenosis degree is lower than 50%, and when the output result of the classifier is-1, the classification result is a lesion class, namely the stenosis degree is higher than 50%; multiplying the classification result of each classifier by the corresponding weight obtained in the step 5.1, adding the three products to obtain a classification result of the multi-classifier fusion prediction model, and judging the classification result as a normal class when the addition result is positive and judging the classification result as a lesion class when the addition result is negative.
In step 5.1, the weight occupied by each classifier is determined according to the classification accuracy, and the accuracy calculation formula of the classification model is as follows:
wherein a = 1,2,3; n=narrow, non-narrow; e' n Is thatE n The accumulated times are classified into normal types or abnormal types; y is a E { +1, -1} is the tag of the training sample, +.>Respectively representing the classification results of the models;
calculating the weight w of each model a The method comprises the following steps:
wherein ,
and 5.2, multiplying and adding the results obtained by each model with the corresponding weight to obtain a final output result:
when the output result is positive, the classification result is a normal category, namely the stenosis degree is lower than 50%, and when the output result is negative, the classification result is a lesion category, namely the stenosis degree is higher than 50%.
The technical scheme adopted by the application comprises the design of two main components: an image processing stage and a classifier modeling stage. Firstly, a database is required to be collected, pretreatment is carried out on a constructed image sample library, and a coronary artery segmentation map is extracted to carry out subsequent classification learning on coronary arteries; and then constructing a fusion classifier model, determining a classifier topological structure and a result output mode, identifying and classifying the stenosis degree according to the defined training sample, and fusing the classification result. And finally, SPSS software can be used for analyzing the accuracy, sensitivity, specificity, negative predicted value and positive predicted value of the predicted result, and the test set is used for classification prediction. In the process, from the point of noninvasively determining the coronary stenosis degree, two labeling categories are defined: the stenosis degree is more than 50% and less than 50%. Clinically, the coronary heart disease can be defined when the stenosis degree is more than 50%, so that the patients need to pay attention when the stenosis degree of the separated cases is more than 50%, and a treatment scheme is formulated. The fusion classifier is one of important components for degree identification and degree type marking, and is used for carrying out parameter training on the model by using a marked training set, and then identifying and marking by applying a test set. In order to obtain higher classification accuracy, the method adopts an algorithm to screen the characteristics of overfitting caused by excessive characteristics, three classifiers with better image classification performance are selected on the selection of a single classifier to be combined and built, as the single classifier relates to the adjustment and optimization of a plurality of super parameters, the multi-classifiers can be mutually coordinated, the problem of parameter adjustment and optimization is solved, the effect of adding one to more than two in the accuracy of classification results can be realized, and the weighted fusion algorithm gives a larger weight to the classifier with better classification performance, so that the classification results are more credible.
The application adopts SPSS software to analyze the accuracy, sensitivity, specificity, negative predictive value and positive predictive value of the predictive result. And testing the model classification effect by adopting a test set. In the designed technical scheme, the proportion of each classifier in the step 5 in the fusion classifier is distributed through the prediction capability of each classifier, the classification standard takes the international latest coronary artery stenosis diagnosis standard CAD-RADS as a criterion, when the stenosis degree is less than 50%, observation and prevention are taken as the main, and when the stenosis degree is greater than 50%, treatment such as medicines and operations are considered.
The application can directly classify the stenosis degree by studying the comparison of the prior patient data CTA and DSA diagnosis results, and finally realize the automatic prediction of the stenosis degree corresponding to DSA by the coronary CT image, namely, the accurate determination of the gold index of coronary lesions by the CTA image, thereby giving a treatment scheme without invasive examination, not only assisting doctors in giving diagnosis results, improving the working efficiency, but also greatly relieving the pain of patients, and having important clinical significance.

Claims (8)

1. The method for identifying the degree of the coronary artery stenosis based on the multi-classifier fusion is characterized by comprising the following steps of:
step 1, constructing an image sample library;
step 2, denoising and segmentation binarization processing is carried out on a CT original sequence image extracted from heart CTA, so as to obtain a coronary artery extraction image;
step 3, extracting features of the segmented image, namely three main features of interesting texture features, gray features and geometric features;
step 4, according to the principle of 7:3, dividing 500 samples into training groups and test groups by adopting a random index method, screening three image group chemical characteristics of texture, gray scale and geometry extracted in the step 3 by adopting a multi-classification Relieff characteristic weighting algorithm, carrying out ten-fold cross validation on random characteristics, calculating the correlation between each characteristic and a prediction result, and eliminating the characteristics with small correlation;
step 5, forming a feature set by the features of the texture, the gray level and the geometry selected in the step 4, establishing a multi-classifier fusion prediction model, and selecting three classifiers of a Support Vector Machine (SVM), a Random Forest (RF) and an Extreme Learning Machine (ELM) with good medical image classification effect to fusion predict coronary artery lesion degree; and determining the weight of the 3 classifiers in the fusion classifier by adopting a weighting method, judging as a normal sample when the stenosis degree is lower than 50%, and judging as a lesion sample when the stenosis degree is higher than 50%.
2. The method for identifying the degree of coronary artery stenosis based on multi-classifier fusion according to claim 1, wherein the step 1 is specifically as follows:
patient information and images which are subjected to heart CTA and coronary angiography DSA examination in recent three years are collected in a hospital data system, namely, a CT image and coronary stenosis gold index data can be corresponding, after basic information of a patient in the images is hidden, 500 coronary CT images of the patient which meet the image quality are selected as selected input samples, and label category labeling is carried out.
3. The method for identifying the degree of coronary artery stenosis based on multi-classifier fusion according to claim 2, wherein the step 2 is specifically as follows:
step 2.1, arranging all pixel points in the neighborhood of Gaussian noise points in an original CT image according to a size rule, taking the gray value of the pixel in the middle as the gray value of the noise point to reduce noise of the image, wherein the principle expression is as follows:
wherein i, j represents the coordinate value of the pixel point, g ij The gray value of the noise point is represented by A, and the neighborhood region taken by the noise point is represented by A; { f ij -a sequence of data; med means median taking operation;
the image quality of the CT image can be improved through denoising, and meanwhile, the denoised image can more clearly reflect the coronary artery structure information in the CT image, so that the segmentation operation in the step 2.2 is facilitated;
steps 2.2 and R represent the whole image, and the segmentation is considered as a process of segmenting the whole denoised CT image R into c sub-regions, and the following conditions (1) to (4) should be satisfied at the same time:
①U(R x )=R,R x is a sub-communication area;
②R x ∩R y =Φ, x, y=1, 2,3.. and for any x and y, x is not equal to y;
③P(R x ) The number of times of for x=1, 2,3.
④R(R x ∪R y )=False,x≠y;
And extracting coronary vessels with continuous areas from the original CT image by using an area growth segmentation algorithm, and obtaining a coronary artery extraction map.
4. The method for identifying the degree of coronary artery stenosis based on multi-classifier fusion according to claim 3, wherein the step 3 is specifically as follows:
step 3.1, extracting gray features in six aspects of mean value, variance, energy, entropy, kurtosis and skewness of the coronary artery extraction map in the step 2 by adopting a gray histogram method;
step 3.2, constructing a gray level co-occurrence matrix, selecting a sliding window of 5 multiplied by 5, calculating gray level characteristic values of each pixel point of the coronary artery extraction map in the step 2, and extracting texture characteristics of the image;
and 3.3, extracting geometric features of the coronary artery image by using a Hu invariant moment method based on the coronary artery extraction map obtained in the step 2, firstly calculating second-order and third-order center distances of the coronary artery image, then carrying out normalization processing to obtain an invariant moment group, and describing the geometric features of the shape of the coronary artery extraction image by the invariant moment group.
5. The method for identifying the degree of coronary artery stenosis based on multi-classifier fusion according to claim 4, wherein the step 4 is specifically as follows:
step 4.1, selecting the first d features with the largest correlation from all features of texture, gray scale and geometric three-image histology extracted in the step 3 through a ReliefF feature weighting algorithm to form d feature subsets, wherein each subset comprises the feature numbers from 1 to d in sequence;
step 4.2, performing ten-fold cross validation, dividing a sample set into 10 subsets, selecting one subset as a test set each time, taking the rest 9 subsets as training sets, repeating 10 times, and finally selecting the average recognition accuracy of 10 times as a result;
and 4.3, calculating the prediction error rate of each feature subset according to the process, and selecting the feature subset with the minimum pre-error rate as the input feature of the multi-classifier fusion prediction model in the step 5.
6. The method for identifying the degree of coronary stenosis based on multi-classifier fusion according to claim 5, wherein the ReliefF feature weighting algorithm in step 4.1 is specifically as follows:
randomly extracting a sample S from the training sample set each time, and then extracting a sample S from the training sample setFinding k neighbor samples H from the same sample and different samples l 、M l And then updating the weight occupied by each feature in the three types of texture, gray scale and geometry features extracted in the step 3 in the prediction process, wherein the features with the weights smaller than the set threshold value are rejected, and the feature weight calculation formula is as follows:
in the above formula, m is the number of sample samples, k is the number of nearest neighbor samples, l=1 l ) Representing sample S and sample H l The difference in feature A, C is the sample class, p (C) is the ratio of the number of C-class target samples to the total number of samples, and p (class (S)) is the ratio of the number of samples in sample S to the total number of samples.
7. The method for identifying the degree of coronary artery stenosis based on multi-classifier fusion according to claim 5, wherein the step 5 is specifically as follows:
step 5.1, firstly, the feature sample set screened in the step 4 is respectively passed through three single classifiers of a support vector machine SVM, an extreme learning machine ELM and a random forest RF to obtain the recognition result of each classifier on the coronary artery stenosis degree, namely 3 classes obtained by the classification prediction of the sample to be recognized by each classifier, and the weight occupied by each single classifier in the final multi-classifier fusion prediction model is calculated according to the classification correct capacity of each classifier;
step 5.2, adopting a majority weighted voting method to fuse classification results of three single classifiers of a support vector machine SVM, an extreme learning machine ELM and a random forest RF, when the output result of the classifier is +1, the classification result is a normal class, namely, the stenosis degree is lower than 50%, and when the output result of the classifier is-1, the classification result is a lesion class, namely, the stenosis degree is higher than 50%; multiplying the classification result of each classifier by the corresponding weight obtained in the step 5.1, adding the three products to obtain a classification result of the multi-classifier fusion prediction model, and judging the classification result as a normal class when the addition result is positive and judging the classification result as a lesion class when the addition result is negative.
8. The method for identifying the degree of coronary artery stenosis based on multi-classifier fusion according to claim 7, wherein the weight occupied by each classifier in the step 5.1 is determined according to the classification accuracy, and the accuracy calculation formula of the classification model is as follows:
wherein a = 1,2,3; n=narrow, non-narrow; e' n Is thatE n The accumulated times are classified into normal types or abnormal types; y is a E { +1, -1} is the tag of the training sample, +.>Respectively representing the classification results of the models;
calculating the weight w of each model a The method comprises the following steps:
wherein ,
and 5.2, multiplying and adding the results obtained by each model with the corresponding weight to obtain a final output result:
when the output result is positive, the classification result is a normal category, namely the stenosis degree is lower than 50%, and when the output result is negative, the classification result is a lesion category, namely the stenosis degree is higher than 50%.
CN202110658791.7A 2021-06-15 2021-06-15 Coronary artery stenosis degree identification method based on multi-classifier fusion Active CN113408603B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110658791.7A CN113408603B (en) 2021-06-15 2021-06-15 Coronary artery stenosis degree identification method based on multi-classifier fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110658791.7A CN113408603B (en) 2021-06-15 2021-06-15 Coronary artery stenosis degree identification method based on multi-classifier fusion

Publications (2)

Publication Number Publication Date
CN113408603A CN113408603A (en) 2021-09-17
CN113408603B true CN113408603B (en) 2023-10-31

Family

ID=77683777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110658791.7A Active CN113408603B (en) 2021-06-15 2021-06-15 Coronary artery stenosis degree identification method based on multi-classifier fusion

Country Status (1)

Country Link
CN (1) CN113408603B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115602321A (en) * 2021-12-24 2023-01-13 郑州大学第三附属医院(河南省妇幼保健院)(Cn) Method and system for predicting risk of secondary displacement of PICC catheter of premature infant
CN114399635A (en) * 2022-03-25 2022-04-26 珞石(北京)科技有限公司 Image two-classification ensemble learning method based on feature definition and deep learning

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103489005A (en) * 2013-09-30 2014-01-01 河海大学 High-resolution remote sensing image classifying method based on fusion of multiple classifiers
CN106886792A (en) * 2017-01-22 2017-06-23 北京工业大学 A kind of brain electricity emotion identification method that Multiple Classifiers Combination Model Based is built based on layering
CN108108762A (en) * 2017-12-22 2018-06-01 北京工业大学 A kind of random forest classification method based on core extreme learning machine and parallelization for the classification of coronary heart disease data
CN108805858A (en) * 2018-04-10 2018-11-13 燕山大学 Hepatopathy CT image computers assistant diagnosis system based on data mining and method
CN110490040A (en) * 2019-05-30 2019-11-22 浙江理工大学 A method of local vascular stenosis in identification DSA coronary artery images
CN111667456A (en) * 2020-04-28 2020-09-15 北京理工大学 Method and device for detecting vascular stenosis in coronary artery X-ray sequence radiography
CN112184647A (en) * 2020-09-22 2021-01-05 清华大学深圳国际研究生院 Vascular lesion grading identification method for fundus image based on migration convolution network
WO2021081771A1 (en) * 2019-10-29 2021-05-06 未艾医疗技术(深圳)有限公司 Vrds ai medical image-based analysis method for heart coronary artery, and related devices

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015054666A1 (en) * 2013-10-10 2015-04-16 Board Of Regents, The University Of Texas System Systems and methods for quantitative analysis of histopathology images using multi-classifier ensemble schemes

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103489005A (en) * 2013-09-30 2014-01-01 河海大学 High-resolution remote sensing image classifying method based on fusion of multiple classifiers
CN106886792A (en) * 2017-01-22 2017-06-23 北京工业大学 A kind of brain electricity emotion identification method that Multiple Classifiers Combination Model Based is built based on layering
CN108108762A (en) * 2017-12-22 2018-06-01 北京工业大学 A kind of random forest classification method based on core extreme learning machine and parallelization for the classification of coronary heart disease data
CN108805858A (en) * 2018-04-10 2018-11-13 燕山大学 Hepatopathy CT image computers assistant diagnosis system based on data mining and method
CN110490040A (en) * 2019-05-30 2019-11-22 浙江理工大学 A method of local vascular stenosis in identification DSA coronary artery images
WO2021081771A1 (en) * 2019-10-29 2021-05-06 未艾医疗技术(深圳)有限公司 Vrds ai medical image-based analysis method for heart coronary artery, and related devices
CN111667456A (en) * 2020-04-28 2020-09-15 北京理工大学 Method and device for detecting vascular stenosis in coronary artery X-ray sequence radiography
CN112184647A (en) * 2020-09-22 2021-01-05 清华大学深圳国际研究生院 Vascular lesion grading identification method for fundus image based on migration convolution network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
An Alzheimers disease related genes identification method based on multiple classifier integration;YuMiao等;《Computer Methods and Programs in Biomedicine》;20171031;全文 *
OCT影像下冠状动脉斑块智能分割与识别;张勃;《中国硕士学位论文全文数据库》;20190115;全文 *
基于标准数据集的分类器融合学习模型;吴疆等;《微型电脑应用》;20200420(第04期);全文 *

Also Published As

Publication number Publication date
CN113408603A (en) 2021-09-17

Similar Documents

Publication Publication Date Title
Wang et al. Breast cancer detection using extreme learning machine based on feature fusion with CNN deep features
Shamshirband et al. A review on deep learning approaches in healthcare systems: Taxonomies, challenges, and open issues
Miranda et al. A survey of medical image classification techniques
Hasan et al. Machine learning-based diabetic retinopathy early detection and classification systems-a survey
Aghamohammadi et al. TPCNN: two-path convolutional neural network for tumor and liver segmentation in CT images using a novel encoding approach
CN113408603B (en) Coronary artery stenosis degree identification method based on multi-classifier fusion
Patel et al. EfficientNetB0 for brain stroke classification on computed tomography scan
Kazemi et al. Classifying tumor brain images using parallel deep learning algorithms
Yang et al. RADCU-Net: Residual attention and dual-supervision cascaded U-Net for retinal blood vessel segmentation
Kumar et al. A methodical exploration of imaging modalities from dataset to detection through machine learning paradigms in prominent lung disease diagnosis: a review
Challab et al. Ant colony optimization–rain optimization algorithm based on hybrid deep learning for diagnosis of lung involvement in coronavirus patients
Basavaraju et al. Early Detection of Diabetic Retinopathy Using K-means Clustering Algorithm and Ensemble Classification Approach.
Zhou et al. Two-phase non-invasive multi-disease detection via sublingual region
Shanmugam et al. Study of early prediction and classification of arthritis disease using soft computing techniques
Babayomi et al. Convolutional xgboost (c-xgboost) model for brain tumor detection
Sankari et al. Automated detection of retinopathy of prematurity using quantum machine learning and deep learning techniques
Manikandan et al. Hybrid computational intelligence for healthcare and disease diagnosis
Zhou et al. A novel 1-D densely connected feature selection convolutional neural network for heart sounds classification
Thomas et al. Diabetic retinopathy detection using ensembled transfer learning based thrice CNN with SVM classifier
Gai Highly Efficient and Accurate Deep Learning–Based Classification of MRI Contrast on a CPU and GPU
Roy Chowdhury et al. A cybernetic systems approach to abnormality detection in retina images using case based reasoning
Elwin et al. Entropy Weighted and Kernalized Power K-Means Clustering Based Lesion Segmentation and Optimized Deep Learning for Diabetic Retinopathy Detection
Nugroho et al. Image dermoscopy skin lesion classification using deep learning method: systematic literature review
Butt et al. Feature Enhanced Stacked Auto Encoder for Diseases Detection in Brain MRI
Wibisono et al. Segmentation-based knowledge extraction from chest X-ray images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230613

Address after: 710000 No. B49, Xinda Zhongchuang space, 26th Street, block C, No. 2 Trading Plaza, South China City, international port district, Xi'an, Shaanxi Province

Applicant after: Xi'an Huaqi Zhongxin Technology Development Co.,Ltd.

Address before: 710048 Shaanxi province Xi'an Beilin District Jinhua Road No. 5

Applicant before: XI'AN University OF TECHNOLOGY

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231127

Address after: Room 101, 1st Floor, Building 2, No. 18 Keyuan Road, Daxing Economic Development Zone, Beijing, 102600

Patentee after: Beijing Futong Kangying Technology Co.,Ltd.

Address before: 710000 No. B49, Xinda Zhongchuang space, 26th Street, block C, No. 2 Trading Plaza, South China City, international port district, Xi'an, Shaanxi Province

Patentee before: Xi'an Huaqi Zhongxin Technology Development Co.,Ltd.

TR01 Transfer of patent right