CN111243736B - Survival risk assessment method and system - Google Patents

Survival risk assessment method and system Download PDF

Info

Publication number
CN111243736B
CN111243736B CN201911019274.4A CN201911019274A CN111243736B CN 111243736 B CN111243736 B CN 111243736B CN 201911019274 A CN201911019274 A CN 201911019274A CN 111243736 B CN111243736 B CN 111243736B
Authority
CN
China
Prior art keywords
risk
survival
variable set
screening
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911019274.4A
Other languages
Chinese (zh)
Other versions
CN111243736A (en
Inventor
李志臻
袁磊
张晨
孙佳星
王则远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Medicinovo Technology Co ltd
Third Affiliated Hospital Of Chinese People's Liberation Army Naval Medical University
Original Assignee
Beijing Medicinovo Technology Co ltd
Third Affiliated Hospital Of Chinese People's Liberation Army Naval Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Medicinovo Technology Co ltd, Third Affiliated Hospital Of Chinese People's Liberation Army Naval Medical University filed Critical Beijing Medicinovo Technology Co ltd
Priority to CN201911019274.4A priority Critical patent/CN111243736B/en
Publication of CN111243736A publication Critical patent/CN111243736A/en
Application granted granted Critical
Publication of CN111243736B publication Critical patent/CN111243736B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a survival risk assessment method and a survival risk assessment system. The method comprises the following steps: acquiring a screening risk variable set and a screening risk variable model coefficient; the screening risk variable set is obtained by the initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; screening risk variable model coefficients is obtained based on a survival risk prediction model; constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient; and inputting a plurality of information data of the individual to be evaluated into a survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated. According to the embodiment of the invention, the important risk factors are screened out through the risk regression model and stepwise regression intelligence by acquiring the follow-up data of the practical application, the survival risk assessment scale is constructed automatically, the survival risk value and the corresponding risk level are output, the coverage content is more comprehensive, and the practicability is higher.

Description

Survival risk assessment method and system
Technical Field
The present invention relates to the field of risk assessment technologies, and in particular, to a survival risk assessment method and system.
Background
In the field of risk assessment, a risk assessment scale is a measuring tool used for quantifying risk, for example, in the assessment process of health status, various aspects of risk factors of an individual or a group need to be observed, and the observation results are assessed and interpreted in a quantitative manner, and the scale composite score represents the risk level of the individual or the group. Likewise, a survival risk assessment scale is used to quantify survival risk after receiving a certain treatment.
The construction process of the traditional survival risk assessment scale is approximately as follows: consulting literature, consulting expert, determining risk factors, revising scales, pre-test surveys (e.g., confidence tests, validity tests, consistency tests, sensitivity analysis, specificity analysis, etc.), expert assessment, iterating improvements, and revising scales. By evaluating the survival risks of individuals or groups, constructing a survival risk evaluation scale, quantifying the survival risk degree, facilitating timely risk monitoring and prevention, and adopting a relatively safe and conservative treatment mode for patients with higher survival risk scores aiming at the actual conditions of the patients, so as to avoid unnecessary losses as far as possible.
The steps of consulting documents, consulting specialists and the like in the construction process of the traditional survival risk assessment scale are required to consume a great deal of manpower, material resources and time cost, the operation process is complicated, and a certain subjective judgment error exists. Moreover, the sample size of the research is usually small, and the investigation breadth and depth have certain limitations.
Disclosure of Invention
The embodiment of the invention provides a survival risk assessment method and a survival risk assessment system, which are used for solving the defects that a large amount of manpower and material resources are required to be consumed for constructing a risk assessment table, the operation is complicated, and subjective judgment errors exist in the prior art.
In a first aspect, an embodiment of the present invention provides a survival risk assessment method, including:
acquiring a screening risk variable set and a screening risk variable model coefficient; the screening risk variable set is obtained by an initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model;
constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient;
and inputting a plurality of information data of the individual to be evaluated into the survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated.
Preferably, the constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient further includes:
inputting a plurality of evaluation samples into the survival risk prediction model to obtain a plurality of survival risk prediction values;
Dividing the plurality of survival risk prediction values into a plurality of survival risk grades;
comparing a plurality of survival curves corresponding to the plurality of survival risk grades through a preset verification algorithm to obtain a risk difference value;
and if the risk difference value meets a preset difference threshold condition, the classification of the plurality of survival risk classes is considered to be correct.
Preferably, the acquiring the screening risk variable set and the screening risk variable model coefficient specifically includes:
an original risk variable set is obtained, and the original risk variable set is initialized to obtain a preprocessed risk variable set;
constructing a survival risk assessment database based on the pretreatment risk variable set;
deleting a plurality of variables with the deletion rate larger than a preset optimal deletion rate threshold value in the survival risk assessment database, and acquiring the variables with the preset association degree for supplementing to obtain an optimal risk variable set;
screening the optimized risk variable sets by adopting a plurality of machine learning algorithms to obtain a plurality of variable sets;
solving intersections of the variable sets to obtain the initial risk variable set;
acquiring the COX proportional risk regression model, training the initial risk variable set based on the COX proportional risk regression model, and screening by combining the gradual backward algorithm to obtain the screening risk variable set;
And further constructing the survival risk prediction model based on the screening risk variable set, and inputting the initial risk variable set into the survival risk prediction model to obtain the screening risk variable model coefficient.
Preferably, the obtaining an original risk variable set, initializing the original risk variable set to obtain a preprocessed risk variable set, specifically includes:
acquiring a plurality of objective risk information of the individual to be evaluated, constructing the original risk variable set, and setting target variables for the original risk variable set;
and cleaning data of the original risk variable set with the set target variable, and formatting to obtain the preprocessing risk variable set.
Preferably, deleting a plurality of variables with the deletion rate greater than a preset optimal threshold in the survival risk assessment database, and acquiring the plurality of variables with the preset association degree for supplementing to obtain an optimized risk variable set, which specifically includes:
setting a preset deletion rate range interval and a preset adjustment step length;
starting from the starting point of the preset deletion rate range interval, increasing according to the preset adjustment step length until the ending point of the preset deletion rate range interval is reached, so as to obtain a plurality of preset adjustment thresholds;
Deleting the variables in the preprocessing risk variable set which are larger than the preset adjustment thresholds to obtain a plurality of verification test sets;
verifying the verification test sets to obtain the preset optimal deletion rate threshold;
deleting the variables in the pretreatment risk variable set according to the preset optimal deletion rate threshold;
and acquiring the plurality of variables with the preset association degree by adopting a K nearest neighbor algorithm, and supplementing the plurality of deleted variables to obtain the optimized risk variable set.
Preferably, the number of machine learning algorithms includes an XGboost algorithm, a random forest algorithm, and a GBDT algorithm.
In a second aspect, an embodiment of the present invention provides a survival risk assessment system, including:
the acquisition module is used for acquiring the screening risk variable set and the screening risk variable model coefficient; the screening risk variable set is obtained by an initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model;
the processing module is used for constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient;
The evaluation module is used for inputting a plurality of information data of the individual to be evaluated into the survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated.
Preferably, the system further comprises a verification module, wherein the verification module is specifically used for:
inputting a plurality of evaluation samples into the survival risk prediction model to obtain a plurality of survival risk prediction values;
dividing the plurality of survival risk prediction values into a plurality of survival risk grades;
comparing a plurality of survival curves corresponding to the plurality of survival risk grades through a preset verification algorithm to obtain a risk difference value;
and if the risk difference value meets a preset difference threshold condition, the classification of the plurality of survival risk classes is considered to be correct.
In a third aspect, an embodiment of the present invention provides an electronic device, including:
a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the survival risk assessment methods when the program is executed.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the survival risk assessment methods.
According to the survival risk assessment method and system provided by the embodiment of the invention, the important risk factors are screened out through the risk regression model and stepwise regression intelligence by acquiring the follow-up data of the actual application, the survival risk assessment scale is automatically constructed, the survival risk value and the corresponding risk level are output, the coverage content is more comprehensive, and the practicability is stronger.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a survival risk assessment method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a survival risk assessment system according to an embodiment of the present invention;
fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flowchart of a survival risk assessment method according to an embodiment of the present invention, where, as shown in fig. 1, the method includes:
s1, acquiring a screening risk variable set and a screening risk variable model coefficient; the screening risk variable set is obtained by an initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model;
s2, constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient;
s3, inputting a plurality of information data of the individual to be evaluated into the survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated.
Specifically, in step S1, a screening risk variable set is first obtained, where the screening risk variable set is obtained by introducing a COX proportional risk regression model into an initial risk variable set for screening, and combining a step-by-step backward algorithm for verification; and obtaining a screening risk variable model coefficient corresponding to the screening risk variable set, wherein the coefficient is obtained by further constructing a survival risk prediction model from the screening risk variable set and then inputting the initial risk variable set into the survival risk prediction model for post-processing.
In step S2, a complete survival risk assessment scale is constructed based on the screening risk variable set and the screening risk variable model coefficient obtained in step S1.
In step S3, a plurality of information data of the individual to be evaluated, i.e. the risk value to be evaluated, is input to the survival risk evaluation scale to obtain a survival risk evaluation quantized value and an individual survival risk level of the evaluated individual, for example, relevant information of a hospital patient is input to obtain the survival risk evaluation quantized value and the corresponding individual survival risk level of the patient.
According to the embodiment of the invention, the important risk factors are screened out through the risk regression model and stepwise regression intelligence by acquiring the follow-up risk variables of actual application, the survival risk assessment scale is constructed automatically, the survival risk values and the corresponding risk grades are output, the coverage content is more comprehensive, and the practicability is stronger.
Based on the above embodiment, the constructing a survival risk assessment scale based on the screening risk variable set and the screening wind row variable model coefficient further includes:
inputting a plurality of evaluation samples into the survival risk prediction model to obtain a plurality of survival risk prediction values;
dividing the plurality of survival risk prediction values into a plurality of survival risk grades;
Comparing a plurality of survival curves corresponding to the plurality of survival risk grades through a preset verification algorithm to obtain a risk difference value;
and if the risk difference value meets a preset difference threshold condition, the classification of the plurality of survival risk classes is considered to be correct.
Specifically, substituting all samples into a survival risk prediction model to obtain survival risk prediction values of all samples, and dividing the survival risk prediction values of all samples into a plurality of groups, namely a plurality of risk levels according to actual conditions, wherein the embodiment of the invention adopts 4 to 6 groups, namely 4 to 6 survival risk levels; and further comparing a plurality of survival curves corresponding to the plurality of survival risk levels by adopting a preset checking algorithm, wherein the preset checking algorithm adopts log rank to compare whether significant differences exist, if the difference of the survival curves of the risk levels is significant, the risk score level division is reasonable, and if the risk score is not significant, the survival risk level needs to be additionally divided.
According to the embodiment of the invention, after the survival risk assessment scale is constructed, the sample is input into the assessment scale to obtain the corresponding assessment result, and the assessment result is effectively verified, so that the rationality and accuracy of construction of the survival risk assessment scale can be accurately judged, and the practicability is enhanced.
Based on any one of the foregoing embodiments, the acquiring the screening risk variable set and the screening risk variable model coefficient specifically includes:
an original risk variable set is obtained, and the original risk variable set is initialized to obtain a preprocessed risk variable set;
constructing a survival risk assessment database based on the pretreatment risk variable set;
deleting a plurality of variables with the deletion rate larger than a preset optimal deletion rate threshold value in the survival risk assessment database, and acquiring the variables with the preset association degree for supplementing to obtain an optimal risk variable set;
screening the optimized risk variable sets by adopting a plurality of machine learning algorithms to obtain a plurality of variable sets;
solving intersections of the variable sets to obtain the initial risk variable set;
acquiring the COX proportional risk regression model, training the initial risk variable set based on the COX proportional risk regression model, and screening by combining the gradual backward algorithm to obtain the screening risk variable set;
and further constructing the survival risk prediction model based on the screening risk variable set, and inputting the initial risk variable set into the survival risk prediction model to obtain the screening risk variable model coefficient.
Specifically, an original risk variable set is firstly obtained, a series of preprocessing is needed to be carried out on the original risk variable set to obtain a preprocessed risk variable set, and the preprocessed risk variable set is constructed into a survival risk assessment database; deleting a plurality of variables with the deletion rate larger than a preset optimal deletion rate threshold value in the survival risk assessment database, acquiring a plurality of variables with a certain preset association degree, and supplementing the originally deleted variable parts to obtain an optimal risk variable set; further, a plurality of machine learning algorithms are adopted to respectively screen the optimized risk variable sets in different dimensions, and then the screened optimized risk variable sets are subjected to intersection, and the intersection is considered as an initial risk variable set and is the most important variable selected.
And further constructing a survival risk prediction model, wherein the target variables of the data set are the survival risk (0-1 variable) and the survival time (numerical variable), and the independent variables are the initial risk variable set.
The COX proportional risk regression model is obtained, and is a semi-parametric regression model, wherein the model takes survival ending and survival time as dependent variables, can analyze the influence of a plurality of factors on the survival time at the same time, can analyze data with the truncated survival time, and does not require estimating the survival distribution type of the data.
The COX proportional-risk regression model uses h (t, X) as the dependent variable, where x= (X) 1 ,X 2 ,…,X m ) Basic form of modelThe method comprises the following steps:
h(t,X)=h 0 (t)exp(β 1 X 12 X 2 +…+β m X m ) (1)
in the above, beta 12 ,…,β m Partial regression coefficient as independent variable, h 0 And (t) is the reference risk of h (t, X) when x=0. Due to COX regression model pair h 0 (t) no assumptions are made, so the COX regression model has greater flexibility in dealing with problems; on the other hand, in many cases, only the parameter β (e.g., factor analysis, etc.) needs to be estimated, even at h 0 (t) unknown, the parameter β can still be estimated, that is, the COX regression model contains h 0 (t) so it is not a complete parametric model, but still an estimate of the parameter β can be made according to equation (1), so the COX regression model belongs to a semi-parametric model, so equation (1) can be converted into:
ln[h(t,X)/h 0 (t)]=lnRR=β 1 X 12 X 2 +…+β m X m (2)
while the assumption of the COX regression model is:
(1) Proportional risk assumption: the effect of each risk factor does not change with time, i.e. h (t, X)/h 0 (t) does not change over time. Thus, equation (1) is also known as a proportional risk model, this assumption being a precondition for building a COX regression model;
(2) Log-linear assumption: covariates in the model should be linear with logarithmic risk ratios.
And then dividing a training set and a test set according to the ratio of 8:2, further screening important variables by taking Backward Stepwise, namely a gradual backward algorithm, as a mode of screening variables by a COX proportional risk regression model, and constructing a final survival risk prediction model based on the training set to predict survival risks of individuals.
And secondly, calculating C_index (consistency index) of the test set and Brier Score as indexes for evaluating the advantages and disadvantages of the model, and comparing whether the two important variable screening is optimized for the prediction capability and the prediction accuracy of the model. The Brier Score herein may be considered a measure of the "calibration" of a set of probability predictions, otherwise known as a "cost function", which must be mutually exclusive, and the sum of the probabilities must be 1. The lower the Brier Score is for a set of predictors, the better the predictive calibration, which is the recalculation of the classification prediction probabilities made by the classification function, and the calculation of the Brier Score, followed by a determination of whether the initial prediction is supported or countered based on the magnitude of the Brier Score. Here, c_index is used to evaluate the predictive power of the model, with higher values indicating better predictive power of the model; brier Score is used to evaluate prediction accuracy, with lower values indicating higher prediction accuracy for the model.
After the screening variable set is obtained, a survival risk prediction model is further constructed, and the initial risk variable set is input into the survival risk prediction model, so that the screening risk variable model coefficient is obtained.
The embodiment of the invention constructs the survival risk prediction model by screening the variables for multiple times and introducing the COX regression model, saves the time of consulting documents and consulting specialists before constructing the scale by the traditional method to a certain extent, and effectively avoids the condition of manually summarizing missing risk factors.
Based on any one of the foregoing embodiments, the obtaining an original risk variable set, initializing the original risk variable set to obtain a preprocessed risk variable set, specifically includes:
acquiring a plurality of objective risk information of an individual to be evaluated, constructing the original risk variable set, and setting target variables for the original risk variable set;
and cleaning data of the original risk variable set with the set target variable, and formatting to obtain the preprocessing risk variable set.
Specifically, after the original risk variable set is acquired, a target variable is set, specifically: risk of survival (1 for death, 0 for survival) and time to live.
Then, data cleaning is carried out on the original variables, wherein the original variables comprise survival information, basic information and clinical data of patients such as medical history, disease diagnosis information, inspection information, operation information and the like, cleaning treatment is carried out on all the original variables, and data formatting is carried out, wherein the data cleaning method comprises the following steps: outlier processing, missing value processing, data grouping, data transposition, multi-classification variable single-heat coding and the like, and finally a preprocessed data variable set is obtained.
The embodiment of the invention carries out preliminary pretreatment on the original variable set, carries out preliminary screening on the validity and accuracy of the variables, and is beneficial to improving the accuracy of the input variables of subsequent modeling.
Based on any one of the foregoing embodiments, deleting a plurality of variables with deletion rates greater than a preset optimal threshold in the survival risk assessment database, and obtaining the plurality of variables with a preset association degree for supplementing, to obtain an optimized risk variable set, including:
setting a preset deletion rate range interval and a preset adjustment step length;
starting from the starting point of the preset deletion rate range interval, increasing according to the preset adjustment step length until the ending point of the preset deletion rate range interval is reached, so as to obtain a plurality of preset adjustment thresholds;
Deleting the variables in the preprocessing risk variable set which are larger than the preset adjustment thresholds to obtain a plurality of verification risk test sets;
verifying the verification risk test sets to obtain the preset optimal deletion rate threshold;
deleting the variables in the pretreatment risk variable set according to the preset optimal deletion rate threshold;
and acquiring the plurality of variables with the preset association degree by adopting a K nearest neighbor algorithm, and supplementing the plurality of deleted variables to obtain the optimized risk variable set.
Specifically, a preset deletion rate range is set, 30% -95% is adopted here, a preset adjustment step length is adopted, 5% is adopted as an adjustment unit, risk variable sets with deletion rates larger than 30%, 35%, 40%, … … and 95% are successively deleted, a plurality of verification risk test sets are obtained, the correct rates of the verification risk test sets are sequentially verified, a preset optimal deletion rate threshold value with the highest correction rate of the verification risk test sets is found, a plurality of variables in the pretreatment risk variable sets are deleted based on the preset optimal deletion rate threshold value, the rest variables adopt a plurality of variables with preset association degrees, for example, a deletion value is interpolated by adopting a method of similar cases, and a K Nearest Neighbor (KNN) classification algorithm is adopted for identifying similar cases, so that the method is a theoretical mature and relatively simple machine learning algorithm, and the algorithm idea is as follows:
If a sample belongs to a class for the majority of the k most similar (i.e., nearest neighbor) samples in the feature space, then the sample also belongs to that class. The K neighbor algorithm is that a training data set is given, K instances nearest to the new input instance are found in the training data set, and if the K instances are continuous variables, weighted average is carried out through K adjacent instances to obtain a substitution value of the missing value; if the value is a classification variable, the missing value is replaced by a value with a larger proportion. The steps in actual operation are as follows:
1) Selecting variables with missing values in the data set;
2) Taking uric acid with a deficiency value as an example, taking uric acid as a target variable (target) and the rest variables as characteristic variables when KNN interpolation is carried out;
3) Splitting the data set into a uric acid non-deletion data set tran and a uric acid deletion data set test;
4) The Euclidean distance between each sample in the Test data set and each sample in the tran data set is calculated, and k similar samples (the calculation formula of the Euclidean distance between a case a in the Test data set and each case i (i epsilon tran) in the tran data set) closest to each sample is selected as follows:
5) Taking the reciprocal of Euclidean distance as weight, and obtaining the substitution value of the missing value of each sample in the test data set by weighted average of uric acid values of k similar cases in the tran data set;
6) And (3) through adjusting the value of k, finding the k value which enables the accuracy of the machine learning algorithm test set to be highest after interpolation.
According to the embodiment of the invention, the risk variable is further processed by adopting the threshold deleting and K nearest neighbor algorithm, so that the variable with higher reliability is obtained, and the accuracy and the reliability of survival risk assessment are improved.
Based on any of the above embodiments, the number of machine learning algorithms includes an XGboost algorithm, a random forest algorithm, and a GBDT algorithm.
Specifically, a plurality of machine learning algorithms are adopted to screen important variables, and the embodiment of the invention adopts the following algorithms:
1) Extraction of important variables based on XGboost algorithm
XGBoost is an integrated learning method, and a series of regression decision trees are added to achieve the regression purpose. XGBoost is an improvement on Boosting algorithm based on GBDT, and regression tree is used as an internal decision tree. The basic idea of the XGBoost algorithm is: hundreds of tree models with low classification accuracy are combined to form a model with high accuracy, so that the purpose of classification is achieved. The latter tree takes the residual of the former tree as a regression target, and a gradient is adopted in the XGBoost algorithm to approximately replace the residual. The specific flow is as follows:
Inputting a target variable and an independent variable respectively;
an objective function (loss + regularization term) is defined. Where loss = error of last tree (gradient); regularization term = complexity of tree. Further optimizing the objective function requires that the prediction error be as small as possible, and the complexity of the numbers be as low as possible;
and searching the segmentation points by using a greedy method, and constructing a decision tree. Enumerating all different tree structures, and selecting a scheme with the maximum Gain value and exceeding a threshold value. Pruning terminates splitting if max (Gain) is less than the threshold;
calculating the score of the leaf node, updating the decision tree sequence, and storing all constructed decision trees and the score thereof;
calculating the prediction result of each sample, namely the sum of the scores of each tree, and obtaining the probability that the sample belongs to each category;
and calculating an importance score of each variable, namely an average value of Gini coefficients, selecting important variables which have significant influence on the model, and reserving the important variables with the importance score larger than 0.
2) Extraction of important variables based on random forest algorithm
Random forests are in fact a special bagging method that uses decision trees as models in bagging. Firstly, generating m training sets by using a bootstrap method, then, constructing a decision tree for each training set, and when a node finds a feature to split, not finding all the features to maximize an index (such as information gain), but randomly extracting a part of the features, finding an optimal solution among the extracted features, applying the optimal solution to the node, and splitting. The random forest method is equivalent to sampling both samples and features (if training data is regarded as a matrix, as is common in practice, then a line and column sampling process) due to the concept of bagging, i.e. integration, so that overfitting can be avoided. Because of randomness, the method has a very good effect of reducing the variance of the model, so that the random forest can obtain better generalization capability and overfitting resistance capability without additional pruning. The specific flow is as follows:
Inputting a target variable and an independent variable respectively;
constructing a decision tree and a decision tree forest. "random" has two layers meaning, one is a randomly selected sample and one is a randomly selected feature. For each tree, a replaced random extraction training sample is arranged, and then replaced random extraction features are used as branch basis of the tree, so that a plurality of trees can be constructed to form a decision tree forest according to the method;
the importance level of each feature is calculated. Taking the difference of error rates as the importance degree of the feature in the tree, wherein each feature appears in a plurality of trees, and taking the average value of the importance degree of the feature in the plurality of trees as the importance degree of the feature in the forest;
sequencing the importance degrees of all the features, removing partial features with low importance degrees in the forest to obtain a new feature set, so that one iteration is truly completed, and the trees in the forest are continuously optimized through continuous iteration;
comparing the predicted results of all samples with the true values, calculating the out-of-cover error rate of the forest, and selecting the forest with the minimum out-of-cover error rate as a final random forest model;
the importance degree of each feature in the forest, namely the importance score, is kept as an important variable with the importance score being more than 0.
3) Extraction of important variables based on GBDT algorithm
GBDT is also a member of the ensemble learning Boosting family, but is quite different from traditional Adaboost. Adaboost is the error rate of the weak learner in the previous iteration to update the weight of the training set, and the iteration is performed in this round. GBDT is also iterative, using a forward distribution algorithm, but weak learners define that only CART regression tree models can be used, with each round of training being trained on the residual basis of the previous round of training. The specific flow is as follows:
inputting a target variable and an independent variable respectively;
an objective function is defined. Assume that the strong learner from the previous iteration is f t-1 (x) The loss function is L (y, f t-1 (x) The goal of this round of iterations is to find a weak learner h of the CART regression tree model t (x) Let the loss function L (y, f of the present round t (x))=L(y,f t-1 (x)+h t (x) A) minimum. That is, the round of iteration finds the decision tree, so that the loss of the sample is reduced as much as possible;
initializing a weak learner;
a negative gradient, i.e. residual, is calculated for each sample. Taking the residual error as a new true value of the sample, and taking the original data as training data of the next tree to obtain a new regression tree;
and calculating a best fit value for the leaf area, and updating the strong learner.
According to the embodiment of the invention, three mature machine learning algorithms are selected to screen the optimized risk variable set, the risk variable set is obtained after the intersection is solved, and the reliability of the variables is further improved on the basis of preprocessing.
Fig. 2 is a structural diagram of a survival risk assessment system according to an embodiment of the present invention, as shown in fig. 2, including: an acquisition module 21, a processing module 22 and an evaluation module 23; wherein:
the acquisition module 21 is used for acquiring a screening risk variable set and a screening risk variable model coefficient; the screening risk variable set is obtained by an initial data variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model; the processing module 22 is configured to construct a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient; the evaluation module 23 is configured to input a plurality of information data of the individual to be evaluated into the survival risk evaluation scale, to obtain a survival risk evaluation quantitative value and an individual survival risk level of the individual to be evaluated.
The system provided by the embodiment of the present invention is used for executing the corresponding method, and the specific implementation manner of the system is consistent with the implementation manner of the method, and the related algorithm flow is the same as the algorithm flow of the corresponding method, which is not repeated here.
According to the embodiment of the invention, the important risk factors are screened out through the risk regression model and stepwise regression intelligence by acquiring the follow-up data of the practical application, the survival risk assessment scale is constructed automatically, the survival risk value and the corresponding risk level are output, the coverage content is more comprehensive, and the practicability is higher.
Based on any of the above embodiments, the system further includes a verification module 24, where the verification module 24 is configured to input a plurality of evaluation samples into the survival risk prediction model to obtain a plurality of survival risk prediction values; dividing the plurality of survival risk prediction values into a plurality of survival risk grades; comparing a plurality of survival curves corresponding to the plurality of survival risk grades through a preset verification algorithm to obtain a risk difference value; and if the risk difference value meets a preset difference threshold condition, the classification of the plurality of survival risk classes is considered to be correct.
According to the embodiment of the invention, after the survival risk assessment scale is constructed, the sample is input into the assessment scale to obtain the corresponding assessment result, and the assessment result is effectively verified, so that the rationality and accuracy of construction of the survival risk assessment scale can be accurately judged, and the practicability is enhanced.
Based on any of the above embodiments, the acquiring module 21 includes: a preprocessing sub-module 211, a first construction sub-module 212, an optimization sub-module 213, a first screening sub-module 214, a solution sub-module 215, a second screening sub-module 216, and a second construction sub-module 217; wherein:
the preprocessing sub-module 211 is configured to obtain an original risk variable set, and initialize the original risk variable set to obtain a preprocessed risk variable set; the first construction submodule 212 is used for constructing a survival risk assessment database based on the pretreatment risk variable set; the optimizing sub-module 213 is configured to delete a number of variables with a deletion rate greater than a preset optimal deletion rate threshold in the survival risk assessment database, and obtain the number of variables with a preset association degree for supplementing, so as to obtain an optimized risk variable set; the first screening sub-module 214 is configured to screen the optimized risk variable set by using a plurality of machine learning algorithms, to obtain a plurality of variable sets; the solving sub-module 215 solves the intersection sets for the plurality of variable sets to obtain the initial risk variable set; the second screening submodule 216 is configured to obtain the COX proportional-risk regression model, train the initial risk variable set based on the COX proportional-risk regression model, and screen the initial risk variable set in combination with the stepwise backward algorithm to obtain the screened risk variable set; the second construction submodule 217 further constructs the survival risk prediction model based on the screening risk variable set, and inputs the initial risk variable set into the survival risk prediction model to obtain the screening risk variable model coefficient.
The embodiment of the invention constructs the survival risk prediction model by screening the risk variables for multiple times and introducing the COX regression model, saves the time of consulting documents and consulting specialists before constructing the scale by the traditional method to a certain extent, and effectively avoids the condition of manually summarizing missing risk factors.
Based on any of the foregoing embodiments, the preprocessing submodule 211 is specifically configured to obtain a plurality of objective risk information of the individual to be evaluated, construct the original risk variable set, and set a target variable for the original risk variable set; and cleaning data of the original risk variable set with the set target variable, and formatting to obtain the preprocessing risk variable set.
The method and the device for preprocessing the original variable set preliminarily, and preliminarily screening the validity and the accuracy of the variables are beneficial to improving the accuracy of input data of subsequent modeling.
Based on any of the above embodiments, the optimizing sub-module 213 is specifically configured to set a preset deletion rate range interval and a preset adjustment step; starting from the starting point of the preset deletion rate range interval, increasing according to the preset adjustment step length until the ending point of the preset deletion rate range interval is reached, so as to obtain a plurality of preset adjustment thresholds; deleting the variables in the preprocessing risk variable set which are larger than the preset adjustment thresholds to obtain a plurality of verification risk test sets; verifying the verification risk test sets to obtain the preset optimal deletion rate threshold; deleting the variables in the pretreatment risk variable set according to the preset optimal deletion rate threshold; and acquiring the plurality of variables with the preset association degree by adopting a K nearest neighbor algorithm, and supplementing the plurality of deleted variables to obtain the optimized risk variable set.
According to the embodiment of the invention, the risk variable is further processed by adopting the threshold deleting and K nearest neighbor algorithm, so that the data variable with higher reliability is obtained, and the accuracy and the reliability of survival risk assessment are improved.
Based on any of the above embodiments, the number of machine learning algorithms includes an XGboost algorithm, a random forest algorithm, and a GBDT algorithm.
According to the embodiment of the invention, three mature machine learning algorithms are selected to screen the optimized risk variable set, the risk variable set is obtained after the intersection is solved, and the credibility of the data is further improved on the basis of preprocessing.
Fig. 3 illustrates a physical schematic diagram of an electronic device, as shown in fig. 3, where the electronic device may include: processor 310, communication interface (Communications Interface) 320, memory 330 and communication bus 340, wherein processor 310, communication interface 320, memory 330 accomplish communication with each other through communication bus 340. The processor 310 may call logic instructions in the memory 330 to perform the following method: acquiring a screening risk variable set and a screening risk variable model coefficient; the screening risk variable set is obtained by an initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model; constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient; and inputting a plurality of information data of the individual to be evaluated into the survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated.
Further, the logic instructions in the memory 330 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, embodiments of the present invention further provide a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the transmission method provided in the above embodiments, for example, including: acquiring a screening risk variable set and a screening risk variable model coefficient; the screening risk variable set is obtained by an initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model; constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient; and inputting a plurality of information data of the individual to be evaluated into the survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. A method for assessing survival risk, comprising:
acquiring a screening risk variable set and a screening risk variable model coefficient; the screening risk variable set is obtained by an initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model;
constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient;
inputting a plurality of information data of the individual to be evaluated into the survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated;
The acquiring the screening risk variable set and the screening risk variable model coefficient specifically comprises the following steps:
an original risk variable set is obtained, and the original risk variable set is initialized to obtain a preprocessed risk variable set;
constructing a survival risk assessment database based on the pretreatment risk variable set;
deleting a plurality of variables with the deletion rate larger than a preset optimal deletion rate threshold value in the survival risk assessment database, and acquiring the variables with the preset association degree for supplementing to obtain an optimal risk variable set;
screening the optimized risk variable sets by adopting a plurality of machine learning algorithms to obtain a plurality of variable sets;
solving intersections of the variable sets to obtain the initial risk variable set;
acquiring the COX proportional risk regression model, training the initial risk variable set based on the COX proportional risk regression model, and screening by combining the gradual backward algorithm to obtain the screening risk variable set;
based on the screening risk variable set, further constructing the survival risk prediction model, and inputting the initial risk variable set into the survival risk prediction model to obtain the screening risk variable model coefficient;
Deleting a plurality of variables with the deletion rate larger than a preset optimal threshold value in the survival risk assessment database, and acquiring the plurality of variables with the preset association degree for supplementing to obtain an optimal risk variable set, wherein the method specifically comprises the following steps of:
setting a preset deletion rate range interval and a preset adjustment step length;
starting from the starting point of the preset deletion rate range interval, increasing according to the preset adjustment step length until the ending point of the preset deletion rate range interval is reached, so as to obtain a plurality of preset adjustment thresholds;
deleting the variables in the preprocessing risk variable set which are larger than the preset adjustment thresholds to obtain a plurality of verification test sets;
verifying the verification test sets to obtain the preset optimal deletion rate threshold;
deleting the variables in the pretreatment risk variable set according to the preset optimal deletion rate threshold;
and acquiring the plurality of variables with the preset association degree by adopting a K nearest neighbor algorithm, and supplementing the plurality of deleted variables to obtain the optimized risk variable set.
2. The survival risk assessment method of claim 1, wherein the constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficients further comprises:
Inputting a plurality of evaluation samples into the survival risk prediction model to obtain a plurality of survival risk prediction values;
dividing the plurality of survival risk prediction values into a plurality of survival risk grades;
comparing a plurality of survival curves corresponding to the plurality of survival risk grades through a preset verification algorithm to obtain a risk difference value;
and if the risk difference value meets a preset difference threshold condition, the classification of the plurality of survival risk classes is considered to be correct.
3. The survival risk assessment method according to claim 1, wherein the obtaining an original risk variable set, initializing the original risk variable set to obtain a preprocessed risk variable set, specifically includes:
acquiring a plurality of objective risk information of the individual to be evaluated, constructing the original risk variable set, and setting target variables for the original risk variable set;
and cleaning data of the original risk variable set with the set target variable, and formatting to obtain the preprocessing risk variable set.
4. A survival risk assessment method according to any of claims 1 or 3, wherein the number of machine learning algorithms comprises an XGboost algorithm, a random forest algorithm and a GBDT algorithm.
5. A survival risk assessment system, comprising:
the acquisition module is used for acquiring the screening risk variable set and the screening risk variable model coefficient; the screening risk variable set is obtained by an initial risk variable set based on a COX proportional risk regression model and a gradual backward algorithm; the screening risk variable model coefficient is obtained based on a survival risk prediction model;
the processing module is used for constructing a survival risk assessment scale based on the screening risk variable set and the screening risk variable model coefficient;
the evaluation module is used for inputting a plurality of information data of the individual to be evaluated into the survival risk evaluation scale to obtain a survival risk evaluation quantitative value and an individual survival risk grade of the individual to be evaluated;
the acquiring the screening risk variable set and the screening risk variable model coefficient specifically comprises the following steps:
an original risk variable set is obtained, and the original risk variable set is initialized to obtain a preprocessed risk variable set;
constructing a survival risk assessment database based on the pretreatment risk variable set;
deleting a plurality of variables with the deletion rate larger than a preset optimal deletion rate threshold value in the survival risk assessment database, and acquiring the variables with the preset association degree for supplementing to obtain an optimal risk variable set;
Screening the optimized risk variable sets by adopting a plurality of machine learning algorithms to obtain a plurality of variable sets;
solving intersections of the variable sets to obtain the initial risk variable set;
acquiring the COX proportional risk regression model, training the initial risk variable set based on the COX proportional risk regression model, and screening by combining the gradual backward algorithm to obtain the screening risk variable set;
based on the screening risk variable set, further constructing the survival risk prediction model, and inputting the initial risk variable set into the survival risk prediction model to obtain the screening risk variable model coefficient;
deleting a plurality of variables with the deletion rate larger than a preset optimal threshold value in the survival risk assessment database, and acquiring the plurality of variables with the preset association degree for supplementing to obtain an optimal risk variable set, wherein the method specifically comprises the following steps of:
setting a preset deletion rate range interval and a preset adjustment step length;
starting from the starting point of the preset deletion rate range interval, increasing according to the preset adjustment step length until the ending point of the preset deletion rate range interval is reached, so as to obtain a plurality of preset adjustment thresholds;
Deleting the variables in the preprocessing risk variable set which are larger than the preset adjustment thresholds to obtain a plurality of verification test sets;
verifying the verification test sets to obtain the preset optimal deletion rate threshold;
deleting the variables in the pretreatment risk variable set according to the preset optimal deletion rate threshold;
and acquiring the plurality of variables with the preset association degree by adopting a K nearest neighbor algorithm, and supplementing the plurality of deleted variables to obtain the optimized risk variable set.
6. The survival risk assessment system of claim 5, further comprising a verification module, the verification module being specifically configured to:
inputting a plurality of evaluation samples into the survival risk prediction model to obtain a plurality of survival risk prediction values;
dividing the plurality of survival risk prediction values into a plurality of survival risk grades;
comparing a plurality of survival curves corresponding to the plurality of survival risk grades through a preset verification algorithm to obtain a risk difference value;
and if the risk difference value meets a preset difference threshold condition, the classification of the plurality of survival risk classes is considered to be correct.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the survival risk assessment method according to any one of claims 1 to 4 when the program is executed by the processor.
8. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the survival risk assessment method according to any one of claims 1 to 4.
CN201911019274.4A 2019-10-24 2019-10-24 Survival risk assessment method and system Active CN111243736B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911019274.4A CN111243736B (en) 2019-10-24 2019-10-24 Survival risk assessment method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911019274.4A CN111243736B (en) 2019-10-24 2019-10-24 Survival risk assessment method and system

Publications (2)

Publication Number Publication Date
CN111243736A CN111243736A (en) 2020-06-05
CN111243736B true CN111243736B (en) 2023-09-01

Family

ID=70867869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911019274.4A Active CN111243736B (en) 2019-10-24 2019-10-24 Survival risk assessment method and system

Country Status (1)

Country Link
CN (1) CN111243736B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111755126A (en) * 2020-06-28 2020-10-09 北京肿瘤医院(北京大学肿瘤医院) Immunotherapy curative effect prediction method and device
CN112017785B (en) * 2020-11-02 2021-02-05 平安科技(深圳)有限公司 Disease risk prediction system, method, device, equipment and medium
CN112614595A (en) * 2020-12-25 2021-04-06 联仁健康医疗大数据科技股份有限公司 Survival analysis model construction method and device, electronic terminal and storage medium
CN113177701A (en) * 2021-04-15 2021-07-27 国任财产保险股份有限公司 User credit assessment method and device
CN113284612B (en) * 2021-05-21 2024-04-16 大连海事大学 Survival analysis method based on XGBoost algorithm
CN115620902A (en) * 2021-07-15 2023-01-17 华为云计算技术有限公司 Method and device for predicting survival risk rate
CN114512240A (en) * 2022-02-08 2022-05-17 吾征智能技术(北京)有限公司 Gout prediction model system, equipment and storage medium
CN115471056B (en) * 2022-08-31 2023-05-23 鼎翰文化股份有限公司 Data transmission method and data transmission system
CN117094184B (en) * 2023-10-19 2024-01-26 上海数字治理研究院有限公司 Modeling method, system and medium of risk prediction model based on intranet platform

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355208A (en) * 2016-08-31 2017-01-25 广州精点计算机科技有限公司 Data prediction analysis method based on COX model and random survival forest
CN107358047A (en) * 2017-07-13 2017-11-17 刘峰 Diabetic assesses and management system
CN107680680A (en) * 2017-09-07 2018-02-09 广州九九加健康管理有限公司 Cardiovascular and cerebrovascular disease method for prewarning risk and system based on accurate health control
CN109063418A (en) * 2018-07-19 2018-12-21 东软集团股份有限公司 Determination method, apparatus, equipment and the readable storage medium storing program for executing of disease forecasting classifier
CN109215781A (en) * 2018-09-14 2019-01-15 苏州贝斯派生物科技有限公司 A kind of construction method and building system of the Kawasaki disease risk evaluation model based on logistic algorithm
CN109243604A (en) * 2018-09-14 2019-01-18 苏州贝斯派生物科技有限公司 A kind of construction method and building system of the Kawasaki disease risk evaluation model based on neural network algorithm
CN109273093A (en) * 2018-09-14 2019-01-25 苏州贝斯派生物科技有限公司 A kind of construction method and building system of Kawasaki disease risk evaluation model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355208A (en) * 2016-08-31 2017-01-25 广州精点计算机科技有限公司 Data prediction analysis method based on COX model and random survival forest
CN107358047A (en) * 2017-07-13 2017-11-17 刘峰 Diabetic assesses and management system
CN107680680A (en) * 2017-09-07 2018-02-09 广州九九加健康管理有限公司 Cardiovascular and cerebrovascular disease method for prewarning risk and system based on accurate health control
CN109063418A (en) * 2018-07-19 2018-12-21 东软集团股份有限公司 Determination method, apparatus, equipment and the readable storage medium storing program for executing of disease forecasting classifier
CN109215781A (en) * 2018-09-14 2019-01-15 苏州贝斯派生物科技有限公司 A kind of construction method and building system of the Kawasaki disease risk evaluation model based on logistic algorithm
CN109243604A (en) * 2018-09-14 2019-01-18 苏州贝斯派生物科技有限公司 A kind of construction method and building system of the Kawasaki disease risk evaluation model based on neural network algorithm
CN109273093A (en) * 2018-09-14 2019-01-25 苏州贝斯派生物科技有限公司 A kind of construction method and building system of Kawasaki disease risk evaluation model

Also Published As

Publication number Publication date
CN111243736A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
CN111243736B (en) Survival risk assessment method and system
US20220254493A1 (en) Chronic disease prediction system based on multi-task learning model
Kell et al. Evaluation of the prediction skill of stock assessment using hindcasting
CN113517066B (en) Depression assessment method and system based on candidate gene methylation sequencing and deep learning
CN110797101B (en) Medical data processing method, medical data processing device, readable storage medium and computer equipment
KR20220059120A (en) System for modeling automatically of machine learning with hyper-parameter optimization and method thereof
CN116959585B (en) Deep learning-based whole genome prediction method
Tiruneh et al. Feature selection for construction organizational competencies impacting performance
CN112464172A (en) Growth parameter active and passive remote sensing inversion method and device
CN115565669B (en) Cancer survival analysis method based on GAN and multitask learning
CN111524023A (en) Greenhouse adjusting method and system
Pan et al. Predicting times to event based on vine copula models
CN111598580A (en) XGboost algorithm-based block chain product detection method, system and device
CN116705310A (en) Data set construction method, device, equipment and medium for perioperative risk assessment
CN112116449A (en) Credit evaluation method, device, equipment and storage medium with good model interpretability
CN117373688B (en) Chronic disease data processing method, device, electronic equipment and storage medium
Spanou et al. Walleye (Sander vitreus, Mitchill 1818) age and sex classification using innovative supervised and unsupervised machine learning and soft computing methodologies
CN115831356B (en) Auxiliary prediction diagnosis method based on artificial intelligence algorithm
CN116070120B (en) Automatic identification method and system for multi-tag time sequence electrophysiological signals
CN113035363B (en) Probability density weighted genetic metabolic disease screening data mixed sampling method
CN117292796A (en) Construction method, device, equipment and storage medium of disease diagnosis standard library
Zhang On the Genetic Regulation of Bayesian Networks
CN118052371A (en) Method and device for analyzing operation condition of electric power marketing field device
CN117976185A (en) Breast cancer risk assessment method and system combining deep learning
CN117637167A (en) Method, device, equipment and medium for predicting risk value of postoperative complications of lung

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant