WO2022226843A1 - System for predicting acute pancreatitis-induced organ failure, and computer device - Google Patents

System for predicting acute pancreatitis-induced organ failure, and computer device Download PDF

Info

Publication number
WO2022226843A1
WO2022226843A1 PCT/CN2021/090728 CN2021090728W WO2022226843A1 WO 2022226843 A1 WO2022226843 A1 WO 2022226843A1 CN 2021090728 W CN2021090728 W CN 2021090728W WO 2022226843 A1 WO2022226843 A1 WO 2022226843A1
Authority
WO
WIPO (PCT)
Prior art keywords
time
patient
feature
information
gate
Prior art date
Application number
PCT/CN2021/090728
Other languages
French (fr)
Chinese (zh)
Inventor
兰蓝
Original Assignee
四川大学华西医院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 四川大学华西医院 filed Critical 四川大学华西医院
Publication of WO2022226843A1 publication Critical patent/WO2022226843A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the invention relates to the field of neural networks, in particular to a computer device and system for predicting organ failure in acute pancreatitis.
  • n respectively represent the vectors of each characteristic variable of patient n; are the vectors representing the acquisition time of each characteristic variable of patient n, respectively, Respectively represent the indicator vector of whether each characteristic variable of patient n is missing when it is collected, The vectors representing the time interval of each characteristic variable collection of patient n respectively; x′ n 1 ,x′ n 2 ,...,x′ n D sub-tables represent the last collection of each characteristic variable of patient n at a certain collection time point A vector of feature information at a time point;
  • n represents the mean vector of each feature of patient n; represents the time interval of collection points, and the subscript j represents the row number of a patient feature matrix, represents the specific time of a certain feature of a row, Indicates whether a feature of a row is missing;
  • x j' represents the eigenvalue of the jth time point at the previous moment, Denoted as the decay rate to h j-1 , and It is uniformly recorded as ⁇ j , W ⁇ is the weight, b ⁇ is the offset, and ⁇ j is the interval time information at time s j ;
  • i represents the input gate
  • f represents the forget gate
  • c represents the cell state
  • o represents the output gate
  • represents the sigmoid activation function
  • W represents the weight
  • b represents the offset
  • the subscript represents the attribution of the weight and offset
  • the subscript j represents the row number
  • represents the dot product operation
  • x j represents the value of the feature in row j.
  • the decay of ⁇ j to h j-1 , and the calculation of the time gate and the influence of the time gate on the cell state and the hidden layer are formulas (15) to (21),
  • n represents the intermediate amount of hidden layer updates, represents the hidden layer obtained after the weighted summation of the time gate, Represents the intermediate amount of cell state update, c j represents the updated cell state, h j is the updated hidden layer, Denote the decay rate to h j-1 , k j represents the time gate, W s , b s , C s are the parameters of the periodic function, L is the objective function of the model, and N is the sample size used in each iteration , T n is the total number of measurements of a patient, y nj is the outcome of a patient at a certain time, p nj is the predicted probability of the outcome of a patient at a certain time.
  • the hyperparameter information includes the number of neurons and the number of hidden layers.
  • the present invention also provides a computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the above-mentioned prediction model when the processor executes the program.
  • the present invention also provides a prediction system for acute pancreatitis-induced organ failure, comprising:
  • the present invention improves the missing value filling method caused by asynchronous sampling variables in the past, and the decay rate ⁇ obtained through model training can make the interpolation closer to the real situation.
  • Time gate By introducing Time gate, time information can be fully utilized, so that time-sensitive prediction tasks can achieve higher accuracy.
  • the present invention can include as much electronic record information of the patient as possible to increase the decision-making ability of the model.
  • Fig. 1 is a model conceptual diagram of the present invention
  • FIG. 2 is a technical detail diagram of the present invention.
  • this embodiment integrates the patient's medication information, laboratory inspection information, electronic medical record information, and radiological system inspection information after admission, and organizes them into structured data.
  • Time node information namely ⁇ Variables, Time ⁇ ;
  • the time gate output is calculated according to the interval from the patient's admission to the time node of an event, and the output result of the time gate is used to speed up the model training process.
  • the number of neurons in the output layer is 2, and the softmax function is used as the activation function.
  • the loss function adopts the cross-entropy function. Please refer to Figure 2 for specific technical details.
  • the development language uses the python3.5 version, and the packages involved include numpy and pytorch.
  • Equation (1)-(6) Represents a real matrix with the number of rows T n and the number of columns D, R is the symbol of the real number, and n is the index number of a certain patient, because the real matrix of each patient may have a different number of matrix rows, so it is recorded as T n , but the number of columns of the real matrix for all patients must be the same, so D is not subscripted.
  • X n represents the feature matrix of patient n
  • Sn represents the time information matrix of each feature collection of patient n
  • M n represents the indication matrix of whether each feature of patient n is missing
  • ⁇ n represents the time interval matrix of each feature collection of patient n
  • X′ n represents the feature matrix of each feature of patient n at the previous acquisition time point of a certain acquisition time point.
  • each characteristic variable of patient n respectively represent the vectors of each characteristic variable of patient n; are the vectors representing the acquisition time of each characteristic variable of patient n, respectively, Respectively represent the indicator vector of whether each characteristic variable of patient n is missing when it is collected, The vectors representing the time interval of each characteristic variable collection of patient n respectively; x′ n 1 ,x′ n 2 ,...,x′ n D sub-tables represent the last collection of each characteristic variable of patient n at a certain collection time point A vector of feature information for a time point.
  • a vector of means representing the individual features of patient n A vector of means representing the individual features of patient n.
  • the value of the feature d representing time j is determined by the mean vector and the eigenvalues of the previous moment by the decay rate weighted. here are the components of the vector ⁇ j .
  • the original feature matrix X n is transformed into one without missing values.
  • the decay rate to x j′ Denoted as the decay rate to h j-1 .
  • Their calculation methods are the same as those shown in Equation (8), except that the weight W ⁇ and the offset b ⁇ are different. For simplicity, no specific distinction is made below and It is uniformly denoted as ⁇ j .
  • Equation (15)-Equation (21) represents the intermediate amount of hidden layer updates, represents the hidden layer obtained after the weighted summation of the time gate, represents the intermediate amount of cell state update, and c j represents the updated cell state.
  • ⁇ j represents that the decay rate at time s j is calculated from the function of the interval time information ⁇ j at time t; where W ⁇ represents the weight, and b ⁇ represents the offset. Other amounts are the same as before.
  • Formulas (11)-(15) are the LSTM network structure, and formulas (16)-(20) illustrate The decay on h j-1 , and the computation of the time gate and the effect of the time gate on the cell state and hidden layers.
  • Equation (21) is the loss function of the model, which is used for the error calculation of the forward process and the gradient calculation of the backward process in the model training.
  • the number of neurons, the number of hidden layers and other hyperparameter information are optimized by grid search method, and the parameter combination with the best performance on the validation set is selected as the final result of the model.
  • a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the prediction model in Embodiment 1 when the processor executes the program.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

Disclosed in the present invention is a model for predicting an acute pancreatitis-induced organ failure. The model comprises the following steps: S100, pre-processing patient information, involving recording events and time nodes by using {Variables, Time}; S200, sorting the events according to the sequential order of time, and complementing a missed value by using a Decay mechanism; and S300, performing one-hot encoding on data by using an Embedding mechanism, mapping the data into a real vector space, normalizing the data, and then inputting the data into a Phased LSTM model, wherein a time gate output is calculated according to the interval time between a patient's admission to hospital and a time node of a certain event, a model training process is accelerated by using an output result of a time gate, there are two neurons in an output layer, and a softmax function is used as an activation function. By means of the present invention, heterogeneous multi-dimensional data can be processed and time information can be flexibly used; moreover, the determination of the model is closer to a description of a natural process of a disease in the real world.

Description

一种用于急性胰腺炎器官衰竭预测的计算机设备和***A computer device and system for prediction of organ failure in acute pancreatitis 技术领域technical field
本发明涉及神经网络领域,尤其涉及一种用于急性胰腺炎器官衰竭预测的计算机设备和***。The invention relates to the field of neural networks, in particular to a computer device and system for predicting organ failure in acute pancreatitis.
背景技术Background technique
急性胰腺炎为胰腺组织自身消化、水肿、出血甚至坏死的炎症反应,可由多种病因导致胰酶在胰腺内被激活后引起。临床上,急性胰腺炎以急性上腹痛、恶心、呕吐、发热和血胰酶增高等为特点。急性胰腺炎病变程度轻重不等,轻者以胰腺水肿为主,临床多见,病情常呈自限性,预后良好,又称为轻症急性胰腺炎。少数重者的胰腺出血坏死,常继发感染、腹膜炎和休克等,病死率高,称为重症急性胰腺炎。临床病理常把急性胰腺炎分为水肿型和出血坏死型两种。Acute pancreatitis is an inflammatory reaction of pancreatic tissue auto-digestion, edema, hemorrhage and even necrosis, which can be caused by the activation of pancreatic enzymes in the pancreas due to various causes. Clinically, acute pancreatitis is characterized by acute epigastric pain, nausea, vomiting, fever, and increased blood pancreatic enzymes. The severity of acute pancreatitis varies in severity. The mild cases are mainly pancreatic edema, which is more common in clinical practice. The disease is often self-limiting and the prognosis is good. It is also called mild acute pancreatitis. A small number of severe cases of pancreatic hemorrhage and necrosis, often secondary to infection, peritonitis and shock, have a high fatality rate, which is called severe acute pancreatitis. Clinical pathology often divide acute pancreatitis into two types: edema type and hemorrhagic necrosis type.
重症急性胰腺炎进一步可能导致器官衰竭,一旦发生器官衰竭,对患者的救治会非常困难,因此,找到一种方法对急性胰腺炎导致的器官衰竭进行提前的预测并进行提前干预或预防就显得尤为重要。Severe acute pancreatitis may further lead to organ failure. Once organ failure occurs, it will be very difficult to treat patients. Therefore, it is particularly important to find a way to predict organ failure caused by acute pancreatitis in advance and to intervene or prevent it in advance. important.
中国发明专利申请“CN202010827820.3一种急性胰腺炎预后标志物、急性胰腺炎预后预测模型及其应用”提供了一种方法,通过检测患者血清外泌体中的标志物对急性胰腺炎进行预后,能够实现对器官衰竭等状况是否发生进行预测。然而,该专利申请中,对患者血清外泌体中的标志物进行检测,需要繁琐的检测工作和检测试剂盒,因而具有较高的成本,使得患者需要负担较高的经济压力。此外,也是因为检测工作麻烦和检测成本的原因,该方法只能在医生或患者认为必要的时候进行,无法持续地对患者的状况进行监控和实时预测,因而当患者病情恶化,器官衰竭可能性提高时,医生可能难以及时发现该情况。Chinese invention patent application "CN202010827820.3 A prognostic marker of acute pancreatitis, a prognostic prediction model of acute pancreatitis and its application" provides a method to predict acute pancreatitis by detecting markers in serum exosomes of patients , which can predict the occurrence of organ failure and other conditions. However, in this patent application, the detection of markers in the patient's serum exosomes requires cumbersome detection work and detection kits, and thus has a high cost, which makes patients need to bear high economic pressure. In addition, because of the troublesome testing work and the cost of testing, this method can only be performed when the doctor or the patient thinks it is necessary, and cannot continuously monitor and predict the patient's condition in real time. Therefore, when the patient's condition deteriorates, the possibility of organ failure When it increases, it may be difficult for doctors to detect the condition in time.
发明内容SUMMARY OF THE INVENTION
针对现有技术的缺陷,本发明提供一种用于急性胰腺炎器官衰竭预测的计算机设备和***,目的在于:能够处理异构的多维度数据以及能够灵活的使用时间信息,同时使得急性胰腺炎器官衰竭预测模型的判断也更接近于真实世界中对于疾病自然进程的一种刻画。Aiming at the defects of the prior art, the present invention provides a computer device and system for predicting organ failure in acute pancreatitis, with the purpose of being able to process heterogeneous multi-dimensional data and flexibly use time information, and at the same time make acute pancreatitis The judgment of the organ failure prediction model is also closer to a description of the natural process of the disease in the real world.
一种急性胰腺炎诱发器官衰竭的预测模型,包括以下步骤:A predictive model for acute pancreatitis-induced organ failure comprising the following steps:
S100、患者信息预处理,采用{Variables,Time}记录事件及时间节点;S100, patient information preprocessing, using {Variables, Time} to record events and time nodes;
S200、按照时间的先后顺序对事件进行排序,采用Decay机制填补缺失值;S200. Sort the events according to the time sequence, and use the Decay mechanism to fill in the missing values;
S300、使用Embedding机制对数据进行one-hot编码,映射到实向量空间中,对数据进行归一化后输入Phased LSTM模型,S300. Use the Embedding mechanism to perform one-hot encoding on the data, map it to the real vector space, normalize the data and input it into the Phased LSTM model,
其中依据患者从入院到某一事件的时间节点这段间隔时间计算时间门输出,利用时间门的输出结果来加速模型训练过程,输出层的神经元为2,采用softmax函数作为激活函数。The output of the time gate is calculated according to the interval from the patient's admission to a certain event, and the output of the time gate is used to speed up the model training process. The number of neurons in the output layer is 2, and the softmax function is used as the activation function.
优选的,步骤S200中,Decay机制具体体现为公式(1)~(9),其作用为引入衰减率
Figure PCTCN2021090728-appb-000001
对上一时刻的各指标观测值进行衰减,从而对目前时刻的缺失值进行填补。
Preferably, in step S200, the Decay mechanism is embodied in formulas (1) to (9), and its function is to introduce a decay rate
Figure PCTCN2021090728-appb-000001
Attenuate the observed values of each indicator at the previous moment to fill in the missing values at the current moment.
Figure PCTCN2021090728-appb-000002
Figure PCTCN2021090728-appb-000002
Figure PCTCN2021090728-appb-000003
Figure PCTCN2021090728-appb-000003
Figure PCTCN2021090728-appb-000004
Figure PCTCN2021090728-appb-000004
Figure PCTCN2021090728-appb-000005
Figure PCTCN2021090728-appb-000005
Figure PCTCN2021090728-appb-000006
Figure PCTCN2021090728-appb-000006
其中,
Figure PCTCN2021090728-appb-000007
表示行数为T n,列数为D的实数矩阵,R是实数的记号,n是某一个患者的索引号,因为每个患者的实数矩阵可能有不同的矩阵行数,所以记为T n
in,
Figure PCTCN2021090728-appb-000007
Represents a real matrix with the number of rows T n and the number of columns D, R is the symbol of the real number, and n is the index number of a certain patient, because the real matrix of each patient may have a different number of matrix rows, so it is recorded as T n ;
X n表示患者n的特征矩阵,S n表示患者n的各特征采集的时间信息矩阵,M n表示患者n的各特征是否缺失的指示矩阵,Δ n表示患者n的各特征采集的时间间隔矩阵,X′ n表示患者n的各特征在某采集时间点的上一个采集时间点的特征矩阵; X n represents the feature matrix of patient n, Sn represents the time information matrix of each feature collection of patient n, M n represents the indication matrix of whether each feature of patient n is missing, Δ n represents the time interval matrix of each feature collection of patient n , X′ n represents the feature matrix of each feature of patient n at the previous collection time point of a certain collection time point;
Figure PCTCN2021090728-appb-000008
分别表示患者n的各个特征变量的向量;
Figure PCTCN2021090728-appb-000009
分别代表患者n的各个特征变量的采集时间的向量,
Figure PCTCN2021090728-appb-000010
分别代表患者n的各个特征变量采集时是否缺失的指示向量,
Figure PCTCN2021090728-appb-000011
分别代表患者n的各个特征变量采集的时间间隔的向量;x′ n 1,x′ n 2,…,x′ n D分表表示患者n的各个特征变量在某个采集时间点的上一个采集时间点的特征信息的向量;
Figure PCTCN2021090728-appb-000008
respectively represent the vectors of each characteristic variable of patient n;
Figure PCTCN2021090728-appb-000009
are the vectors representing the acquisition time of each characteristic variable of patient n, respectively,
Figure PCTCN2021090728-appb-000010
Respectively represent the indicator vector of whether each characteristic variable of patient n is missing when it is collected,
Figure PCTCN2021090728-appb-000011
The vectors representing the time interval of each characteristic variable collection of patient n respectively; x′ n 1 ,x′ n 2 ,...,x′ n D sub-tables represent the last collection of each characteristic variable of patient n at a certain collection time point A vector of feature information at a time point;
Figure PCTCN2021090728-appb-000012
Figure PCTCN2021090728-appb-000012
Figure PCTCN2021090728-appb-000013
Figure PCTCN2021090728-appb-000013
其中,
Figure PCTCN2021090728-appb-000014
表示患者n的各个特征的均值向量;
Figure PCTCN2021090728-appb-000015
表示采集点时间间隔,下标j代表某患者特征矩阵的行号,
Figure PCTCN2021090728-appb-000016
代表某行某个特征的具体的时间,
Figure PCTCN2021090728-appb-000017
代表某行某个特征是否缺失;
in,
Figure PCTCN2021090728-appb-000014
represents the mean vector of each feature of patient n;
Figure PCTCN2021090728-appb-000015
represents the time interval of collection points, and the subscript j represents the row number of a patient feature matrix,
Figure PCTCN2021090728-appb-000016
represents the specific time of a certain feature of a row,
Figure PCTCN2021090728-appb-000017
Indicates whether a feature of a row is missing;
Γ j=exp{-max(0,W ΓΔ j+b Γ)}     (8) Γ j =exp{-max(0, W Γ Δ j +b Γ )} (8)
Figure PCTCN2021090728-appb-000018
Figure PCTCN2021090728-appb-000018
其中,把
Figure PCTCN2021090728-appb-000019
记为对x j′的衰减率,x j′表示第j个时间点的上一个时刻的特征值,
Figure PCTCN2021090728-appb-000020
记为对h j-1的衰减率,
Figure PCTCN2021090728-appb-000021
Figure PCTCN2021090728-appb-000022
统一记为Γ j,W Γ为权重,b Γ为偏移,Δ j为s j时刻的间隔时间信息;
Among them, put
Figure PCTCN2021090728-appb-000019
Denoted as the decay rate of x j ' , x j' represents the eigenvalue of the jth time point at the previous moment,
Figure PCTCN2021090728-appb-000020
Denoted as the decay rate to h j-1 ,
Figure PCTCN2021090728-appb-000021
and
Figure PCTCN2021090728-appb-000022
It is uniformly recorded as Γ j , W Γ is the weight, b Γ is the offset, and Δ j is the interval time information at time s j ;
Figure PCTCN2021090728-appb-000023
表示j时刻的特征d的值,其是由特征d的均值
Figure PCTCN2021090728-appb-000024
和上一时刻的特征值
Figure PCTCN2021090728-appb-000025
由衰减率
Figure PCTCN2021090728-appb-000026
加权而来,
Figure PCTCN2021090728-appb-000027
是向量Γ j的分量,
Figure PCTCN2021090728-appb-000028
为第d个变量的第j次测量值是否缺失,
Figure PCTCN2021090728-appb-000029
为s j时刻的特征d的原始值。
Figure PCTCN2021090728-appb-000023
represents the value of feature d at time j, which is the mean of feature d
Figure PCTCN2021090728-appb-000024
and the eigenvalues of the previous moment
Figure PCTCN2021090728-appb-000025
by the decay rate
Figure PCTCN2021090728-appb-000026
weighted,
Figure PCTCN2021090728-appb-000027
are the components of the vector Γ j ,
Figure PCTCN2021090728-appb-000028
is whether the jth measurement of the dth variable is missing,
Figure PCTCN2021090728-appb-000029
is the original value of feature d at time s j .
优选的,步骤S300中,对于Phased LSTM模型,其网络结构由公式(10)~ 公式(20)定义,Preferably, in step S300, for the Phased LSTM model, its network structure is defined by formula (10) to formula (20),
i j=σ(x jW xi+h j-1W hj+b i)     (10) i j =σ(x j W xi +h j-1 W hj +b i ) (10)
f j=σ(x jW xf+h j-1W hf+b f)    (11) f j =σ(x j W xf +h j-1 W hf +b f ) (11)
c j=σ⊙c j-1+i t⊙σ(x jW xc+h j-1W hc+b c)  (12) c j =σ⊙c j-1 +i t ⊙σ(x j W xc +h j-1 W hc +b c ) (12)
o j=σ(x jW xo+h j-1W ho+b o)    (13) o j =σ(x j W xo +h j-1 W ho +b o ) (13)
h j=o j⊙σ(c j)   (14) h j =o j ⊙σ(c j ) (14)
其中,i表示输入门,f表示遗忘门,c表示细胞状态,o表述输出门,σ表示sigmoid激活函数,W代表权重,b代表偏移,其下标代表权重和偏移的归属,下标j代表行号,⊙代表点乘运算,x j代表j行特征的值。 Among them, i represents the input gate, f represents the forget gate, c represents the cell state, o represents the output gate, σ represents the sigmoid activation function, W represents the weight, b represents the offset, the subscript represents the attribution of the weight and offset, the subscript j represents the row number, ⊙ represents the dot product operation, and x j represents the value of the feature in row j.
优选的,Γ j对h j-1的衰减,以及时间门的计算和时间门对细胞状态和隐含层的影响的公式为公式(15)~公式(21), Preferably, the decay of Γ j to h j-1 , and the calculation of the time gate and the influence of the time gate on the cell state and the hidden layer are formulas (15) to (21),
Figure PCTCN2021090728-appb-000030
Figure PCTCN2021090728-appb-000030
Figure PCTCN2021090728-appb-000031
Figure PCTCN2021090728-appb-000031
Figure PCTCN2021090728-appb-000032
Figure PCTCN2021090728-appb-000032
Figure PCTCN2021090728-appb-000033
Figure PCTCN2021090728-appb-000033
Figure PCTCN2021090728-appb-000034
Figure PCTCN2021090728-appb-000034
k j=sin(W sS j+b s)+C s        (20) k j =sin(W s S j +b s )+C s (20)
Figure PCTCN2021090728-appb-000035
Figure PCTCN2021090728-appb-000035
其中,
Figure PCTCN2021090728-appb-000036
表示隐含层更新的中间量,
Figure PCTCN2021090728-appb-000037
表示经过时间门加权求和后得到的隐含层,
Figure PCTCN2021090728-appb-000038
表示细胞状态更新的中间量,c j表示更新后的细胞状态,h j为更新后的隐含层,
Figure PCTCN2021090728-appb-000039
记为对h j-1的衰减率,k j表示时间门,W s,b s,C s为周期函数的各项参数,L为模型的目标函数,N为每次迭代所使用的样本量,T n为某患者的总测量次数,y nj表示某患者某一时刻的结局,p nj表示某患者某时刻的结局的预测概率。
in,
Figure PCTCN2021090728-appb-000036
represents the intermediate amount of hidden layer updates,
Figure PCTCN2021090728-appb-000037
represents the hidden layer obtained after the weighted summation of the time gate,
Figure PCTCN2021090728-appb-000038
Represents the intermediate amount of cell state update, c j represents the updated cell state, h j is the updated hidden layer,
Figure PCTCN2021090728-appb-000039
Denote the decay rate to h j-1 , k j represents the time gate, W s , b s , C s are the parameters of the periodic function, L is the objective function of the model, and N is the sample size used in each iteration , T n is the total number of measurements of a patient, y nj is the outcome of a patient at a certain time, p nj is the predicted probability of the outcome of a patient at a certain time.
优选的,Phased LSTM模型采用Adam算法进行反向传播求解。Preferably, the Phased LSTM model adopts the Adam algorithm for back-propagation solution.
优选的,Phased LSTM模型的超参数信息采用网格搜索法寻优,选出在验证集上表现最优的参数组合作为模型最终结果。Preferably, the hyperparameter information of the Phased LSTM model is optimized by a grid search method, and the parameter combination that performs best on the validation set is selected as the final result of the model.
优选的,超参数信息包括神经元个数,隐含层层数。Preferably, the hyperparameter information includes the number of neurons and the number of hidden layers.
优选的,所述事件包括患者入院后的用药信息、实验室检查信息、电子病历信息和放射***的检查信息。Preferably, the events include medication information, laboratory examination information, electronic medical record information and examination information of the radiology system after admission of the patient.
本发明还提供一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述预测模型。The present invention also provides a computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the above-mentioned prediction model when the processor executes the program.
本发明还提供一种急性胰腺炎诱发器官衰竭的预测***,包括:The present invention also provides a prediction system for acute pancreatitis-induced organ failure, comprising:
服务器,用于存储患者信息;Servers for storing patient information;
上述计算机设备。The aforementioned computer equipment.
本发明的模型无需专门进行相关医学测试,能够对急性胰腺炎患者日常的检测、治疗和用药等信息进行数据处理,处理后的数据可用于及时、准确预测患者发生器官衰竭的风险。本发明的有益效果包括:The model of the present invention does not need to carry out relevant medical tests, and can process data on daily detection, treatment and medication information of patients with acute pancreatitis, and the processed data can be used to timely and accurately predict the risk of organ failure of patients. The beneficial effects of the present invention include:
1、本发明通过引入Decay机制,改进了以往由于异步采样变量所带来的缺失值填补方法,通过模型训练所得的衰减率γ能够使插值更接近真实情况。1. By introducing the Decay mechanism, the present invention improves the missing value filling method caused by asynchronous sampling variables in the past, and the decay rate γ obtained through model training can make the interpolation closer to the real situation.
2、通过引入Time gate,时间信息得以充分利用,使得对于时间敏感的预测任务能够达到更高精度。2. By introducing Time gate, time information can be fully utilized, so that time-sensitive prediction tasks can achieve higher accuracy.
3、本发明可以尽可能多的囊括病人的电子记录信息,增加模型的决策能力。3. The present invention can include as much electronic record information of the patient as possible to increase the decision-making ability of the model.
显然,根据本发明的上述内容,按照本领域的普通技术知识和惯用手段,在不脱离本发明上述基本技术思想前提下,还可以做出其它多种形式的修改、替换或变更。Obviously, according to the above-mentioned content of the present invention, according to the common technical knowledge and conventional means in the field, without departing from the above-mentioned basic technical idea of the present invention, other various forms of modification, replacement or change can also be made.
以下通过实施例形式的具体实施方式,对本发明的上述内容再作进一步的详细说明。但不应将此理解为本发明上述主题的范围仅限于以下的实例。凡基于本发明上述内容所实现的技术均属于本发明的范围。The above content of the present invention will be further described in detail below through the specific implementation in the form of examples. However, this should not be construed as limiting the scope of the above-mentioned subject matter of the present invention to the following examples. All technologies implemented based on the above content of the present invention belong to the scope of the present invention.
附图说明Description of drawings
图1为本发明的模型概念图;Fig. 1 is a model conceptual diagram of the present invention;
图2为本发明的技术细节图。FIG. 2 is a technical detail diagram of the present invention.
具体实施方式Detailed ways
需要特别说明的是,实施例中未具体说明的数据采集、传输、储存和处理等步骤的算法,以及未具体说明的硬件结构、电路连接等均可通过现有技术已公开的内容实现。It should be noted that, the algorithms for the steps of data acquisition, transmission, storage and processing not specifically described in the embodiments, as well as the hardware structure and circuit connection not specifically described can be implemented by the content disclosed in the prior art.
实施例1急性胰腺炎诱发器官衰竭的预测模型Example 1 Prediction model of acute pancreatitis-induced organ failure
如图1、图2所示,本实施例整合患者入院后的用药信息,实验室检查信息,电子病历信息,放射***的检查信息等,并统一整理成结构化数据的形式,保留各事件的时间节点信息,即{Variables,Time};As shown in Figures 1 and 2, this embodiment integrates the patient's medication information, laboratory inspection information, electronic medical record information, and radiological system inspection information after admission, and organizes them into structured data. Time node information, namely {Variables, Time};
按时间的先后顺序进行排序,缺失值填补采用Decay机制。在网络的输入层,对于类别型变量使用Embedding机制进行one-hot编码,然后再映射到合适维度的实向量空间中,数值事件值归一化后直接引入。Sort in chronological order, and use Decay mechanism to fill missing values. In the input layer of the network, one-hot encoding is used for categorical variables using the Embedding mechanism, and then mapped to a real vector space of appropriate dimensions, and the numerical event values are directly introduced after normalization.
输入层连接Phased LSTM层,神经元个数采用超参数进行选择。The input layer is connected to the Phased LSTM layer, and the number of neurons is selected by hyperparameters.
在Phased LSTM层中,依据患者从入院到某一事件的时间节点这段间隔时间计算时间门输出,利用时间门的输出结果来加速模型训练过程。输出层的神经元为2,采用softmax函数作为激活函数。In the Phased LSTM layer, the time gate output is calculated according to the interval from the patient's admission to the time node of an event, and the output result of the time gate is used to speed up the model training process. The number of neurons in the output layer is 2, and the softmax function is used as the activation function.
分别表示患者在未来7天内发生或者不发生器官衰竭的概率,损失函数采用交叉熵函数,具体技术细节请参考图2。Respectively represent the probability of the patient developing or not developing organ failure in the next 7 days. The loss function adopts the cross-entropy function. Please refer to Figure 2 for specific technical details.
实际使用时,开发语言采用python3.5版本,涉及的包包括numpy、pytorch。首先把患者的用药数据、实验室检测数据、体温单数据等信息整理成{Variables,Time}的序列形式,In actual use, the development language uses the python3.5 version, and the packages involved include numpy and pytorch. First, organize the patient's medication data, laboratory test data, temperature list data and other information into the sequence form of {Variables, Time},
采用公式(1)-(9)对原始数据进行缺失值填补,Using formulas (1)-(9) to fill in missing values for the original data,
Figure PCTCN2021090728-appb-000040
Figure PCTCN2021090728-appb-000040
Figure PCTCN2021090728-appb-000041
Figure PCTCN2021090728-appb-000041
Figure PCTCN2021090728-appb-000042
Figure PCTCN2021090728-appb-000042
Figure PCTCN2021090728-appb-000043
Figure PCTCN2021090728-appb-000043
Figure PCTCN2021090728-appb-000044
Figure PCTCN2021090728-appb-000044
Figure PCTCN2021090728-appb-000045
Figure PCTCN2021090728-appb-000045
Figure PCTCN2021090728-appb-000046
Figure PCTCN2021090728-appb-000046
Γ j=exp{-max(0,W ΓΔ j+b Γ)}    (8) Γ j =exp{-max(0, W Γ Δ j +b Γ )} (8)
Figure PCTCN2021090728-appb-000047
Figure PCTCN2021090728-appb-000047
在公式(1)-(6)中,
Figure PCTCN2021090728-appb-000048
表示行数为T n,列数为D的实数矩阵,R是实数的记号,n是某一个患者的索引号,因为每个患者的实数矩阵可能有不同的矩阵行数,所以记为T n,但所有患者的实数矩阵的列数肯定是一样的,所以D未带下标。X n表示患者n的特征矩阵,S n表示患者n的各特征采集的时间信息矩阵,M n表示患者n的各特征是否缺失的指示矩阵,Δ n表示患者n的各特征采集的时间间隔矩阵,X′ n表示患者n的各特征在某采集时间点的上一个采集时间点的特征矩阵。
In formulas (1)-(6),
Figure PCTCN2021090728-appb-000048
Represents a real matrix with the number of rows T n and the number of columns D, R is the symbol of the real number, and n is the index number of a certain patient, because the real matrix of each patient may have a different number of matrix rows, so it is recorded as T n , but the number of columns of the real matrix for all patients must be the same, so D is not subscripted. X n represents the feature matrix of patient n, Sn represents the time information matrix of each feature collection of patient n, M n represents the indication matrix of whether each feature of patient n is missing, Δ n represents the time interval matrix of each feature collection of patient n , X′ n represents the feature matrix of each feature of patient n at the previous acquisition time point of a certain acquisition time point.
Figure PCTCN2021090728-appb-000049
分别表示患者n的各个特征变量的向量;
Figure PCTCN2021090728-appb-000050
分别代表患者n的各个特征变量的采集时间的向量,
Figure PCTCN2021090728-appb-000051
分别代表患者n的各个特征变量采集时是否缺失的指示向量,
Figure PCTCN2021090728-appb-000052
分别代表患者n的各个特征变量采集的时间间隔的向量;x′ n 1,x′ n 2,…,x′ n D分表表示患者n的各个特征变量在某个采集时间点的上一个采集时间点的特征信息的向量。
Figure PCTCN2021090728-appb-000049
respectively represent the vectors of each characteristic variable of patient n;
Figure PCTCN2021090728-appb-000050
are the vectors representing the acquisition time of each characteristic variable of patient n, respectively,
Figure PCTCN2021090728-appb-000051
Respectively represent the indicator vector of whether each characteristic variable of patient n is missing when it is collected,
Figure PCTCN2021090728-appb-000052
The vectors representing the time interval of each characteristic variable collection of patient n respectively; x′ n 1 ,x′ n 2 ,...,x′ n D sub-tables represent the last collection of each characteristic variable of patient n at a certain collection time point A vector of feature information for a time point.
Figure PCTCN2021090728-appb-000053
表示患者n的各个特征的均值向量。
Figure PCTCN2021090728-appb-000053
A vector of means representing the individual features of patient n.
公式(7)表示采集点时间间隔
Figure PCTCN2021090728-appb-000054
的具体计算方式,其中,下标j代表某患者特征矩阵的行号,
Figure PCTCN2021090728-appb-000055
代表某行某个特征的具体的时间,
Figure PCTCN2021090728-appb-000056
代表某行某个特征是否缺失。
Formula (7) represents the time interval of collection points
Figure PCTCN2021090728-appb-000054
The specific calculation method of , where the subscript j represents the row number of a patient characteristic matrix,
Figure PCTCN2021090728-appb-000055
represents the specific time of a certain feature of a row,
Figure PCTCN2021090728-appb-000056
Indicates whether a feature of a row is missing.
公式(9)中,
Figure PCTCN2021090728-appb-000057
表示j时刻的特征d的值是由均值向量
Figure PCTCN2021090728-appb-000058
和上一时刻的特征值
Figure PCTCN2021090728-appb-000059
由衰减率
Figure PCTCN2021090728-appb-000060
加权而来。这里的
Figure PCTCN2021090728-appb-000061
是向量Γ j的分量。根据公式(9),利用时间信息S n,缺失指示变量M n,间隔时间信息Δ n,将原始的特征矩阵X n变成无缺失值的
Figure PCTCN2021090728-appb-000062
衰减率有两种,一种是对x j′进行衰减,另外一种是对隐含层向量h j-1进行衰减,为了区分,把
Figure PCTCN2021090728-appb-000063
记为对x j′的衰减率,
Figure PCTCN2021090728-appb-000064
记为对h j-1的衰减率。他们的计算方式都是公式(8)所展示的方式,只是权重W Γ和偏移b Γ不同。为了简化,下文中没有特意区别
Figure PCTCN2021090728-appb-000065
Figure PCTCN2021090728-appb-000066
统一记为Γ j
In formula (9),
Figure PCTCN2021090728-appb-000057
The value of the feature d representing time j is determined by the mean vector
Figure PCTCN2021090728-appb-000058
and the eigenvalues of the previous moment
Figure PCTCN2021090728-appb-000059
by the decay rate
Figure PCTCN2021090728-appb-000060
weighted. here
Figure PCTCN2021090728-appb-000061
are the components of the vector Γ j . According to formula (9), using the time information S n , the missing indicator variable Mn , and the interval time information Δ n , the original feature matrix X n is transformed into one without missing values.
Figure PCTCN2021090728-appb-000062
There are two decay rates, one is to attenuate x j′ , and the other is to attenuate the hidden layer vector h j-1 . In order to distinguish, the
Figure PCTCN2021090728-appb-000063
is denoted as the decay rate to x j′ ,
Figure PCTCN2021090728-appb-000064
Denoted as the decay rate to h j-1 . Their calculation methods are the same as those shown in Equation (8), except that the weight W Γ and the offset b Γ are different. For simplicity, no specific distinction is made below
Figure PCTCN2021090728-appb-000065
and
Figure PCTCN2021090728-appb-000066
It is uniformly denoted as Γ j .
Phased LSTM和核心前向过程计算参考公式(10)-(21),Phased LSTM and core forward process calculation reference formulas (10)-(21),
i j=σ(x jW xi+h j-1W hi+b i)    (10) i j =σ(x j W xi +h j-1 W hi +b i ) (10)
f j=σ(x jW xf+h j-1W hf+b f)    (11) f j =σ(x j W xf +h j-1 W hf +b f ) (11)
c j=σ⊙c j-1+i t⊙σ(x jW xc+h j-1W hc+b c)  (12) c j =σ⊙c j-1 +i t ⊙σ(x j W xc +h j-1 W hc +b c ) (12)
o j=σ(x jW xo+h j-1W ho+b o)   (13) o j =σ(x j W xo +h j-1 W ho +b o ) (13)
h j=o j⊙σ(c j)                 (14) h j =o j ⊙σ(c j ) (14)
Figure PCTCN2021090728-appb-000067
Figure PCTCN2021090728-appb-000067
Figure PCTCN2021090728-appb-000068
Figure PCTCN2021090728-appb-000068
Figure PCTCN2021090728-appb-000069
Figure PCTCN2021090728-appb-000069
Figure PCTCN2021090728-appb-000070
Figure PCTCN2021090728-appb-000070
Figure PCTCN2021090728-appb-000071
Figure PCTCN2021090728-appb-000071
k j=sin(W sS j+b s)+C s     (20) k j =sin(W s S j +b s )+C s (20)
Figure PCTCN2021090728-appb-000072
Figure PCTCN2021090728-appb-000072
公式(10)-(14)是LSTM模型的公式,其中i表示输入门,f表示遗忘门,c表示细胞状态,o表述输出门,σ表示sigmoid激活函数,W代表权重,b代表偏移,其下标代表权重和偏移的归属。⊙代表点乘运算。下标j代表行号。注意,在公式(10)-(14)中,x j和由公式(9)计算出来的各特征所组成的向量
Figure PCTCN2021090728-appb-000073
等价,这里每一个特征矩阵的行号j都会有一个对应的时间s j,x j代表已经填补以后,没有缺失值的s j时刻的某患者的特征向量。
Formulas (10)-(14) are the formulas of the LSTM model, where i represents the input gate, f represents the forget gate, c represents the cell state, o represents the output gate, σ represents the sigmoid activation function, W represents the weight, b represents the offset, Its subscripts represent the attribution of weights and offsets. ⊙ stands for dot multiplication. The subscript j represents the line number. Note that in equations (10)-(14), x j and the vector composed of the features calculated by equation (9)
Figure PCTCN2021090728-appb-000073
Equivalently, the row number j of each feature matrix here will have a corresponding time s j , and x j represents the feature vector of a patient at time s j without missing values after it has been filled.
公式(15)-公式(21)中,
Figure PCTCN2021090728-appb-000074
表示隐含层更新的中间量,
Figure PCTCN2021090728-appb-000075
表示经过时间门加权求和后得到的隐含层,
Figure PCTCN2021090728-appb-000076
表示细胞状态更新的中间量,c j表示更新后的细胞状态。Γ j表示s j时刻的衰减率是由t时刻的间隔时间信息Δ j的函数计算而来;其中W Γ表示权重,b Γ表示偏移。其他量与前述一致。k j表示时间门,通过周期函数k j=sin(W sS j+b s)+C s计算时间门,W s,b s,C s为周期函数的各项参数,在后向传播时同梯度下降法进行估计。公式(21)为模型的目标函数。N为每次迭代所使用的样本量,T n为某患者的总测量次数,y nj表示某患者某一时刻的结局,p nj表示某患者某时刻的结局的预测概率。
Equation (15)-Equation (21),
Figure PCTCN2021090728-appb-000074
represents the intermediate amount of hidden layer updates,
Figure PCTCN2021090728-appb-000075
represents the hidden layer obtained after the weighted summation of the time gate,
Figure PCTCN2021090728-appb-000076
represents the intermediate amount of cell state update, and c j represents the updated cell state. Γ j represents that the decay rate at time s j is calculated from the function of the interval time information Δ j at time t; where W Γ represents the weight, and b Γ represents the offset. Other amounts are the same as before. k j represents the time gate, the time gate is calculated by the periodic function k j =sin(W s S j +b s )+C s , W s ,b s ,C s are the parameters of the periodic function. Estimated using the gradient descent method. Equation (21) is the objective function of the model. N is the sample size used in each iteration, T n is the total number of measurements of a certain patient, y nj represents the outcome of a certain patient at a certain time, and p nj represents the predicted probability of the outcome of a certain patient at a certain time.
公式(11)-(15)是LSTM网络结构,公式(16)-(20)阐述了
Figure PCTCN2021090728-appb-000077
对h j-1的衰减,以及时间门的计算和时间门对细胞状态和隐含层的影响。公式(21) 是模型的损失函数,在模型训练中用于前向过程的误差计算和后向过程的梯度计算。
Formulas (11)-(15) are the LSTM network structure, and formulas (16)-(20) illustrate
Figure PCTCN2021090728-appb-000077
The decay on h j-1 , and the computation of the time gate and the effect of the time gate on the cell state and hidden layers. Equation (21) is the loss function of the model, which is used for the error calculation of the forward process and the gradient calculation of the backward process in the model training.
优化算法采用Adam算法进行反向传播求解。The optimization algorithm adopts Adam's algorithm for back propagation solution.
神经元个数,隐含层层数等超参数信息采用网格搜索法寻优,选出在验证集上表现最优的参数组合作为模型最终结果。The number of neurons, the number of hidden layers and other hyperparameter information are optimized by grid search method, and the parameter combination with the best performance on the validation set is selected as the final result of the model.
实施例2急性胰腺炎诱发器官衰竭的预测***Example 2 Prediction system for acute pancreatitis-induced organ failure
本实施例提供一种急性胰腺炎诱发器官衰竭的预测***,包括通过数据接口连接的服务器和计算机设备。This embodiment provides a prediction system for acute pancreatitis-induced organ failure, including a server and a computer device connected through a data interface.
服务器,用于存储患者信息;Servers for storing patient information;
计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现实施例1中的预测模型。A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the prediction model in Embodiment 1 when the processor executes the program.
通过上述实施例可见,本发明的模型无需专门进行相关医学测试,能够对急性胰腺炎患者日常的检测、治疗和用药等信息进行数据处理,处理后的数据可用于及时、准确预测患者发生器官衰竭的风险。It can be seen from the above examples that the model of the present invention does not need to carry out relevant medical tests, and can perform data processing on the daily detection, treatment and medication information of patients with acute pancreatitis, and the processed data can be used to timely and accurately predict the occurrence of organ failure in patients risks of.

Claims (10)

  1. 一种急性胰腺炎诱发器官衰竭的预测模型,其特征在于包括以下步骤:A predictive model for acute pancreatitis-induced organ failure, characterized by comprising the following steps:
    S100、患者信息预处理,采用{Variables,Time}记录事件及时间节点;S100, patient information preprocessing, using {Variables, Time} to record events and time nodes;
    S200、按照时间的先后顺序对事件进行排序,采用Decay机制填补缺失值;S200. Sort the events according to the time sequence, and use the Decay mechanism to fill in the missing values;
    S300、使用Embedding机制对数据进行one-hot编码,映射到实向量空间中,对数据进行归一化后输入Phased LSTM模型,S300. Use the Embedding mechanism to perform one-hot encoding on the data, map it to the real vector space, normalize the data and input it into the Phased LSTM model,
    其中依据患者从入院到某一事件的时间节点这段间隔时间计算时间门输出,利用时间门的输出结果来加速模型训练过程,输出层的神经元为2,采用softmax函数作为激活函数。The output of the time gate is calculated according to the interval from the patient's admission to a certain event, and the output of the time gate is used to speed up the model training process. The number of neurons in the output layer is 2, and the softmax function is used as the activation function.
  2. 根据权利要求1所述的预测模型,其特征在于:步骤S200中,Decay机制具体体现为公式(1)~(9),其作用为引入衰减率
    Figure PCTCN2021090728-appb-100001
    对上一时刻的各指标观测值进行衰减,从而对目前时刻的缺失值进行填补,
    The prediction model according to claim 1, characterized in that: in step S200, the Decay mechanism is embodied in formulas (1) to (9), and its function is to introduce a decay rate
    Figure PCTCN2021090728-appb-100001
    Attenuate the observed values of each indicator at the previous moment, so as to fill in the missing values at the current moment,
    Figure PCTCN2021090728-appb-100002
    Figure PCTCN2021090728-appb-100002
    Figure PCTCN2021090728-appb-100003
    Figure PCTCN2021090728-appb-100003
    Figure PCTCN2021090728-appb-100004
    Figure PCTCN2021090728-appb-100004
    Figure PCTCN2021090728-appb-100005
    Figure PCTCN2021090728-appb-100005
    Figure PCTCN2021090728-appb-100006
    Figure PCTCN2021090728-appb-100006
    其中,
    Figure PCTCN2021090728-appb-100007
    表示行数为T n,列数为D的实数矩阵,R是实数的记号,n是某一个患者的索引号,因为每个患者的实数矩阵可能有不同的矩阵行数,所以记为T n
    in,
    Figure PCTCN2021090728-appb-100007
    Represents a real matrix with the number of rows T n and the number of columns D, R is the symbol of the real number, and n is the index number of a certain patient, because the real matrix of each patient may have a different number of matrix rows, so it is recorded as T n ;
    X n表示患者n的特征矩阵,S n表示患者n的各特征采集的时间信息矩阵,M n表示患者n的各特征是否缺失的指示矩阵,Δ n表示患者n的各特征采集的 时间间隔矩阵,X′ n表示患者n的各特征在某采集时间点的上一个采集时间点的特征矩阵; X n represents the feature matrix of patient n, Sn represents the time information matrix of each feature collection of patient n, M n represents the indication matrix of whether each feature of patient n is missing, Δ n represents the time interval matrix of each feature collection of patient n , X′ n represents the feature matrix of each feature of patient n at the previous collection time point of a certain collection time point;
    Figure PCTCN2021090728-appb-100008
    分别表示患者n的各个特征变量的向量;
    Figure PCTCN2021090728-appb-100009
    分别代表患者n的各个特征变量的采集时间的向量,
    Figure PCTCN2021090728-appb-100010
    分别代表患者n的各个特征变量采集时是否缺失的指示向量,
    Figure PCTCN2021090728-appb-100011
    分别代表患者n的各个特征变量采集的时间间隔的向量;
    Figure PCTCN2021090728-appb-100012
    分表表示患者n的各个特征变量在某个采集时间点的上一个采集时间点的特征信息的向量;
    Figure PCTCN2021090728-appb-100008
    respectively represent the vectors of each characteristic variable of patient n;
    Figure PCTCN2021090728-appb-100009
    are the vectors representing the acquisition time of each characteristic variable of patient n, respectively,
    Figure PCTCN2021090728-appb-100010
    Respectively represent the indicator vector of whether each characteristic variable of patient n is missing when it is collected,
    Figure PCTCN2021090728-appb-100011
    The vectors representing the time intervals of each characteristic variable collection of patient n respectively;
    Figure PCTCN2021090728-appb-100012
    The sub-table represents the vector of the characteristic information of each characteristic variable of patient n at the previous collection time point of a certain collection time point;
    Figure PCTCN2021090728-appb-100013
    Figure PCTCN2021090728-appb-100013
    Figure PCTCN2021090728-appb-100014
    Figure PCTCN2021090728-appb-100014
    其中,
    Figure PCTCN2021090728-appb-100015
    表示患者n的各个特征的均值向量;
    Figure PCTCN2021090728-appb-100016
    表示采集点时间间隔,下标j代表某患者特征矩阵的行号,
    Figure PCTCN2021090728-appb-100017
    代表某行某个特征的具体的时间,
    Figure PCTCN2021090728-appb-100018
    代表某行某个特征是否缺失;
    in,
    Figure PCTCN2021090728-appb-100015
    represents the mean vector of each feature of patient n;
    Figure PCTCN2021090728-appb-100016
    represents the time interval of collection points, and the subscript j represents the row number of a patient feature matrix,
    Figure PCTCN2021090728-appb-100017
    represents the specific time of a certain feature of a row,
    Figure PCTCN2021090728-appb-100018
    Indicates whether a feature of a row is missing;
    Γ j=exp{-max(0,W ΓΔ j+b Γ)}    (8) Γ j =exp{-max(0, W Γ Δ j +b Γ )} (8)
    Figure PCTCN2021090728-appb-100019
    Figure PCTCN2021090728-appb-100019
    其中,把
    Figure PCTCN2021090728-appb-100020
    记为对x j′的衰减率,x j′表示第j个时间点的上一个时刻的特征值,
    Figure PCTCN2021090728-appb-100021
    记为对h j-1的衰减率,h j-1表示第j个时间点的隐含层状态,
    Figure PCTCN2021090728-appb-100022
    Figure PCTCN2021090728-appb-100023
    统一记为Γ j,W Γ为权重,b Γ为偏移,Δ j为s j时刻的间隔时间信息;
    Among them, put
    Figure PCTCN2021090728-appb-100020
    Denoted as the decay rate of x j ' , x j' represents the eigenvalue of the jth time point at the previous moment,
    Figure PCTCN2021090728-appb-100021
    Denoted as the decay rate to h j- 1 , h j-1 represents the hidden layer state at the jth time point,
    Figure PCTCN2021090728-appb-100022
    and
    Figure PCTCN2021090728-appb-100023
    It is uniformly recorded as Γ j , W Γ is the weight, b Γ is the offset, and Δ j is the interval time information at time s j ;
    Figure PCTCN2021090728-appb-100024
    表示s j时刻的特征d的值,其是由特征d的均值
    Figure PCTCN2021090728-appb-100025
    和上一时刻的特征值
    Figure PCTCN2021090728-appb-100026
    由衰减率
    Figure PCTCN2021090728-appb-100027
    加权而来,
    Figure PCTCN2021090728-appb-100028
    是向量Γ t的分量,
    Figure PCTCN2021090728-appb-100029
    为第d个变量的第j次测量值是否缺失,
    Figure PCTCN2021090728-appb-100030
    为s j时刻的特征d的原始值。
    Figure PCTCN2021090728-appb-100024
    Represents the value of feature d at time s j , which is the mean of feature d
    Figure PCTCN2021090728-appb-100025
    and the eigenvalues of the previous moment
    Figure PCTCN2021090728-appb-100026
    by the decay rate
    Figure PCTCN2021090728-appb-100027
    weighted,
    Figure PCTCN2021090728-appb-100028
    are the components of the vector Γ t ,
    Figure PCTCN2021090728-appb-100029
    is whether the jth measurement of the dth variable is missing,
    Figure PCTCN2021090728-appb-100030
    is the original value of feature d at time s j .
  3. 根据权利要求1所述的预测模型,其特征在于:步骤S300中,对于Phased LSTM模型,其网络结构由公式(10)~公式(20)定义,The prediction model according to claim 1, wherein: in step S300, for the Phased LSTM model, its network structure is defined by formula (10) to formula (20),
    i j=σ(x jW xi+h j-1W hi+b i)     (10) i j =σ(x j W xi +h j-1 W hi +b i ) (10)
    f j=σ(x jW xf+h j-1W hf+b f)    (11) f j =σ(x j W xf +h j-1 W hf +b f ) (11)
    c j=σ⊙c j-1+i t⊙σ(x jW xc+h j-1W hc+b c)   (12) c j =σ⊙c j-1 +i t ⊙σ(x j W xc +h j-1 W hc +b c ) (12)
    o j=σ(x jW xo+h j-1W ho+b o)     (13) o j =σ(x j W xo +h j-1 W ho +b o ) (13)
    h j=o j⊙σ(c j)     (14) h j =o j ⊙σ(c j ) (14)
    其中,i表示输入门,f表示遗忘门,c表示细胞状态,o表述输出门,σ表示sigmoid激活函数,W代表权重,b代表偏移,其下标代表权重和偏移的归属,下标j代表行号,⊙代表点乘运算,x j代表j行特征的值。 Among them, i represents the input gate, f represents the forget gate, c represents the cell state, o represents the output gate, σ represents the sigmoid activation function, W represents the weight, b represents the offset, the subscript represents the attribution of the weight and offset, the subscript j represents the row number, ⊙ represents the dot product operation, and x j represents the value of the feature in row j.
  4. 根据权利要求3所述的预测模型,其特征在于:Γ j对h j-1的衰减,以及时间门的计算和时间门对细胞状态和隐含层的影响的公式为公式(15)~公式(21), The prediction model according to claim 3, characterized in that: the attenuation of Γ j to h j-1 , and the formula for the calculation of the time gate and the influence of the time gate on the cell state and the hidden layer are formula (15) ~ formula (twenty one),
    Figure PCTCN2021090728-appb-100031
    Figure PCTCN2021090728-appb-100031
    Figure PCTCN2021090728-appb-100032
    Figure PCTCN2021090728-appb-100032
    Figure PCTCN2021090728-appb-100033
    Figure PCTCN2021090728-appb-100033
    Figure PCTCN2021090728-appb-100034
    Figure PCTCN2021090728-appb-100034
    Figure PCTCN2021090728-appb-100035
    Figure PCTCN2021090728-appb-100035
    k j=sin(W sS j+b s)+C s     (20) k j =sin(W s S j +b s )+C s (20)
    Figure PCTCN2021090728-appb-100036
    Figure PCTCN2021090728-appb-100036
    其中,
    Figure PCTCN2021090728-appb-100037
    表示隐含层更新的中间量,
    Figure PCTCN2021090728-appb-100038
    表示经过时间门加权求和后得到的隐含层,
    Figure PCTCN2021090728-appb-100039
    表示细胞状态更新的中间量,c j表示更新后的细胞状态,h j为更 新后的隐含层,
    Figure PCTCN2021090728-appb-100040
    记为对h j-1的衰减率,k j表示时间门,W s,b s,C s为周期函数的各项参数,L为模型的目标函数,N为每次迭代所使用的样本量,T n为某患者的总测量次数,y nj表示某患者某一时刻的结局,p nj表示某患者某时刻的结局的预测概率。
    in,
    Figure PCTCN2021090728-appb-100037
    represents the intermediate amount of hidden layer updates,
    Figure PCTCN2021090728-appb-100038
    represents the hidden layer obtained after the weighted summation of the time gate,
    Figure PCTCN2021090728-appb-100039
    Represents the intermediate amount of cell state update, c j represents the updated cell state, h j is the updated hidden layer,
    Figure PCTCN2021090728-appb-100040
    Denote the decay rate to h j-1 , k j represents the time gate, W s , b s , C s are the parameters of the periodic function, L is the objective function of the model, and N is the sample size used in each iteration , T n is the total number of measurements of a patient, y nj is the outcome of a patient at a certain time, p nj is the predicted probability of the outcome of a patient at a certain time.
  5. 根据权利要求3或4所述的预测模型,其特征在于:Phased LSTM模型采用Adam算法进行反向传播求解。The prediction model according to claim 3 or 4, characterized in that: the Phased LSTM model adopts Adam algorithm for back-propagation solution.
  6. 根据权利要求5所述的预测模型,其特征在于:Phased LSTM模型的超参数信息采用网格搜索法寻优,选出在验证集上表现最优的参数组合作为模型最终结果。The prediction model according to claim 5, wherein the hyperparameter information of the Phased LSTM model is optimized by a grid search method, and the parameter combination that performs best on the verification set is selected as the final result of the model.
  7. 根据权利要求1所述的预测模型,其特征在于:超参数信息包括神经元个数,隐含层层数。The prediction model according to claim 1, wherein the hyperparameter information includes the number of neurons and the number of hidden layers.
  8. 根据权利要求1所述的预测模型,其特征在于:所述事件包括患者入院后的用药信息、实验室检查信息、电子病历信息和放射***的检查信息。The prediction model according to claim 1, wherein the events include medication information, laboratory examination information, electronic medical record information and examination information of a radiology system after admission of the patient.
  9. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1-8任一项所述的预测模型。A computer device, comprising a memory, a processor, and a computer program stored on the memory and running on the processor, characterized in that, when the processor executes the program, any one of claims 1-8 is implemented prediction model.
  10. 一种急性胰腺炎诱发器官衰竭的预测***,包括:A predictive system for acute pancreatitis-induced organ failure including:
    服务器,用于存储患者信息;Servers for storing patient information;
    权利要求9所述的计算机设备。The computer device of claim 9.
PCT/CN2021/090728 2021-04-26 2021-04-28 System for predicting acute pancreatitis-induced organ failure, and computer device WO2022226843A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110454703.1 2021-04-26
CN202110454703.1A CN112967816B (en) 2021-04-26 2021-04-26 Acute pancreatitis organ failure prediction method, computer equipment and system

Publications (1)

Publication Number Publication Date
WO2022226843A1 true WO2022226843A1 (en) 2022-11-03

Family

ID=76281240

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090728 WO2022226843A1 (en) 2021-04-26 2021-04-28 System for predicting acute pancreatitis-induced organ failure, and computer device

Country Status (2)

Country Link
CN (1) CN112967816B (en)
WO (1) WO2022226843A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116298947A (en) * 2023-03-07 2023-06-23 中国铁塔股份有限公司黑龙江省分公司 Storage battery nuclear capacity monitoring device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113951845B (en) * 2021-12-01 2022-08-05 中国人民解放军总医院第一医学中心 Method and system for predicting severe blood loss and injury condition of wound
CN113903460A (en) * 2021-12-10 2022-01-07 中国医学科学院北京协和医院 System for predicting severe acute pancreatitis and application thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111081377A (en) * 2020-01-16 2020-04-28 四川大学 Necrotic acute pancreatitis patient operation time prediction model
CN111243752A (en) * 2020-01-16 2020-06-05 四川大学华西医院 Prediction model for acute pancreatitis induced organ failure
KR102225278B1 (en) * 2020-01-31 2021-03-10 주식회사 스탠다임 Prediction Method for Disease, Gene or Protein related Query Entity and built Prediction System using the same

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE535502C2 (en) * 2010-09-14 2012-08-28 Calmark Sweden Ab System and method for analyzing risk or occurrence of organ failure
KR20190115330A (en) * 2018-04-02 2019-10-11 주식회사 씨씨앤아이리서치 An application for predicting an acute exacerbation of chronic respiratory disease
KR102310490B1 (en) * 2018-04-27 2021-10-08 한국과학기술원 The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network
CN109493933B (en) * 2018-08-08 2022-04-05 浙江大学 Attention mechanism-based adverse cardiovascular event prediction device
EP3620983B1 (en) * 2018-09-05 2023-10-25 Sartorius Stedim Data Analytics AB Computer-implemented method, computer program product and system for data analysis
GB201820004D0 (en) * 2018-12-07 2019-01-23 Univ Oxford Innovation Ltd Method and data processing apparatus for generating real-time alerts about a patient
CN109659033B (en) * 2018-12-18 2021-04-13 浙江大学 Chronic disease state of an illness change event prediction device based on recurrent neural network
CN110289096B (en) * 2019-06-28 2021-12-07 电子科技大学 ICU (intensive Care Unit) intra-hospital mortality prediction method based on deep learning
CN110881969A (en) * 2019-11-27 2020-03-17 太原理工大学 Stacking ensemble learning-based heart failure early warning method
CN111581339B (en) * 2020-04-09 2021-11-12 天津大学 Method for extracting gene events of biomedical literature based on tree-shaped LSTM
CN112420201B (en) * 2020-11-25 2022-09-30 哈尔滨工业大学 Deep cascading framework for ICU mortality prediction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111081377A (en) * 2020-01-16 2020-04-28 四川大学 Necrotic acute pancreatitis patient operation time prediction model
CN111243752A (en) * 2020-01-16 2020-06-05 四川大学华西医院 Prediction model for acute pancreatitis induced organ failure
KR102225278B1 (en) * 2020-01-31 2021-03-10 주식회사 스탠다임 Prediction Method for Disease, Gene or Protein related Query Entity and built Prediction System using the same

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116298947A (en) * 2023-03-07 2023-06-23 中国铁塔股份有限公司黑龙江省分公司 Storage battery nuclear capacity monitoring device
CN116298947B (en) * 2023-03-07 2023-11-03 中国铁塔股份有限公司黑龙江省分公司 Storage battery nuclear capacity monitoring device

Also Published As

Publication number Publication date
CN112967816A (en) 2021-06-15
CN112967816B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
WO2022226843A1 (en) System for predicting acute pancreatitis-induced organ failure, and computer device
Smets et al. Machine learning solutions for osteoporosis—a review
Brause Medical analysis and diagnosis by neural networks
US9672326B2 (en) Determining disease state of a patient by mapping a topological module representing the disease, and using a weighted average of node data
US9558324B2 (en) Artificial general intelligence system/medical reasoning system (MRS) for determining a disease state using graphs
US20220122739A1 (en) Ai-based condition classification system for patients with novel coronavirus
CN111243752A (en) Prediction model for acute pancreatitis induced organ failure
Liu et al. Cost analysis in choosing group size when group testing for Potato virus Y in the presence of classification errors
Agharezaei et al. The prediction of the risk level of pulmonary embolism and deep vein thrombosis through artificial neural network
Olufemi et al. Application of Logistic Regression Model in Prediction of Early Diabetes Across United States
Kumar et al. Heart disease detection system using gradient boosting technique
CN116994751A (en) Method and device for constructing pre-eclampsia early-stage risk prediction model
CN116453694A (en) Disease risk prediction method and system based on under-sampling integrated framework with replacement
EP4167245A1 (en) Systems and methods for modelling a human subject
CN115101217A (en) Kawasaki disease aspirin resistance prediction model and prediction evaluation system
CN113593694A (en) Method for predicting prognosis of severe patient
CN112732690A (en) Stabilizing system and method for chronic disease detection and risk assessment
Badvath et al. Prediction of software defects using deep learning with improved cuckoo search algorithm
Zhu et al. Design and development of a readmission risk assessment system for patients with cardiovascular disease
Chen et al. A Novel Method for Predicting Blood Glucose Levels in the Elderly Based on Ensemble Optimization Algorithms
Ravaji et al. CSChO-deep MaxNet: Cat swam chimp optimization integrated deep maxout network for heart disease detection
Corbin et al. Avoiding biased clinical machine learning model performance estimates in the presence of label selection
CN117133459B (en) Machine learning-based postoperative intracranial infection prediction method and system
CN114444368B (en) Pipeline integrity evaluation method and device and electronic equipment
CN116092655B (en) Hospital performance management method and system based on big data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21938328

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21938328

Country of ref document: EP

Kind code of ref document: A1