CN113808693A

CN113808693A - Medicine recommendation method based on graph neural network and attention mechanism

Info

Publication number: CN113808693A
Application number: CN202111061579.9A
Authority: CN
Inventors: 万健; 岳魏琦; 张蕾; 洪高枫; 郑慧琳; 史斌彬
Original assignee: Zhejiang Lover Health Science and Technology Development Co Ltd
Current assignee: Zhejiang Lover Health Science and Technology Development Co Ltd; Zhejiang University of Science and Technology ZUST
Priority date: 2021-09-10
Filing date: 2021-09-10
Publication date: 2021-12-17

Abstract

The invention discloses a medicine recommendation method based on a graph neural network and an attention mechanism. The invention takes the structural characteristics of the doctor seeing situation or the medication information of each patient as a node, adopts the graph neural network to capture the relationship among the structural characteristics and learns the high-order characteristics containing the medical system knowledge. Meanwhile, the attention mechanism is used for better modeling the historical medical records of the user, the medicine interaction knowledge is introduced, and the accuracy and the safety of medicine recommendation are effectively improved.

Description

Medicine recommendation method based on graph neural network and attention mechanism

Technical Field

The invention belongs to the technical field of computer application, and relates to a medicine recommendation method based on a graph neural network and an attention mechanism.

Background

The development of modern medical technology has led to the widespread use of electronic medical records, which accumulate large amounts of clinical data, such as vital signs, clinical summary, disease diagnosis, prescription drugs, etc. Meanwhile, the deep learning technology provides a new technical means for mining and utilizing medical data, and is a research hotspot at present. The combined medicine recommendation algorithm based on the electronic medical record can assist a doctor to make a safe and effective prescription according to the change characteristics of the illness state of a patient, the medicine attributes and the action relation among a large number of medicines, and has important research value.

Early drug recommendation techniques were mostly rule based. The related experts extract the medication rules based on the medical information such as diagnosis, disease classification, symptoms, detection results and the like of the patients, and the defects are that the maintenance is complex and the updating and the expansion are difficult. The medicine recommendation technology under deep learning embeds information of physical signs, diagnosis, past medicine and the like of a patient into a low-dimensional space, and uses the embedded representation for recommendation, so that the recommendation accuracy is improved. However, they also have many problems, including data sparsity, inability to effectively utilize historical medical record information of patients, ignoring medical ontology information that medical codes imply, and the like.

A graph neural network is a neural network that acts directly on the graph structure. It has the following characteristics: ignoring the input order of the nodes; in the calculation process, the representation of the node is influenced by the neighbor nodes around the node, and the connection of the graph is unchanged; the representation of graph structure allows reasoning based on graphs. Therefore, the graph neural network becomes a great research hotspot and is widely applied to the fields of social networks, recommendation systems, financial wind control, physical systems, molecular chemistry, life science, knowledge maps, traffic prediction and the like.

The introduction of the graph attention mechanism to graph neural networks has found widespread use in many areas. The graph neural network better realizes the weighted aggregation of the neighbors by learning the weights of the neighbors, further filters noise neighbors, improves the model performance and can realize certain explanation on the result.

Disclosure of Invention

The invention aims to provide a medicine recommendation method which can effectively relieve the sparsity of medical data, effectively utilize the historical case information of a patient and give consideration to the safety of medicines in view of the defects of the prior art.

In order to achieve the above object, the present invention provides a drug recommendation method based on a graph neural network and an attention mechanism, comprising the following steps:

step 1, acquiring historical electronic medical record data, and performing structured processing:

acquiring historical clinic situations of a patient and medication information corresponding to the clinic situations to construct an electronic medical record, wherein the clinic situations comprise diagnosis data and operation condition data; the patient's electronic medical record is denoted as p ═ x₁,x₂,...,x_t-1]T is the current number of visits by the patient, where the ith visit by the patient is denoted as x_i＝[d_i,p_i,m_i],i＝1,2,...,t-1，d_iDiagnostic data, p, representing the patient's i-th visit_iSurgical condition data, m, representing the ith visit in a patient's medical record_iData representing the medication at the i-th visit.

Step 2, constructing three graph neural networks for learning the structural characteristics of the patient treatment condition and the medication information; the inputs of the three graph neural networks are respectively diagnosis data, operation condition data and medication data of a patient, and the corresponding outputs are respectively d^e、p^e、m^e；

The three graph neural networks adopt the same structure and specifically comprise nodes and edges; the nodes comprise leaf nodes and non-leaf nodes, the leaf nodes are input data, namely one of diagnosis data, operation condition data and medication data of a patient, and the non-leaf nodes are medical attribution classifications of the leaf nodes; the edge is the medical classification relation of two nodes;

each non-leaf node is represented as the sum of its own vector representation and all its sub-nodes, calculated using the GAT graph attention machine:

wherein g is_nDenotes the nth non-leaf node, K denotes the total number of attentions, ReLU and LeakyReLU denote non-linear functions, ch (n) denotes the vector representation of the nth non-leaf node itself and all its children,

weight calculation coefficient, W, representing the current non-leaf node itself and all its children nodes under the kth attention^kRepresents the learning parameters of the non-leaf nodes at the kth attention, e_*Vector representation representing nodes, a representing a learnable matrix, a^TIs a transpose thereof.

Each leaf node is represented as the sum of the vector representations of itself and all its ancestor nodes, again calculated using the GAT graph attention machine mechanism:

wherein c'_nRepresents the nth leaf node, an (n) represents the vector representation of the nth leaf node itself and all its ancestor nodes,

weight calculation coefficient, W ', representing the current leaf node itself and all ancestor nodes thereof under the k-th attention'^kThe learning parameters of the k-th attention leaf node are represented.

Step 3, constructing two attention mechanismsThe input of the GRU network model is the output result d of the step 2^e、p^eThe corresponding outputs are respectively k with history information^d、k^p；

The two GRU network models with attention mechanisms adopt the same structure and respectively comprise two parallel GRU networks and attention mechanism modules connected with the outputs of the two parallel GRU networks;

the two parallel GRU models are hidden layer output information for acquiring historical visiting situations (i.e., diagnosis or surgical situation information) by adopting different activation functions, and are specifically as follows:

H＝GRU₁(r) (5)

W^h＝softmax(F_h(H)) (6)

H＝GRU₂(r) (7)

W′^htanh(F_h′(H′)) (8)

h and H' respectively represent hidden layer information output by a first GRU network model and a second GRU network model, and W^h，W′^hRespectively representing attention mechanism weights obtained by the first GRU network and the second GRU network through softmax and tanh activation functions, F_h，F′_hRespectively representing the learnable linear transformation matrix functions of the first GRU network and the second GRU network, wherein r represents d^eOr p^e；

The attention mechanism module calculates k with historical information according to formula (9)^d、k^pI.e. by

For diagnostic information with historical information for different time scales,

the surgical condition information with historical information is obtained in different time scales.

Wherein t represents the total number of patient visits, W^h(i)，W′^h(i) Respectively represent attention mechanism weights obtained by the softmax and tanh activation functions corresponding to the ith visit,

representing element-by-element multiplication; k represents k^dOr k^p；

Step 4, constructing two memory neural networks MANN with the same structure; wherein the key-value pair stored in the first memory neural network is' the ith visit diagnosis data fusion information

"-" graph neural network medication information

"; the key-value pair stored in the second memory neural network is' the fusion information of the condition of the operation of the ith visit

"-" graph neural network medication information

”；

The ith treatment situation and the historical treatment situation are multiplied by each other in a contraposition way, and the weight of the ith treatment situation is calculated

Wherein

Or

The information of the treatment condition with historical information is shown in the ith treatment;

by weight

Obtaining historical medication vector

The ith dose was as follows:

further obtaining keys of memory neural network

Wherein

Representing learning weight, key

Corresponding value is

According to

Can obtain

And

step 5, constructing a drug interaction knowledge base

Introduction of knowledge of drug interactions, use of adjacency matrix A_CThe adjacency matrix A represents the coexistence relationship of medicines in the electronic medical record_DIndicating drug interaction relationship. And (4) learning the medicine co-occurrence relation and the medicine interaction relation by adopting a graph convolution neural network, and combining the medicine interaction and co-occurrence relation with the key value pair in the step (4) to generate a recommended medicine list.

5-1 step 4

And

combining to obtain a query vector containing historical medical record information, ith diagnosis information and ith operation information

Wherein W_sRepresenting contrast weights for diagnostic and surgical information.

5-2, constructing a medicine coexistence relation matrix and a medicine interaction relation matrix in the electronic medical record

A_*＝D^-1(A_*+I)D^-1 (14)

Wherein D represents A_*The diagonal matrix of the transform, D-1 being the inverse thereof, I being an identity matrix, A_*Representing the drug coexistence relationship matrix A in the electronic medical record_COr drug interaction relationship matrix A_D。

5-3 learning relationships between drugs using graph convolution neural networks, incorporating drug interactions and co-occurrence relationships into an embedded representation, resulting in a representation matrix of drug co-occurrencesZ_CAnd a representation matrix Z of a drug interaction map_D：

Z_C＝A_Ctanh(A_Cm^e)W_C (15)

ZD＝A_Dtanh(A_Dm^e)W_D (16)

Wherein, W_C，W_DParameter matrices, m, for the drug contribution and drug interaction maps, respectively^eThe set of key-value pairs found in step 4.

Based on matrix Z_C、Z_DAnd query vector

Calculating attention λ_i：

Wherein, W_CDRepresenting the drug co-existence relationship and the contrast weight of the interaction.

Finally obtaining a recommended medicine list y_i：

Wherein, W_yRepresenting the weight coefficients at the time of calculation of the recommended medication list.

The recommendation probability y of the drug in the drug list_iAnd if the probability is greater than the recommendation probability threshold value rho, recommending the medicine.

It is another object of the present invention to provide a graph neural network and attention mechanism based drug recommendation device, comprising

The data preprocessing module is used for carrying out structuralized processing on the historical treatment condition of the patient and the medication information corresponding to the treatment condition to construct corresponding electronic medical record data;

graph neural network modelA block to learn structural characteristics of patient encounter and medication information; the inputs of the three graph neural networks are respectively diagnosis data, operation condition data and medication data of a patient, and the corresponding outputs are respectively d^e、p^e、m^e；

The GRU network model module with attention mechanism is used for extracting characteristics of the diagnosis data and the operation condition data output by the GRU network module, and then combining the characteristics to the current diagnosis situation to obtain the diagnosis data and the operation condition data with historical information.

The memory neural network MANN module is used for constructing key value pairs of the ith diagnosis and diagnosis data fusion information output by the GRU network model module with the attention mechanism and the medication information of the graph neural network, and key value pairs of the ith diagnosis and operation condition fusion information output by the GRU network model module with the attention mechanism and the medication information of the graph neural network;

and the drug interaction knowledge base module is used for learning the drug co-occurrence relationship and the drug interaction relationship by adopting a graph convolution neural network, combining the drug interaction and the co-occurrence relationship to the drug embedded expression of the memory neural network MANN module and generating a recommended drug list.

A further object of the present invention is a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to carry out the above-mentioned method.

Yet another object of the present invention is a computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method described above.

The invention has the following advantages: the invention takes the structural characteristics of the doctor seeing situation or the medication information of each patient as a node, adopts the graph neural network to capture the relationship among the structural characteristics and learns the high-order characteristics containing the medical system knowledge. Meanwhile, the attention mechanism is used for better modeling the historical medical records of the user, the medicine interaction knowledge is introduced, and the accuracy and the safety of medicine recommendation are effectively improved.

Drawings

FIG. 1 is a drug recommendation process based on a graph neural network and attention mechanism.

FIG. 2 is a tree diagram of a medical code encoding architecture.

Fig. 3 is a comparison of F1 values for different visits with different methods.

Detailed Description

The present invention is further analyzed with reference to the following specific examples.

The invention provides a medicine recommendation method based on a graph neural network and an attention mechanism, which comprises the following steps of:

indicating the current leaf at the kth attentionWeight calculation coefficients, W ', of the child nodes themselves and all ancestor nodes thereof'^kThe learning parameters of the k-th attention leaf node are represented.

Step 3, constructing two GRU network models with attention mechanisms, and inputting the output results de and p in the step 2 respectively^eThe corresponding outputs are respectively k with history information^d、k^p；

The two GRU network models with attention mechanisms adopt the same structure and respectively comprise a GRU network and an attention mechanism module connected with the output of the GRU network;

the GRU model uses an attention mechanism to incorporate hidden layer output information of historical encounter situations (i.e. diagnostic or surgical condition information) into the current information representation, and the specific calculation method is as follows:

H＝GRU₁(r) (5)

W^h＝softmax(F_h(H)) (6)

H′＝GRU₂(r) (7)

W′^h＝tanh(F_h′(H′)) (8)

wherein, H, H' respectively represent the hidden layer information output by the first and second GRU network models, W^h，W′^hRespectively, the attention mechanism weights obtained by the softmax and tanh activation functions, F_h，F′_hRepresenting the learnable linear transformation matrix function of the first and second GRU network models, r represents d^eOr p^e；

Wherein t represents the total number of patient visits, W^h(i)，W^h(i) Indicating the attention mechanism weight obtained by the softmax and tanh activation functions corresponding to a particular clinic,

representing element-by-element multiplication; k is a radical of^*Represents k^dOr k^p；

Step 4, constructing two memory neural networks MANN with similar structures; wherein the key-value pair stored in the first memory neural network is' the ith visit diagnosis data fusion information

"-" graph neural network medication information

"-" graph neural network medication information

”；

Wherein

Or

The operation condition information with historical information is shown in the ith visit;

by weight

Obtaining historical medication vector

The ith dose was as follows:

further obtaining keys of memory neural network

Wherein

Representing learning weight, key

Corresponding value is

Step 5, constructing a drug interaction knowledge base

Introduction of knowledge of drug interactions, use of adjacency matrix A_CRepresenting drugs in an electronic medical recordCoexistence relationship of objects, adjacency matrix A_DIndicating drug interaction relationship. And (4) learning the medicine co-occurrence relation and the medicine interaction relation by adopting a graph convolution neural network, and combining the medicine interaction and co-occurrence relation with the medicine embedded expression obtained in the step (4) to generate a recommended medicine list.

5-1 will step four

And

Wherein, W_sRepresenting contrast weights for diagnostic and surgical information.

A_*＝D^-1(A_*+I)D^-1 (14)

5-3 learning relationships between drugs using graph convolution neural networks, incorporating drug interactions and co-occurrence relationships into the embedded representation, resulting in a representation matrix Z of drug co-occurrences_CAnd a representation matrix Z of a drug interaction map_D：

z_C＝A_Ctanh(A_Cm^e)W_C (15)

Z_D＝A_Dtanh(A_Dm^e)W_D (16)

Wherein, W_C，W_DAnd me is the structural characteristic of the drug output in the step 2.

Based on matrix Z_C、Z_DAnd query vector

Calculating attention λ_i:

Finally obtaining a recommended medicine list y_i：

The obtained medicine list is a group of one-dimensional matrixes with absolute values smaller than 1, the horizontal and vertical coordinates respectively represent the medicine type and the recommendation probability, and when the recommendation probability of the medicine in the medicine list is larger than a preset threshold value of 0.5, the medicine is recommended.

The experimental process comprises the following steps:

the experiment used electronic medical record data from the MIMIC-III (medical Information Mark for Intelligent Care) database, which was a free public Intensive care data set published by the institute of technology, Massachusetts, institute of technology, and computing physiology laboratory. The present invention uses the diagnostic, surgical and prescription data in the database to screen patients for medications received within 24 hours of entering the ICU.

To measure the accuracy of the recommendation, the invention uses a Jaccard similarity coefficient (Jaccard), namely the size of the intersection of the real drug and the recommended drug, divided by the size of the union, an average F1 value (F1), namely the harmonic mean of the accuracy and the recall rate, and a precision calling curve (PRAUC) as the measurement index of the accuracy.

To measure the safety of the recommended drugs, the drug interaction rate DDI, i.e., the ratio of DDI drugs contained in the recommended combination drug, is used.

Compared with the current six effective methods, the Nearst method recommends the combined medicine which is the same as the previous diagnosis according to the similarity of the current diagnosis and the previous diagnosis; the LR method is L2 regularized logistic regression, using multiple heat vectors to represent input data, and binary classification to process multi-label outputs. The Leap method uses a recurrent neural network to model tag dependencies, and uses a content-based attention mechanism to capture mappings between tag instances. The RETAIN method is based on a sequence data drug combination of a two-layer attention network model that selects important clinical variables in past visits. The GAMENET method is a method for integrating historical medications and drug interaction DDIs using a graph-volume network via a storage module. The PREMIER method learns patient history representations using the attention mechanism, in combination with the graph attention mechanism, for drug interactions.

Table 1 model comparison experiment

Table 1 is the performance of various methods on the data set for the drug recommendation task. Experimental results show that the model of the method can achieve the best effect in all methods. In particular, the methods proposed by the present invention are 0.97%, 0.89% and 0.93% higher than the latest method (PREMIER) in Jaccard, PRAUC and F1 scores, respectively. Meanwhile, the method gives consideration to drug interaction, and the lowest DDI is 0.0705 under the condition of the interaction rate of the first 40 drugs in all similar deep learning methods. In addition, the average medicine quantity recommended by the invention is 14.98, and the average medicine quantity closest to the real medical record in comparison with each deep learning method is 14.68.

GRAD-mkg represents the experimental results of the present invention with the knowledge base of drug interactions removed. Under the condition of no drug interaction knowledge base, the accuracy of the recommended drugs does not change greatly, but the DDI in the model recommended drugs becomes high and reaches 0.767, which shows that the knowledge of the interaction relationship among the drugs is combined with the query vector with the historical visit information in the method, so that the interaction rate in the recommended drugs is reduced, and the medication safety is improved. The Grad-tree represents a model generated by learning structural features of the patient's visit and medication information using only the neural network of the graph. After the medical code body structure is not embedded, the accuracy rate of drug recommendation is obviously reduced, which shows that the graph neural network used in the method has the coding capability of high-order structural features, can enrich the embedded representation of the medical body, makes up the problem of sparse training data to a certain extent, and improves the accuracy rate of drug recommendation.

Because the number of visits of each patient is different, the influence of the number of visits in the past should be considered. The present invention is superior to all other methods for different timing lengths. As shown in FIG. 3, the present invention has the highest F1 value in all categories by visit number. Particularly, for data with more times of visits, the accuracy can still be kept higher than that of other methods, which shows that the method has better modeling capability for long-time dependence in patient medical records.

TABLE 2 comparative experiments on DDI of different degrees

Further experiments were also performed in the present invention with respect to the effect on drug interactions. The first 40, 60, 80, 100 DDI types were used, respectively, to investigate the impact of the present invention and those compared methods when considering the use of different numbers of DDIs. As the results are shown in Table 2, although the Δ DDI ratio rises from-18.48 to-0.26%, when the number of DDI types considered is changed from 40 to 100, GRAD is the only algorithm capable of achieving DDI reduction, and the Δ DDI ratio is always larger than zero regardless of the DDI types. This indicates that the present invention can reduce the interaction rate of recommended drugs after introducing knowledge of drug interactions, and is safer.

It can thus be seen that the present invention has the following advantages: the proposed medicine recommendation algorithm based on the graph neural network and the attention mechanism takes the structural characteristics of the treatment condition or the medication information of each patient as a node, captures the relationship among the nodes by adopting the graph neural network, and learns the high-order characteristics containing the medical classification relationship. Meanwhile, the attention mechanism is used for better modeling the historical medical records of the user, the medicine interaction knowledge is introduced, and the accuracy and the safety of medicine recommendation are effectively improved.

Claims

1. A medicine recommendation method based on a graph neural network and an attention mechanism is characterized by comprising the following steps:

acquiring historical clinic situations of a patient and medication information corresponding to the clinic situations to construct an electronic medical record, wherein the clinic situations comprise diagnosis data and operation condition data; the patient's electronic medical record is denoted as p ═ x₁，x₂，...，x_t-1]T is the current number of visits by the patient, where the ith visit by the patient is denoted as x_i＝[d_i，p_i，m_i]，i＝1，2，...，t-1，d_iDiagnostic data, p, representing the patient's i-th visit_iSurgical condition data, m, representing the ith visit in a patient's medical record_iMedication data representing the ith visit;

step 2, constructing three graph neural networks for learning about patientsStructural characteristics of the diagnosis and medication information; the inputs of the three graph neural networks are respectively diagnosis data, operation condition data and medication data of a patient, and the corresponding outputs are respectively d^e、p^e、m^e；

Step 3, constructing two GRU network models with attention mechanism, and inputting the output results d of the step 2^e、p^eThe corresponding outputs are respectively k with history information^d、k^p；

"-" graph neural network medication information

"-" graph neural network medication information

”；

Step 5, constructing a drug interaction knowledge base

5-1 step 4

And

Wherein W_sContrast weights representing diagnostic and surgical information;

A_*＝D^-1(A_*+I)D^-1 (14)

Wherein D represents A_*The diagonal matrix of the transform, D-1 being the inverse thereof, I being an identity matrix, A_*Representing the drug coexistence relationship matrix A in the electronic medical record_COr drug interaction relationship matrix A_D；

Z_C＝A_Ctanh(A_Cm^e)W_C (15)

Z_D＝A_Dtanh(A_Dm^e)W_D (16)

Wherein, W_C，W_DParameter matrices, m, for the drug contribution and drug interaction maps, respectively^eA set of key-value pairs obtained in step 4;

based on matrix Z_C、Z_DAnd query vector

Calculating attention λ_i：

Wherein, W_CDA contrast weight representing drug co-existence relationship and interaction;

finally obtaining a recommended medicine list y_i：

Wherein, W_yRepresenting a weight coefficient at the time of calculation of the recommended medicine list;

the recommendation probability y of the drug in the drug list_iAnd if the probability is greater than the recommendation probability threshold rho, recommending the corresponding medicine.

2. The drug recommendation method based on graph neural networks and attention mechanism as claimed in claim 1, wherein in step 2, the three graph neural networks adopt the same structure and comprise nodes and edges; the nodes comprise leaf nodes and non-leaf nodes, the leaf nodes are input data, namely one of diagnosis data, operation condition data and medication data of a patient, and the non-leaf nodes are medical attribution classifications of the leaf nodes; the edge is the medical classification relation of two nodes;

weight calculation coefficient, W, representing the current non-leaf node itself and all its children nodes under the kth attention^kRepresents the learning parameters of the non-leaf nodes at the kth attention, e_*Vector representation representing nodes, a representing a learnable matrix, a^TTransposing the same;

3. The method for recommending drugs based on graph neural network and attention mechanism according to claim 1 or 2, wherein in step 3, the two GRU network models with attention mechanism are of the same structure, and each of the two GRU network models comprises two parallel GRU networks and an attention mechanism module connected with the outputs of the two parallel GRU networks;

the two parallel GRU models are specifically a first GRU network and a second GRU network, the first GRU network and the second GRU network respectively adopt different activation functions to acquire hidden information of historical visiting situations, and the method specifically comprises the following steps:

H＝GRU₁(r) (5)

W^h＝softmax(F_h(H)) (6)

H′＝GRU₂(r) (7)

W′^h＝tanh(F_h′(H′)) (8)

the operation condition information with history information of different time scales is obtained;

representing element-by-element multiplication; k is a radical of^*Represents k^dOr k^p。

4. A graph-based neural network and attention machine as claimed in claim 1 or 2The prepared medicine recommending method is characterized in that the memory neural network carries out counterpoint multiplication on the ith diagnosis condition and the historical diagnosis condition in the step 4, and the weight of the ith diagnosis condition is calculated

Wherein

Or

by weight

Affecting historical dose

The ith dose was as follows:

further obtaining keys of memory neural network

Wherein

Representing learning weight, key

Corresponding value is

According to

Can obtain

And

5. the medicine recommending device based on the graph neural network and the attention mechanism is characterized by comprising the following components:

the figure neural network module is used for learning structural characteristics of the patient treatment condition and the medication information; the inputs of the three graph neural networks are respectively diagnosis data, operation condition data and medication data of a patient, and the corresponding outputs are respectively d^e、p^e、m^e；

The GRU network model module with attention mechanism is used for extracting characteristics of the diagnosis data and the operation condition data output by the GRU network module, and then combining the characteristics to the current diagnosis situation to obtain the diagnosis data and the operation condition data with historical information;

6. A computer-readable storage medium, having stored thereon a computer program which, when executed on a computer, causes the computer to perform the method of any one of claims 1 to 5.

7. A computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of any of claims 1-5.