CN114067905A - Drug-target interaction prediction method fusing multilayer drug structure information - Google Patents

Drug-target interaction prediction method fusing multilayer drug structure information Download PDF

Info

Publication number
CN114067905A
CN114067905A CN202111313022.XA CN202111313022A CN114067905A CN 114067905 A CN114067905 A CN 114067905A CN 202111313022 A CN202111313022 A CN 202111313022A CN 114067905 A CN114067905 A CN 114067905A
Authority
CN
China
Prior art keywords
drug
target
information
matrix
interaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111313022.XA
Other languages
Chinese (zh)
Inventor
车超
张培良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University
Original Assignee
Dalian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University filed Critical Dalian University
Priority to CN202111313022.XA priority Critical patent/CN114067905A/en
Publication of CN114067905A publication Critical patent/CN114067905A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Chemical & Material Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a medicine-target interaction prediction method fusing multilayer medicine structure information. Firstly, preprocessing the drug and target information in a pharmacocmics database, and extracting the drug and target information with interaction; secondly, expressing the molecular fingerprint of the drug SMILES as a molecular diagram structure, and extracting drug characteristic information by using a molecular complement diagram convolutional neural network and a Transformer network; then, processing the target sequence information by using a convolutional neural network, and extracting target characteristic information; and finally, the extracted medicine characteristic information and target point characteristic information are sent to a classification model for training, the model is stored, and the relation between the medicine and the target point is predicted. The method effectively extracts the characteristic information in the molecular structure of the drug, has higher accuracy in predicting the drug-target relationship, improves the efficiency and the precision of verifying the drug-target relationship, effectively shortens the period of drug research and development, and greatly reduces the cost of research and development of new drugs.

Description

Drug-target interaction prediction method fusing multilayer drug structure information
Technical Field
The invention relates to the technical field of medical artificial intelligence and natural language processing, in particular to a medicine-target interaction prediction method fusing multilayer medicine structure information.
Background
The development of new drugs is an expensive and time consuming process. It is well known that the total average development cost of a new drug varies from $ 2 to $ 30 billion, with a total development time of 13-15 years. Therefore, a fast and efficient research and development mode is urgently needed in the field of drug research and development to improve the efficiency of drug research and development and reduce the cost of drug research and development. A large number of researches show that the drug relocation method can effectively shorten the drug research and development period and greatly reduce the research and development cost of new drugs. While drug-target interactions play a key role in drug relocation studies. The development of the human genome project enables the rapid accumulation of data related to drug compounds, targets and interactions, and provides data accumulation for the prediction of drug-target interactions. However, there are still a large number of interactions between drugs and targets that have not been discovered and verified.
At present, the verification of the drug-target interaction relationship mainly depends on large-scale biological or chemical experiments, a large amount of manpower, material resources and financial resources are required to be invested in the verification process, and the experiment verification has great contingency, so that the cost investment of the verification of the drug-target interaction relationship is increased. In order to reduce the cost investment of drug-target interaction relationship verification, more and more calculation methods are used for drug-target interaction relationship prediction, but all the methods have certain defects. For example, deep characteristic information between a drug and a target is often ignored in a similarity-based method, and although a deep learning method can acquire more characteristic information between the drug and the target, complex relationships between different entities are difficult to learn for complex omics data, and the method lacks practical guiding significance for drug reuse. The structure diagram of the drug molecule contains various atoms and chemical bonds for forming the drug, is the centralized embodiment of the chemical property and the curative effect of the drug, and has important influence on the prediction of the drug-target interaction. However, the current method based on the graph neural network only focuses on the interaction relationship in the complex network and ignores the molecular structure property of the drug itself.
Disclosure of Invention
Aiming at the problems in the prior art, the application provides a deep learning model based on drug molecular structure information to automatically predict the interaction relation between a drug and a target spot, so that the verification efficiency is improved, and the verification cost is reduced.
In order to achieve the purpose, the technical scheme of the application is as follows: a method for predicting drug-target interaction fusing multilayer drug structure information comprises the following steps:
step 1: preprocessing the drug and target information in a pharmacocmics database, extracting drug and target information with interaction, and constructing drug-target interaction data;
step 2: expressing the molecular fingerprint of the drug SMILES as a molecular diagram structure, and extracting drug characteristic information by using a molecular complement diagram convolutional neural network and a Transformer network;
and step 3: embedding and representing target point sequence information, processing by using a convolutional neural network, and extracting target point characteristic information;
and 4, step 4: the extracted medicine characteristic information and the extracted target point characteristic information are sent to a classification model for training, and then the model is stored;
and 5: and loading the model, inputting the information of the drug to be predicted and the target point, predicting the relationship between the drug and the target point and outputting a prediction result.
Further, step 1 specifically includes:
step 1.1: screening the drug information and the target point information from the pharmacocmics database, and deleting the drug information and the target point information which have no interaction relationship;
step 1.2: integrating the medicines and the targets with the interaction relationship to form a form of < medicine number, target number and label >, and marking the label as 1;
step 1.3: acquiring SMILES molecular fingerprints corresponding to the drugs and sequence information corresponding to target spots from a pharmacocmics database, and respectively using the SMILES molecular fingerprints and the sequence information as specific representation information of the drugs and the target spots;
step 1.4: according to the positive example: negative example is 1: 2, randomly constructing unknown drug-target relationships as negative cases, and labeling the negative case label as 0.
Further, step 2 specifically includes:
step 2.1: expressing the SMILES molecular fingerprint of each drug in a graph form by calling an RDKit function library in a Python library, wherein the vertex and the edge of the graph respectively represent atoms and chemical bonds of the drug, each drug molecule is expressed by using a characteristic matrix and an adjacent matrix, and each row of the characteristic matrix corresponds to the attribute of each atom; each drug is represented as
Figure BDA0003342767640000031
Wherein N represents the kind of the drug,
Figure BDA0003342767640000032
a feature matrix representing the drug substance,
Figure BDA0003342767640000033
representing a contiguous matrix of drugs, DiRepresents the number of atoms of the ith drug, and C represents the number of characteristic channels of the atoms;
step 2.2: referring to fig. 2, the molecular complement graph convolutional neural network and the Transformer network are used to extract the drug characteristic information.
Further, step 2.2 specifically includes:
step 2.2.1: the molecule complement graph convolution neural network MCGCN takes the graph G obtained in the step 2.1 as input, and the MCGCN ensures that the size of an adjacency matrix and a characteristic matrix of each drug molecule is consistent by adding a complement graph to the original drug molecule graph, wherein the original graph and the complement graph are independent of each other; after completion, the molecular diagram of the drug is shown as follows:
Figure BDA0003342767640000041
wherein,
Figure BDA0003342767640000042
connection matrix between original graph G and complementary graph G' representing the ith drug;
Figure BDA0003342767640000043
Respectively representing the adjacency matrix and the characteristic matrix after completion; all drug molecules are represented as a graph G with consistent node number and size by the completion operationnew(ii) a The MCGCN comprises two hidden layers, the drug being represented in each hidden layer using formula (2):
Figure BDA0003342767640000044
wherein,
Figure BDA0003342767640000045
is the adjacency matrix with self-attention added,
Figure BDA0003342767640000046
is that
Figure BDA0003342767640000047
The weight matrix of (1), wherein
Figure BDA0003342767640000048
Figure BDA0003342767640000049
And Θ(l)Is the convolved signal and filter parameters for the l-th layer; each hidden layer is then represented by σ (·) which is set to ReLU (·) max (0,); maximum pooling is used at the end of MCGCN, and dimensionality of data is reduced;
step 2.2.2: transformer network uses output vector of each hidden layer in MCGCN
Figure BDA00033427676400000410
Taking an Encoder part in a Transformer network as an input to extract features; in a Transformer network, vector information from different hidden layers in an MCGCN is processed by different multi-head attention modules, for a first hidden layer in the MCGCN(Vector)
Figure BDA00033427676400000411
Processing by using a multi-head attention module with the head number of 6; for vectors from the second hidden layer of MCGCN
Figure BDA00033427676400000412
Processing by using a plurality of attention modules with the number of 4; extracting features in the multi-head attention module using equation (3):
MultiHeadj(Q,K,V)=Concat(head1,...,headi) (3)
splicing the feature vectors processed by the multi-head attention module by using a formula (4), sending the feature vectors into an original layer normalization part in a Transformer network, finally sending output vectors of the layer normalization into a full-connection feedforward neural network, and taking the output of the full-connection feedforward neural network as final feature vectors M of the medicinealldrug
AllMultiHead=Concat(MultiHead1,...,MultiHeadj) (4)。
Further, the step 3 specifically includes:
step 3.1: randomly initializing a lookup table corresponding to all amino acids appearing in the target sequence, wherein the size of the lookup table is 26 multiplied by 20; corresponding the amino acid in each target sequence with a lookup table to construct an embedded matrix M of the target sequencetar(ii) a The embedded matrix MtarThe length of (2) is the maximum length in the target point sequence, and is set to 2500, and the width is consistent with the width of the lookup table; during the model training process, the embedded vector is optimized continuously, so the relevant information in the lookup table changes continuously along with the optimization of the model.
Step 3.2: referring to fig. 3, a convolutional neural network is used to extract feature information in a target point sequence, and the embedded matrix M obtained in step 3.1 is usedtarAs an input to a convolutional neural network; the filling of empty tags is automatically performed for target sequences smaller than the length of the embedding matrix.
Further, the step 3.2 specifically includes:
step 3.2.1: embedded matrix M obtained in step 3.1tarInputting convolution layers with convolution kernels of 10, 15 and 20 respectively and step length of 1 to extract features, and sending the extracted feature vectors to an ELU activation function for optimization, wherein the ELU activation function is defined as follows:
Figure BDA0003342767640000051
step 3.2.2: the optimized vector in the ELU activation function is sent to a global maximum pooling layer, the most important local feature is extracted, and after the vector passes through the global maximum pooling layer, the obtained vector dimension is 128;
step 3.3.3: splicing the output vectors of each maximum pooling layer to obtain a spliced vector with a dimension of 384, inputting the spliced vector into a fully-connected neural network to obtain a vector with a dimension of 128, and using the vector as a final feature vector M of a target pointalltar
Further, the step 4 specifically includes:
step 4.1: the characteristic vector M of the medicine obtained in the step 2alldrugAnd the target point feature vector M obtained in the step 3alltarSplicing to obtain final vector representation M of input dataallTaking a label corresponding to the original drug-target point relation as a label of a final vector;
step 4.2: the final vector obtained in step 4.1 is represented as MallInputting the label into a fully-connected neural network, and training a model; in order to obtain the best model effect, the model is optimized by using a binary cross entropy function optimized by an L2 norm, and the model _ best with the best effect is stored:
Figure BDA0003342767640000061
Figure BDA0003342767640000062
further, the step 5 specifically includes:
loading the model _ best in the step 4.2, inputting the drug-target point information in the verification data into the model, judging whether the drug and the target point have an interaction relation, and outputting a corresponding evaluation index;
due to the adoption of the technical scheme, the invention can obtain the following technical effects: the invention adopts a deep learning model, utilizes the information of the drugs and the target points in the drug database, combines the structural characteristics of the drugs and the target points, and automatically predicts the interaction information of the drugs and the target points through the model. The method effectively extracts the characteristic information in the molecular structure of the drug, has higher accuracy when predicting the drug-target relationship, has robustness, improves the efficiency and the precision of verifying the drug-target relationship, effectively shortens the period of drug research and development, greatly reduces the cost of new drug research and development, and provides important basis and guarantee for new drug research and development and drug reuse.
Drawings
FIG. 1 is a flow chart of a method for predicting drug-target interaction that incorporates information about the structure of a multi-layered drug;
FIG. 2 is a flow chart of drug characteristic information extraction;
FIG. 3 is a flow chart of target feature information extraction.
Detailed Description
The embodiments of the present invention are implemented on the premise of the technical solution of the present invention, and detailed embodiments and specific operation procedures are given, but the scope of the present invention is not limited to the following embodiments.
Example 1
The present invention is described in detail below with reference to examples so that those skilled in the art can practice the invention with reference to the present specification.
In the embodiment, a Windows system is used as a development environment, a Pycharm is used as a development platform, a Python is used as a development language, and the prediction method of the drug-target interaction with the multi-layer drug structure information is adopted to predict the drug-target interaction relationship and predict the COVID-19 potential therapeutic drugs.
In this embodiment, a method for predicting drug-target interaction by fusing multilayer drug structure information includes the following steps:
step 1: giving a target spot, namely a Delta target spot of COVID-19, finding out 54 drugs having interaction relation with the target spot in a PubChem database, and setting a data tag as 1;
step 2: randomly selecting 108 medicines which do not interact with the Delta target from a PubChem database, constructing a negative example, and setting a data label to be 0;
and step 3: acquiring the chemical structure of the medicine and the sequence structure of the Delta target in the steps 1 and 2 from a PubChem database;
and 4, step 4: converting the chemical structure of the medicine into a form of a molecular structure diagram by using an RDkit tool package in a Python library, and storing the molecular structure diagram as a file in an hkl format, wherein each medicine is stored as one file;
and 5: taking a file in a drug hkl format and a sequence structure of a Delta target as input, and loading a stored model to obtain an evaluation index and a predicted Score of an interaction relation between the drug and the Delta target, wherein the evaluation index comprises an Accuracy (ACC), an F1 value and an AUC;
Figure BDA0003342767640000081
Figure BDA0003342767640000082
Figure BDA0003342767640000083
Figure BDA0003342767640000084
TP: a true positive case, correctly predicting the positive class as a positive class number; FP: false positive case, the negative class is mispredicted to be a positive class number; FN: false negative examples, mispredict the positive class as a negative class number; TN: and in the true negative case, the negative class is correctly predicted as the negative class number. AUC is expressed using the area under the ROC curve;
step 6: and 5, sequencing the prediction scores Score in the step 5 in a descending manner to obtain the medicine information ranked at the top 5.
According to the steps, the medicine-target point relation prediction effect is compared with a Deep DTA model, a Deep DTI model, a Deep Conv-DTI model and an ML-DTI model. As can be seen from table 1, the method proposed in the present invention is significantly superior to other methods in AUC, F1 values and prediction accuracy.
TABLE 1 comparison of prediction results for different models for drug-target relationship
Figure BDA0003342767640000091
The method of the invention is used for predicting the potential therapeutic drugs of COVID-19 and Delta target, and in the experimental result, four drugs including Tramadol in the top five drugs have been clinically treated by COVID-19 or have literature support to have inhibitory effect on COVID-19, as shown in Table 2. Tramadol, Amitriptyline and Dextrometorphan all have close interaction relation with Delta target. Among them, Dexamethasone and Dextrometorphan are widely used as clinical treatments for COVID-19 and successfully alleviate the complications of COVID-19. Tramadol can protect COVID-19 patients from disease complications by increasing antioxidant enzymes, superoxide dismutase and glutathione peroxidase, while reducing the effects of malondialdehyde. It has been shown that the probability of infection of cells is reduced by 90% after treatment with different concentrations of Amitriptyline, which also provides the basis for the use of Amitriptyline for COVID-19 therapy.
TABLE 2 the first five therapeutic agents related to COVID-19 recommended by the present invention
Figure BDA0003342767640000101
The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (8)

1. A method for predicting drug-target interaction fused with multilayer drug structure information is characterized by comprising the following steps:
step 1: preprocessing the drug and target information in a pharmacocmics database, extracting drug and target information with interaction, and constructing drug-target interaction data;
step 2: expressing the molecular fingerprint of the drug SMILES as a molecular diagram structure, and extracting drug characteristic information by using a molecular complement diagram convolutional neural network and a Transformer network;
and step 3: embedding and representing target point sequence information, processing by using a convolutional neural network, and extracting target point characteristic information;
and 4, step 4: the extracted medicine characteristic information and the extracted target point characteristic information are sent to a classification model for training, and then the model is stored;
and 5: and loading the model, inputting the information of the drug to be predicted and the target point, predicting the relationship between the drug and the target point and outputting a prediction result.
2. The method for predicting the drug-target interaction fusing the structural information of the multilayer drug according to claim 1, wherein the step 1 specifically comprises:
step 1.1: screening the drug information and the target point information from the pharmacocmics database, and deleting the drug information and the target point information which have no interaction relationship;
step 1.2: integrating the medicines and the targets with the interaction relationship to form a form of < medicine number, target number and label >, and marking the label as 1;
step 1.3: acquiring SMILES molecular fingerprints corresponding to the drugs and sequence information corresponding to target spots from a pharmacocmics database, and respectively using the SMILES molecular fingerprints and the sequence information as specific representation information of the drugs and the target spots;
step 1.4: according to the positive example: negative example is 1: 2, randomly constructing unknown drug-target relationships as negative cases, and labeling the negative case label as 0.
3. The method for predicting the drug-target interaction fusing the structural information of the multilayer drug according to claim 1, wherein the step 2 specifically comprises:
step 2.1: expressing the SMILES molecular fingerprint of each drug in a graph form by calling an RDKit function library in a Python library, wherein the vertex and the edge of the graph respectively represent atoms and chemical bonds of the drug, each drug molecule is expressed by using a characteristic matrix and an adjacent matrix, and each row of the characteristic matrix corresponds to the attribute of each atom; each drug is represented as
Figure FDA0003342767630000021
Wherein N represents the kind of the drug,
Figure FDA0003342767630000022
a feature matrix representing the drug substance,
Figure FDA0003342767630000023
representing a contiguous matrix of drugs, DiRepresents the number of atoms of the ith drug, and C represents the number of characteristic channels of the atoms;
step 2.2: and extracting the characteristic information of the medicine by adopting a molecular complement graph convolutional neural network and a Transformer network.
4. The method for predicting the drug-target interaction fusing the structural information of the multilayer drug according to claim 3, wherein the step 2.2 specifically comprises:
step 2.2.1: the molecule complement graph convolution neural network MCGCN takes the graph G obtained in the step 2.1 as input, and the MCGCN ensures that the size of an adjacency matrix and a characteristic matrix of each drug molecule is consistent by adding a complement graph to the original drug molecule graph, wherein the original graph and the complement graph are independent of each other; after completion, the molecular diagram of the drug is shown as follows:
Figure FDA0003342767630000024
wherein,
Figure FDA0003342767630000025
a connection matrix between the original graph G and the complement graph G' representing the ith drug;
Figure FDA0003342767630000026
respectively representing the adjacency matrix and the characteristic matrix after completion; all drug molecules are represented as a graph G with consistent node number and size by the completion operationnew(ii) a The MCGCN comprises two hidden layers, the drug being represented in each hidden layer using formula (2):
Figure FDA0003342767630000031
wherein,
Figure FDA0003342767630000032
is the adjacency matrix with self-attention added,
Figure FDA0003342767630000033
is that
Figure FDA0003342767630000034
The weight matrix of (1), wherein
Figure FDA0003342767630000035
Figure FDA0003342767630000036
And Θ(l)Is the convolved signal and filter parameters for the l-th layer; each hidden layer is then represented by σ (·) which is set to ReLU (·) max (0,); maximum pooling is used at the end of MCGCN, and dimensionality of data is reduced;
step 2.2.2: transformer network uses output vector of each hidden layer in MCGCN
Figure FDA0003342767630000037
Taking an Encoder part in a Transformer network as an input to extract features; in a Transformer network, vector information from different hidden layers in an MCGCN is processed by different multi-head attention modules, and for a vector from a first hidden layer in the MCGCN
Figure FDA0003342767630000038
Processing by using a multi-head attention module with the head number of 6; for vectors from the second hidden layer of MCGCN
Figure FDA0003342767630000039
Processing by using a plurality of attention modules with the number of 4; extracting features in the multi-head attention module using equation (3):
MultiHeadj(Q,K,V)=Concat(head1,…,headi) (3)
splicing the feature vectors processed by the multi-head attention module by using a formula (4), sending the feature vectors into an original layer normalization part in a Transformer network, sending output vectors of the layer normalization into a full-connection feedforward neural network, and sending the output vectors before full connectionThe output of the feed neural network is used as the final characteristic vector M of the medicinealldrug
AllMultiHead=Concat(MultiHead1,…,MultiHeadj) (4)。
5. The method for predicting drug-target interaction fusing structural information of multilayer drugs according to claim 1, wherein the step 3 specifically comprises:
step 3.1: randomly initializing a lookup table corresponding to all occurring amino acids in the target sequence; corresponding the amino acid in each target sequence with a lookup table to construct an embedded matrix M of the target sequencetar(ii) a The embedded matrix MtarThe length of (2) is the maximum length in the target point sequence, and the width is consistent with the width of the lookup table;
step 3.2: extracting characteristic information in the target point sequence by using a convolutional neural network, and embedding the embedded matrix M obtained in the step 3.1tarAs an input to a convolutional neural network; the filling of empty tags is automatically performed for target sequences smaller than the length of the embedding matrix.
6. The method for predicting drug-target interaction fusing structural information of multilayer drugs according to claim 5, wherein the step 3.2 specifically comprises:
step 3.2.1: embedded matrix M obtained in step 3.1tarInputting convolution layers with convolution kernels of 10, 15 and 20 respectively and step length of 1 to extract features, and sending the extracted feature vectors to an ELU activation function for optimization, wherein the ELU activation function is defined as follows:
Figure FDA0003342767630000041
step 3.2.2: the optimized vector in the ELU activation function is sent to a global maximum pooling layer, and the most important local feature is extracted;
step 3.3.3: splicing the output vectors of each maximum pooling layerInputting the spliced vector into a fully-connected neural network as the final characteristic vector M of the target pointalltar
7. The method for predicting drug-target interaction fusing structural information of multilayer drugs according to claim 1, wherein the step 4 specifically comprises:
step 4.1: the characteristic vector M of the medicine obtained in the step 2alldrugAnd the target point feature vector M obtained in the step 3alltarSplicing to obtain final vector representation M of input dataallTaking a label corresponding to the original drug-target point relation as a label of a final vector;
step 4.2: the final vector obtained in step 4.1 is represented as MallInputting the label into a fully-connected neural network, and training a model; optimizing the model by adopting a binary cross entropy function optimized by an L2 norm, and storing the model _ best with the best effect:
Figure FDA0003342767630000051
Figure FDA0003342767630000052
8. the method for predicting drug-target interaction fusing structural information of multilayer drugs according to claim 7, wherein the step 5 specifically comprises:
and (4) loading the model _ best in the step (4.2), inputting the drug-target point information in the verification data into the model, judging whether the drug and the target point have an interaction relation, and outputting a corresponding evaluation index.
CN202111313022.XA 2021-11-08 2021-11-08 Drug-target interaction prediction method fusing multilayer drug structure information Pending CN114067905A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111313022.XA CN114067905A (en) 2021-11-08 2021-11-08 Drug-target interaction prediction method fusing multilayer drug structure information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111313022.XA CN114067905A (en) 2021-11-08 2021-11-08 Drug-target interaction prediction method fusing multilayer drug structure information

Publications (1)

Publication Number Publication Date
CN114067905A true CN114067905A (en) 2022-02-18

Family

ID=80274150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111313022.XA Pending CN114067905A (en) 2021-11-08 2021-11-08 Drug-target interaction prediction method fusing multilayer drug structure information

Country Status (1)

Country Link
CN (1) CN114067905A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115206421A (en) * 2022-07-19 2022-10-18 北京百度网讯科技有限公司 Drug repositioning method, and repositioning model training method and device
WO2024026929A1 (en) * 2022-08-03 2024-02-08 慧壹科技(上海)有限公司 Cleaning method and apparatus for drug-target interaction data
CN118016149A (en) * 2024-04-09 2024-05-10 太原理工大学 Spatial domain identification method for integrating space transcriptome multi-mode information

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115206421A (en) * 2022-07-19 2022-10-18 北京百度网讯科技有限公司 Drug repositioning method, and repositioning model training method and device
WO2024026929A1 (en) * 2022-08-03 2024-02-08 慧壹科技(上海)有限公司 Cleaning method and apparatus for drug-target interaction data
CN118016149A (en) * 2024-04-09 2024-05-10 太原理工大学 Spatial domain identification method for integrating space transcriptome multi-mode information

Similar Documents

Publication Publication Date Title
CN114067905A (en) Drug-target interaction prediction method fusing multilayer drug structure information
CN109920501B (en) Electronic medical record classification method and system based on convolutional neural network and active learning
CN108986908B (en) Method and device for processing inquiry data, computer equipment and storage medium
CN108062556B (en) Drug-disease relationship identification method, system and device
CN113936735A (en) Method for predicting binding affinity of drug molecules and target protein
CN112086195B (en) Admission risk prediction method based on self-adaptive ensemble learning model
Xu et al. Automated Scoring of Clinical Patient Notes using Advanced NLP and Pseudo Labeling
CN114420310A (en) Medicine ATCCode prediction method based on graph transformation network
CN112131399A (en) Old medicine new use analysis method and system based on knowledge graph
CN112652358A (en) Drug recommendation system, computer equipment and storage medium for regulating and controlling disease target based on three-channel deep learning
CN114743600A (en) Gate-controlled attention mechanism-based deep learning prediction method for target-ligand binding affinity
CN115376704A (en) Medicine-disease interaction prediction method fusing multi-neighborhood correlation information
CN116206775A (en) Multi-dimensional characteristic fusion medicine-target interaction prediction method
CN116612810A (en) Medicine target interaction prediction method based on interaction inference network
CN116013428A (en) Drug target general prediction method, device and medium based on self-supervision learning
Kulathunga et al. PatientCare: Patient assistive tool with automatic hand-written prescription reader
CN112837743B (en) Drug repositioning method based on machine learning
Sumathi et al. A review on deep learning-driven drug discovery: strategies, tools and applications
CN116153527A (en) Attention mechanism-based method and system for predicting side effects of combination of psychotropic drugs
CN116403731A (en) Missense mutation effect prediction method and system for clinical drug effect based on deep learning
CN116630062A (en) Medical insurance fraud detection method, system and storage medium
CN115966315A (en) Method, equipment and storage medium for predicting anti-aging medicine
Hu et al. Sequence translating model using deep neural block cascade network: Taking protein secondary structure prediction as an example
Bhattacharya et al. Improved search space shrinking for medical image retrieval using capsule architecture and decision fusion
CN113345535A (en) Drug target prediction method and system for keeping chemical property and function consistency of drug

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination