CN113160879B - Method for predicting drug repositioning through side effect based on network learning - Google Patents
Method for predicting drug repositioning through side effect based on network learning Download PDFInfo
- Publication number
- CN113160879B CN113160879B CN202110448039.XA CN202110448039A CN113160879B CN 113160879 B CN113160879 B CN 113160879B CN 202110448039 A CN202110448039 A CN 202110448039A CN 113160879 B CN113160879 B CN 113160879B
- Authority
- CN
- China
- Prior art keywords
- medicine
- similarity
- side effects
- side effect
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000694 effects Effects 0.000 title claims abstract description 73
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000009511 drug repositioning Methods 0.000 title claims abstract description 17
- 239000003814 drug Substances 0.000 claims abstract description 104
- 229940079593 drug Drugs 0.000 claims abstract description 67
- 201000010099 disease Diseases 0.000 claims description 5
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 5
- 230000002411 adverse Effects 0.000 abstract 1
- 208000030453 Drug-Related Side Effects and Adverse reaction Diseases 0.000 description 12
- 206010061623 Adverse drug reaction Diseases 0.000 description 5
- 238000011161 development Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000002547 new drug Substances 0.000 description 2
- 229940124602 FDA-approved drug Drugs 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000012362 drug development process Methods 0.000 description 1
- 230000000857 drug effect Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 230000011514 reflex Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/40—ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Pharmacology & Pharmacy (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Toxicology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a method for predicting drug repositioning through side effects based on network learning, which comprises the following steps: 1) Constructing unique side effect fingerprints for the side effects of each drug in a 0/1 vector mode; 2) Similarity between drugs was calculated using Jaccard Index; 3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved; 4) Predicting potential indications of neighboring drugs in the network based on their MeSH information 5) using the EASE score to calculate the degree of enrichment of the adverse effect network, rank-ordering the indication locations of drug a. The scheme selects 61 medicines for prediction, can accurately predict the indication of 41 medicines, and shows good prediction effect.
Description
Technical Field
The invention relates to the technical field of drug research and development, in particular to a method for predicting drug repositioning through side effects based on network learning.
Background
The development efficiency of drugs has been low, compared with the investment of a lot of money and time for the development of novel drugs, the development time can be remarkably shortened for the research of the potential of the known drugs, the toxicity risk of the drugs is reduced, but the past successful examples of drug repositioning often depend on contingencies, recently, researches have been proposed to predict new indications of drugs according to the gene expression patterns of the drugs or by utilizing the structure functions of compounds/proteins so as to reduce the development cost, and these methods are often focused on researching the molecular action mechanism from the aspect of genotype, but the preclinical results based on MOA are not greatly related to the actual curative effect of the drug development process, and it is estimated that the drugs effective in cell analysis are only thirty percent effective in animals, only five percent effective in human bodies, and the difference between MOA and physiological reactions may limit the practicability of the drug repositioning method.
When the drug is combined with the wrong target, normal metabolism and signal paths can be disturbed to generate side effects, namely the side effects of the drug on the human body can be regarded as valuable parameters of the drug on the human body, new thinking and wide prospects are provided for drug repositioning research, only few researches relate to the aspect, a prediction model 'SIDER' covering 996 drugs and 4192 drug effects is researched and manufactured in 2010, a research group establishes a prediction model 'DRoSEF' based on PharmGKB database and covering 145 diseases in 2011, the DRoSEF model is expected to realize more promising performance by expanding the disease coverage, the drug with similar side effects also has certain similar therapeutic properties, new indications of the drug are predicted by researching complete side effect catalogues, and an overall network is constructed by utilizing the side effect similarities of different drugs, so that the application range of the drug can be predicted by adjacent drug function distribution in the network, and the problem can be solved by the method of predicting the drug repositioning based on network learning.
Disclosure of Invention
(one) solving the technical problems
Aiming at the technical problems in the prior art, the invention provides a method for predicting drug repositioning through side effects based on network learning, which predicts new indications of drugs through a complete side effect catalog, builds an integral network by utilizing the side effect similarity of different drugs, solves the problems that the development efficiency of the drugs is always low, a great deal of money and time are required to be input for developing the novel drugs, and the past successful drug repositioning examples often depend on accidents.
(II) technical scheme
The technical scheme for solving the technical problems is as follows: a method for predicting drug repositioning through side effects based on network learning, comprising the steps of:
1) Constructing unique side effect fingerprint for each side effect of each medicine in a 0/1 vector mode, namely 2183 medicines all have a 6495-dimensional vector for representing the side effect of each medicine;
2) Similarity between every two drugs was calculated using Jaccard Index, with the following formula:
wherein a, B is the number of side effects of the medicines A and B, c is the number of side effects shared by the medicines A and B;
3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved;
4) Z score was used to measure this significant difference, as follows:
wherein the Zcore threshold is set to be equal to or greater than 2.576;
5) Because the MeSH contains disease information distributed according to the hierarchy, the potential indication of the MeSH can be predicted in the network according to the MeSH information of adjacent medicines;
6) Using the EASE score to calculate the degree of enrichment of the side effect network, ranking the indication locations of drug A;
7) And finally, the model is a large network formed by a similarity sub-network between any two medicines, the normalized discount accumulated income is originally used for evaluating a network search engine algorithm in the information retrieval field, and the usefulness degree of the document in a result list is calculated and is used for calculating the rank ranking accuracy of the medicine prediction result.
(III) beneficial effects
Compared with the prior art, the invention provides a method for predicting drug repositioning through side effects based on network learning, which has the following beneficial effects:
according to the network learning-based drug repositioning method through side effects, the similarity of drug side effects is ranked by using a normalized fold cumulative benefit (NDCG) method for ranking the priority of search results in a network search engine algorithm originally used in the information retrieval field for the first time, in the existing method for predicting the drug side effects, the drug side effect data used by us are the most comprehensive and authoritative information of the 6495 kinds of side effects of 2183 drugs in total, the robustness and universality of a model constructed by the method are guaranteed through the comprehensive coverage, in the test results of randomly selecting 98 kinds of drug data in a SIDER model, 84.69% of drugs in the model are selected through threshold screening, 61 kinds of drugs in the model are predicted, and the indication of 41 kinds of drugs in the model can be accurately predicted, and 50.94% of results in the 106 medicine indication prediction results contained in the 41 kinds of drugs are supported by approval or other clinical experiments and scientific literature, so that good prediction effects are displayed, and new drug indications are effectively predicted through the drug side effects.
Drawings
Fig. 1 is a schematic flow chart of a method for predicting drug repositioning through side effects based on network learning.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below in conjunction with the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, a method for predicting drug repositioning through side effects based on network learning includes the following steps:
1) Constructing unique side effect fingerprint for each side effect of each medicine in a 0/1 vector mode, namely 2183 medicines all have a 6495-dimensional vector for representing the side effect of each medicine;
2) Similarity between every two drugs was calculated using Jaccard Index, with the following formula:
wherein a, B is the number of side effects of the medicines A and B, c is the number of side effects shared by the medicines A and B;
3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved;
4) Z score was used to measure this significant difference, as follows:
wherein the Zcore threshold is set to be equal to or greater than 2.576;
5) Because the MeSH contains disease information distributed according to the hierarchy, the potential indication of the MeSH can be predicted in the network according to the MeSH information of adjacent medicines;
6) Using the EASE score to calculate the degree of enrichment of the side effect network, ranking the indication locations of drug A;
7) And finally, the model is a large network formed by a similarity sub-network between any two medicines, the normalized discount accumulated income is originally used for evaluating a network search engine algorithm in the information retrieval field, and the usefulness degree of the document in a result list is calculated and is used for calculating the rank ranking accuracy of the medicine prediction result.
The drug side effect dictionary uses a fifteenth edition of merle drug side effect catalogue, and uses 2007-2012 drug side effect report and FDA approved indication data for side effects of drugs after 2006.
Meanwhile, the MedDRA vocabulary is used as standard words and grades, the catalog data from different resources are integrated, and semantic redundancy is avoided.
The FDA approved drug indication information is converted to a MeSH format header, resulting in 6495 clinical side effects of 2183 drug and 994 MeSH fourth-level information.
Experimental cases: in the test results of 98 kinds of medicine data in a SIDER model, 84.69% of medicines are selected in the model through threshold screening, 61 kinds of medicines are selected for prediction, and the indication of 41 kinds of medicines can be accurately predicted, and 50.94% of the 106 top-ranked five-medicine indication prediction results contained in the 41 kinds of medicines are supported by FDA approval or other clinical experiments and scientific literature, so that good prediction effects are shown, and new indication of medicines is effectively predicted through side effects of medicines, as shown in the following table:
Drug-indication pairs | Number | Percentage |
FDA-approved | 37 | 34.91% |
Clinical | 10 | 9.43% |
Preclinical | 7 | 6.6% |
Unknown | 52 | 49.06% |
the beneficial effects of the invention are as follows: according to the network learning-based drug repositioning method through side effects, the similarity of drug side effects is ranked by using a normalized fold cumulative benefit (NDCG) method for ranking the priority of search results in a network search engine algorithm originally used in the information retrieval field for the first time, in the existing method for predicting the drug side effects, the drug side effect data used by us are the most comprehensive and authoritative information of the 6495 kinds of side effects of 2183 drugs in total, the robustness and universality of a model constructed by the method are guaranteed through the comprehensive coverage, in the test results of randomly selecting 98 kinds of drug data in a SIDER model, 84.69% of drugs in the model are selected through threshold screening, 61 kinds of drugs in the model are predicted, and the indication of 41 kinds of drugs in the model can be accurately predicted, and 50.94% of results in the 106 medicine indication prediction results contained in the 41 kinds of drugs are supported by approval or other clinical experiments and scientific literature, so that good prediction effects are displayed, and new drug indications are effectively predicted through the drug side effects.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (1)
1. The method for predicting drug repositioning through side effects based on network learning is characterized by comprising the following steps:
1) Constructing unique side effect fingerprint for each side effect of each medicine in a 0/1 vector mode, namely 2183 medicines all have a 6495-dimensional vector for representing the side effect of each medicine;
2) Similarity between every two drugs was calculated using Jaccard Index, with the following formula:
wherein a, B is the number of side effects of the medicines A and B, c is the number of side effects shared by the medicines A and B;
3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved;
4) Z score was used to measure this significant difference, as follows:
wherein the Zcore threshold is set to be equal to or greater than 2.576;
5) Because the MeSH contains disease information distributed according to the hierarchy, the potential indication of the MeSH can be predicted in the network according to the MeSH information of adjacent medicines;
6) Using the EASE score to calculate the degree of enrichment of the side effect network, ranking the indication locations of drug A;
7) And finally, the model is a large network formed by a similarity sub-network between any two medicines, the normalized discount accumulated income is originally used for evaluating a network search engine algorithm in the information retrieval field, and the usefulness degree of the document in a result list is calculated and is used for calculating the rank ranking accuracy of the medicine prediction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448039.XA CN113160879B (en) | 2021-04-25 | 2021-04-25 | Method for predicting drug repositioning through side effect based on network learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448039.XA CN113160879B (en) | 2021-04-25 | 2021-04-25 | Method for predicting drug repositioning through side effect based on network learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113160879A CN113160879A (en) | 2021-07-23 |
CN113160879B true CN113160879B (en) | 2023-11-28 |
Family
ID=76870224
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110448039.XA Active CN113160879B (en) | 2021-04-25 | 2021-04-25 | Method for predicting drug repositioning through side effect based on network learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113160879B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113626567A (en) * | 2021-07-28 | 2021-11-09 | 上海基绪康生物科技有限公司 | Method for mining information related to genes and diseases from biomedical literature |
CN113611363B (en) * | 2021-08-09 | 2023-11-28 | 上海基绪康生物科技有限公司 | Method for identifying cancer driving gene by using consensus prediction result |
CN113838583B (en) * | 2021-09-27 | 2023-10-24 | 中国人民解放军空军军医大学 | Intelligent medicine curative effect evaluation method based on machine learning and application thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105653846A (en) * | 2015-12-25 | 2016-06-08 | 中南大学 | Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method |
CN111191014A (en) * | 2019-12-26 | 2020-05-22 | 上海科技发展有限公司 | Medicine relocation method, system, terminal and medium |
CN111951886A (en) * | 2019-05-17 | 2020-11-17 | 天津科技大学 | Drug relocation prediction method based on Bayesian inductive matrix completion |
CN112216396A (en) * | 2020-10-14 | 2021-01-12 | 复旦大学 | Method for predicting drug-side effect relationship based on graph neural network |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11037684B2 (en) * | 2014-11-14 | 2021-06-15 | International Business Machines Corporation | Generating drug repositioning hypotheses based on integrating multiple aspects of drug similarity and disease similarity |
GB2537925A (en) * | 2015-04-30 | 2016-11-02 | Fujitsu Ltd | A similarity-computation apparatus, a side effect determining apparatus and a system for calculating similarities between drugs and using the similarities |
US20200176128A1 (en) * | 2018-12-02 | 2020-06-04 | Ravipal Soin | Identifying Drug Side Effects |
-
2021
- 2021-04-25 CN CN202110448039.XA patent/CN113160879B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105653846A (en) * | 2015-12-25 | 2016-06-08 | 中南大学 | Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method |
CN111951886A (en) * | 2019-05-17 | 2020-11-17 | 天津科技大学 | Drug relocation prediction method based on Bayesian inductive matrix completion |
CN111191014A (en) * | 2019-12-26 | 2020-05-22 | 上海科技发展有限公司 | Medicine relocation method, system, terminal and medium |
CN112216396A (en) * | 2020-10-14 | 2021-01-12 | 复旦大学 | Method for predicting drug-side effect relationship based on graph neural network |
Non-Patent Citations (4)
Title |
---|
"NEUROCOMPUTING";Zhang, Wen, et al.;《NEUROCOMPUTING》;第287卷;154-162 * |
"drug relocation";Wang, Jihong, et al.;"A Drug Target Interaction Prediction Based on LINE-RF Learning";第15卷(第7期);750-757 * |
"基于化学信息学方法预测药物副作用的研究进展";薛斌,等;《计算机与应用化学》;第36卷(第5期);487-490 * |
"基于多相似度融合的药物重定位推荐算法";陈鹏,等;《计算机技术与发展》;第31卷(第1期);168-174页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113160879A (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113160879B (en) | Method for predicting drug repositioning through side effect based on network learning | |
Nourmohammadi-Khiarak et al. | New hybrid method for heart disease diagnosis utilizing optimization algorithm in feature selection | |
US9971974B2 (en) | Methods and systems for knowledge discovery | |
Li et al. | Application of a clustering method on sentiment analysis | |
CN109935337B (en) | Medical record searching method and system based on similarity measurement | |
CN114242186B (en) | Chinese and western medicine relocation method and system fusing GHP and GCN and storage medium | |
CN105740626A (en) | Drug activity prediction method based on machine learning | |
Glenisson et al. | Evaluation of the vector space representation in text-based gene clustering | |
CN113597645A (en) | Methods and systems for reconstructing drug response and disease networks and uses thereof | |
US20140089246A1 (en) | Methods and systems for knowledge discovery | |
Sughasiny et al. | Application of machine learning techniques, big data analytics in health care sector–a literature survey | |
CN109801687A (en) | A kind of construction method and system of the causality knowledge base towards medicine | |
Gómez‐Núñez et al. | Updating the SCI mago journal and country rank classification: A new approach using W ard's clustering and alternative combination of citation measures | |
CN111026877A (en) | Knowledge verification model construction and analysis method based on probability soft logic | |
Xiong et al. | A multimodal framework for improving in silico drug repositioning with the prior knowledge from knowledge graphs | |
Luo et al. | A neural network approach to chemical and gene/protein entity recognition in patents | |
Chinnasamy et al. | Machine learning based cardiovascular disease prediction | |
Zhou et al. | A new method of health condition detection for hydraulic pump using enhanced whale optimization-resonance-based sparse signal decomposition and modified hierarchical amplitude-aware permutation entropy | |
Hasan et al. | Clinical Question Answering using Key-Value Memory Networks and Knowledge Graph. | |
Gong et al. | Prioritization of disease susceptibility genes using LSM/SVD | |
CN114691826B (en) | Medical data information retrieval method based on co-occurrence analysis and spectral clustering | |
Rak et al. | Multilabel associative classification categorization of MEDLINE articles into MeSH keywords | |
Gollapalli et al. | Text mining on hospital stay durations and management of sickle cell disease patients | |
Rahman et al. | A machine learning based modeling of the cytokine storm as it relates to COVID-19 using a virtual clinical semantic network (vCSN) | |
Wang et al. | Potentiality of healthcare big data: Improving search by automatic query reformulation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |