CN113160879B - Method for predicting drug repositioning through side effect based on network learning - Google Patents

Method for predicting drug repositioning through side effect based on network learning Download PDF

Info

Publication number
CN113160879B
CN113160879B CN202110448039.XA CN202110448039A CN113160879B CN 113160879 B CN113160879 B CN 113160879B CN 202110448039 A CN202110448039 A CN 202110448039A CN 113160879 B CN113160879 B CN 113160879B
Authority
CN
China
Prior art keywords
medicine
similarity
side effects
side effect
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110448039.XA
Other languages
Chinese (zh)
Other versions
CN113160879A (en
Inventor
韦嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jixukang Biotechnology Co ltd
Original Assignee
Shanghai Jixukang Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jixukang Biotechnology Co ltd filed Critical Shanghai Jixukang Biotechnology Co ltd
Priority to CN202110448039.XA priority Critical patent/CN113160879B/en
Publication of CN113160879A publication Critical patent/CN113160879A/en
Application granted granted Critical
Publication of CN113160879B publication Critical patent/CN113160879B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Toxicology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method for predicting drug repositioning through side effects based on network learning, which comprises the following steps: 1) Constructing unique side effect fingerprints for the side effects of each drug in a 0/1 vector mode; 2) Similarity between drugs was calculated using Jaccard Index; 3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved; 4) Predicting potential indications of neighboring drugs in the network based on their MeSH information 5) using the EASE score to calculate the degree of enrichment of the adverse effect network, rank-ordering the indication locations of drug a. The scheme selects 61 medicines for prediction, can accurately predict the indication of 41 medicines, and shows good prediction effect.

Description

Method for predicting drug repositioning through side effect based on network learning
Technical Field
The invention relates to the technical field of drug research and development, in particular to a method for predicting drug repositioning through side effects based on network learning.
Background
The development efficiency of drugs has been low, compared with the investment of a lot of money and time for the development of novel drugs, the development time can be remarkably shortened for the research of the potential of the known drugs, the toxicity risk of the drugs is reduced, but the past successful examples of drug repositioning often depend on contingencies, recently, researches have been proposed to predict new indications of drugs according to the gene expression patterns of the drugs or by utilizing the structure functions of compounds/proteins so as to reduce the development cost, and these methods are often focused on researching the molecular action mechanism from the aspect of genotype, but the preclinical results based on MOA are not greatly related to the actual curative effect of the drug development process, and it is estimated that the drugs effective in cell analysis are only thirty percent effective in animals, only five percent effective in human bodies, and the difference between MOA and physiological reactions may limit the practicability of the drug repositioning method.
When the drug is combined with the wrong target, normal metabolism and signal paths can be disturbed to generate side effects, namely the side effects of the drug on the human body can be regarded as valuable parameters of the drug on the human body, new thinking and wide prospects are provided for drug repositioning research, only few researches relate to the aspect, a prediction model 'SIDER' covering 996 drugs and 4192 drug effects is researched and manufactured in 2010, a research group establishes a prediction model 'DRoSEF' based on PharmGKB database and covering 145 diseases in 2011, the DRoSEF model is expected to realize more promising performance by expanding the disease coverage, the drug with similar side effects also has certain similar therapeutic properties, new indications of the drug are predicted by researching complete side effect catalogues, and an overall network is constructed by utilizing the side effect similarities of different drugs, so that the application range of the drug can be predicted by adjacent drug function distribution in the network, and the problem can be solved by the method of predicting the drug repositioning based on network learning.
Disclosure of Invention
(one) solving the technical problems
Aiming at the technical problems in the prior art, the invention provides a method for predicting drug repositioning through side effects based on network learning, which predicts new indications of drugs through a complete side effect catalog, builds an integral network by utilizing the side effect similarity of different drugs, solves the problems that the development efficiency of the drugs is always low, a great deal of money and time are required to be input for developing the novel drugs, and the past successful drug repositioning examples often depend on accidents.
(II) technical scheme
The technical scheme for solving the technical problems is as follows: a method for predicting drug repositioning through side effects based on network learning, comprising the steps of:
1) Constructing unique side effect fingerprint for each side effect of each medicine in a 0/1 vector mode, namely 2183 medicines all have a 6495-dimensional vector for representing the side effect of each medicine;
2) Similarity between every two drugs was calculated using Jaccard Index, with the following formula:
wherein a, B is the number of side effects of the medicines A and B, c is the number of side effects shared by the medicines A and B;
3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved;
4) Z score was used to measure this significant difference, as follows:
wherein the Zcore threshold is set to be equal to or greater than 2.576;
5) Because the MeSH contains disease information distributed according to the hierarchy, the potential indication of the MeSH can be predicted in the network according to the MeSH information of adjacent medicines;
6) Using the EASE score to calculate the degree of enrichment of the side effect network, ranking the indication locations of drug A;
7) And finally, the model is a large network formed by a similarity sub-network between any two medicines, the normalized discount accumulated income is originally used for evaluating a network search engine algorithm in the information retrieval field, and the usefulness degree of the document in a result list is calculated and is used for calculating the rank ranking accuracy of the medicine prediction result.
(III) beneficial effects
Compared with the prior art, the invention provides a method for predicting drug repositioning through side effects based on network learning, which has the following beneficial effects:
according to the network learning-based drug repositioning method through side effects, the similarity of drug side effects is ranked by using a normalized fold cumulative benefit (NDCG) method for ranking the priority of search results in a network search engine algorithm originally used in the information retrieval field for the first time, in the existing method for predicting the drug side effects, the drug side effect data used by us are the most comprehensive and authoritative information of the 6495 kinds of side effects of 2183 drugs in total, the robustness and universality of a model constructed by the method are guaranteed through the comprehensive coverage, in the test results of randomly selecting 98 kinds of drug data in a SIDER model, 84.69% of drugs in the model are selected through threshold screening, 61 kinds of drugs in the model are predicted, and the indication of 41 kinds of drugs in the model can be accurately predicted, and 50.94% of results in the 106 medicine indication prediction results contained in the 41 kinds of drugs are supported by approval or other clinical experiments and scientific literature, so that good prediction effects are displayed, and new drug indications are effectively predicted through the drug side effects.
Drawings
Fig. 1 is a schematic flow chart of a method for predicting drug repositioning through side effects based on network learning.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below in conjunction with the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, a method for predicting drug repositioning through side effects based on network learning includes the following steps:
1) Constructing unique side effect fingerprint for each side effect of each medicine in a 0/1 vector mode, namely 2183 medicines all have a 6495-dimensional vector for representing the side effect of each medicine;
2) Similarity between every two drugs was calculated using Jaccard Index, with the following formula:
wherein a, B is the number of side effects of the medicines A and B, c is the number of side effects shared by the medicines A and B;
3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved;
4) Z score was used to measure this significant difference, as follows:
wherein the Zcore threshold is set to be equal to or greater than 2.576;
5) Because the MeSH contains disease information distributed according to the hierarchy, the potential indication of the MeSH can be predicted in the network according to the MeSH information of adjacent medicines;
6) Using the EASE score to calculate the degree of enrichment of the side effect network, ranking the indication locations of drug A;
7) And finally, the model is a large network formed by a similarity sub-network between any two medicines, the normalized discount accumulated income is originally used for evaluating a network search engine algorithm in the information retrieval field, and the usefulness degree of the document in a result list is calculated and is used for calculating the rank ranking accuracy of the medicine prediction result.
The drug side effect dictionary uses a fifteenth edition of merle drug side effect catalogue, and uses 2007-2012 drug side effect report and FDA approved indication data for side effects of drugs after 2006.
Meanwhile, the MedDRA vocabulary is used as standard words and grades, the catalog data from different resources are integrated, and semantic redundancy is avoided.
The FDA approved drug indication information is converted to a MeSH format header, resulting in 6495 clinical side effects of 2183 drug and 994 MeSH fourth-level information.
Experimental cases: in the test results of 98 kinds of medicine data in a SIDER model, 84.69% of medicines are selected in the model through threshold screening, 61 kinds of medicines are selected for prediction, and the indication of 41 kinds of medicines can be accurately predicted, and 50.94% of the 106 top-ranked five-medicine indication prediction results contained in the 41 kinds of medicines are supported by FDA approval or other clinical experiments and scientific literature, so that good prediction effects are shown, and new indication of medicines is effectively predicted through side effects of medicines, as shown in the following table:
Drug-indication pairs Number Percentage
FDA-approved 37 34.91%
Clinical 10 9.43%
Preclinical 7 6.6%
Unknown 52 49.06%
the beneficial effects of the invention are as follows: according to the network learning-based drug repositioning method through side effects, the similarity of drug side effects is ranked by using a normalized fold cumulative benefit (NDCG) method for ranking the priority of search results in a network search engine algorithm originally used in the information retrieval field for the first time, in the existing method for predicting the drug side effects, the drug side effect data used by us are the most comprehensive and authoritative information of the 6495 kinds of side effects of 2183 drugs in total, the robustness and universality of a model constructed by the method are guaranteed through the comprehensive coverage, in the test results of randomly selecting 98 kinds of drug data in a SIDER model, 84.69% of drugs in the model are selected through threshold screening, 61 kinds of drugs in the model are predicted, and the indication of 41 kinds of drugs in the model can be accurately predicted, and 50.94% of results in the 106 medicine indication prediction results contained in the 41 kinds of drugs are supported by approval or other clinical experiments and scientific literature, so that good prediction effects are displayed, and new drug indications are effectively predicted through the drug side effects.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (1)

1. The method for predicting drug repositioning through side effects based on network learning is characterized by comprising the following steps:
1) Constructing unique side effect fingerprint for each side effect of each medicine in a 0/1 vector mode, namely 2183 medicines all have a 6495-dimensional vector for representing the side effect of each medicine;
2) Similarity between every two drugs was calculated using Jaccard Index, with the following formula:
wherein a, B is the number of side effects of the medicines A and B, c is the number of side effects shared by the medicines A and B;
3) Randomly selecting 1 ten thousand times of side effects with the same number as that of the side effects of the medicine A in the side effect data total set, and calculating whether the similarity of the medicine A and the medicine B is better than the random medicine similarity with the same number of side effects selected randomly, wherein only the side effect similarity results with the similarity of the medicine A and the medicine B being obviously better than that of the side effect selected randomly are reserved;
4) Z score was used to measure this significant difference, as follows:
wherein the Zcore threshold is set to be equal to or greater than 2.576;
5) Because the MeSH contains disease information distributed according to the hierarchy, the potential indication of the MeSH can be predicted in the network according to the MeSH information of adjacent medicines;
6) Using the EASE score to calculate the degree of enrichment of the side effect network, ranking the indication locations of drug A;
7) And finally, the model is a large network formed by a similarity sub-network between any two medicines, the normalized discount accumulated income is originally used for evaluating a network search engine algorithm in the information retrieval field, and the usefulness degree of the document in a result list is calculated and is used for calculating the rank ranking accuracy of the medicine prediction result.
CN202110448039.XA 2021-04-25 2021-04-25 Method for predicting drug repositioning through side effect based on network learning Active CN113160879B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110448039.XA CN113160879B (en) 2021-04-25 2021-04-25 Method for predicting drug repositioning through side effect based on network learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110448039.XA CN113160879B (en) 2021-04-25 2021-04-25 Method for predicting drug repositioning through side effect based on network learning

Publications (2)

Publication Number Publication Date
CN113160879A CN113160879A (en) 2021-07-23
CN113160879B true CN113160879B (en) 2023-11-28

Family

ID=76870224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110448039.XA Active CN113160879B (en) 2021-04-25 2021-04-25 Method for predicting drug repositioning through side effect based on network learning

Country Status (1)

Country Link
CN (1) CN113160879B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626567A (en) * 2021-07-28 2021-11-09 上海基绪康生物科技有限公司 Method for mining information related to genes and diseases from biomedical literature
CN113611363B (en) * 2021-08-09 2023-11-28 上海基绪康生物科技有限公司 Method for identifying cancer driving gene by using consensus prediction result
CN113838583B (en) * 2021-09-27 2023-10-24 中国人民解放军空军军医大学 Intelligent medicine curative effect evaluation method based on machine learning and application thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
CN111191014A (en) * 2019-12-26 2020-05-22 上海科技发展有限公司 Medicine relocation method, system, terminal and medium
CN111951886A (en) * 2019-05-17 2020-11-17 天津科技大学 Drug relocation prediction method based on Bayesian inductive matrix completion
CN112216396A (en) * 2020-10-14 2021-01-12 复旦大学 Method for predicting drug-side effect relationship based on graph neural network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11037684B2 (en) * 2014-11-14 2021-06-15 International Business Machines Corporation Generating drug repositioning hypotheses based on integrating multiple aspects of drug similarity and disease similarity
GB2537925A (en) * 2015-04-30 2016-11-02 Fujitsu Ltd A similarity-computation apparatus, a side effect determining apparatus and a system for calculating similarities between drugs and using the similarities
US20200176128A1 (en) * 2018-12-02 2020-06-04 Ravipal Soin Identifying Drug Side Effects

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
CN111951886A (en) * 2019-05-17 2020-11-17 天津科技大学 Drug relocation prediction method based on Bayesian inductive matrix completion
CN111191014A (en) * 2019-12-26 2020-05-22 上海科技发展有限公司 Medicine relocation method, system, terminal and medium
CN112216396A (en) * 2020-10-14 2021-01-12 复旦大学 Method for predicting drug-side effect relationship based on graph neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"NEUROCOMPUTING";Zhang, Wen, et al.;《NEUROCOMPUTING》;第287卷;154-162 *
"drug relocation";Wang, Jihong, et al.;"A Drug Target Interaction Prediction Based on LINE-RF Learning";第15卷(第7期);750-757 *
"基于化学信息学方法预测药物副作用的研究进展";薛斌,等;《计算机与应用化学》;第36卷(第5期);487-490 *
"基于多相似度融合的药物重定位推荐算法";陈鹏,等;《计算机技术与发展》;第31卷(第1期);168-174页 *

Also Published As

Publication number Publication date
CN113160879A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN113160879B (en) Method for predicting drug repositioning through side effect based on network learning
Nourmohammadi-Khiarak et al. New hybrid method for heart disease diagnosis utilizing optimization algorithm in feature selection
US9971974B2 (en) Methods and systems for knowledge discovery
Li et al. Application of a clustering method on sentiment analysis
CN109935337B (en) Medical record searching method and system based on similarity measurement
CN114242186B (en) Chinese and western medicine relocation method and system fusing GHP and GCN and storage medium
CN105740626A (en) Drug activity prediction method based on machine learning
Glenisson et al. Evaluation of the vector space representation in text-based gene clustering
CN113597645A (en) Methods and systems for reconstructing drug response and disease networks and uses thereof
US20140089246A1 (en) Methods and systems for knowledge discovery
Sughasiny et al. Application of machine learning techniques, big data analytics in health care sector–a literature survey
CN109801687A (en) A kind of construction method and system of the causality knowledge base towards medicine
Gómez‐Núñez et al. Updating the SCI mago journal and country rank classification: A new approach using W ard's clustering and alternative combination of citation measures
CN111026877A (en) Knowledge verification model construction and analysis method based on probability soft logic
Xiong et al. A multimodal framework for improving in silico drug repositioning with the prior knowledge from knowledge graphs
Luo et al. A neural network approach to chemical and gene/protein entity recognition in patents
Chinnasamy et al. Machine learning based cardiovascular disease prediction
Zhou et al. A new method of health condition detection for hydraulic pump using enhanced whale optimization-resonance-based sparse signal decomposition and modified hierarchical amplitude-aware permutation entropy
Hasan et al. Clinical Question Answering using Key-Value Memory Networks and Knowledge Graph.
Gong et al. Prioritization of disease susceptibility genes using LSM/SVD
CN114691826B (en) Medical data information retrieval method based on co-occurrence analysis and spectral clustering
Rak et al. Multilabel associative classification categorization of MEDLINE articles into MeSH keywords
Gollapalli et al. Text mining on hospital stay durations and management of sickle cell disease patients
Rahman et al. A machine learning based modeling of the cytokine storm as it relates to COVID-19 using a virtual clinical semantic network (vCSN)
Wang et al. Potentiality of healthcare big data: Improving search by automatic query reformulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant