CN104346513A - Chinese herbal medicinal ingredient and compound hepatotoxin evaluation system based on propelling decision-making tree - Google Patents

Chinese herbal medicinal ingredient and compound hepatotoxin evaluation system based on propelling decision-making tree Download PDF

Info

Publication number
CN104346513A
CN104346513A CN201310344934.2A CN201310344934A CN104346513A CN 104346513 A CN104346513 A CN 104346513A CN 201310344934 A CN201310344934 A CN 201310344934A CN 104346513 A CN104346513 A CN 104346513A
Authority
CN
China
Prior art keywords
compound
chinese medicine
traditional chinese
medicine ingredients
liver
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310344934.2A
Other languages
Chinese (zh)
Inventor
朱永亮
叶立
王新洲
叶祖光
金若敏
姚广涛
刘敬阁
钱向平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Neupharma Co Ltd
Original Assignee
Suzhou Neupharma Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Neupharma Co Ltd filed Critical Suzhou Neupharma Co Ltd
Priority to CN201310344934.2A priority Critical patent/CN104346513A/en
Publication of CN104346513A publication Critical patent/CN104346513A/en
Pending legal-status Critical Current

Links

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses an evaluation and prediction method of Chinese herbal medicinal ingredient and compound hepatotoxin. The method comprises the following steps: firstly, determining chemical structures of Chinese herbal medicinal ingredient and compound to be evaluated according to Chinese herbal medicinal ingredient and compound databases or Chinese herbal medicinal ingredient and compound construction tools, and secondly, performing hepatotoxin toxicity evaluation through a Chinese herbal medicinal ingredient and compound hepatotoxin prediction model based on a propelling decision-making tree algorithm built in a system according to the chemical structures of the Chinese herbal medicinal ingredient and compound of traditional Chinese medicine to be evaluated, and finally obtaining an evaluation and prediction conclusion.

Description

Based on traditional Chinese medicine ingredients and the compound liver poison evaluation system of pusher decision tree
Technical field
The present invention relates to liver poison toxicity assessment method and its system of a kind of area of computer aided traditional Chinese medicine ingredients and compound.
Background technology
Liver, as the organ of vertebrate (comprising the mankind), is the organ based on metabolic function in health, and inside health, plays detoxification element, stores glycogen (glycogen), secreted protein synthesis etc.In drug development and use procedure, use in medicament-induced hepatotoxicity is the one of the main reasons causing new drug development failure or remove city.Therefore, Ge great drugmaker is subject to day by day for the hepatotoxicity wind agitation evaluation method that drug development is early stage paid attention to.At home, it is the main method of disease therapy that Chinese medicine uses, but increasing Chinese medicine has been in the news and confirmation has liver cytotoxic activity, the serious harm safe handling of medicine.Therefore, need badly and the overall evaluation is carried out to the liver cytotoxic activity of traditional Chinese medicine ingredients and compound, thus optimize clinical application.
Utilize traditional experimental technique to remove to check the liver cytotoxic activity of traditional Chinese medicine ingredients and compound, often the cycle is longer, and cost is higher, is difficult to realization and evaluates the traditional Chinese medicine ingredients of enormous amount and compound liver cytotoxic activity.In recent years, existing many seminars attempt utilizing toxicogenomics go to carry out hepatotoxicity wind agitation research and achieve certain achievement, but biochip technology still has some limitations in theory He technically, thus the evaluation accuracy that result in method is here not high.Calculate toxicologic fast development and the extensive application on environmental compound toxicity assessment and indicate feasibility based on toxicologic method fast prediction traditional Chinese medicine ingredients and compound liver cytotoxic activity.Calculating relevant (QSAR) relation of toxicologic method application Study on Quantitative Structure-Activity Relationship can directly based on the biologically active of Molecular structure prediction compound, optimizing animal experimental design, heavy experiment and high expense are reduced or remitted, decrease the quantity of animal subject, be widely used in now the toxicity of compound prediction in drug design process, and achieve good result.We are by finding the investigation of existing compound liver poison forecast model, the training set data that nearly all model adopts is all the compound (comprising medicine) of Prof. Du Yucang, and traditional Chinese medicine ingredients diversity structurally could not be embodied, thus may not be applied to the liver cytotoxic activity prediction of traditional Chinese medicine ingredients preferably.Due to tree-model easy to understand and realization, and be applicable to being applied to nonumeric type Data classification, tree-model is successfully applied to the prediction of toxicity of compound by existing people at present.Set as Zhang Zhen mountain etc. utilizes positive decision Tree algorithms to construct one by 80 strain decision-making lists the decision forest model formed, and good effect is achieved to the carcinogenic toxicity prediction of compound; Cheng and Dixon utilizes recurrence to divide into groups and global learning method constructs compound liver poison forecast model based on decision-tree model, and this model to external data set table reveals very high prediction effect.And liver that Fourches etc. utilize support vector machine method to build poison forecast model shows, its to the forecasting accuracy of outside data set between 55.7-72.6%, the model built lower than utilizing decision tree.
Summary of the invention
The present invention is directed to above-mentioned situation, the traditional Chinese medicine ingredients reported by marketed drug and document and compound liver poison data are as training set, construct the traditional Chinese medicine ingredients based on pusher decision Tree algorithms and compound liver poison forecast model, thus a kind of method providing quick, relatively accurate traditional Chinese medicine ingredients and compound liver poison to evaluate and system thereof.
In order to realize above-mentioned task, the present invention takes following technical solution:
A first aspect of the present invention, the method of evaluation and forecast of a kind of traditional Chinese medicine ingredients and compound liver poison is provided, the step of the method comprises: step one, by traditional Chinese medicine ingredients and compound structure database or traditional Chinese medicine ingredients and compound structure the build tool, determine the chemical constitution of traditional Chinese medicine ingredients to be evaluated and compound; Step 2, according to the chemical constitution of this traditional Chinese medicine ingredients to be evaluated and compound, the traditional Chinese medicine ingredients based on pusher decision Tree algorithms built-in by system and compound liver poison forecast model carry out liver poison toxicity assessment to it, finally draw evaluation and foreca conclusion.
" traditional Chinese medicine ingredients and compound liver poison forecast model based on pusher decision Tree algorithms " in step 2, by adopting molecular descriptor, descriptor screening and pusher traditional decision-tree, carry out statistical modeling to the molecular structure of existing relevant liver poison toxic traditional Chinese medicine composition and compound and the malicious attribute of its liver and obtain.Detailed modeling process is as follows:
1. data
This forecast model build institute based on compound data from the traditional Chinese medicine ingredients data of the known liver cytotoxic activity of 286 the compound data (221 have liver poison and 65 malicious without liver) in FDA contained by Liver Toxicity Knowledge Base (LTKB) and 62 literature's store as training set (totally 348 traditional Chinese medicine ingredients or compound).LTKB is a project of the hepar damnification research that U.S. FDA country toxicological study center is caused about medicine, its objective is and help medicament research and development personnel to the understanding of hepatic injury mechanism, give medicament research and development personnel, the reference of scientific research personnel and managerial personnel's Drug safety assessment aspect.This project relates to many-sided Data Collection housekeeping, comprises liver poison mechanism, drug action mechanism, target spot, the information of the aspects such as spinoff, then utilizes the method for systems biology comprehensively to analyze these data, and the hepatic injury feature giving single medicine is assessed.Importantly, in concrete evaluation stage, the method for classic method and high flux molecular matrix is used to evaluate selected medicine hepatic injury, greatly strengthen the credibility of data.
2. the structure of model
The calculating of 2.1 molecular descriptor.On the basis of collecting these 348 traditional Chinese medicine ingredients neat or compound 2D structure, Mold2 (Molecular 2D Descriptors Generator Software) software is utilized to calculate its 2D descriptor to each traditional Chinese medicine ingredients or compound.Mold2 developed by American National toxicological study center Bioinformatics Institute, it is a quick and free 2D molecular descriptor software for calculation, based on its 777 2D descriptors of the 2D Structure Calculation of compound, the compound descriptor computation of different number size can be applicable to.
The screening of 2.2 molecular descriptor.First, we eliminate calculated value in the traditional Chinese medicine ingredients or compound exceeding sum 90% is all the descriptor of steady state value, then eliminates related coefficient between two and, higher than one in two descriptors of 0.9, guarantees do not have serious dependence between descriptor; Then the multiple correlation existed between descriptor is cleared up.In the concrete election process of descriptor, the method for resampling is adopted repeatedly to assess the prediction case of model constructed by different number descriptor set to traditional Chinese medicine ingredients or compound data.Model building method adopts random forests algorithm, have employed the method for cross validation in model construction process.Finally, with reference to different number descriptor set constructed by the prediction case of model pick out best descriptor set for final model construction.Concrete steps are as follows: 1. first utilize method for resampling data to be divided into training set and test set two class; 2. based on training set data, utilize all descriptors, build forecast model, and prediction and evaluation is carried out to test set data, evaluate based on the variable predicted the outcome to participating in building model and sort simultaneously; 3. choose the most important descriptor of different number, and utilize random forests algorithm to build model based on training set data, and utilize leave-one-out cross validation method to go to evaluate, optimization model is used for carrying out prediction and evaluation to test set data; 4. repeat 1,2 and 3 step, constructed by the most important descriptor of the different numbers of statistical study, the prediction case of model, have selected 35 optimum descriptors.The coding of these 35 descriptors in Mold2 software is D026, D123, D144, D152, D173, D191, D253, D255, D299, D309, D374, D449, D456, D457, D460, D461, D464, D465, D471, D475, D476, D477, D485, D489, D521, D539, D565, D572, D580, D588, D674, D677, D747, D775, D777 respectively, relates separately to the aspects such as topological index, information index, Burden eigenwert and composition parameter.
2.3 model constructions are also evaluated.Pusher decision Tree algorithms belongs to a kind of integrated classifier equally, first each sample in training set can obtain a weighted value, and self weighted value of the correctness that can predict the outcome according to sorter amendment, predicting the outcome is that multiple Iterative classification devices by comprised determined.In concrete model construction process, setting iterations is 10 times, and utilizes the cross validation method of leave-one-out to assess model building method.The model accuracy constructed arrives 82%.
Use 22 be not comprised in build model training set data in traditional Chinese medicine ingredients as external testing collection, the evaluation accuracy of forecast model reaches 68%, wherein to 16 have liver poison traditional Chinese medicine ingredients prediction accuracy be 81%.22 traditional Chinese medicine ingredients are matrine, oxymatrine, triptolide, triptonide, Celastrol, wilforine, Rutaecarpine, rutaecarpin, evodin, synephrine, Dioscin, aristolochic acid A, hanfangchin B, pokeroot saponin A, enoxolone, andrographolide, toosendanin, loganin, Puerarin, Gardenoside, aesculin, schizandrin respectively.Test 22 traditional Chinese medicine ingredients to see the following form at the liver poison toxicity profile of Human normal hepatocyte HL7702 cell line (Shanghai Chinese Academy of Sciences cell bank).
Table 1. utilizes optimization model to predict the outcome to outside 22 traditional Chinese medicine ingredients are hepatotoxic
Note: Y: have liver poison; N: without liver poison.
A second aspect of the present invention provides the evaluation and foreca system of a kind of traditional Chinese medicine ingredients and compound liver poison, the evaluation and foreca system of this traditional Chinese medicine ingredients and compound liver poison at least comprises: load module, for input traditional Chinese medicine ingredients and compound title or (with) chemical constitution; Output module, for the display of computation process information, and predicts the outcome and gathers and export; Memory module, for storing, reading and supervisory computer programs file, traditional Chinese medicine ingredients and compound structure file, activity data file, configuration file, temporary file and history file; Traditional Chinese medicine ingredients and compound structure database: the various title and 2 for store and management traditional Chinese medicine ingredients and compound ties up structured datas; Traditional Chinese medicine ingredients and compound liver poison database, the traditional Chinese medicine ingredients with liver poison toxicity known for store and management and compound, comprise its title, liver poison attribute and experiment condition etc., data from U.S. food Drug Administration Liver Toxicity Knowledge Base (LTKB) contained by 286 compound data (221 have liver poison and 65 without liver poison) and the traditional Chinese medicine ingredients data of known liver cytotoxic activity of 62 literature's store as training set (totally 348 traditional Chinese medicine ingredients or compound, 256 have liver malicious and 92 malicious without liver) etc.; Traditional Chinese medicine ingredients and toxicity of compound prediction module, for evaluation and foreca traditional Chinese medicine ingredients to be measured and compound hepatotoxicity wind agitation matter; Data processing module, for connecting above-mentioned modules, real-time management data flow is system core module.Not timing upgrades and adds new traditional Chinese medicine ingredients or compound by above-mentioned database information.
Method and system of the present invention may be used for the research and development early stage of new drug, to grinding medicine and carry out evaluation and the screening of medicine liver poison, saves a large amount of funds and time.
Method and system of the present invention also can be used for the bad reaction of existing medicine and the evaluation of taboo.By the result that the evaluation and foreca of method and system of the present invention draws, can as the reference of the bad reaction had and taboo that judge this medicine.
Use method and system of the present invention, can rapidly, comprehensively, exactly evaluate and predictive compound liver poison toxicity, for new drug development and application provide reliable reference and guidance.
Embodiment
The present embodiment is the embodiment about traditional Chinese medicine ingredients of the present invention and compound liver poison toxicity assessment prognoses system.
A kind of traditional Chinese medicine ingredients of the present invention and compound liver poison toxicity assessment prognoses system comprise:
(1) load module, for inputting and/or retrieve the chemical constitution of traditional Chinese medicine ingredients and compound, and responds the various operations of user; (2) output module, for the display of computation process information, and the output predicted the outcome; (3) memory module, for storing, reading and supervisory computer programs file, traditional Chinese medicine ingredients and compound structure file, activity data file, configuration file, temporary file and history file; (4) traditional Chinese medicine ingredients and compound structure database: the various title and 2 for store and management traditional Chinese medicine ingredients and compound ties up structured datas; (5) traditional Chinese medicine ingredients and compound liver poison database, for the compound with liver poison toxicity that store and management is known, comprise its title, liver poison attribute and experiment condition etc., data from the traditional Chinese medicine ingredients data of the known liver cytotoxic activity of 286 the compound data (221 have liver poison and 65 without liver poison) in FDA contained by Liver Toxicity Knowledge Base (LTKB) and 62 literature's store as training set (totally 348 traditional Chinese medicine ingredients or compound, 256 have liver malicious and 92 malicious without liver) etc.; (6) traditional Chinese medicine ingredients and toxicity of compound evaluation and foreca module, for the liver poison toxicity of evaluation and foreca traditional Chinese medicine ingredients to be measured and compound, described traditional Chinese medicine ingredients and compound liver poison toxicity prediction module refer to the traditional Chinese medicine ingredients based on pusher decision Tree algorithms and compound liver poison forecast model that use computer program to realize; (7) data processing module, for above-mentioned modules and the database of connected system, real-time management data flow is system core module.
By the synergy of load module, data processing module and each database, complete the work such as the typing of compound, data retrieval, information processing, evaluation result output.Described load module is for completing input and the analysis of traditional Chinese medicine ingredients to be predicted and compound, and respond the various operations of user, described load module also comprises: the various name data tables of traditional Chinese medicine ingredients and compound, for storing the Chinese and English title of traditional Chinese medicine ingredients and compound, another name and database login number, real-time analysis is carried out to the data of input, the traditional Chinese medicine ingredients input user and compound title or chemical constitution are converted into accession number sequence, for the retrieval in later stage; Compound title prompting submodule, can input in the process of traditional medicine name user and provide real-time prompting, to raise the efficiency, to reduce mistake.
For the chemical constitution of traditional Chinese medicine ingredients to be evaluated and compound, traditional Chinese medicine ingredients and toxicity of compound evaluation and foreca module use its liver poison forecast model to carry out calculating and the model prediction of related chemistry descriptor, and net result is exported by data processing module.
The traditional Chinese medicine ingredients built based on pusher decision Tree algorithms that described traditional Chinese medicine ingredients and toxicity of compound evaluation and foreca module rely on and compound liver poison forecast model can upgrade according to the renewal of traditional Chinese medicine ingredients and compound liver poison database.

Claims (2)

1., based on traditional Chinese medicine ingredients and the compound liver poison evaluation method of pusher decision Tree algorithms, it is characterized in that:
(1) the liver poison basic data building model, both containing Western medicine compound, contains traditional Chinese medicine ingredients again;
(2) have employed resampling methods and select optimum traditional Chinese medicine ingredients and compound descriptor set in conjunction with random forests algorithm, as traditional Chinese medicine ingredients and compound chemical characterization and for building traditional Chinese medicine ingredients and compound liver poison forecast model, described descriptor comprises the aspects such as topological index, information index, Burden eigenwert and composition parameter;
(3) use the pusher decision Tree algorithms improved to characterize the liver poison toxicity data of training set traditional Chinese medicine ingredients and compound and corresponding chemical thereof and carry out data mining, obtain traditional Chinese medicine ingredients and compound liver poison forecast model;
(4) the pusher decision Tree algorithms improved is used to use the modular traditional Chinese medicine ingredients of php language development and compound liver poison toxicity prediction program.
2., based on traditional Chinese medicine ingredients and the compound liver poison evaluation system of pusher decision Tree algorithms, it is characterized in that, described prognoses system at least comprises:
(1) load module, for input traditional Chinese medicine ingredients and compound title or (with) chemical constitution;
(2) output module, for the display of computation process information, and predicts the outcome and gathers and export;
(3) memory module, for storing, reading and supervisory computer programs file, traditional Chinese medicine ingredients and compound structure file, activity data file, configuration file, temporary file and history file;
(4) traditional Chinese medicine ingredients and compound structure database: the various title and 2 for store and management traditional Chinese medicine ingredients and compound ties up structured datas;
(5) traditional Chinese medicine ingredients and compound liver poison database, for traditional Chinese medicine ingredients and the compound of store and management liver poison toxicity, comprise its title, liver poison attribute and experiment condition etc., data from U.S. food Drug Administration Liver Toxicity Knowledge Base (LTKB) contained by 286 compound data (221 have liver poison and 65 malicious without liver) and the traditional Chinese medicine ingredients data of known liver cytotoxic activity of 62 literature's store as training set (totally 348 traditional Chinese medicine ingredients or compound) etc.;
(6) traditional Chinese medicine ingredients and toxicity of compound prediction module, for evaluation and foreca traditional Chinese medicine ingredients to be measured and compound hepatotoxicity wind agitation matter;
(7) data processing module, for connecting above-mentioned modules, real-time management data flow is system core module;
Not timing upgrades and adds new traditional Chinese medicine ingredients or compound by above-mentioned database information.
CN201310344934.2A 2013-08-09 2013-08-09 Chinese herbal medicinal ingredient and compound hepatotoxin evaluation system based on propelling decision-making tree Pending CN104346513A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310344934.2A CN104346513A (en) 2013-08-09 2013-08-09 Chinese herbal medicinal ingredient and compound hepatotoxin evaluation system based on propelling decision-making tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310344934.2A CN104346513A (en) 2013-08-09 2013-08-09 Chinese herbal medicinal ingredient and compound hepatotoxin evaluation system based on propelling decision-making tree

Publications (1)

Publication Number Publication Date
CN104346513A true CN104346513A (en) 2015-02-11

Family

ID=52502102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310344934.2A Pending CN104346513A (en) 2013-08-09 2013-08-09 Chinese herbal medicinal ingredient and compound hepatotoxin evaluation system based on propelling decision-making tree

Country Status (1)

Country Link
CN (1) CN104346513A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295887A (en) * 2016-08-12 2017-01-04 辽宁大学 Lasting seed bank Forecasting Methodology based on random forest
CN110600135A (en) * 2019-09-18 2019-12-20 东北大学 Breast cancer prediction system based on improved random forest algorithm
CN112820359A (en) * 2021-02-24 2021-05-18 北京中医药大学东直门医院 Liver injury prediction method, apparatus, device, medium, and program product
CN113628697A (en) * 2021-07-28 2021-11-09 上海基绪康生物科技有限公司 Random forest model training method for classification unbalance data optimization
CN114530212A (en) * 2022-01-11 2022-05-24 中国中医科学院中药研究所 Traditional Chinese medicine chemical component nephrotoxicity prediction and evaluation method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295887A (en) * 2016-08-12 2017-01-04 辽宁大学 Lasting seed bank Forecasting Methodology based on random forest
CN110600135A (en) * 2019-09-18 2019-12-20 东北大学 Breast cancer prediction system based on improved random forest algorithm
CN112820359A (en) * 2021-02-24 2021-05-18 北京中医药大学东直门医院 Liver injury prediction method, apparatus, device, medium, and program product
CN113628697A (en) * 2021-07-28 2021-11-09 上海基绪康生物科技有限公司 Random forest model training method for classification unbalance data optimization
CN114530212A (en) * 2022-01-11 2022-05-24 中国中医科学院中药研究所 Traditional Chinese medicine chemical component nephrotoxicity prediction and evaluation method
CN114530212B (en) * 2022-01-11 2024-05-31 中国中医科学院中药研究所 Method for predicting and evaluating nephrotoxicity of chemical components of traditional Chinese medicine

Similar Documents

Publication Publication Date Title
Chen et al. Efficient variant set mixed model association tests for continuous and binary traits in large-scale whole-genome sequencing studies
Akbar et al. cACP-DeepGram: classification of anticancer peptides via deep neural network and skip-gram-based word embedding model
El-Bialy et al. Feature analysis of coronary artery heart disease data sets
CN104346513A (en) Chinese herbal medicinal ingredient and compound hepatotoxin evaluation system based on propelling decision-making tree
CN105740626B (en) Drug activity prediction method based on machine learning
CN106709272B (en) Method and system based on decision template prediction drug target protein interaction relationship
CN105653846A (en) Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
Zhu et al. Phylogenetic analysis of Uncaria species based on internal transcribed spacer (ITS) region and ITS2 secondary structure
Poon et al. A novel approach in discovering significant interactions from TCM patient prescription data
Sun et al. Development of quantitative structure-activity relationship models to predict potential nephrotoxic ingredients in traditional Chinese medicines
Liu et al. Chinese herbal medicine hepatotoxicity: the evaluation and recognization based on large-scale evidence database
Li et al. TMNP: a transcriptome-based multi-scale network pharmacology platform for herbal medicine
Yu et al. Analyzing the molecular mechanism of xuefuzhuyu decoction in the treatment of pulmonary hypertension with network pharmacology and bioinformatics and verifying molecular docking
CN109545276B (en) Drug discovery method based on epigenome and application thereof
Zhao et al. HGNA-HTI: Heterogeneous graph neural network with attention mechanism for prediction of herb-target interactions
Hu et al. Acute coronary syndrome risk prediction based on GRACE Risk Score
CN113506592A (en) Mechanism analysis method of traditional Chinese medicine for treating chronic bronchitis
Heiskanen et al. Predicting drug–target interactions through integrative analysis of chemogenetic assays in yeast
CN103077322A (en) Cardiovascular toxicity evaluation predicting system and method of traditional medicine
Yang et al. Exploring the multi-level interaction mechanism between drugs and targets based on artificial intelligence
Noorabad-Ghahroodi et al. HGDB: A web retrieving cardiovascular-associated gene data
Rollinger et al. Computational approaches for the discovery of natural lead structures
Zhang et al. An Approach for Recognition of Enhancer-promoter Associations based on Random Forest
CN110322929A (en) A method of the direct target spot of prediction Chinese medicine compound prescription and action component
Aguilar-Valdez et al. Unraveling the hcov-19 informational architecture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150211