CN111540419A - Anti-senile dementia drug effectiveness prediction system based on deep learning - Google Patents

Anti-senile dementia drug effectiveness prediction system based on deep learning Download PDF

Info

Publication number
CN111540419A
CN111540419A CN202010347311.0A CN202010347311A CN111540419A CN 111540419 A CN111540419 A CN 111540419A CN 202010347311 A CN202010347311 A CN 202010347311A CN 111540419 A CN111540419 A CN 111540419A
Authority
CN
China
Prior art keywords
data
effectiveness
prediction
model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010347311.0A
Other languages
Chinese (zh)
Inventor
邱卫东
鲁静文
王昊
唐鹏
郭捷
黄征
王强民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202010347311.0A priority Critical patent/CN111540419A/en
Publication of CN111540419A publication Critical patent/CN111540419A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

An anti-senile dementia drug effectiveness prediction system based on deep learning, comprising: an anti-senile dementia traditional Chinese medicine effectiveness prediction module based on prescription attributes and an anti-senile dementia medicine effectiveness prediction module based on molecular compound characteristics, wherein: the anti-senile dementia traditional Chinese medicine effectiveness prediction module performs the training treatment of the drug attributes in the traditional Chinese medicine prescription and outputs a traditional Chinese medicine prescription prediction model; and the anti-senile dementia drug effectiveness prediction module performs molecular compound characteristic value training and outputs a molecular compound prediction model. The invention takes the drug attributes and the characteristics of molecular compounds in the traditional Chinese medicine prescription as training data to train a deep learning model, and accurately predicts the effectiveness of the drug by using the model.

Description

Anti-senile dementia drug effectiveness prediction system based on deep learning
Technical Field
The invention relates to a technology in the field of artificial intelligence application, in particular to an anti-senile dementia drug effectiveness prediction system based on deep learning, which utilizes the deep learning technology to further distinguish and predict the effectiveness of compound molecules in drugs.
Background
In the research of traditional Chinese medicine, a plurality of traditional Chinese medicine formulas for treating dementia, amnesia and the like similar to senile dementia are recorded in documents. The traditional Chinese medicine research adopts a circulation mode of 'separation and extraction, activity detection, separation and extraction, and activity detection …', and the mode only focuses on single-component or single-component research methods, which is lack of systematicness, high in investment and low in efficiency.
With the rise of data mining and machine learning technologies, machine learning and even deep learning methods provide support for traditional Chinese medicine research. At present, the research on traditional Chinese medicines by using machine learning mainly focuses on the relationship between traditional Chinese medicine components and symptoms, the method only analyzes the traditional Chinese medicine components, and the analysis mode and the analysis data are single.
Disclosure of Invention
The invention provides an anti-senile dementia drug effectiveness prediction system based on deep learning by combining artificial intelligence technology aiming at the defects of the existing anti-senile dementia drug research, wherein a deep learning model is trained by taking the drug attributes and the characteristics of molecular compounds in a traditional Chinese medicine prescription as training data, and the model is used for accurately predicting the drug effectiveness.
The invention is realized by the following technical scheme:
the invention comprises the following steps: an anti-senile dementia traditional Chinese medicine effectiveness prediction module based on prescription attributes and an anti-senile dementia medicine effectiveness prediction module based on molecular compound characteristics, wherein: the anti-senile dementia traditional Chinese medicine effectiveness prediction module performs the training treatment of the drug attributes in the traditional Chinese medicine prescription and outputs a traditional Chinese medicine prescription prediction model; and the anti-senile dementia drug effectiveness prediction module performs molecular compound characteristic value training and outputs a molecular compound prediction model.
The anti-senile dementia traditional Chinese medicine effectiveness prediction module comprises: the Chinese medicine effectiveness prediction system comprises a Chinese medicine attribute calculation unit, an attribute-based deep learning training unit and a Chinese medicine effectiveness prediction unit, wherein: the traditional Chinese medicine attribute calculation unit performs data standardization on traditional Chinese medicine prescriptions to obtain a binary character array, the deep learning training unit divides the standardized prescription data into a training set and test set data information according to the data division part, the training part receives the training set to generate a deep learning prediction model to be provided for the traditional Chinese medicine prescription effectiveness prediction unit, and the traditional Chinese medicine prescription effectiveness prediction unit converts the input traditional Chinese medicine prescriptions into standardized prescription data and outputs the standardized prescription data to the deep learning prediction model for intelligent prediction to obtain a prescription effectiveness prediction result.
The data standardization treatment refers to: according to the record of 'Chinese pharmacopoeia' 2015 edition, 22 attributes including sex, taste and channel tropism are recorded, attribute statistics is carried out on all the medicinal materials in the prescription, and the character prescription is converted into a binary character array for deep learning training unit processing.
The deep learning prediction model is arranged in a Chinese medicine prescription effectiveness prediction unit, and the specific architecture of the model comprises: a data reading sub-module, a model training sub-module, and a data prediction sub-module, wherein: the data reading submodule reads the standardized prescription data generated by the traditional Chinese medicine attribute calculation unit, inputs the data into the model training submodule in the model training stage and inputs the data into the data prediction submodule in the prediction stage; the model training submodule divides a data set into a training set and test set data information in a model training stage, the training set is input to a model input layer and used for training a machine learning model, and the test set is used for evaluating the prediction effect of the model; and the data prediction submodule acquires a model with the optimal effect in the model training submodule to predict the effectiveness of the traditional Chinese medicine prescription with unknown effect.
The module for predicting the effectiveness of the anti-senile dementia drug comprises: the device comprises a molecular compound feature extraction unit, a feature-based deep learning training unit and a compound effectiveness prediction unit, wherein: the molecular compound feature extraction unit extracts a feature value of a molecular compound according to molecular features, the deep learning training unit reads compound standardized data generated by the molecular compound feature extraction unit, the data dividing part divides a data set into a training set and test set data information, the training part receives the training set to generate a deep learning prediction model and provides the deep learning prediction model for the compound effectiveness prediction unit, and the compound effectiveness prediction unit receives the standardized compound data and outputs the data to the deep learning prediction model for intelligent prediction to obtain a compound effectiveness prediction result.
The characteristic numerical value extraction is as follows: the standardized data entry of the compound is established according to the characteristic data of the compound molecules, such as balabanJ, BCUT _ PEOE _0, GCUT _ SMR _3, PEOE _ VSA-6, and the like.
The characteristic data of the compound molecule is calculated and generated by MOE (Molecular Operating Environment, Molecular simulation and drug design integrated software) according to the compound molecule data from BindingDB.
The deep learning prediction model is arranged in the compound effectiveness prediction unit, and the specific architecture of the model comprises: a data pre-processing sub-module, a model training sub-module, and a compound validity prediction sub-module, wherein: the data preprocessing submodule performs dimensionality reduction and standardization processing on the molecular compound features extracted by the molecular compound feature extraction unit, data are input into the model training submodule in a model training stage, and data are input into the compound effectiveness prediction submodule in a prediction stage; the model training submodule divides the molecular compound characteristic values subjected to dimensionality reduction and standardization into a training set and a testing machine, a deep learning method is used for modeling and learning the training set, and the testing set is used for evaluating the model; and the compound effectiveness prediction submodule acquires a model with the optimal effect in the model training submodule to predict the effectiveness of the molecular compound with unknown effect.
Technical effects
The invention integrally solves the prediction problem of a traditional Chinese medicine prescription for resisting senile dementia based on the pharmaceutical attributes and the screening problem of a molecular compound effective on target proteins of the senile dementia based on the characteristics of the molecular compound.
Compared with the prior art, the invention respectively carries out modeling learning from two aspects of drug attributes and molecular compounds contained in the drugs: (1) on the aspect of drug attributes, deep learning is used for modeling and learning the drug attributes in the traditional Chinese medicine formula; (2) on the molecular compound level, the deep learning is used for modeling and learning the compound characteristics contained in the medicine, and the effectiveness prediction of the anti-senile dementia medicine and the guiding suggestion for the synthesis of the anti-senile dementia new medicine can be realized under the condition of not increasing manual supervision and hardware equipment. The invention trains a deep learning model by taking the drug attributes in the traditional Chinese medicine prescription and the characteristics of the molecular compound as training data, and predicts the effectiveness of the drug by using the model, wherein the accuracy rate in the effectiveness prediction of the traditional Chinese medicine prescription reaches 94 percent, and the accuracy rate in the effectiveness prediction of the molecular compound reaches 88 percent. The method provided by the invention has strong flexibility, and can be popularized to the drug effectiveness prediction of other diseases.
Drawings
FIG. 1 is a block diagram of the system of the present invention.
Detailed Description
As shown in fig. 1, the present embodiment relates to a system for predicting the effectiveness of anti-alzheimer's disease drug based on deep learning, which comprises: an anti-senile dementia traditional Chinese medicine effectiveness prediction module based on prescription attributes and an anti-senile dementia medicine effectiveness prediction module based on molecular compound characteristics.
The embodiment relates to a method for predicting the effectiveness of the anti-senile dementia drug of the system, which comprises the following steps:
step one, data collection. The first prescription 224 meeting the conditions from 1988 to the present is extracted from the literature libraries of the national knowledge network, the Uygur, the Wanfang and the like, the syndrome type, the treatment method, the prescription, the origin and the effective rate are recorded in detail, and the anti-senile dementia traditional Chinese medicine prescription database is established.
Step two, the data processing of the traditional Chinese medicine attribute calculation unit specifically comprises the following steps:
2.1) all medicinal materials related to the prescription in the anti-senile dementia traditional Chinese medicine prescription database are subjected to numerical value coding, and the coding rule is as follows: according to the record of the 'Chinese pharmacopoeia' 2015 edition, the properties, tastes and channels are determined according to the characteristics, namely 'cold, hot, warm, cool, flat', 'sour, bitter, sweet, pungent, salty', 'heart, kidney, lung, liver, spleen, stomach, gallbladder, large intestine, bladder, triple energizer and small intestine channels', which have 22 attributes. And auditing whether the medicinal materials have corresponding attributes or not according to the sequence, wherein if the medicinal materials have the corresponding attributes, the attribute value is set to be 1, and if the medicinal materials do not have the corresponding attributes, the attribute value is set to be 0. Finally, one herb is coded into a one-dimensional numerical array consisting of 22 0 or 1.
2.2) encoding the prescriptions in the database into a two-dimensional numerical matrix, wherein the encoding rule is as follows: the medicinal materials in the prescription are converted into 22-bit numerical arrays in the step 2.1) in sequence to be used as row vectors of the numerical matrix.
The prescription contains 18 medicinal materials at most, and if the medicinal materials are less than 18 medicinal materials, 0 supplementing treatment is carried out. Finally, one example of a recipe is encoded as an 18 row, 22 column matrix of values.
2.3) marking the prescriptions in the database as validity: according to the effective rate of the prescription recorded in the Chinese knowledge network, the Uppur and the Wanfang document library, the effectiveness of the prescription is marked.
Step three, model training of the deep learning training unit based on the prescription attributes specifically comprises the following steps:
3.1) training a regression model:
3.1.1) selecting a multilayer perceptron as a type A regression model program, and setting parameters as follows: the input data height is 18 and weight is 22. The network comprises 3 hidden layers with 128 nodes, and dropout is 0.5. The learning rate is set to be 0.001, the optimization algorithm is adam algorithm, the loss function adopts mean square error, and the activation function adopts Relu. Training round 300 rounds.
3.1.2) dividing the 224 prescription data processed in the step two into 174 training sets and 50 testing sets, and inputting the training sets and the testing sets into the multilayer perceptron network model with the structure for training to obtain a Chinese medicine prescription effectiveness prediction model A.
3.1.3) selecting a convolutional neural network as a B-type regression model program, wherein the parameters of the convolutional neural network are as follows: the input data has a height of 18, a weight of 22, and a depth of 1. The network contains four convolutional layers (32, 64, 64, 32 filters of 3 x 3) and a pooling layer (filters of 2 x 2), one fully connected layer (128 nodes). dropout is 0.8. The learning rate is set to be 0.001, the optimization algorithm is adam algorithm, the loss function adopts mean square error, and the activation function adopts Relu. Training turns 400.
3.1.4) dividing the 224 prescription data processed in the step two into 174 training sets and 50 testing sets, and inputting the training sets and the testing sets into the convolutional neural network model with the structure for training to obtain a Chinese medicinal prescription effectiveness prediction model B.
3.2) training a classification model:
3.2.1) selecting a multilayer perceptron as a class C classification model program, wherein the parameters of the multilayer perceptron are as follows: the input data has a height of 18 and a weight of 22. The network comprises 3 hidden layers with 32 nodes, dropout is 0.8, Relu is adopted by an activation function in the hidden layers, and softmax is adopted in an output layer. The learning rate is set to be 0.001, the optimization algorithm is adam algorithm, and the loss function adopts cross entropy. Training round 80 rounds.
3.2.2) dividing the 224 prescription data processed in the step two into 174 training sets and 50 testing sets, inputting the training sets into a multilayer perceptron network model for training, and obtaining a Chinese medicine prescription effectiveness prediction model C.
3.2.3) selecting the convolutional neural network as a D-type regression model program, wherein the parameters of the convolutional neural network are as follows: the input data has a height of 18, a weight of 22, and a depth of 1. The network contains four convolutional layers (32, 64, 64, 32 filters of 1 × 1) and pooling layers (filters of 1 × 1), one fully connected layer (32 nodes), and a dropout rate of 0.8. The activation function adopts Relu in the hidden layer, the softmax learning rate in the output layer is set to be 0.001, the optimization algorithm is adam algorithm, and the loss function adopts cross entropy. The round of network training was 80 rounds.
3.2.4) dividing the 224 prescription data processed in the step two into 164 training sets and 60 testing sets, inputting the training sets into a convolutional neural network model for training, and obtaining a Chinese medicine prescription effectiveness prediction model D.
All model effects are shown in the following table:
model evaluation index Rate of accuracy Loss value
Model A 0.76 0.01
Model B 0.84 0.01
Model C 0.93 0.27
Model D 0.94 0.30
And step four, predicting the effectiveness of the traditional Chinese medicine prescription. Reading the Chinese medicinal prescription, converting the Chinese medicinal prescription into a standard data structure according to the step two, respectively inputting the standard data structure into the Chinese medicinal prescription effectiveness prediction model A, B, C, D, comprehensively considering the output results of the four models, and judging whether the Chinese medicinal prescription is effective.
The anti-senile dementia traditional Chinese medicine effectiveness prediction based on the molecular compound characteristics comprises the following steps:
step one, data collection. The structural data and the active molecular structural data of the compound contained in the medicine for treating the senile dementia are collected by utilizing traditional Chinese medicine compound databases such as TCMSP and TCMID and drug small molecule databases such as BindingDB and Drugbank. The data set in this example totaled 2710 entries of molecular compound.
Step two, extracting the characteristics of the molecular compound, which specifically comprises the following steps:
2.1) extracting characteristic values of the molecular compounds according to molecular characteristics, such as 354 characteristics (including the effectiveness characteristics of compound molecules on senile dementia targets) of balabanJ, BCUT _ PEOE _0, GCUT _ SMR _3, PEOE _ VSA-6 and the like, and establishing 1 characteristic item of 354 values for each molecular compound in the data set. The data set in this example contained 2710 entries of molecular compounds.
2.2) the characteristics (354 in total) of 2710 molecular compounds were normalized by number: the characteristic value of the molecular compound was obtained by subtracting the minimum value of the characteristic among 2710 molecular compounds and dividing by the range of the characteristic value, and the result was the characteristic value of the normalized molecular compound. Finally, the numerical value of each characteristic of the molecular compound is in the interval 0 to 1.
2.3) selecting the XGboost as a machine learning program of feature engineering, wherein the main parameters of the XGboost are set as follows: the input data is 354 in length, the boost is of tree type (gbtree), the activation function is multi: softmax, the maximum depth of the tree is 6 levels, and the gamma value is 0.1. The training round is 500 rounds. And (3) selecting 1500 data from the data set by adopting a ten-fold cross-validation method, inputting the 1500 data into the XGboost learning model, outputting a feature importance result of 354 features, and counting the features and times of the feature importance ranked 50 times in the ten-fold experiment. According to the statistical result, high-importance characteristics are screened out from 354 characteristics, and 47 characteristics are selected in total.
The 47 features are as follows (all features are calculated and interpreted using MOE software):
Figure BDA0002470598980000051
step three, model training of the deep learning unit based on the molecular compound characteristics specifically comprises the following steps:
3.1) selecting a Deep Neural Network (DNN) as a classification model program, wherein the structure and parameters of the deep neural network are as follows: the length of the input data is 47. The number of the network hidden layers is 4, the first layer is provided with 64 nodes, the second layer is provided with 32 nodes, the third layer is provided with 32 nodes, and the fourth layer is provided with 16 nodes. The dropout ratio of the first layer and the fourth layer was 0.2, and the dropout ratio of the second layer and the third layer was 0.5. The learning rate is set to be 0.00005, the optimizer adopts an adam algorithm, the loss function adopts cross entropy, the activation function of the hidden layer adopts Relu, and the activation function of the output layer adopts softmax. Training round 80 rounds.
3.2) the data set of the molecular compound processed in the second step is processed according to the following steps of 1: and 3, dividing the ratio into a training set and a testing set, inputting the training set into a deep neural network model for training, and obtaining a compound molecule effectiveness prediction model.
And step four, predicting the molecular effectiveness of the compound. And (4) reading compound molecules, converting the compound molecules into a standard data structure according to the step two, inputting the standard data structure into a compound molecule effectiveness prediction model, and outputting an effectiveness prediction result by the model.
And (3) evaluating the effect of the model:
rate of accuracy Rate of accuracy Recall rate
Training set 0.90 0.73 1.00
Test set 0.88 0.74 1.00
Accuracy (accuracycacy), which means that all samples with correct prediction result account for all sample ratios.
Precision (precision), which represents the true valid sample proportion of samples for which the prediction result is valid.
And the recall rate (recall) represents the proportion of the samples with the prediction results of being valid in all real valid samples.
The invention realizes the conversion from the drug attribute to the numerical matrix through the traditional Chinese medicine attribute calculation and the molecular compound characteristic calculation, selects 22 drug attributes in the traditional Chinese medicine formulas as the characteristic for predicting the effectiveness of the senile dementia formulas, and trains the 22 characteristic values by using a neural network. The invention calculates 354 characteristic values of the molecular compound, designs and realizes a dimensionality reduction method, and screens 47 characteristic values effective on senile dementia targets as training characteristics of the model.
Through specific practical experiments, under the specific environment setting of Linux, the device/method is started/operated by using a Python related command, and the obtained experimental data is as follows: in the aspect of the effectiveness prediction of the traditional Chinese medicine prescription, the highest accuracy rate can reach 94%, and in the aspect of the effectiveness prediction of the molecular compound, the highest accuracy rate can reach 88%.
Compared with the prior art, the accuracy rate of the invention for predicting the effectiveness of the traditional Chinese medicine prescription on the senile dementia by using the traditional Chinese medicine attributes reaches 94%, and the accuracy rate of the invention for predicting the effectiveness of the traditional Chinese medicine prescription on the senile dementia target by using the characteristics of the screened molecular compound reaches 88%.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (8)

1. An anti-senile dementia drug effectiveness prediction system based on deep learning, comprising: an anti-senile dementia traditional Chinese medicine effectiveness prediction module based on prescription attributes and an anti-senile dementia medicine effectiveness prediction module based on molecular compound characteristics, wherein: the anti-senile dementia traditional Chinese medicine effectiveness prediction module performs the training treatment of the drug attributes in the traditional Chinese medicine prescription and outputs a traditional Chinese medicine prescription prediction model; the anti-senile dementia drug effectiveness prediction module performs molecular compound characteristic value training and outputs a molecular compound prediction model;
the anti-senile dementia traditional Chinese medicine effectiveness prediction module comprises: the Chinese medicine effectiveness prediction system comprises a Chinese medicine attribute calculation unit, an attribute-based deep learning training unit and a Chinese medicine effectiveness prediction unit, wherein: the Chinese medicine attribute calculation unit performs data standardization on Chinese medicine prescriptions to obtain a binary character array, the deep learning training unit divides the standardized prescription data into a training set and test set data information according to a data division part, the training part receives the training set to generate a deep learning prediction model to be provided for the Chinese medicine prescription effectiveness prediction unit, and the Chinese medicine prescription effectiveness prediction unit converts the input Chinese medicine prescriptions into standardized prescription data and outputs the standardized prescription data to the deep learning prediction model for intelligent prediction to obtain a prescription effectiveness prediction result;
the module for predicting the effectiveness of the anti-senile dementia drug comprises: the device comprises a molecular compound feature extraction unit, a feature-based deep learning training unit and a compound effectiveness prediction unit, wherein: the molecular compound feature extraction unit extracts a feature value of a molecular compound according to molecular features, the deep learning training unit reads compound standardized data generated by the molecular compound feature extraction unit, the data dividing part divides a data set into a training set and test set data information, the training part receives the training set to generate a deep learning prediction model and provides the deep learning prediction model for the compound effectiveness prediction unit, and the compound effectiveness prediction unit receives the standardized compound data and outputs the data to the deep learning prediction model for intelligent prediction to obtain a compound effectiveness prediction result.
2. The system for predicting the effectiveness of anti-senile dementia drugs according to claim 1, wherein the data normalization process is: according to the record of 'Chinese pharmacopoeia' 2015 edition, 22 attributes including sex, taste and channel tropism are recorded, attribute statistics is carried out on all the medicinal materials in the prescription, and the character prescription is converted into a binary character array for deep learning training unit processing.
3. The system for predicting the effectiveness of anti-senile dementia drugs according to claim 1, wherein the deep learning prediction model is disposed in a unit for predicting the effectiveness of Chinese herbal prescriptions, and the specific architecture of the model comprises: a data reading sub-module, a model training sub-module, and a data prediction sub-module, wherein: the data reading submodule reads the standardized prescription data generated by the traditional Chinese medicine attribute calculation unit, inputs the data into the model training submodule in the model training stage and inputs the data into the data prediction submodule in the prediction stage; the model training submodule divides a data set into a training set and test set data information in a model training stage, the training set is input to a model input layer and used for training a machine learning model, and the test set is used for evaluating the prediction effect of the model; and the data prediction submodule acquires a model with the optimal effect in the model training submodule to predict the effectiveness of the traditional Chinese medicine prescription with unknown effect.
4. The system for predicting the effectiveness of anti-senile dementia drugs according to claim 1, wherein the extracted characteristic values are: the standardized data entry of the compound is established according to the characteristic data of the compound molecules, such as balabanJ, BCUT _ PEOE _0, GCUT _ SMR _3, PEOE _ VSA-6, and the like.
5. The system for predicting the effectiveness of a deep learning-based anti-senile dementia drug according to claim 1, wherein the characteristic data of the compound molecule is calculated by MOE based on the compound molecule data from BindingDB.
6. The system for predicting the effectiveness of an anti-senile dementia drug based on deep learning of claim 1, wherein the deep learning prediction model is disposed in the compound effectiveness prediction unit, and the model has a specific structure comprising: a data pre-processing sub-module, a model training sub-module, and a compound validity prediction sub-module, wherein: the data preprocessing submodule performs dimensionality reduction and standardization processing on the molecular compound features extracted by the molecular compound feature extraction unit, data are input into the model training submodule in a model training stage, and data are input into the compound effectiveness prediction submodule in a prediction stage; the model training submodule divides the molecular compound characteristic values subjected to dimensionality reduction and standardization into a training set and a testing machine, a deep learning method is used for modeling and learning the training set, and the testing set is used for evaluating the model; and the compound effectiveness prediction submodule acquires a model with the optimal effect in the model training submodule to predict the effectiveness of the molecular compound with unknown effect.
7. A method for predicting the effectiveness of an anti-senile dementia drug based on deep learning according to any one of claims 1 to 6, comprising the steps of:
step one, data collection;
step two, the data processing of the traditional Chinese medicine attribute calculation unit specifically comprises the following steps:
2.1) carrying out numerical value coding on all medicinal materials related to the formula in the anti-senile dementia traditional Chinese medicine formula database;
2.2) coding the prescriptions in the database into a two-dimensional numerical matrix;
2.3) marking the prescriptions in the database as validity marks;
thirdly, training a model of a deep learning training unit based on the prescription attributes;
and step four, predicting the effectiveness of the traditional Chinese medicine prescription.
8. A prediction of the effectiveness of a molecular compound based anti-senile dementia Chinese medicine according to the system of any one of claims 1 to 6, comprising the following steps:
step one, data collection;
step two, extracting the characteristics of the molecular compound, which specifically comprises the following steps:
2.1) extracting characteristic numerical values of molecular compounds according to molecular characteristics;
2.2) carrying out numerical value normalization treatment on each characteristic of the molecular compound one by one;
2.3) selecting the XGboost as a machine learning program of the feature engineering, selecting 1500 data from the data set by adopting a ten-fold cross-validation method, inputting the 1500 data into an XGboost learning model, outputting 'feature importance' results of 354 features, and screening out features with high importance;
step three, training a model of a deep learning unit based on the characteristics of the molecular compound;
fourthly, predicting the molecular effectiveness of the compound; and (4) reading compound molecules, converting the compound molecules into a standard data structure according to the step two, inputting the standard data structure into a compound molecule effectiveness prediction model, and outputting an effectiveness prediction result by the model.
CN202010347311.0A 2020-04-28 2020-04-28 Anti-senile dementia drug effectiveness prediction system based on deep learning Pending CN111540419A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010347311.0A CN111540419A (en) 2020-04-28 2020-04-28 Anti-senile dementia drug effectiveness prediction system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010347311.0A CN111540419A (en) 2020-04-28 2020-04-28 Anti-senile dementia drug effectiveness prediction system based on deep learning

Publications (1)

Publication Number Publication Date
CN111540419A true CN111540419A (en) 2020-08-14

Family

ID=71970189

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010347311.0A Pending CN111540419A (en) 2020-04-28 2020-04-28 Anti-senile dementia drug effectiveness prediction system based on deep learning

Country Status (1)

Country Link
CN (1) CN111540419A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112086145A (en) * 2020-09-02 2020-12-15 腾讯科技(深圳)有限公司 Compound activity prediction method and device, electronic equipment and storage medium
CN113066549A (en) * 2021-04-06 2021-07-02 青岛瑞斯凯尔生物科技有限公司 Clinical effectiveness evaluation method and system of medical instrument based on artificial intelligence
CN113990510A (en) * 2021-10-29 2022-01-28 山东师范大学 Acute cerebral infarction traditional Chinese medicine prescription treatment effect prediction system based on machine learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985001A (en) * 2017-06-05 2018-12-11 欧阳德方 A kind of pharmaceutical preparation prediction technique
CN109033738A (en) * 2018-07-09 2018-12-18 湖南大学 A kind of pharmaceutical activity prediction technique based on deep learning
CN109741797A (en) * 2018-12-10 2019-05-10 中国药科大学 A method of small molecule compound water solubility grade is predicted using depth learning technology
US20190164632A1 (en) * 2017-09-25 2019-05-30 Syntekabio Co., Ltd. Drug indication and response prediction systems and method using ai deep learning based on convergence of different category data
CN109979541A (en) * 2019-03-20 2019-07-05 四川大学 Medicament molecule pharmacokinetic property and toxicity prediction method based on capsule network
WO2019144700A1 (en) * 2018-01-23 2019-08-01 上海市同济医院 Deep learning-based quick and precise high-throughput drug screening system
CN110942808A (en) * 2019-12-10 2020-03-31 山东大学 Prognosis prediction method and prediction system based on gene big data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985001A (en) * 2017-06-05 2018-12-11 欧阳德方 A kind of pharmaceutical preparation prediction technique
US20190164632A1 (en) * 2017-09-25 2019-05-30 Syntekabio Co., Ltd. Drug indication and response prediction systems and method using ai deep learning based on convergence of different category data
WO2019144700A1 (en) * 2018-01-23 2019-08-01 上海市同济医院 Deep learning-based quick and precise high-throughput drug screening system
CN109033738A (en) * 2018-07-09 2018-12-18 湖南大学 A kind of pharmaceutical activity prediction technique based on deep learning
CN109741797A (en) * 2018-12-10 2019-05-10 中国药科大学 A method of small molecule compound water solubility grade is predicted using depth learning technology
CN109979541A (en) * 2019-03-20 2019-07-05 四川大学 Medicament molecule pharmacokinetic property and toxicity prediction method based on capsule network
CN110942808A (en) * 2019-12-10 2020-03-31 山东大学 Prognosis prediction method and prediction system based on gene big data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘根等: "基于中医传承辅助平台对老年性痴呆防治方剂核心药物组合的筛选研究" *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112086145A (en) * 2020-09-02 2020-12-15 腾讯科技(深圳)有限公司 Compound activity prediction method and device, electronic equipment and storage medium
CN112086145B (en) * 2020-09-02 2024-04-16 腾讯科技(深圳)有限公司 Compound activity prediction method and device, electronic equipment and storage medium
CN113066549A (en) * 2021-04-06 2021-07-02 青岛瑞斯凯尔生物科技有限公司 Clinical effectiveness evaluation method and system of medical instrument based on artificial intelligence
CN113066549B (en) * 2021-04-06 2022-07-26 青岛瑞斯凯尔生物科技有限公司 Clinical effectiveness evaluation method and system of medical instrument based on artificial intelligence
CN113990510A (en) * 2021-10-29 2022-01-28 山东师范大学 Acute cerebral infarction traditional Chinese medicine prescription treatment effect prediction system based on machine learning

Similar Documents

Publication Publication Date Title
CN111540419A (en) Anti-senile dementia drug effectiveness prediction system based on deep learning
CN112150209B (en) Construction method of CNN-LSTM time sequence prediction model based on clustering center
Yang et al. GoogLeNet based on residual network and attention mechanism identification of rice leaf diseases
CN111785329B (en) Single-cell RNA sequencing clustering method based on countermeasure automatic encoder
CN112435720B (en) Prediction method based on self-attention mechanism and multi-drug characteristic combination
CN105740653A (en) Redundancy removal feature selection method LLRFC score+ based on LLRFC and correlation analysis
CN111968741A (en) Diabetes complication high-risk early warning system based on deep learning and integrated learning
CN109360658B (en) Disease pattern mining method and device based on word vector model
CN115985503B (en) Cancer prediction system based on ensemble learning
Castanho et al. Biclustering fMRI time series: a comparative study
Masetti et al. NMR tracing of food geographical origin: The impact of seasonality, cultivar and production year on data analysis
CN107796766A (en) A kind of smelly pin salt place of production discrimination method, device and computer-readable recording medium
CN114864004A (en) Deletion mark filling method based on sliding window sparse convolution denoising self-encoder
Saputro et al. Comparison ADAM-optimizer and SGDM for Classification Images of Rice Leaf Disease
Chen et al. Dual-Stream Subspace Clustering Network for revealing gene targets in Alzheimer's disease
Cheng et al. Convtimenet: A deep hierarchical fully convolutional model for multivariate time series analysis
CN116434950B (en) Diagnosis system for autism spectrum disorder based on data clustering and ensemble learning
TWI709904B (en) Methods for training an artificial neural network to predict whether a subject will exhibit a characteristic gene expression and systems for executing the same
Ahmed et al. Enhanced deep learning model for personalized cancer treatment
Choudhary et al. A review of convolution neural network used in various applications
He et al. A crop leaf disease image recognition method based on bilinear residual networks
CN112287036A (en) Outlier detection method based on spectral clustering
Xiang et al. Segmentation method of multiple sclerosis lesions based on 3D‐CNN networks
CN114999628B (en) Method for searching for obvious characteristic of degenerative knee osteoarthritis by using machine learning
Shih et al. The causes analysis of Ischemic Stroke transformation into Hemorrhagic Stroke using PLS (partial least square)-GA and swarm algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination