CN113140265A - Method and system for predicting hydrophilicity and hydrophobicity of nanoparticles based on face recognition - Google Patents
Method and system for predicting hydrophilicity and hydrophobicity of nanoparticles based on face recognition Download PDFInfo
- Publication number
- CN113140265A CN113140265A CN202110399850.3A CN202110399850A CN113140265A CN 113140265 A CN113140265 A CN 113140265A CN 202110399850 A CN202110399850 A CN 202110399850A CN 113140265 A CN113140265 A CN 113140265A
- Authority
- CN
- China
- Prior art keywords
- nano
- hydrophilicity
- nanoparticles
- hydrophobicity
- face recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000002105 nanoparticle Substances 0.000 title claims abstract description 83
- 238000000034 method Methods 0.000 title claims abstract description 39
- 239000002086 nanomaterial Substances 0.000 claims abstract description 67
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 14
- 238000011176 pooling Methods 0.000 claims abstract description 10
- 238000013135 deep learning Methods 0.000 claims abstract description 8
- 238000002790 cross-validation Methods 0.000 claims abstract description 5
- 238000010200 validation analysis Methods 0.000 claims abstract description 5
- 238000012360 testing method Methods 0.000 claims description 22
- 239000003446 ligand Substances 0.000 claims description 19
- 238000013136 deep learning model Methods 0.000 claims description 14
- 239000002245 particle Substances 0.000 claims description 13
- 238000012549 training Methods 0.000 claims description 13
- 239000010931 gold Substances 0.000 claims description 12
- KDLHZDBZIXYQEI-UHFFFAOYSA-N Palladium Chemical compound [Pd] KDLHZDBZIXYQEI-UHFFFAOYSA-N 0.000 claims description 11
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 claims description 11
- 229910052737 gold Inorganic materials 0.000 claims description 11
- BASFCYQUMIYNBI-UHFFFAOYSA-N platinum Chemical compound [Pt] BASFCYQUMIYNBI-UHFFFAOYSA-N 0.000 claims description 11
- 238000002474 experimental method Methods 0.000 claims description 10
- 238000013480 data collection Methods 0.000 claims description 7
- 230000002209 hydrophobic effect Effects 0.000 claims description 7
- 229910052763 palladium Inorganic materials 0.000 claims description 5
- 229910052697 platinum Inorganic materials 0.000 claims description 5
- 150000003384 small molecules Chemical class 0.000 claims description 5
- 239000011162 core material Substances 0.000 claims description 4
- 238000013459 approach Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 230000008676 import Effects 0.000 claims description 3
- 238000003780 insertion Methods 0.000 claims description 3
- 230000037431 insertion Effects 0.000 claims description 3
- 210000002569 neuron Anatomy 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 241000894007 species Species 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract description 7
- 230000004071 biological effect Effects 0.000 abstract description 5
- 238000013461 design Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 238000004617 QSAR study Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 239000011824 nuclear material Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- -1 small molecule compounds Chemical class 0.000 description 3
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 239000000460 chlorine Substances 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 230000003013 cytotoxicity Effects 0.000 description 2
- 231100000135 cytotoxicity Toxicity 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- ZAMOUSCENKQFHK-UHFFFAOYSA-N Chlorine atom Chemical compound [Cl] ZAMOUSCENKQFHK-UHFFFAOYSA-N 0.000 description 1
- 244000248349 Citrus limon Species 0.000 description 1
- 235000005979 Citrus limon Nutrition 0.000 description 1
- PXGOKWXKJXAPGV-UHFFFAOYSA-N Fluorine Chemical compound FF PXGOKWXKJXAPGV-UHFFFAOYSA-N 0.000 description 1
- 108010020147 Protein Corona Proteins 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 229910052801 chlorine Inorganic materials 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 229910052731 fluorine Inorganic materials 0.000 description 1
- 239000011737 fluorine Substances 0.000 description 1
- 230000036046 immunoreaction Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 1
- YOBAEOGBNPPUQV-UHFFFAOYSA-N iron;trihydrate Chemical compound O.O.O.[Fe].[Fe] YOBAEOGBNPPUQV-UHFFFAOYSA-N 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 231100000623 nanotoxicology Toxicity 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000001179 sorption measurement Methods 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/10—Analysis or design of chemical reactions, syntheses or processes
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
Abstract
The invention discloses a nanoparticle hydrophilicity and hydrophobicity prediction method based on face recognition, which is characterized in that a convolutional neural network deep learning technology widely applied to face recognition is utilized to recognize a nanostructure image, three-dimensional structure characteristics of nanoparticles are extracted from the nanostructure image, the characteristics are subjected to operations such as convolution, pooling and the like for many times, and finally a nanoparticle hydrophilicity and hydrophobicity prediction value is output; the method is tested in the prediction of the hydrophilicity and hydrophobicity of 147 kinds of nanoparticles, and R2 of five-fold cross validation and external validation is more than 0.7; by the method, researchers can estimate the hydrophilicity and hydrophobicity of the nanoparticles through the nanostructure image without complex calculation of the nano descriptor, and further provide data support and theoretical guidance for reasonably designing the nanoparticles with the required biological effect.
Description
Technical Field
The invention relates to the research field of artificial intelligence assisted nano material property prediction and design, in particular to a method and a system for predicting the hydrophilicity and hydrophobicity of nano particles based on face recognition.
Background
The hydrophilicity and hydrophobicity of the Nano material reflect the hydrophilic or oleophilic characteristics of the Nano material, and are generally expressed by the oil-water distribution coefficient of the Nano material, namely the numerical value of Nano-logP. When the Nano-logP is a positive value, the Nano material has oleophylic property, and the larger the value is, the better the fat solubility is; conversely, when the Nano-logP is negative, the Nano-material tends to be hydrophilic, and the larger the absolute value is, the better the hydrophilic characteristic is. Researches show that the hydrophilicity and hydrophobicity of the nano material are highly related to various biological effects of the nano material, such as cell uptake, in-vivo distribution, protein adsorption, cytotoxicity, immunoreaction, pharmacokinetics and the like. It can be said that the hydrophilicity and hydrophobicity are an important precondition and index for evaluating the biological effect of the nano material. Therefore, how to rapidly and accurately obtain the hydrophilicity and hydrophobicity of the nano material has important significance for guiding the reasonable design of the nano material.
The hydrophilicity and hydrophobicity of the nano material can be measured by an experimental method. However, the experimental method has the disadvantages of high cost, time and labor consumption, and dependence on the technical level of equipment and testing personnel. In the face of a great variety and a huge number of nano materials, the hydrophilicity and the hydrophobicity of the nano materials are difficult to be measured one by one through an experimental method. Therefore, a non-experimental method for rapidly predicting and evaluating the hydrophilicity and hydrophobicity of the nano material is urgently needed to meet the requirement of novel nano material screening design.
An artificial intelligence method represented by machine learning and deep learning is used for constructing a reliable model through existing data, and properties of unknown objects can be accurately predicted. In recent decades, with the appearance of big data and the rapid development of computer hardware, artificial intelligence has been widely applied in many fields such as face recognition, automatic driving, intelligent medical treatment, etc. In the field of chemical informatics, machine learning and deep learning have also been successfully used for compound property prediction. However, due to the complicated structure of the nano-material, it is difficult to obtain a nano-descriptor capable of characterizing the entire structure of the nano-material by calculation. In addition, compared with small molecule compounds, the current nano material physicochemical property and biological effect data set are extremely deficient. Therefore, machine learning and deep learning are greatly limited when applied to the prediction of the properties of nanomaterials.
Although the concept of artificial intelligence has been proposed as early as half a century ago, its application to nanomaterial property prediction has been a matter of nearly ten years. Although, in this period, many studies have been made to predict the physicochemical properties and biological effects of nanomaterials by using artificial intelligence methods. However, as described above, when the machine learning and the deep learning are applied to the property prediction of the nanomaterial at present, the following two disadvantages mainly exist, one is that the number of nanomaterials contained in the used data set is too small, and it is difficult to construct a prediction model with good generalization capability. As in the literature "Nature Nanotechnology,2011,6(3): 175-. In the document "Nanoscale, 2013,5(12): 5644-5653", the authors use a support vector machine method to construct a machine learning model to predict cytotoxicity caused by metal oxides, and the data set used in the method only contains 22 kinds of nanomaterials. And the other is lack of proper nanometer descriptors to characterize the overall structure of the nanometer material. The nano-descriptors used in previous studies are largely divided into experimental descriptors, small molecule ligand descriptors, and quantitative calculation descriptors. However, the experimental descriptors are time consuming and laborious and have poor reproducibility; the description of the small molecular ligand cannot include information such as nano material type, particle size, ligand distribution, density and the like; the quantitative calculation descriptor consumes a large amount of calculation resources, and is difficult to calculate the large-particle-size nanometer material. For example, in the literature "Nanotoxicology, 2016,10(3): 374-. In the document ACS nano,2014,8(3):2439-2455, the authors construct a machine learning model by using the information of the protein corona adsorbed by the nano material obtained by the experiment to predict the cell intake of the nano material. In the document "Nanoscale, 2019,11(24): 11808-. Based on the current research situation, a prediction model based on a large number of nano-material data sets and without complex nano descriptor calculation is urgently needed to be constructed, and data support and theoretical guidance are provided for the screening design of nano-materials.
Disclosure of Invention
The invention mainly aims to overcome the defects of the prior art and provide a method and a system for predicting the hydrophilicity and hydrophobicity of nano particles based on face recognition. In the modeling process, an OECD (Organization for environmental co-operation and maintenance) is referred to construct and use principles of a QSAR (quantitative structure activity relationship) model, so that the prediction capability and the robustness of the model are verified and investigated internally and externally, and the model is explained.
The invention aims to provide a method for predicting the hydrophilicity and the hydrophobicity of nanoparticles based on face recognition.
The second purpose of the invention is to provide a nano-particle hydrophilicity and hydrophobicity prediction system based on face recognition.
The first purpose of the invention is realized by the following technical scheme:
a nanoparticle hydrophilicity and hydrophobicity prediction method based on face recognition comprises the following steps:
synthesizing nanoparticles using a nano combinatorial chemistry approach;
acquiring a hydrophilic and hydrophobic property experiment test value of the nano particles through a data collection device;
constructing an electronic file of the three-dimensional structure information of the nanoparticles according to the experimental test values;
importing the electronic file into image processing software to obtain a nano-structure image;
adopting a convolutional neural network for face recognition to construct an end-to-end deep learning model;
and extracting the nano-structure characteristics in the nano-structure image through an end-to-end deep learning model, and finally outputting a predicted value of the hydrophilicity and hydrophobicity of the nano-particles.
Further, the nanoparticle comprises a core material and a small molecule ligand.
Further, the core material comprises gold, platinum, palladium; the small and medium molecular ligands comprise K species.
Further, the hydrophilic and hydrophobic property experiment test value of the nanoparticles is obtained through a data collection device, which is as follows: the method comprises the steps of collecting hydrophilic and hydrophobic experimental test values of a plurality of nanoparticles through a data collection device, wherein the experimental test values comprise nanoparticle particle size and ligand number.
Further, the electronic file for constructing the three-dimensional structure information of the nanoparticles according to the experimental test values is specifically as follows: and constructing a PDB format electronic file containing three-dimensional structure information of the nanoparticles by VINAS software according to the nanoparticle particle size and the ligand number of the experimental test value, wherein the three-dimensional structure information of the nanoparticles comprises the nanoparticle particle size, the atom type, the atom coordinate and the atom connection information.
Further, the electronic file is imported into image processing software to obtain a nano-structure image; the method comprises the following specific steps: and importing the obtained electronic file into VMD software to generate a nano-structure image for subsequent deep learning modeling.
Further, the deep learning model specifically includes the following steps: adopting a LeNet convolutional neural network framework, wherein the LeNet convolutional neural network framework comprises an input layer, four convolutional layers, four pooling layers, two full-connection layers and an output layer; the four convolutional layers and the pooling layer respectively adopt 32, 64, 128 and 128 convolutional kernels, the convolutional layer convolutional kernels and the step sizes are unified into 3 x 3 and 1 x 1, and the pooling layer kernels and the step sizes are unified into 2 x 2 and 2 x 2; both fully-connected layers contain 512 neurons; overfitting was prevented by adding a Dropout term of 0.3, a Learning rate of 0.001, a loss function of Mean square error, MSE, batch size of 32, epochs of 300.
Further, the deep learning model is verified by adopting five-fold cross validation and external validation.
The second purpose of the invention is realized by the following technical scheme:
a nanoparticle hydrophilicity and hydrophobicity prediction system based on face recognition comprises a nanostructure insertion module, a pre-training model introduction module, a Nano-logP prediction module and a prediction result output module;
the nanostructure inserting module is used for inserting a nanostructure image to be predicted;
the pre-training model importing module imports a pre-training deep learning model;
the Nano-logP prediction module calls a model to predict the Nano-logP of the Nano image;
and the prediction result output module is used for displaying the prediction result.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the model constructed by the invention can be used for predicting the hydrophilicity and hydrophobicity of various nanoparticles. The method is simple, convenient and quick, has low cost, has simple operation of a graphical user interface, and can be conveniently used by researchers without computational chemistry basis. The nano-particle hydrophilicity and hydrophobicity prediction method conforms to the QSAR model development and use principle specified by OECD, so that the nano-particle hydrophilicity and hydrophobicity prediction result disclosed by the invention can provide data support and theoretical guidance for reasonable design of nano materials, and has important significance for nano-biological effect evaluation.
2. The model data set comprises 147 kinds of nanoparticles, comprises three kinds of nuclear materials and 91 kinds of different surface ligand small molecules, and is a currently known nanoparticle hydrophilicity and hydrophobicity prediction model with the largest number of nanoparticles and the most abundant types.
3. The model is constructed by adopting a convolutional neural network method, the method is widely applied to face recognition, automatic driving and the like, and the characteristic information can be automatically extracted from the nanostructure image without complex nanometer descriptor calculation.
4. The invention constructs and evaluates the model according to the construction and use principles of OECD about QSAR model, the constructed model has good fitting energy, stability and prediction capability, and can be used for nano material hydrophilicity and hydrophobicity prediction and reasonable nano material design.
5. The graphical user interface developed by the invention is simple and easy to operate, can be operated on a personal computer, and is beneficial to non-computational chemistry professional researchers.
Drawings
FIG. 1 is a flow chart of a method for predicting the hydrophilicity and hydrophobicity of nanoparticles based on face recognition according to the present invention;
FIG. 2 is a view of a convolutional neural network model framework in embodiment 1 of the present invention;
FIG. 3 is a fitting graph of the measured value and the predicted value of the hydrophilicity and hydrophobicity of the nanoparticles in the training set according to example 1 of the present invention;
FIG. 4 is a fitting graph of the measured value and the predicted value of the hydrophilicity and hydrophobicity of the nanoparticles in the test set according to example 1 of the present invention;
FIG. 5 is a comparison of the original images of gold nanoparticles Nos. 25 and 115 of example 1 of the present invention and activation-like maps analyzed by the Grad-CAM method.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example 1:
a method for predicting the hydrophilicity and hydrophobicity of nanoparticles based on face recognition is shown in FIG. 1 and comprises the following steps:
synthesizing nanoparticles using a nano combinatorial chemistry approach;
acquiring a hydrophilic and hydrophobic property experiment test value of the nano particles through a data collection device;
constructing an electronic file of the three-dimensional structure information of the nanoparticles according to the experimental test values;
importing the electronic file into image processing software to obtain a nano-structure image;
adopting a convolutional neural network for face recognition to construct an end-to-end deep learning model;
and extracting the nano-structure characteristics in the nano-structure image through an end-to-end deep learning model, and finally outputting a predicted value of the hydrophilicity and hydrophobicity of the nano-particles.
The method comprises the following specific steps:
firstly, the test values of the hydrophilicity and hydrophobicity (Nano-logP) of 147 kinds of Nano-particles are collected, the Nano-particles are synthesized by a Nano-combinatorial chemical method, three kinds of nuclear materials (gold, platinum and palladium) and 91 kinds of small molecular ligands are contained, and the physicochemical properties (particle size, ligand number, Nano-logP and the like) of the Nano-particles are strictly characterized and tested, so that the reliability of experimental data is ensured. According to the information of the particle size, the number of ligands and the like of the nanoparticles measured by experiments, and by using VINAS (virtual nanostructual Structure simulation) software developed by Hao Zhu laboratories of the university of Rutgers, an electronic file (PDB format) containing three-dimensional structure information (such as the particle size, the atom type, the atom coordinates, the atom connection and the like) of the nanoparticles is constructed. And importing the obtained PDB file into VMD software to generate a nano-structure image for subsequent deep learning modeling. The image is rendered by a VDW mode, namely spheres with different sizes and colors are adopted to represent different atoms, and the size of the sphere reflects the proportional relation among different atom radiuses. After a nano-structure image is obtained, an end-to-end deep learning model is constructed by adopting a convolutional neural network widely used for face recognition so as to automatically extract nano-structure characteristics in the image and finally output a predicted value of the hydrophilicity and hydrophobicity of nano-particles.
In the nanostructure image, the atomic colors and sizes were as follows: carbon (C), cyan,nitrogen (N), blue,oxygen (O) is added to the reaction mixture,the color of red is as follows,sulfur (S), yellow,chlorine (Cl), green,fluorine (F), lemon color,gold (Au), ochre color,platinum (Pt), grey,palladium (Pd), ice blue,
the model is built through Keras (version 2.2.5), as shown in FIG. 2, the rear end is TensorFlow (version 1.14.0), a LeNet Convolutional neural network framework is adopted, and the model comprises an Input layer (Input layer), four Convolutional layers (Convolutional layers), four pooling layers (Max-posing layers), two full-connection layers (sense layers) and an Output layer (Output layer). The four convolutional layers and the pooling layers respectively adopt 32, 64, 128 and 128 convolutional kernels, the convolutional layer convolutional kernels and the step sizes are unified into 3 x 3 and 1 x 1, and the pooling layer kernels and the step sizes are unified into 2 x 2 and 2 x 2. Both fully connected layers contain 512 neurons. Overfitting was prevented by adding a Dropout term (Dropout rate 0.3), Learning rate (Learning rate 0.001), loss function as root Mean Square Error (MSE), batch size 32, epochs 300.
The nanometer material data set is the maximum data set about the hydrophilicity and the hydrophobicity of the nanometer materials, comprises 147 kinds of nanometer materials, covers three kinds of nuclear materials (gold, platinum and palladium) and 91 kinds of small molecule ligands, and has diversity in nanometer structures; the hydrophilic and hydrophobic values of the nano material have a large range distribution from-2.68 to 2.72. The diversity of the nano structure and the wide range distribution of the predicted value are beneficial to constructing a prediction model with good robustness.
The stability and generalization capability of the obtained model are tested by adopting five-fold cross validation and external validation, the R2 of the five-fold cross validation result is 0.70, and the R2 of the external validation result is 0.77, which indicates that the model has better stability and external prediction capability. In addition, the loss curves of the training set and the test set prove that the model has no overfitting phenomenon. The result shows that the constructed model can accurately predict the hydrophilicity and hydrophobicity of various nanoparticles with different particle sizes and different surface ligand modifications. FIG. 3 is a fitting graph of the measured value and the predicted value of the hydrophilicity and hydrophobicity of the nanoparticles in the training set; wherein the training set is 80% of the whole data set, and 119 kinds of nanoparticles are provided; FIG. 4 is a fitting graph of measured and predicted values of hydrophilicity and hydrophobicity of nanoparticles in a test set, wherein the test set is 20% of the whole data set, and the total number of the nanoparticles is 28;
no. 25 gold nanoparticles with particle size of 6.2nm and surface ligand number of 525, the experimentally determined Nano-Nano-log P of-1.12, were downloaded from PubVINAS (http:// www.pubvinas.com), and then imported into VMD, and according to the above image settings, generated the corresponding nanostructure image. And inserting the obtained image by using a graphical user interface, and clicking a Predict button to output a predicted value of-1.47. The experimental values and the predicted values are well matched.
No. 115 gold nanoparticles with a particle size of 7.3nm and surface ligands of 869 were obtained, and the experimentally determined Nano-Nano-log P was 2.39, and a PDB file of the nanoparticles was downloaded from PubVINAS (http:// www.pubvinas.com), and then imported into a VMD, and set according to the above-described image, to generate a corresponding nanostructure image. And inserting the obtained image by using a graphical user interface, and clicking a Predict button to output a predicted value of 2.37. The experimental value is well matched with the predicted value; FIG. 5 is a comparison of the raw images of gold nanoparticles Nos. 25 and 115 and the activation-like maps analyzed by the Grad-CAM method.
Model interpretation, namely, the visual analysis of the convolutional neural network is realized by a Grad-CAM (Gradient-weighted Class Activation Mapping) method, and the importance degree of different areas in the image to a prediction result can be obtained by the method and is represented by a heat map. The basic principle is to obtain a heat map by superposition of feature map weights, which are represented by back-propagation gradients.
And analyzing the No. 25 and No. 115 gold nanoparticles by using a Grad-CAM method to obtain a heat map representing the importance degree, and further overlapping the heat map with the original map to obtain a Class activation mapping (Class activation mapping). It was found that for two gold nanoparticles, the convolutional neural network was different in the main part of feature extraction during learning, but mainly concentrated in the outer layer region of the nanoparticles, indicating that the influence of this part of region on Nano-logP is large.
Example 2:
a nanoparticle hydrophilicity and hydrophobicity prediction system based on face recognition comprises a nanostructure insertion module, a pre-training model introduction module, a Nano-logP prediction module and a prediction result output module;
the nanostructure inserting module is used for inserting a nanostructure image to be predicted;
the pre-training model importing module imports a pre-training deep learning model;
the Nano-logP prediction module calls a model to predict the Nano-logP of the Nano image;
and the prediction result output module is used for displaying the prediction result.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (9)
1. A nanoparticle hydrophilicity and hydrophobicity prediction method based on face recognition is characterized by comprising the following steps:
synthesizing nanoparticles using a nano combinatorial chemistry approach;
acquiring a hydrophilic and hydrophobic property experiment test value of the nano particles through a data collection device;
constructing an electronic file of the three-dimensional structure information of the nanoparticles according to the experimental test values;
importing the electronic file into image processing software to obtain a nano-structure image;
adopting a convolutional neural network for face recognition to construct an end-to-end deep learning model;
and extracting the nano-structure characteristics in the nano-structure image through an end-to-end deep learning model, and finally outputting a predicted value of the hydrophilicity and hydrophobicity of the nano-particles.
2. The method for predicting the hydrophilicity and hydrophobicity of nanoparticles based on human face recognition according to claim 1, wherein the nanoparticles comprise a core material and a small molecule ligand.
3. The method for predicting the hydrophilicity and hydrophobicity of nanoparticles based on human face recognition according to claim 2, wherein the core material comprises gold, platinum and palladium; the small and medium molecular ligands comprise K species.
4. The method for predicting the hydrophilicity and hydrophobicity of the nanoparticles based on the face recognition according to claim 1, wherein the experimental test value of the hydrophilicity and hydrophobicity of the nanoparticles is obtained through a data collection device, and specifically comprises the following steps: the method comprises the steps of collecting hydrophilic and hydrophobic experimental test values of a plurality of nanoparticles through a data collection device, wherein the experimental test values comprise nanoparticle particle size and ligand number.
5. The method for predicting the hydrophilicity and hydrophobicity of the nanoparticles based on the face recognition according to claim 4, wherein the electronic file of the three-dimensional structure information of the nanoparticles is constructed according to experimental test values, and specifically comprises the following steps: and constructing a PDB format electronic file containing three-dimensional structure information of the nanoparticles by VINAS software according to the nanoparticle particle size and the ligand number of the experimental test value, wherein the three-dimensional structure information of the nanoparticles comprises the nanoparticle particle size, the atom type, the atom coordinate and the atom connection information.
6. The method for predicting the hydrophilicity and hydrophobicity of nanoparticles based on face recognition according to claim 1, wherein the electronic file is imported into image processing software to obtain a nanostructure image; the method comprises the following specific steps: and importing the obtained electronic file into VMD software to generate a nano-structure image for subsequent deep learning modeling.
7. The method for predicting the hydrophilicity and hydrophobicity of nanoparticles based on face recognition according to claim 1, wherein the deep learning model specifically comprises the following steps: adopting a LeNet convolutional neural network framework, wherein the LeNet convolutional neural network framework comprises an input layer, four convolutional layers, four pooling layers, two full-connection layers and an output layer; the four convolutional layers and the pooling layer respectively adopt 32, 64, 128 and 128 convolutional kernels, the convolutional layer convolutional kernels and the step sizes are unified into 3 x 3 and 1 x 1, and the pooling layer kernels and the step sizes are unified into 2 x 2 and 2 x 2; both fully-connected layers contain 512 neurons; overfitting was prevented by adding a Dropout term of 0.3, a Learning rate of 0.001, a loss function of Mean square error, MSE, batch size of 32, epochs of 300.
8. The method for predicting the hydrophilicity and hydrophobicity of nanoparticles based on human face recognition according to claim 7, wherein the deep learning model is verified by five-fold cross validation and external validation.
9. A nanoparticle hydrophilicity and hydrophobicity prediction system based on face recognition is characterized by comprising a nanostructure insertion module, a pre-training model introduction module, a Nano-logP prediction module and a prediction result output module;
the nanostructure inserting module is used for inserting a nanostructure image to be predicted;
the pre-training model importing module imports a pre-training deep learning model;
the Nano-logP prediction module calls a model to predict the Nano-logP of the Nano image;
and the prediction result output module is used for displaying the prediction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110399850.3A CN113140265A (en) | 2021-04-14 | 2021-04-14 | Method and system for predicting hydrophilicity and hydrophobicity of nanoparticles based on face recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110399850.3A CN113140265A (en) | 2021-04-14 | 2021-04-14 | Method and system for predicting hydrophilicity and hydrophobicity of nanoparticles based on face recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113140265A true CN113140265A (en) | 2021-07-20 |
Family
ID=76812603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110399850.3A Pending CN113140265A (en) | 2021-04-14 | 2021-04-14 | Method and system for predicting hydrophilicity and hydrophobicity of nanoparticles based on face recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113140265A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070285843A1 (en) * | 2006-06-12 | 2007-12-13 | Tran Bao Q | NANO-electronics |
US20190304568A1 (en) * | 2018-03-30 | 2019-10-03 | Board Of Trustees Of Michigan State University | System and methods for machine learning for drug design and discovery |
-
2021
- 2021-04-14 CN CN202110399850.3A patent/CN113140265A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070285843A1 (en) * | 2006-06-12 | 2007-12-13 | Tran Bao Q | NANO-electronics |
US20190304568A1 (en) * | 2018-03-30 | 2019-10-03 | Board Of Trustees Of Michigan State University | System and methods for machine learning for drug design and discovery |
Non-Patent Citations (1)
Title |
---|
YAN, XL ET AL.: "Prediction of Nano-Bio Interactions through Convolutional Neural Network Analysis of Nanostructure Images", 《ACS SUSTAINABLE CHEMISTRY & ENGINEERING,HTTPS://DOI.ORG/10.1021/ACSSUSCHEMENG.0C07453》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Montavon et al. | Methods for interpreting and understanding deep neural networks | |
Baydilli et al. | Classification of white blood cells using capsule networks | |
Han et al. | A new method of mixed gas identification based on a convolutional neural network for time series classification | |
CN103800011B (en) | A kind of brain district based on functional mri effective connectivity analyzes system | |
Molinara et al. | An end to end indoor air monitoring system based on machine learning and sensiplus platform | |
CN106127263A (en) | The human brain magnetic resonance image (MRI) classifying identification method extracted based on three-dimensional feature and system | |
Yuan et al. | An improved DeepLab v3+ deep learning network applied to the segmentation of grape leaf black rot spots | |
Bamidele et al. | Discovery and prediction capabilities in metal-based nanomaterials: An overview of the application of machine learning techniques and some recent advances | |
Zhu et al. | A deep learning-based method for automatic assessment of stomatal index in wheat microscopic images of leaf epidermis | |
Tran et al. | DeepNose: Using artificial neural networks to represent the space of odorants | |
CN112766283A (en) | Two-phase flow pattern identification method based on multi-scale convolution network | |
Huang et al. | Multi-analyte sensing strategies towards wearable and intelligent devices | |
Halalsheh et al. | Breakthrough curves prediction of selenite adsorption on chemically modified zeolite using boosted decision tree algorithms for water treatment applications | |
CN115270874A (en) | Method and system for flow cytometry classification and counting based on density estimation | |
CN113180695B (en) | Brain-computer interface signal classification method, system, equipment and storage medium | |
CN113140265A (en) | Method and system for predicting hydrophilicity and hydrophobicity of nanoparticles based on face recognition | |
Sun et al. | From the perspective of high-throughput recognition: Sulfur quantum dots-based multi-channel sensing platform for metal ions detection | |
Escobar et al. | Automated counting of white blood cells in thin blood smear images | |
Witmer et al. | Generative adversarial networks for morphological–temporal classification of stem cell images | |
Liu et al. | Intelligent Tablet Detection System Based on Capsule Neural Network | |
CN115547434A (en) | Reverse nanoparticle design method based on Quasi-SMILES and recurrent neural network | |
Dulta et al. | Nanotechnology and applications | |
Li et al. | Application of Artificial intelligence (AI)-enhanced biochemical sensing in molecular diagnosis and imaging analysis: Advancing and Challenges | |
Tummala | Image classification using convolutional neural networks | |
Vrontaki et al. | Quantitative nanostructure-activity relationship models for the risk assessment of nanomaterials |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210720 |