CN109493922A - A method of prediction chemicals molecular structural parameter - Google Patents

A method of prediction chemicals molecular structural parameter Download PDF

Info

Publication number
CN109493922A
CN109493922A CN201811378715.5A CN201811378715A CN109493922A CN 109493922 A CN109493922 A CN 109493922A CN 201811378715 A CN201811378715 A CN 201811378715A CN 109493922 A CN109493922 A CN 109493922A
Authority
CN
China
Prior art keywords
formula
lpcount
atom
cats2d
organic compound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811378715.5A
Other languages
Chinese (zh)
Other versions
CN109493922B (en
Inventor
陈景文
肖子君
鄢世阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Cilico Environmental Technology Co Ltd
Original Assignee
Dalian Cilico Environmental Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Cilico Environmental Technology Co Ltd filed Critical Dalian Cilico Environmental Technology Co Ltd
Priority to CN201811378715.5A priority Critical patent/CN109493922B/en
Publication of CN109493922A publication Critical patent/CN109493922A/en
Application granted granted Critical
Publication of CN109493922B publication Critical patent/CN109493922B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)

Abstract

The present invention provides a kind of methods for predicting chemicals molecular structural parameter, and the method includes optimization organic compound molecule structure and the parameters of organic compound are calculated in the molecular structure based on optimization.The method of prediction organic compound multi-parameter linear free energy relationship provided by the present invention can be used for multiple types organic compound;The measured data of E, S, A, B, L, V reach 3838 kinds in method, have very extensive application domain, and E, S, A, B, L, V are modeled using linear regression algorithm, and the transparent simplicity of model algorithm is easy to explain;It is simple and efficient, low in cost using the chemicals distribution coefficient in method provided by the present invention prediction organic compound multi-parameter linear free energy relationship, it can be supervised for chemicals and data support is provided, be of great significance to the Ecological risk assessment of chemicals.

Description

A method of prediction chemicals molecular structural parameter
Technical field
The present invention relates to ecological risk assessment Test Strategy fields, and in particular, to a kind of prediction chemicals molecular structure The method of parameter.
Background technique
Distribution situation, reaction rate, bioconcentration and poisonous effect of the organic chemicals in Eco-Environment System are all Depending on their distribution behavior.And since the equilibrium distribution coefficient for testing determining substance has, time-consuming, at high cost, error-prone The disadvantages of.When organic matter is difficult to obtain or when organic matter substantial amounts to be measured, only relies on measuring chemicals molecular structure ginseng Digital display obtains addition difficult.It is therefore desirable to develop reliable prediction technique for the balance molecule structural parameters of substance in the environment.
Multi-parameter linear free energy relationship has been demonstrated to can be used for characterizing to organise in various environment and technology distribution system The equilibrium assignmen of substance is learned, and predicts accurate molecular structural parameters.But the molecular structure ginseng in multi-parameter linear free energy relationship (E is molecule molar excess refractive index to number, L is hexadecane-water partition coefficient, A is hydrogen bond acidity number, B is hydrogen bond basicity, S is Polarity/dipole moment, V be McGowan characteristic molecular volume) acquisition depend on complicated and diversified experimental method.And american chemical The chemicals of digest society (Chemical Abstracts Service, CAS) registration is more than 1.35 hundred million, and with 15000 kinds/day Speed increase, in face of number so huge organic chemicals, only its molecular structural parameter is measured obviously by testing It has difficulties.Only having nearly 4000 kinds of substances at present has molecular structural parameter experiment value, and therefore, there is an urgent need to develop non-experiment skill Art meets organic to obtain material molecular structure parameter value efficiently and rapidly in order to predict multi-parameter linear free energy relationship The demand of chemicals ecological risk assessment and management.
Summary of the invention
The first purpose of the invention is to provide a kind of easy, quick, efficient prediction organic chemicals molecular structural parameters Method, this method can predict its molecular structural parameter according to molecular structure of compounds, and then can assess its partition Coefficient provides necessary basic data for Risk Assessment of Chemicals and management.
To achieve the goals above, the present invention provides a kind of method for predicting chemicals molecular structural parameter, the sides Method includes:
S1. optimize organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for exceeding computer capacity Pseudo potential is added using LANL2DZ in atom, and keyword pop=NBO, Volume is added, and the molecular structure after the optimization should be steady Fixed no empty frequency;
S2. polarizability, E is calculated in the molecular structure based on optimizationHomo-ELumo、I_LPcount、Atom_ num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、 Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、 SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_ Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、 H_Qcount, NaasC, SssCH2, F01 [O-Si], molarVolume value, wherein polarizability be polarizability, EHomo-ELumoIt is former for frontier molecular orbital energy levels are poor, I_Lpcount is all iodine atoms lone pair electrons logarithm, Atom_num Sub- sum, nBT are chemical number of keys, Mor32p be the weighting of 32/ polarizability of 3D-MoRSE signal, the N that nHDon is hydrogen bond donor and Center Broto-Moreau autocorrelation exponent, the F01 [C-N] that O atom number, ATSC2i are the lag2 of ionization potential weighting are topology There is the probability of O-O when distance the probability of C-N occurs when being 1, F10 [O-O] be topology distance is 10, dipole_moment is Dipole moment, CCR_energy are core-nuclear repulsion energy, EHOMO-1Orbital energy is occupied for the second height, the ring week that Rperim is molecule Long, Mor05u is 05/ unweighted of 3D-MoRSE signal, Mor02m is the weighting of 02/ mass of 3D-MoRSE signal, nRCO is aliphatic ketone The number of base, H-046 be hydrogen atom be connected with sp3 hydbridized carbon atoms and on adjacent carbon atom halogen-free atom be connected, SdO be= The E-states summation of O, NtN are that the number of the N containing ≡, H_Qmax are the hydrogen atom highest quantity of electric charge, H_Qmean is hydrogen original in molecule Sub- mean charge amount, nRCONH2 are the number of aliphatic primary amide, N-067 Al2-NH, O-057 are on phenol/enol/carboxyl Oxygen atom, SsNH2 be-NH2E-states summation, CATS2D_01_AN be hydrogen bond receptor-negative electrical charge at lag 01 CATS2D descriptor, CATS2D_03_DD be hydrogen bond donor-hydrogen bond donor CATS2D descriptor at lag03, CATS2D_03_DA is that hydrogen bond donor-hydrogen bond receptor CATS3D descriptor, the B04 [O-O] at lag 03 are topology distance Presence/missing of O-O, nArNHR are the number of aromatic amine, the lone pair electrons that O_Lpcount is all oxygen atoms when being 4 Logarithm, N_Qcount are number of nitrogen atoms, Mor12i be 12/ ionization potential of 3D-MoRSE signal weighting, H-047 be with sp2 and The connected hydrogen atom of the carbon atom of sp3 hydridization, O-056 be the number that oxygen atom, P-116 on hydroxyl is R3-P=X group, NddsN be-N (=)=number, B01 [C-N] when be topology distance being 1 presence/missing of C-N, F02 [C-N] be topology away from It is all bromine atoms from lone pair electrons number that the frequency of C-N, F_Lpcount when being 2 are all fluorine atoms, Br_Lpcount Lone pair electrons logarithm, H_Qcount are numbers of hydrogen atoms, NaasC is aasC number, the E-states that SssCH2 is-CH2- are total There is the frequency of O-Si when with F01 [O-Si] be topology distance being 1, molarVolume is molal volume;
S3. organic compound molecule structural parameters E is calculated according to formula (7), calculates organic compound molecule knot according to formula (8) Structure parameter S, calculates organic compound molecule structural parameters A according to formula (9), calculates organic compound molecule structure according to formula (10) Parameter B, calculates organic compound molecule structural parameters L according to formula (11), calculates organic compound molecule structure according to formula (12) Parameter V,
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount- 0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+ 0.04305F01[C-N]+0.14475F10[O-O]
Formula (7),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+ 0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+ 0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (8),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon- 0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_ DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
Formula (9),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount -0.09927Mor12i+ 0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C- N]+0.03503F02[C-N]
Formula (10),
L=0.44713+0.03226polarizability -0.16282F_LPcount+0.07766Br_LPcount+ 0.25237Atom_num–0.35911H_Qcount+0.48173nHDon-0.08596NaasC+0.06518SssCH2– 0.43300F01[O-Si]
Formula (11),
V=-0.00910+1.027 (molarVolume/100) formula (12),
Wherein, molecular parameter E is molecule molar excess refractive index, molecular parameter L is hexadecane-water partition coefficient, divides Subparameter A is hydrogen bond acidity number, and molecular parameter B is hydrogen bond basicity, and molecular parameter S is polarity/dipole moment, and molecular parameter V is McGowan characteristic molecular volume;
The organic compound can be alkane, alkene, alkynes, alcohols, ethers, phenols, ketone, aldehydes, esters, quinone Class, substituted biphenyl, phenyl amines, halogenated hydrocarbons, nitro-aromatic, alkylbenzene, azobenzene, organic acid, benzamide, phthalic acid Salt, polybrominated diphenyl ethers, polycyclic aromatic hydrocarbon, sulfonic acid, organic phosphorus compound, organic sulfur compound, organic iodide, organic fluoride Object, heterocyclic compound and organo-silicon compound.
The method of prediction chemicals molecular structural parameter provided by the present invention can be used for multiple types organic compound;Method The measured data of middle molecular structural parameter E, S, A, B, L, V reach 3838 kinds, have very extensive application domain, E, S, A, B, L, V is modeled using linear regression algorithm, and the transparent simplicity of model algorithm is easy to explain;The molecular structure ginseng obtained according to prediction Number E, molecular structural parameter S, molecular structural parameter A, molecular structural parameter B, molecular structural parameter L and molecular structural parameter V are pre- It is accurate to survey organic compound multi-parameter linear free energy relationship result.Organic compound is predicted using method provided by the present invention Multi-parameter linear free energy relationship is simple and efficient, low in cost, can supervise for chemicals and provide data support, to chemicals Ecological risk assessment is of great significance.
Other features and advantages of the present invention will the following detailed description will be given in the detailed implementation section.
Specific embodiment
Detailed description of the preferred embodiments below.It should be understood that described herein specific Embodiment is merely to illustrate and explain the present invention, and is not intended to restrict the invention.
The present invention provides a kind of methods for predicting chemicals molecular structural parameter, which comprises
S1. optimize organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for exceeding computer capacity Pseudo potential is added using LANL2DZ in atom, and keyword pop=NBO, Volume is added, the molecular structure stabilized after the optimization Without empty frequency;
S2. polarizability, E is calculated in the molecular structure based on optimizationHomo-ELumo、I_LPcount、Atom_ num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、 Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、 SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_ Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、 H_Qcount, NaasC, SssCH2, F01 [O-Si], molarVolume value, wherein polarizability be polarizability, EHomo-ELumoIt is former for frontier molecular orbital energy levels are poor, I_Lpcount is all iodine atoms lone pair electrons logarithm, Atom_num Sub- sum, nBT are chemical number of keys, Mor32p be the weighting of 32/ polarizability of 3D-MoRSE signal, the N that nHDon is hydrogen bond donor and Center Broto-Moreau autocorrelation exponent, the F01 [C-N] that O atom number, ATSC2i are the lag2 of ionization potential weighting are topology There is the probability of O-O when distance the probability of C-N occurs when being 1, F10 [O-O] be topology distance is 10, dipole_moment is Dipole moment, CCR_energy are core-nuclear repulsion energy, EHOMO-1Orbital energy is occupied for the second height, the ring week that Rperim is molecule Long, Mor05u is 05/ unweighted of 3D-MoRSE signal, Mor02m is the weighting of 02/ mass of 3D-MoRSE signal, nRCO is aliphatic ketone The number of base, H-046 be hydrogen atom be connected with sp3 hydbridized carbon atoms and on adjacent carbon atom halogen-free atom be connected, SdO be= The E-states summation of O, NtN are that the number of the N containing ≡, H_Qmax are the hydrogen atom highest quantity of electric charge, H_Qmean is hydrogen original in molecule Sub- mean charge amount, nRCONH2 are the number of aliphatic primary amide, N-067 Al2-NH, O-057 are on phenol/enol/carboxyl Oxygen atom, SsNH2 be-NH2E-states summation, CATS2D_01_AN be hydrogen bond receptor-negative electrical charge at lag 01 CATS2D descriptor, CATS2D_03_DD be hydrogen bond donor-hydrogen bond donor CATS2D descriptor at lag03, CATS2D_03_DA is that hydrogen bond donor-hydrogen bond receptor CATS3D descriptor, the B04 [O-O] at lag 03 are topology distance Presence/missing of O-O, nArNHR are the number of aromatic amine, the lone pair electrons that O_Lpcount is all oxygen atoms when being 4 Logarithm, N_Qcount are number of nitrogen atoms, Mor12i be 12/ ionization potential of 3D-MoRSE signal weighting, H-047 be with sp2 and The connected hydrogen atom of the carbon atom of sp3 hydridization, O-056 be the number that oxygen atom, P-116 on hydroxyl is R3-P=X group, NddsN be-N (=)=number, B01 [C-N] when be topology distance being 1 presence/missing of C-N, F02 [C-N] be topology away from It is all bromine atoms from lone pair electrons number that the frequency of C-N, F_Lpcount when being 2 are all fluorine atoms, Br_Lpcount Lone pair electrons logarithm, H_Qcount are numbers of hydrogen atoms, NaasC is aasC number, the E-states that SssCH2 is-CH2- are total There is the frequency of O-Si when with F01 [O-Si] be topology distance being 1, molarVolume is molal volume;
S3. organic compound molecule structural parameters E is calculated according to formula (13), calculates organic compound molecule according to formula (14) Structural parameters S, calculates organic compound molecule structural parameters A according to formula (15), calculates organic compound molecule according to formula (16) Structural parameters B, calculates organic compound molecule structural parameters L according to formula (17), calculates organic compound molecule according to formula (18) Structural parameters V,
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount- 0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+ 0.04305F01 [C-N]+0.14475F10 [O-O] formula (13),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+ 0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+ 0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (14),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon- 0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_ DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
Formula (15),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount -0.09927Mor12i+ 0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C- N]+0.03503F02[C-N]
Formula (16)
L=0.44713+0.03226polarizability -0.16282F_LPcount+0.07766Br_LPcount+ 0.25237Atom_num–0.35911H_Qcount+0.48173nHDon-0.08596NaasC+0.06518SssCH2– 0.43300F01[O-Si]
Formula (17)
V=-0.00910+1.027 (molarVolume/100) formula (18),
Wherein, molecular parameter E is molecule molar excess refractive index, molecular parameter L is hexadecane-water partition coefficient, divides Subparameter A is hydrogen bond acidity number, and molecular parameter B is hydrogen bond basicity, and molecular parameter S is polarity/dipole moment, and molecular parameter V is McGowan characteristic molecular volume;
The organic compound can be alkane, alkene, alkynes, alcohols, ethers, phenols, ketone, aldehydes, esters, quinone Class, substituted biphenyl, phenyl amines, halogenated hydrocarbons, nitro-aromatic, alkylbenzene, azobenzene, organic acid, benzamide, phthalic acid Salt, polybrominated diphenyl ethers, polycyclic aromatic hydrocarbon, sulfonic acid, organic phosphorus compound, organic sulfur compound, organic iodide, organic fluoride Object, heterocyclic compound and organo-silicon compound.
The method of prediction chemicals molecular structural parameter provided by the present invention can be used for multiple types organic compound;Method The measured data of middle molecular structural parameter E, S, A, B, L, V reach 3838 kinds, have very extensive application domain, E, S, A, B, L, V is modeled using linear regression algorithm, and the transparent simplicity of model algorithm is easy to explain;The molecular structure ginseng obtained according to prediction Number E, molecular structural parameter S, molecular structural parameter A, molecular structural parameter B, molecular structural parameter L and molecular structural parameter V are pre- Survey organic compound multi-parameter linear free energy relationship.Organic compound multi-parameter line is predicted using method provided by the present invention Free love energy relationship is simple and efficient, low in cost, can supervise for chemicals and provide data support, to the ecological risk of chemicals Property evaluation be of great significance.
Below by embodiment, present invention be described in more detail.
Embodiment 1
A given compound 4- nitro-chlorobenzene (No. CAS: 100-00-5) predicts its molecular structural parameter E value.It is right first Compound carries out molecular structure optimization, optimizes organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for super Pseudo potential is added using LANL2DZ in the atom of computer capacity out, and keyword pop=NBO, Volume, the molecule after optimization is added Stable structure is without empty frequency;Molecular structure based on optimization, using Draogon6.0 software be calculated polarizability, EHomo-ELumo, I_LPcount, Atom_num, nBT, Mor32p, nHDon, ATSC2i, F01 [C-N], F10 [O-O] value difference It is 85.02, -0.180,0,14,14, -0.160,0,0.249,1,0.Then predicted value 0.98 is calculated according to formula (19), with Experiment value is consistent, and prediction effect is good.
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount- 0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+ 0.04305F01 [C-N]+0.14475F10 [O-O] formula (19).
Embodiment 2
A given compound Isosorbide-5-Nitrae-diisopropyl benzene (No. CAS: 100-18-5) predicts its S value.First to compound into The optimization of row molecular structure, the molecular structure based on optimization optimize organic compound molecule in Gauss B3LYP/6-31G (d) method Structure is added pseudo potential using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, excellent Molecular structure stabilized after change is without empty frequency;Dipole_moment, CCR_ is calculated using Draogon6.0 software energy、EHOMO-1, Rperim, ATSC2i, Mor05u, Mor02m, nRCO, nHDon, H-046, SdO, NtN value be respectively 0.0191,682.158,-0.24112,6,0.6,-4.027,11.018,0,0,14,0,0.Then it is calculated according to formula (20) Predicted value 0.47, experiment value 0.474, prediction effect is good.
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+ 0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+ 0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (20).
Embodiment 3
A given compound acetanisole (No. CAS: 100-06-1) predicts its A value.First to compound into The optimization of row molecular structure optimizes organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for beyond calculating model Pseudo potential is added using LANL2DZ in the atom enclosed, and keyword pop=NBO, Volume, the molecular structure stabilized after optimization is added Without empty frequency;Molecular structure based on optimization, using Draogon6.0 software be calculated H_Qmax, H_Qmean, nRCONH2, nHDon、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、 The value of nArNHR is respectively 0.179382,0.1588766,0,0,0,0,0,0,0,0,0,0.Then it is calculated according to formula (21) Predicted value 0.019 out, experiment value 0, prediction effect is good.
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon- 0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_ DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
Formula (21).
Embodiment 4
A given compound propyl benzene (No. CAS: 103-65-1) predicts its distribution coefficient logarithm in methanol/water. Molecular structure optimization is carried out to compound first, optimizes organic compound molecule structure in Gauss B3LYP/6-31G (d) method, Pseudo potential is added using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, after optimization Molecular structure stabilized without empty frequency;Molecular structure based on optimization is calculated using Draogon6.0 software polarizability、EHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、 F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、 NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、 CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、 NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]、 MolarVolume value;According to formula (22), formula (23), formula (24), formula (25), formula (26) calculate separately propyl benzene molecular parameter E, S, A, B, V value are 0.626,0.380, -0.016,0.225,1.126, are remembered according in Abraham M H et al. document in 2004 It is 3.42 that distribution coefficient logarithm of the propyl benzene in methanol/water, which is calculated, in the formula (27) of load, experiment value 3.52, prediction result Well.
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount- 0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+ 0.04305F01 [C-N]+0.14475F10 [O-O] formula (22),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+ 0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+ 0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (23),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon- 0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_ DD -0.04406CATS2D_03_DA -0.08829B04 [O-O] -0.21963nArNHR formula (24),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount -0.09927Mor12i+ 0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C- N]+0.03503F02 [C-N] formula (25)
V=-0.00910+1.027 (molarVolume/100) formula (26),
LogK=0.299E-0.671S+0.080A-3.389B+3.512V+0.329 formula (27).
Embodiment 5
A given compound bromobutane (No. CAS: 109-65-9) predicts its distribution coefficient logarithm in ethanol/water Value.Molecular structure optimization is carried out to compound first, optimizes organic compound molecule knot in Gauss B3LYP/6-31G (d) method Structure is added pseudo potential using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, and optimizes Molecular structure stabilized afterwards is without empty frequency;Molecular structure based on optimization is calculated using Draogon6.0 software polarizability、EHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、 F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、 NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、 CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、 NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]、 MolarVolume value;According to formula (22), formula (23), formula (24), formula (25), formula (26) calculate separately propyl benzene molecular parameter E, S, A, B, V value are 0.252,0.279,0.019,0.052,0.799, are remembered according in Abraham M H et al. document in 2004 It is 3.45 that distribution coefficient logarithm of the bromobutane in ethanol/water, which is calculated, in the formula (28) of load, experiment value 3.52, prediction knot Fruit is good.
LogK=0.409E -0.959S+0.186A-3.645B+3.928V+0.208 formula (28).
Embodiment 6
A given compound dimethyl ether (No. CAS: 115-10-6) predicts its distribution coefficient logarithm in amylalcohol/water Value.Molecular structure optimization is carried out to compound first, optimizes organic compound molecule knot in Gauss B3LYP/6-31G (d) method Structure is added pseudo potential using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, and optimizes Molecular structure stabilized afterwards is without empty frequency;Molecular structure based on optimization is calculated using Draogon6.0 software polarizability、EHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、 F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、 NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、 CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、 NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]、 MolarVolume value;According to formula (22), formula (23), formula (24), formula (25), formula (26) calculate separately propyl benzene molecular parameter E, S, A, B, V value are 0.252,0.279,0.019,0.052,0.799, are remembered according in Abraham M H et al. document in 2004 It is 3.45 that distribution coefficient logarithm of the bromobutane in ethanol/water, which is calculated, in the formula (29) of load, experiment value 3.52, prediction knot Fruit is good.
LogK=0.521E -1.294S+0.208A-3.908B+4.208V+0.08 formula (29).
Compare the method it can be seen that prediction chemicals molecular structural parameter provided by the present invention according to above-described embodiment It can be used for multiple types organic compound, the organic compound can be alkane, alkene, alkynes, alcohols, ethers, phenols, ketone Class, aldehydes, esters, quinones, substituted biphenyl, phenyl amines, halogenated hydrocarbons, nitro-aromatic, alkylbenzene, azobenzene, organic acid, benzoyl Amine, phthalate, polybrominated diphenyl ethers, polycyclic aromatic hydrocarbon, sulfonic acid, organic phosphorus compound, organic sulfur compound, organic iodine Compound, organic fluoride, heterocyclic compound and organo-silicon compound.The actual measurement of molecular structural parameter E, S, A, B, L, V in method Data reach 3838 kinds, have very extensive application domain, and E, S, A, B, L, V are modeled using linear regression algorithm, model The transparent simplicity of algorithm, is easy to explain;Molecular structural parameter E, the molecular structural parameter S, molecular structural parameter obtained according to prediction A, molecular structural parameter B, molecular structural parameter L and molecular structural parameter V prediction organic compound multi-parameter linear free energy close System.
The simple and efficient, cost using method provided by the present invention prediction organic compound multi-parameter linear free energy relationship It is cheap, it can be supervised for chemicals and data support is provided, be of great significance to the Ecological risk assessment of chemicals.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Anyone skilled in the art in the technical scope disclosed by the present invention, according to the technique and scheme of the present invention and its Inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.

Claims (1)

1. a kind of method for predicting chemicals molecular structural parameter, which is characterized in that the described method includes:
S1. optimize organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for exceeding the atom of computer capacity Pseudo potential is added using LANL2DZ, and keyword pop=NBO, Volume is added, the molecular structure stabilized after the optimization does not have Empty frequency;
S2. polarizability, E is calculated in the molecular structure based on optimizationHomo-ELumo、I_LPcount、Atom_num、 nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、 Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、 SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_ Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、 H_Qcount, NaasC, SssCH2, F01 [O-Si] and molarVolume value, wherein polarizability be polarizability, EHomo-ELumoIt is former for frontier molecular orbital energy levels are poor, I_Lpcount is all iodine atoms lone pair electrons logarithm, Atom_num Sub- sum, nBT are chemical number of keys, Mor32p be the weighting of 32/ polarizability of 3D-MoRSE signal, the N that nHDon is hydrogen bond donor and Center Broto-Moreau autocorrelation exponent, the F01 [C-N] that O atom number, ATSC2i are the lag2 of ionization potential weighting are topology There is the probability of O-O when distance the probability of C-N occurs when being 1, F10 [O-O] be topology distance is 10, dipole_moment is Dipole moment, CCR_energy are core-nuclear repulsion energy, EHOMO-1Orbital energy is occupied for the second height, the ring week that Rperim is molecule Long, Mor05u is 05/ unweighted of 3D-MoRSE signal, Mor02m is the weighting of 02/ mass of 3D-MoRSE signal, nRCO is aliphatic ketone The number of base, H-046 be hydrogen atom be connected with sp3 hydbridized carbon atoms and on adjacent carbon atom halogen-free atom be connected, SdO be= The E-states summation of O, NtN are that the number of the N containing ≡, H_Qmax are the hydrogen atom highest quantity of electric charge, H_Qmean is hydrogen original in molecule Sub- mean charge amount, nRCONH2 are the number of aliphatic primary amide, N-067 Al2-NH, O-057 are on phenol/enol/carboxyl Oxygen atom, SsNH2 be-NH2E-states summation, CATS2D_01_AN be hydrogen bond receptor-negative electrical charge at lag 01 CATS2D descriptor, CATS2D_03_DD be hydrogen bond donor-hydrogen bond donor CATS2D descriptor at lag03, CATS2D_03_DA is that hydrogen bond donor-hydrogen bond receptor CATS3D descriptor, the B04 [O-O] at lag 03 are topology distance Presence/missing of O-O, nArNHR are the number of aromatic amine, the lone pair electrons that O_Lpcount is all oxygen atoms when being 4 Logarithm, N_Qcount are number of nitrogen atoms, Mor12i be 12/ ionization potential of 3D-MoRSE signal weighting, H-047 be with sp2 and The connected hydrogen atom of the carbon atom of sp3 hydridization, O-056 be the number that oxygen atom, P-116 on hydroxyl is R3-P=X group, NddsN be-N (=)=number, B01 [C-N] when be topology distance being 1 presence/missing of C-N, F02 [C-N] be topology away from It is all bromine atoms from lone pair electrons number that the frequency of C-N, F_Lpcount when being 2 are all fluorine atoms, Br_Lpcount Lone pair electrons logarithm, H_Qcount are numbers of hydrogen atoms, NaasC is aasC number, the E-states that SssCH2 is-CH2- are total There is the frequency of O-Si when with F01 [O-Si] be topology distance being 1, molarVolume is molal volume;
S3. organic compound molecule structural parameters E is calculated according to formula (1), calculates organic compound molecule structure ginseng according to formula (2) Number S, calculates organic compound molecule structural parameters A according to formula (3), calculates organic compound molecule structural parameters according to formula (4) B, calculates organic compound molecule structural parameters L according to formula (5), calculates organic compound molecule structural parameters V according to formula (6),
E=0.61313+0.01169 polarizability+0.88701 (EHomo-ELumo)+0.12676 I_LPcount- 0.29072 Atom_num+0.26076 nBT-0.34881 Mor32p+0.12675 nHDon-0.57231 ATSC2i+ 0.04305 F01[C-N]+0.14475 F10[O-O]
Formula (1),
S=0.80280+0.05210 dipole_moment+0.00023 CCR_energy+1.96420 EHomo-1+0.03975 Rperim-0.55400 ATSC2i-0.05361 Mor05u+0.01734 Mor02m+0.24280 nRCO+0.10889 nHDon-0.02352 H-046+0.01438 SdO+0.43704 NtN
Formula (2),
A=-0.18760+0.41354 H_Qmax+0.83897 H_Qmean+0.20256 nRCONH2+0.28056 nHDon- 0.16539 N-067+0.08320 O-057–0.07177 SsNH2+0.14845 CATS2D_01_AN–0.12936 CATS2D_03_DD–0.04406 CATS2D_03_DA–0.08829 B04[O-O]-0.21963 nArNHR
Formula (3),
B=-0.01310+0.08131 O_LPcount+0.13056 N_Qcount -0.09927 Mor12i+0.18232 nRCO+0.01458 H-047+0.14627 O-056+0.95757 P-116–0.53368 NddsN+0.14104 B01[C-N] + 0.03503 F02 [C-N] formula (4),
L=0.44713+0.03226 polarizability -0.16282 F_LPcount+0.07766 Br_LPcount+ 0.25237 Atom_num–0.35911 H_Qcount+0.48173 nHDon-0.08596 NaasC+0.06518 SssCH2– 0.43300 F01[O-Si]
Formula (5),
V=-0.00910+1.027 (molarVolume/100) formula (6),
Wherein, molecular parameter E is molecule molar excess refractive index, molecular parameter L is hexadecane-water partition coefficient, molecule ginseng Number A is hydrogen bond acidity number, and molecular parameter B is hydrogen bond basicity, and molecular parameter S is polarity/dipole moment, and molecular parameter V is McGowan Characteristic molecular volume.
CN201811378715.5A 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals Active CN109493922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811378715.5A CN109493922B (en) 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811378715.5A CN109493922B (en) 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals

Publications (2)

Publication Number Publication Date
CN109493922A true CN109493922A (en) 2019-03-19
CN109493922B CN109493922B (en) 2021-06-29

Family

ID=65696276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811378715.5A Active CN109493922B (en) 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals

Country Status (1)

Country Link
CN (1) CN109493922B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111986740A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Compound classification method and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103788276A (en) * 2005-12-09 2014-05-14 陶氏环球技术有限责任公司 Processes of controlling molecular weight distribution in ethylene/alpha-olefin compositions
US20150293057A1 (en) * 2012-10-29 2015-10-15 University Of Utah Research Foundation Functionalized nanotube sensors and related methods
CN106588802A (en) * 2016-10-31 2017-04-26 南京工程学院 Bis(tetrazole-2-oxy-4-hydro)amine, design method, and application thereof
CN107563133A (en) * 2017-08-30 2018-01-09 大连理工大学 Using the method for the chlorine radical reaction rate constant of quantitative structure activity relationship model prediction organic chemicals
CN108140920A (en) * 2015-10-27 2018-06-08 住友化学株式会社 Magnesium air electrode for cell and magnesium air battery and aromatic compound and metal complex

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103788276A (en) * 2005-12-09 2014-05-14 陶氏环球技术有限责任公司 Processes of controlling molecular weight distribution in ethylene/alpha-olefin compositions
US20150293057A1 (en) * 2012-10-29 2015-10-15 University Of Utah Research Foundation Functionalized nanotube sensors and related methods
CN108140920A (en) * 2015-10-27 2018-06-08 住友化学株式会社 Magnesium air electrode for cell and magnesium air battery and aromatic compound and metal complex
CN106588802A (en) * 2016-10-31 2017-04-26 南京工程学院 Bis(tetrazole-2-oxy-4-hydro)amine, design method, and application thereof
CN107563133A (en) * 2017-08-30 2018-01-09 大连理工大学 Using the method for the chlorine radical reaction rate constant of quantitative structure activity relationship model prediction organic chemicals

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111986740A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Compound classification method and related equipment
CN111986740B (en) * 2020-09-03 2024-05-14 深圳赛安特技术服务有限公司 Method for classifying compounds and related equipment

Also Published As

Publication number Publication date
CN109493922B (en) 2021-06-29

Similar Documents

Publication Publication Date Title
Cláudio et al. Extended scale for the hydrogen-bond basicity of ionic liquids
Sun et al. Influence of the delocalization error and applicability of optimal functional tuning in density functional calculations of nonlinear optical properties of organic donor–acceptor chromophores
Feixas et al. Aromaticity of distorted benzene rings: exploring the validity of different indicators of aromaticity
Coote Reliable theoretical procedures for the calculation of electronic-structure information in hydrogen abstraction reactions
Maschio et al. Intermolecular interaction energies in molecular crystals: comparison and agreement of localized Møller–Plesset 2, dispersion-corrected density functional, and classical empirical two-body calculations
Raghavendra et al. Unpaired and σ bond electrons as H, Cl, and Li bond acceptors: an anomalous one-electron blue-shifting chlorine bond
Suresh et al. A novel electrostatic approach to substituent constants: doubly substituted benzenes
Cyrański et al. Global and local aromaticity of linear and angular polyacenes
Beno et al. The C7H10 Potential Energy Landscape: Concerted Transition States and Diradical Intermediates for the Retro-Diels− Alder Reaction and [1, 3] Sigmatropic Shifts of Norbornene
Gavezzotti Quantitative ranking of crystal packing modes by systematic calculations on potential energies and vibrational amplitudes of molecular dimers
Koleva et al. Electrophile affinity: a reactivity measure for aromatic substitution
Kolboe Proton affinity calculations with high level methods
Yu et al. Baird’s rule in substituted fulvene derivatives: an information-theoretic study on triplet-state aromaticity and antiaromaticity
Fleming et al. A bacteria-based genetic assay detects prion formation
Kleinpeter et al. Antiaromaticity proved by the anisotropic effect in 1H NMR spectra
Güell et al. Aromaticity analysis of lithium cation/π complexes of aromatic systems
Gharagheizi et al. Development of a quantitative structure–liquid thermal conductivity relationship for pure chemical compounds
Hirao et al. Theoretical study of reactivities in electrophilic aromatic substitution reactions: reactive hybrid orbital analysis
CN109493922A (en) A method of prediction chemicals molecular structural parameter
Gharagheizi et al. Group contribution model for the prediction of refractive indices of organic compounds
Chandrakumar et al. A systematic study on the reactivity of Lewis acid− base complexes through the local hard− soft acid− base principle
Hemelsoet et al. Reactivity indices for radical reactions involving polyaromatics
Sivaramakrishnan et al. Ring conserved isodesmic reactions: a new method for estimating the heats of formation of aromatics and PAHs
Feixas et al. Analysis of Hückel’s [4 n+ 2] Rule through Electronic Delocalization Measures
Carissan et al. Hückel-Lewis projection method: A “weights watcher” for mesomeric structures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant