CN109493922A - A method of prediction chemicals molecular structural parameter - Google Patents
A method of prediction chemicals molecular structural parameter Download PDFInfo
- Publication number
- CN109493922A CN109493922A CN201811378715.5A CN201811378715A CN109493922A CN 109493922 A CN109493922 A CN 109493922A CN 201811378715 A CN201811378715 A CN 201811378715A CN 109493922 A CN109493922 A CN 109493922A
- Authority
- CN
- China
- Prior art keywords
- formula
- lpcount
- atom
- cats2d
- organic compound
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
Abstract
The present invention provides a kind of methods for predicting chemicals molecular structural parameter, and the method includes optimization organic compound molecule structure and the parameters of organic compound are calculated in the molecular structure based on optimization.The method of prediction organic compound multi-parameter linear free energy relationship provided by the present invention can be used for multiple types organic compound;The measured data of E, S, A, B, L, V reach 3838 kinds in method, have very extensive application domain, and E, S, A, B, L, V are modeled using linear regression algorithm, and the transparent simplicity of model algorithm is easy to explain;It is simple and efficient, low in cost using the chemicals distribution coefficient in method provided by the present invention prediction organic compound multi-parameter linear free energy relationship, it can be supervised for chemicals and data support is provided, be of great significance to the Ecological risk assessment of chemicals.
Description
Technical field
The present invention relates to ecological risk assessment Test Strategy fields, and in particular, to a kind of prediction chemicals molecular structure
The method of parameter.
Background technique
Distribution situation, reaction rate, bioconcentration and poisonous effect of the organic chemicals in Eco-Environment System are all
Depending on their distribution behavior.And since the equilibrium distribution coefficient for testing determining substance has, time-consuming, at high cost, error-prone
The disadvantages of.When organic matter is difficult to obtain or when organic matter substantial amounts to be measured, only relies on measuring chemicals molecular structure ginseng
Digital display obtains addition difficult.It is therefore desirable to develop reliable prediction technique for the balance molecule structural parameters of substance in the environment.
Multi-parameter linear free energy relationship has been demonstrated to can be used for characterizing to organise in various environment and technology distribution system
The equilibrium assignmen of substance is learned, and predicts accurate molecular structural parameters.But the molecular structure ginseng in multi-parameter linear free energy relationship
(E is molecule molar excess refractive index to number, L is hexadecane-water partition coefficient, A is hydrogen bond acidity number, B is hydrogen bond basicity, S is
Polarity/dipole moment, V be McGowan characteristic molecular volume) acquisition depend on complicated and diversified experimental method.And american chemical
The chemicals of digest society (Chemical Abstracts Service, CAS) registration is more than 1.35 hundred million, and with 15000 kinds/day
Speed increase, in face of number so huge organic chemicals, only its molecular structural parameter is measured obviously by testing
It has difficulties.Only having nearly 4000 kinds of substances at present has molecular structural parameter experiment value, and therefore, there is an urgent need to develop non-experiment skill
Art meets organic to obtain material molecular structure parameter value efficiently and rapidly in order to predict multi-parameter linear free energy relationship
The demand of chemicals ecological risk assessment and management.
Summary of the invention
The first purpose of the invention is to provide a kind of easy, quick, efficient prediction organic chemicals molecular structural parameters
Method, this method can predict its molecular structural parameter according to molecular structure of compounds, and then can assess its partition
Coefficient provides necessary basic data for Risk Assessment of Chemicals and management.
To achieve the goals above, the present invention provides a kind of method for predicting chemicals molecular structural parameter, the sides
Method includes:
S1. optimize organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for exceeding computer capacity
Pseudo potential is added using LANL2DZ in atom, and keyword pop=NBO, Volume is added, and the molecular structure after the optimization should be steady
Fixed no empty frequency;
S2. polarizability, E is calculated in the molecular structure based on optimizationHomo-ELumo、I_LPcount、Atom_
num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、
Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、
SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_
Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、
H_Qcount, NaasC, SssCH2, F01 [O-Si], molarVolume value, wherein polarizability be polarizability,
EHomo-ELumoIt is former for frontier molecular orbital energy levels are poor, I_Lpcount is all iodine atoms lone pair electrons logarithm, Atom_num
Sub- sum, nBT are chemical number of keys, Mor32p be the weighting of 32/ polarizability of 3D-MoRSE signal, the N that nHDon is hydrogen bond donor and
Center Broto-Moreau autocorrelation exponent, the F01 [C-N] that O atom number, ATSC2i are the lag2 of ionization potential weighting are topology
There is the probability of O-O when distance the probability of C-N occurs when being 1, F10 [O-O] be topology distance is 10, dipole_moment is
Dipole moment, CCR_energy are core-nuclear repulsion energy, EHOMO-1Orbital energy is occupied for the second height, the ring week that Rperim is molecule
Long, Mor05u is 05/ unweighted of 3D-MoRSE signal, Mor02m is the weighting of 02/ mass of 3D-MoRSE signal, nRCO is aliphatic ketone
The number of base, H-046 be hydrogen atom be connected with sp3 hydbridized carbon atoms and on adjacent carbon atom halogen-free atom be connected, SdO be=
The E-states summation of O, NtN are that the number of the N containing ≡, H_Qmax are the hydrogen atom highest quantity of electric charge, H_Qmean is hydrogen original in molecule
Sub- mean charge amount, nRCONH2 are the number of aliphatic primary amide, N-067 Al2-NH, O-057 are on phenol/enol/carboxyl
Oxygen atom, SsNH2 be-NH2E-states summation, CATS2D_01_AN be hydrogen bond receptor-negative electrical charge at lag 01
CATS2D descriptor, CATS2D_03_DD be hydrogen bond donor-hydrogen bond donor CATS2D descriptor at lag03,
CATS2D_03_DA is that hydrogen bond donor-hydrogen bond receptor CATS3D descriptor, the B04 [O-O] at lag 03 are topology distance
Presence/missing of O-O, nArNHR are the number of aromatic amine, the lone pair electrons that O_Lpcount is all oxygen atoms when being 4
Logarithm, N_Qcount are number of nitrogen atoms, Mor12i be 12/ ionization potential of 3D-MoRSE signal weighting, H-047 be with sp2 and
The connected hydrogen atom of the carbon atom of sp3 hydridization, O-056 be the number that oxygen atom, P-116 on hydroxyl is R3-P=X group,
NddsN be-N (=)=number, B01 [C-N] when be topology distance being 1 presence/missing of C-N, F02 [C-N] be topology away from
It is all bromine atoms from lone pair electrons number that the frequency of C-N, F_Lpcount when being 2 are all fluorine atoms, Br_Lpcount
Lone pair electrons logarithm, H_Qcount are numbers of hydrogen atoms, NaasC is aasC number, the E-states that SssCH2 is-CH2- are total
There is the frequency of O-Si when with F01 [O-Si] be topology distance being 1, molarVolume is molal volume;
S3. organic compound molecule structural parameters E is calculated according to formula (7), calculates organic compound molecule knot according to formula (8)
Structure parameter S, calculates organic compound molecule structural parameters A according to formula (9), calculates organic compound molecule structure according to formula (10)
Parameter B, calculates organic compound molecule structural parameters L according to formula (11), calculates organic compound molecule structure according to formula (12)
Parameter V,
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount-
0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+
0.04305F01[C-N]+0.14475F10[O-O]
Formula (7),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+
0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+
0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (8),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon-
0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_
DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
Formula (9),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount -0.09927Mor12i+
0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C-
N]+0.03503F02[C-N]
Formula (10),
L=0.44713+0.03226polarizability -0.16282F_LPcount+0.07766Br_LPcount+
0.25237Atom_num–0.35911H_Qcount+0.48173nHDon-0.08596NaasC+0.06518SssCH2–
0.43300F01[O-Si]
Formula (11),
V=-0.00910+1.027 (molarVolume/100) formula (12),
Wherein, molecular parameter E is molecule molar excess refractive index, molecular parameter L is hexadecane-water partition coefficient, divides
Subparameter A is hydrogen bond acidity number, and molecular parameter B is hydrogen bond basicity, and molecular parameter S is polarity/dipole moment, and molecular parameter V is
McGowan characteristic molecular volume;
The organic compound can be alkane, alkene, alkynes, alcohols, ethers, phenols, ketone, aldehydes, esters, quinone
Class, substituted biphenyl, phenyl amines, halogenated hydrocarbons, nitro-aromatic, alkylbenzene, azobenzene, organic acid, benzamide, phthalic acid
Salt, polybrominated diphenyl ethers, polycyclic aromatic hydrocarbon, sulfonic acid, organic phosphorus compound, organic sulfur compound, organic iodide, organic fluoride
Object, heterocyclic compound and organo-silicon compound.
The method of prediction chemicals molecular structural parameter provided by the present invention can be used for multiple types organic compound;Method
The measured data of middle molecular structural parameter E, S, A, B, L, V reach 3838 kinds, have very extensive application domain, E, S, A, B, L,
V is modeled using linear regression algorithm, and the transparent simplicity of model algorithm is easy to explain;The molecular structure ginseng obtained according to prediction
Number E, molecular structural parameter S, molecular structural parameter A, molecular structural parameter B, molecular structural parameter L and molecular structural parameter V are pre-
It is accurate to survey organic compound multi-parameter linear free energy relationship result.Organic compound is predicted using method provided by the present invention
Multi-parameter linear free energy relationship is simple and efficient, low in cost, can supervise for chemicals and provide data support, to chemicals
Ecological risk assessment is of great significance.
Other features and advantages of the present invention will the following detailed description will be given in the detailed implementation section.
Specific embodiment
Detailed description of the preferred embodiments below.It should be understood that described herein specific
Embodiment is merely to illustrate and explain the present invention, and is not intended to restrict the invention.
The present invention provides a kind of methods for predicting chemicals molecular structural parameter, which comprises
S1. optimize organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for exceeding computer capacity
Pseudo potential is added using LANL2DZ in atom, and keyword pop=NBO, Volume is added, the molecular structure stabilized after the optimization
Without empty frequency;
S2. polarizability, E is calculated in the molecular structure based on optimizationHomo-ELumo、I_LPcount、Atom_
num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、
Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、
SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_
Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、
H_Qcount, NaasC, SssCH2, F01 [O-Si], molarVolume value, wherein polarizability be polarizability,
EHomo-ELumoIt is former for frontier molecular orbital energy levels are poor, I_Lpcount is all iodine atoms lone pair electrons logarithm, Atom_num
Sub- sum, nBT are chemical number of keys, Mor32p be the weighting of 32/ polarizability of 3D-MoRSE signal, the N that nHDon is hydrogen bond donor and
Center Broto-Moreau autocorrelation exponent, the F01 [C-N] that O atom number, ATSC2i are the lag2 of ionization potential weighting are topology
There is the probability of O-O when distance the probability of C-N occurs when being 1, F10 [O-O] be topology distance is 10, dipole_moment is
Dipole moment, CCR_energy are core-nuclear repulsion energy, EHOMO-1Orbital energy is occupied for the second height, the ring week that Rperim is molecule
Long, Mor05u is 05/ unweighted of 3D-MoRSE signal, Mor02m is the weighting of 02/ mass of 3D-MoRSE signal, nRCO is aliphatic ketone
The number of base, H-046 be hydrogen atom be connected with sp3 hydbridized carbon atoms and on adjacent carbon atom halogen-free atom be connected, SdO be=
The E-states summation of O, NtN are that the number of the N containing ≡, H_Qmax are the hydrogen atom highest quantity of electric charge, H_Qmean is hydrogen original in molecule
Sub- mean charge amount, nRCONH2 are the number of aliphatic primary amide, N-067 Al2-NH, O-057 are on phenol/enol/carboxyl
Oxygen atom, SsNH2 be-NH2E-states summation, CATS2D_01_AN be hydrogen bond receptor-negative electrical charge at lag 01
CATS2D descriptor, CATS2D_03_DD be hydrogen bond donor-hydrogen bond donor CATS2D descriptor at lag03,
CATS2D_03_DA is that hydrogen bond donor-hydrogen bond receptor CATS3D descriptor, the B04 [O-O] at lag 03 are topology distance
Presence/missing of O-O, nArNHR are the number of aromatic amine, the lone pair electrons that O_Lpcount is all oxygen atoms when being 4
Logarithm, N_Qcount are number of nitrogen atoms, Mor12i be 12/ ionization potential of 3D-MoRSE signal weighting, H-047 be with sp2 and
The connected hydrogen atom of the carbon atom of sp3 hydridization, O-056 be the number that oxygen atom, P-116 on hydroxyl is R3-P=X group,
NddsN be-N (=)=number, B01 [C-N] when be topology distance being 1 presence/missing of C-N, F02 [C-N] be topology away from
It is all bromine atoms from lone pair electrons number that the frequency of C-N, F_Lpcount when being 2 are all fluorine atoms, Br_Lpcount
Lone pair electrons logarithm, H_Qcount are numbers of hydrogen atoms, NaasC is aasC number, the E-states that SssCH2 is-CH2- are total
There is the frequency of O-Si when with F01 [O-Si] be topology distance being 1, molarVolume is molal volume;
S3. organic compound molecule structural parameters E is calculated according to formula (13), calculates organic compound molecule according to formula (14)
Structural parameters S, calculates organic compound molecule structural parameters A according to formula (15), calculates organic compound molecule according to formula (16)
Structural parameters B, calculates organic compound molecule structural parameters L according to formula (17), calculates organic compound molecule according to formula (18)
Structural parameters V,
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount-
0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+
0.04305F01 [C-N]+0.14475F10 [O-O] formula (13),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+
0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+
0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (14),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon-
0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_
DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
Formula (15),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount -0.09927Mor12i+
0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C-
N]+0.03503F02[C-N]
Formula (16)
L=0.44713+0.03226polarizability -0.16282F_LPcount+0.07766Br_LPcount+
0.25237Atom_num–0.35911H_Qcount+0.48173nHDon-0.08596NaasC+0.06518SssCH2–
0.43300F01[O-Si]
Formula (17)
V=-0.00910+1.027 (molarVolume/100) formula (18),
Wherein, molecular parameter E is molecule molar excess refractive index, molecular parameter L is hexadecane-water partition coefficient, divides
Subparameter A is hydrogen bond acidity number, and molecular parameter B is hydrogen bond basicity, and molecular parameter S is polarity/dipole moment, and molecular parameter V is
McGowan characteristic molecular volume;
The organic compound can be alkane, alkene, alkynes, alcohols, ethers, phenols, ketone, aldehydes, esters, quinone
Class, substituted biphenyl, phenyl amines, halogenated hydrocarbons, nitro-aromatic, alkylbenzene, azobenzene, organic acid, benzamide, phthalic acid
Salt, polybrominated diphenyl ethers, polycyclic aromatic hydrocarbon, sulfonic acid, organic phosphorus compound, organic sulfur compound, organic iodide, organic fluoride
Object, heterocyclic compound and organo-silicon compound.
The method of prediction chemicals molecular structural parameter provided by the present invention can be used for multiple types organic compound;Method
The measured data of middle molecular structural parameter E, S, A, B, L, V reach 3838 kinds, have very extensive application domain, E, S, A, B, L,
V is modeled using linear regression algorithm, and the transparent simplicity of model algorithm is easy to explain;The molecular structure ginseng obtained according to prediction
Number E, molecular structural parameter S, molecular structural parameter A, molecular structural parameter B, molecular structural parameter L and molecular structural parameter V are pre-
Survey organic compound multi-parameter linear free energy relationship.Organic compound multi-parameter line is predicted using method provided by the present invention
Free love energy relationship is simple and efficient, low in cost, can supervise for chemicals and provide data support, to the ecological risk of chemicals
Property evaluation be of great significance.
Below by embodiment, present invention be described in more detail.
Embodiment 1
A given compound 4- nitro-chlorobenzene (No. CAS: 100-00-5) predicts its molecular structural parameter E value.It is right first
Compound carries out molecular structure optimization, optimizes organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for super
Pseudo potential is added using LANL2DZ in the atom of computer capacity out, and keyword pop=NBO, Volume, the molecule after optimization is added
Stable structure is without empty frequency;Molecular structure based on optimization, using Draogon6.0 software be calculated polarizability,
EHomo-ELumo, I_LPcount, Atom_num, nBT, Mor32p, nHDon, ATSC2i, F01 [C-N], F10 [O-O] value difference
It is 85.02, -0.180,0,14,14, -0.160,0,0.249,1,0.Then predicted value 0.98 is calculated according to formula (19), with
Experiment value is consistent, and prediction effect is good.
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount-
0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+
0.04305F01 [C-N]+0.14475F10 [O-O] formula (19).
Embodiment 2
A given compound Isosorbide-5-Nitrae-diisopropyl benzene (No. CAS: 100-18-5) predicts its S value.First to compound into
The optimization of row molecular structure, the molecular structure based on optimization optimize organic compound molecule in Gauss B3LYP/6-31G (d) method
Structure is added pseudo potential using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, excellent
Molecular structure stabilized after change is without empty frequency;Dipole_moment, CCR_ is calculated using Draogon6.0 software
energy、EHOMO-1, Rperim, ATSC2i, Mor05u, Mor02m, nRCO, nHDon, H-046, SdO, NtN value be respectively
0.0191,682.158,-0.24112,6,0.6,-4.027,11.018,0,0,14,0,0.Then it is calculated according to formula (20)
Predicted value 0.47, experiment value 0.474, prediction effect is good.
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+
0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+
0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (20).
Embodiment 3
A given compound acetanisole (No. CAS: 100-06-1) predicts its A value.First to compound into
The optimization of row molecular structure optimizes organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for beyond calculating model
Pseudo potential is added using LANL2DZ in the atom enclosed, and keyword pop=NBO, Volume, the molecular structure stabilized after optimization is added
Without empty frequency;Molecular structure based on optimization, using Draogon6.0 software be calculated H_Qmax, H_Qmean, nRCONH2,
nHDon、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、
The value of nArNHR is respectively 0.179382,0.1588766,0,0,0,0,0,0,0,0,0,0.Then it is calculated according to formula (21)
Predicted value 0.019 out, experiment value 0, prediction effect is good.
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon-
0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_
DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
Formula (21).
Embodiment 4
A given compound propyl benzene (No. CAS: 103-65-1) predicts its distribution coefficient logarithm in methanol/water.
Molecular structure optimization is carried out to compound first, optimizes organic compound molecule structure in Gauss B3LYP/6-31G (d) method,
Pseudo potential is added using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, after optimization
Molecular structure stabilized without empty frequency;Molecular structure based on optimization is calculated using Draogon6.0 software
polarizability、EHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、
F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、
NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、
CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、
NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]、
MolarVolume value;According to formula (22), formula (23), formula (24), formula (25), formula (26) calculate separately propyl benzene molecular parameter E,
S, A, B, V value are 0.626,0.380, -0.016,0.225,1.126, are remembered according in Abraham M H et al. document in 2004
It is 3.42 that distribution coefficient logarithm of the propyl benzene in methanol/water, which is calculated, in the formula (27) of load, experiment value 3.52, prediction result
Well.
E=0.61313+0.01169polarizability+0.88701 (EHomo-ELumo)+0.12676I_LPcount-
0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+
0.04305F01 [C-N]+0.14475F10 [O-O] formula (22),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420 EHomo-1+
0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+
0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (23),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0 .28056nHDon-
0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_
DD -0.04406CATS2D_03_DA -0.08829B04 [O-O] -0.21963nArNHR formula (24),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount -0.09927Mor12i+
0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C-
N]+0.03503F02 [C-N] formula (25)
V=-0.00910+1.027 (molarVolume/100) formula (26),
LogK=0.299E-0.671S+0.080A-3.389B+3.512V+0.329 formula (27).
Embodiment 5
A given compound bromobutane (No. CAS: 109-65-9) predicts its distribution coefficient logarithm in ethanol/water
Value.Molecular structure optimization is carried out to compound first, optimizes organic compound molecule knot in Gauss B3LYP/6-31G (d) method
Structure is added pseudo potential using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, and optimizes
Molecular structure stabilized afterwards is without empty frequency;Molecular structure based on optimization is calculated using Draogon6.0 software
polarizability、EHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、
F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、
NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、
CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、
NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]、
MolarVolume value;According to formula (22), formula (23), formula (24), formula (25), formula (26) calculate separately propyl benzene molecular parameter E,
S, A, B, V value are 0.252,0.279,0.019,0.052,0.799, are remembered according in Abraham M H et al. document in 2004
It is 3.45 that distribution coefficient logarithm of the bromobutane in ethanol/water, which is calculated, in the formula (28) of load, experiment value 3.52, prediction knot
Fruit is good.
LogK=0.409E -0.959S+0.186A-3.645B+3.928V+0.208 formula (28).
Embodiment 6
A given compound dimethyl ether (No. CAS: 115-10-6) predicts its distribution coefficient logarithm in amylalcohol/water
Value.Molecular structure optimization is carried out to compound first, optimizes organic compound molecule knot in Gauss B3LYP/6-31G (d) method
Structure is added pseudo potential using LANL2DZ for the atom beyond computer capacity, and keyword pop=NBO, Volume is added, and optimizes
Molecular structure stabilized afterwards is without empty frequency;Molecular structure based on optimization is calculated using Draogon6.0 software
polarizability、EHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、
F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、
NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、
CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、
NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]、
MolarVolume value;According to formula (22), formula (23), formula (24), formula (25), formula (26) calculate separately propyl benzene molecular parameter E,
S, A, B, V value are 0.252,0.279,0.019,0.052,0.799, are remembered according in Abraham M H et al. document in 2004
It is 3.45 that distribution coefficient logarithm of the bromobutane in ethanol/water, which is calculated, in the formula (29) of load, experiment value 3.52, prediction knot
Fruit is good.
LogK=0.521E -1.294S+0.208A-3.908B+4.208V+0.08 formula (29).
Compare the method it can be seen that prediction chemicals molecular structural parameter provided by the present invention according to above-described embodiment
It can be used for multiple types organic compound, the organic compound can be alkane, alkene, alkynes, alcohols, ethers, phenols, ketone
Class, aldehydes, esters, quinones, substituted biphenyl, phenyl amines, halogenated hydrocarbons, nitro-aromatic, alkylbenzene, azobenzene, organic acid, benzoyl
Amine, phthalate, polybrominated diphenyl ethers, polycyclic aromatic hydrocarbon, sulfonic acid, organic phosphorus compound, organic sulfur compound, organic iodine
Compound, organic fluoride, heterocyclic compound and organo-silicon compound.The actual measurement of molecular structural parameter E, S, A, B, L, V in method
Data reach 3838 kinds, have very extensive application domain, and E, S, A, B, L, V are modeled using linear regression algorithm, model
The transparent simplicity of algorithm, is easy to explain;Molecular structural parameter E, the molecular structural parameter S, molecular structural parameter obtained according to prediction
A, molecular structural parameter B, molecular structural parameter L and molecular structural parameter V prediction organic compound multi-parameter linear free energy close
System.
The simple and efficient, cost using method provided by the present invention prediction organic compound multi-parameter linear free energy relationship
It is cheap, it can be supervised for chemicals and data support is provided, be of great significance to the Ecological risk assessment of chemicals.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Anyone skilled in the art in the technical scope disclosed by the present invention, according to the technique and scheme of the present invention and its
Inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.
Claims (1)
1. a kind of method for predicting chemicals molecular structural parameter, which is characterized in that the described method includes:
S1. optimize organic compound molecule structure in Gauss B3LYP/6-31G (d) method, for exceeding the atom of computer capacity
Pseudo potential is added using LANL2DZ, and keyword pop=NBO, Volume is added, the molecular structure stabilized after the optimization does not have
Empty frequency;
S2. polarizability, E is calculated in the molecular structure based on optimizationHomo-ELumo、I_LPcount、Atom_num、
nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、
Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、
SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_
Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、
H_Qcount, NaasC, SssCH2, F01 [O-Si] and molarVolume value, wherein polarizability be polarizability,
EHomo-ELumoIt is former for frontier molecular orbital energy levels are poor, I_Lpcount is all iodine atoms lone pair electrons logarithm, Atom_num
Sub- sum, nBT are chemical number of keys, Mor32p be the weighting of 32/ polarizability of 3D-MoRSE signal, the N that nHDon is hydrogen bond donor and
Center Broto-Moreau autocorrelation exponent, the F01 [C-N] that O atom number, ATSC2i are the lag2 of ionization potential weighting are topology
There is the probability of O-O when distance the probability of C-N occurs when being 1, F10 [O-O] be topology distance is 10, dipole_moment is
Dipole moment, CCR_energy are core-nuclear repulsion energy, EHOMO-1Orbital energy is occupied for the second height, the ring week that Rperim is molecule
Long, Mor05u is 05/ unweighted of 3D-MoRSE signal, Mor02m is the weighting of 02/ mass of 3D-MoRSE signal, nRCO is aliphatic ketone
The number of base, H-046 be hydrogen atom be connected with sp3 hydbridized carbon atoms and on adjacent carbon atom halogen-free atom be connected, SdO be=
The E-states summation of O, NtN are that the number of the N containing ≡, H_Qmax are the hydrogen atom highest quantity of electric charge, H_Qmean is hydrogen original in molecule
Sub- mean charge amount, nRCONH2 are the number of aliphatic primary amide, N-067 Al2-NH, O-057 are on phenol/enol/carboxyl
Oxygen atom, SsNH2 be-NH2E-states summation, CATS2D_01_AN be hydrogen bond receptor-negative electrical charge at lag 01
CATS2D descriptor, CATS2D_03_DD be hydrogen bond donor-hydrogen bond donor CATS2D descriptor at lag03,
CATS2D_03_DA is that hydrogen bond donor-hydrogen bond receptor CATS3D descriptor, the B04 [O-O] at lag 03 are topology distance
Presence/missing of O-O, nArNHR are the number of aromatic amine, the lone pair electrons that O_Lpcount is all oxygen atoms when being 4
Logarithm, N_Qcount are number of nitrogen atoms, Mor12i be 12/ ionization potential of 3D-MoRSE signal weighting, H-047 be with sp2 and
The connected hydrogen atom of the carbon atom of sp3 hydridization, O-056 be the number that oxygen atom, P-116 on hydroxyl is R3-P=X group,
NddsN be-N (=)=number, B01 [C-N] when be topology distance being 1 presence/missing of C-N, F02 [C-N] be topology away from
It is all bromine atoms from lone pair electrons number that the frequency of C-N, F_Lpcount when being 2 are all fluorine atoms, Br_Lpcount
Lone pair electrons logarithm, H_Qcount are numbers of hydrogen atoms, NaasC is aasC number, the E-states that SssCH2 is-CH2- are total
There is the frequency of O-Si when with F01 [O-Si] be topology distance being 1, molarVolume is molal volume;
S3. organic compound molecule structural parameters E is calculated according to formula (1), calculates organic compound molecule structure ginseng according to formula (2)
Number S, calculates organic compound molecule structural parameters A according to formula (3), calculates organic compound molecule structural parameters according to formula (4)
B, calculates organic compound molecule structural parameters L according to formula (5), calculates organic compound molecule structural parameters V according to formula (6),
E=0.61313+0.01169 polarizability+0.88701 (EHomo-ELumo)+0.12676 I_LPcount-
0.29072 Atom_num+0.26076 nBT-0.34881 Mor32p+0.12675 nHDon-0.57231 ATSC2i+
0.04305 F01[C-N]+0.14475 F10[O-O]
Formula (1),
S=0.80280+0.05210 dipole_moment+0.00023 CCR_energy+1.96420 EHomo-1+0.03975
Rperim-0.55400 ATSC2i-0.05361 Mor05u+0.01734 Mor02m+0.24280 nRCO+0.10889
nHDon-0.02352 H-046+0.01438 SdO+0.43704 NtN
Formula (2),
A=-0.18760+0.41354 H_Qmax+0.83897 H_Qmean+0.20256 nRCONH2+0.28056 nHDon-
0.16539 N-067+0.08320 O-057–0.07177 SsNH2+0.14845 CATS2D_01_AN–0.12936
CATS2D_03_DD–0.04406 CATS2D_03_DA–0.08829 B04[O-O]-0.21963 nArNHR
Formula (3),
B=-0.01310+0.08131 O_LPcount+0.13056 N_Qcount -0.09927 Mor12i+0.18232
nRCO+0.01458 H-047+0.14627 O-056+0.95757 P-116–0.53368 NddsN+0.14104 B01[C-N]
+ 0.03503 F02 [C-N] formula (4),
L=0.44713+0.03226 polarizability -0.16282 F_LPcount+0.07766 Br_LPcount+
0.25237 Atom_num–0.35911 H_Qcount+0.48173 nHDon-0.08596 NaasC+0.06518 SssCH2–
0.43300 F01[O-Si]
Formula (5),
V=-0.00910+1.027 (molarVolume/100) formula (6),
Wherein, molecular parameter E is molecule molar excess refractive index, molecular parameter L is hexadecane-water partition coefficient, molecule ginseng
Number A is hydrogen bond acidity number, and molecular parameter B is hydrogen bond basicity, and molecular parameter S is polarity/dipole moment, and molecular parameter V is McGowan
Characteristic molecular volume.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811378715.5A CN109493922B (en) | 2018-11-19 | 2018-11-19 | Method for predicting molecular structure parameters of chemicals |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811378715.5A CN109493922B (en) | 2018-11-19 | 2018-11-19 | Method for predicting molecular structure parameters of chemicals |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109493922A true CN109493922A (en) | 2019-03-19 |
CN109493922B CN109493922B (en) | 2021-06-29 |
Family
ID=65696276
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811378715.5A Active CN109493922B (en) | 2018-11-19 | 2018-11-19 | Method for predicting molecular structure parameters of chemicals |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109493922B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111986740A (en) * | 2020-09-03 | 2020-11-24 | 平安国际智慧城市科技股份有限公司 | Compound classification method and related equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103788276A (en) * | 2005-12-09 | 2014-05-14 | 陶氏环球技术有限责任公司 | Processes of controlling molecular weight distribution in ethylene/alpha-olefin compositions |
US20150293057A1 (en) * | 2012-10-29 | 2015-10-15 | University Of Utah Research Foundation | Functionalized nanotube sensors and related methods |
CN106588802A (en) * | 2016-10-31 | 2017-04-26 | 南京工程学院 | Bis(tetrazole-2-oxy-4-hydro)amine, design method, and application thereof |
CN107563133A (en) * | 2017-08-30 | 2018-01-09 | 大连理工大学 | Using the method for the chlorine radical reaction rate constant of quantitative structure activity relationship model prediction organic chemicals |
CN108140920A (en) * | 2015-10-27 | 2018-06-08 | 住友化学株式会社 | Magnesium air electrode for cell and magnesium air battery and aromatic compound and metal complex |
-
2018
- 2018-11-19 CN CN201811378715.5A patent/CN109493922B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103788276A (en) * | 2005-12-09 | 2014-05-14 | 陶氏环球技术有限责任公司 | Processes of controlling molecular weight distribution in ethylene/alpha-olefin compositions |
US20150293057A1 (en) * | 2012-10-29 | 2015-10-15 | University Of Utah Research Foundation | Functionalized nanotube sensors and related methods |
CN108140920A (en) * | 2015-10-27 | 2018-06-08 | 住友化学株式会社 | Magnesium air electrode for cell and magnesium air battery and aromatic compound and metal complex |
CN106588802A (en) * | 2016-10-31 | 2017-04-26 | 南京工程学院 | Bis(tetrazole-2-oxy-4-hydro)amine, design method, and application thereof |
CN107563133A (en) * | 2017-08-30 | 2018-01-09 | 大连理工大学 | Using the method for the chlorine radical reaction rate constant of quantitative structure activity relationship model prediction organic chemicals |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111986740A (en) * | 2020-09-03 | 2020-11-24 | 平安国际智慧城市科技股份有限公司 | Compound classification method and related equipment |
CN111986740B (en) * | 2020-09-03 | 2024-05-14 | 深圳赛安特技术服务有限公司 | Method for classifying compounds and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109493922B (en) | 2021-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cláudio et al. | Extended scale for the hydrogen-bond basicity of ionic liquids | |
Sun et al. | Influence of the delocalization error and applicability of optimal functional tuning in density functional calculations of nonlinear optical properties of organic donor–acceptor chromophores | |
Feixas et al. | Aromaticity of distorted benzene rings: exploring the validity of different indicators of aromaticity | |
Coote | Reliable theoretical procedures for the calculation of electronic-structure information in hydrogen abstraction reactions | |
Maschio et al. | Intermolecular interaction energies in molecular crystals: comparison and agreement of localized Møller–Plesset 2, dispersion-corrected density functional, and classical empirical two-body calculations | |
Raghavendra et al. | Unpaired and σ bond electrons as H, Cl, and Li bond acceptors: an anomalous one-electron blue-shifting chlorine bond | |
Suresh et al. | A novel electrostatic approach to substituent constants: doubly substituted benzenes | |
Cyrański et al. | Global and local aromaticity of linear and angular polyacenes | |
Beno et al. | The C7H10 Potential Energy Landscape: Concerted Transition States and Diradical Intermediates for the Retro-Diels− Alder Reaction and [1, 3] Sigmatropic Shifts of Norbornene | |
Gavezzotti | Quantitative ranking of crystal packing modes by systematic calculations on potential energies and vibrational amplitudes of molecular dimers | |
Koleva et al. | Electrophile affinity: a reactivity measure for aromatic substitution | |
Kolboe | Proton affinity calculations with high level methods | |
Yu et al. | Baird’s rule in substituted fulvene derivatives: an information-theoretic study on triplet-state aromaticity and antiaromaticity | |
Fleming et al. | A bacteria-based genetic assay detects prion formation | |
Kleinpeter et al. | Antiaromaticity proved by the anisotropic effect in 1H NMR spectra | |
Güell et al. | Aromaticity analysis of lithium cation/π complexes of aromatic systems | |
Gharagheizi et al. | Development of a quantitative structure–liquid thermal conductivity relationship for pure chemical compounds | |
Hirao et al. | Theoretical study of reactivities in electrophilic aromatic substitution reactions: reactive hybrid orbital analysis | |
CN109493922A (en) | A method of prediction chemicals molecular structural parameter | |
Gharagheizi et al. | Group contribution model for the prediction of refractive indices of organic compounds | |
Chandrakumar et al. | A systematic study on the reactivity of Lewis acid− base complexes through the local hard− soft acid− base principle | |
Hemelsoet et al. | Reactivity indices for radical reactions involving polyaromatics | |
Sivaramakrishnan et al. | Ring conserved isodesmic reactions: a new method for estimating the heats of formation of aromatics and PAHs | |
Feixas et al. | Analysis of Hückel’s [4 n+ 2] Rule through Electronic Delocalization Measures | |
Carissan et al. | Hückel-Lewis projection method: A “weights watcher” for mesomeric structures |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |