CN117809757A - Molecular force field fitting method and device - Google Patents

Molecular force field fitting method and device Download PDF

Info

Publication number
CN117809757A
CN117809757A CN202311842262.8A CN202311842262A CN117809757A CN 117809757 A CN117809757 A CN 117809757A CN 202311842262 A CN202311842262 A CN 202311842262A CN 117809757 A CN117809757 A CN 117809757A
Authority
CN
China
Prior art keywords
target molecule
fragment
fragments
target
force field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311842262.8A
Other languages
Chinese (zh)
Inventor
徐义富
郭震宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Tengmai Pharmaceutical Technology Co ltd
Original Assignee
Suzhou Tengmai Pharmaceutical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Tengmai Pharmaceutical Technology Co ltd filed Critical Suzhou Tengmai Pharmaceutical Technology Co ltd
Priority to CN202311842262.8A priority Critical patent/CN117809757A/en
Publication of CN117809757A publication Critical patent/CN117809757A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like

Landscapes

  • Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The disclosure relates to a molecular force field fitting method and device, wherein the method comprises the following steps: fragmenting each target molecule according to the 3D structure of each target molecule to determine fragments of each target molecule, wherein the target molecules are ligand molecules for binding with receptor proteins; generating a fragment set and determining the corresponding relation between each fragment in the fragment set and each target molecule; geometrically optimizing each target molecule and each fragment to determine the most stable conformation of each target molecule; respectively carrying out electrostatic potential or single-point energy calculation on each fragment of each target molecule in the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment; calculating charge information of each target molecule; according to the calculation result corresponding to each fragment and the charge information of the target molecules, the total energy of each fragment under different dihedral angles is calculated, and finally the force field parameter of each target molecule is fitted, so that the molecular force field fitting has high precision, good accuracy and high fitting speed.

Description

Molecular force field fitting method and device
Technical Field
The disclosure relates to the technical field of drug research and development, in particular to a molecular force field fitting method and device.
Background
Molecular Force Fields (FFs) are the basis for molecular modeling. Molecular modeling can be used for a variety of applications, from fast virtual screening to detailed free energy computation, using FF accuracy and GPU speed. In the related art, the force fields such as GAFF2, OPLS4, CHARMM are transitable force fields (general force fields), and these force field calculations firstly define specific atom types for elements having the same chemical environment in the molecule, bond length, bond angle, dihedral angle, etc. and non-bond parameters in the force field model are defined by these atom types, and at the same time, a small molecule force field parameter set is created and trained to cover a wide chemical space. For any target compound, the force field model defined by the atomic type can be directly distributed to the force field parameters as long as the force field model can be matched in the pre-trained parameter set. However, since the number of atom types defined in the force field model is small, for example, GAFF2 has only 97 atom types. Therefore, the chemical space covered by the method is still limited, and the method is far from the requirements on property prediction with higher accuracy requirements, such as calculation of the binding free energy of small molecules and protein targets.
Disclosure of Invention
In view of this, the present disclosure proposes a method and apparatus for fitting a molecular force field.
According to an aspect of the present disclosure, there is provided a molecular force field fitting method, the method comprising:
fragmenting each target molecule according to the 3D structure of each target molecule to determine fragments of each target molecule, wherein each target molecule is a ligand molecule for binding with a receptor protein;
generating a fragment set according to fragments of each target molecule, and determining the corresponding relation between each fragment in the fragment set and each target molecule;
geometrically optimizing each target molecule to determine the most stable conformation of each target molecule;
respectively carrying out electrostatic potential or single-point energy calculation on each fragment of each target molecule according to the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment in each target molecule;
fitting charge information of each target molecule according to each calculation result;
according to the calculation result corresponding to the fragments of each target molecule and the charge information of the target molecule, calculating the total energy of each fragment of the target molecule under different dihedral angles, and fitting out force field parameters of all dihedral angles of each fragment based on the total energy;
And integrating force field parameters corresponding to fragments in each target molecule, and outputting formatted force field parameter files of all target molecules.
In one possible implementation manner, the fragmenting processing is performed on each target molecule according to the 3D structure of each target molecule, so as to determine fragments of each target molecule, including:
scoring each atom in each target molecule according to the 3D structure of each target molecule to obtain the atomic fraction of each atom, wherein the atomic fractions corresponding to the atoms belonging to different atom types are different, and the chemical structure environment where the atoms of the same atom type are positioned comprises at least one kind;
adding labels for each atomic bond in each target molecule according to the 3D structure of each target molecule, wherein the labels added by the atomic bonds of different types are different;
cutting the atomic bonds marked with the first label in the target molecule to obtain first-stage fragments of the target molecule;
wherein, for a target molecule comprising two primary fragments, the two primary fragments are directly linked; for a target molecule comprising more than two primary fragments, connecting two adjacent primary fragments to form a secondary fragment; fragments of the target molecule include primary fragments or secondary fragments.
In one possible implementation manner, generating a fragment set according to fragments of each target molecule, and determining a correspondence between each fragment in the fragment set and each target molecule, including:
counting fragments of each target molecule to obtain a fragment set to be de-duplicated;
removing repeated fragments in the fragment set to be de-duplicated to obtain a fragment set;
and determining the corresponding relation between each fragment in the fragment set and each target molecule according to the fragments of each target molecule.
In one possible implementation, performing geometric optimization on each of the target molecules to determine a most stable conformation of each of the target molecules includes:
optimizing the geometric structure of each target molecule by adopting a density functional method according to the initial molecular coordinates of each target molecule to obtain optimized molecular coordinates;
and determining the most stable conformation of each target molecule according to the optimized molecular coordinates corresponding to each target molecule.
In one possible implementation manner, the electrostatic potential or the single-point energy calculation is performed on each fragment of each target molecule according to the most stable conformation of each target molecule, so as to obtain a corresponding calculation result of each fragment in each target molecule, which includes:
Aiming at a target molecule with a first charge type, calculating single-point energy of each fragment in the target molecule according to the optimized molecular coordinates of the target molecule and a preset calculation method, and taking the single-point energy as a calculation result; and/or
For a target molecule with a second charge type, determining new fragments for the target molecule according to the twistable dihedral angles and preset interval degrees on the fragments of the target molecule, performing geometric optimization on the new fragments, and calculating electrostatic potentials of the new fragments under different dihedral angles to obtain a calculation result.
In one possible implementation, fitting the charge information of each target molecule according to each calculation result includes:
adding virtual atoms into each target molecule according to the types of atoms in each target molecule to form a new target molecule;
constructing a target matrix according to the optimized molecular coordinates of the new target molecules corresponding to the target molecules and the electrostatic potential of the corresponding new fragments aiming at the target molecules with the charge type of the second type;
calculating the atomic charge of the new target molecule according to the target matrix;
and removing the virtual atoms in the new target molecule, assigning the atomic charge of each virtual atom to the corresponding previous atom in the target molecule, and determining the atomic charge of each atom in the target molecule and taking the atomic charge as charge information of the target molecule.
In one possible implementation manner, according to the calculation result corresponding to the fragments of each target molecule and the charge information of the target molecule, calculating total energy of each fragment of the target molecule under different dihedral angles, and fitting force field parameters of all dihedral angles of each fragment based on the total energy, including:
calculating the coordinates of each atom in the target molecule and the total energy of the corresponding new fragment under different dihedral angles according to the preset interval degrees based on the atomic charge of each atom in the target molecule;
fitting a potential energy curve based on quantum mechanics of each dihedral angle corresponding to the target molecule according to the total energy of each new fragment in the target molecule under different dihedral angles;
and fitting the dihedral angle force field parameters of each new fragment by a least square method according to the initial force field parameters and the atomic charge and energy of each atom in the new fragment under each dihedral angle conformation.
According to another aspect of the present disclosure there is provided a molecular force field fitting device, the device comprising:
the fragment generation module is used for carrying out fragmentation treatment on each target molecule according to the 3D structure of each target molecule to determine fragments of each target molecule, wherein each target molecule is a ligand molecule for binding with receptor protein;
The relation determining module is used for generating a fragment set according to fragments of each target molecule and determining the corresponding relation between each fragment in the fragment set and each target molecule;
the geometric optimization module is used for carrying out geometric optimization on each target molecule and determining the most stable conformation of each target molecule;
the first calculation module is used for respectively carrying out electrostatic potential or single-point energy calculation on each fragment of each target molecule according to the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment in each target molecule;
a charge information calculation module for calculating charge information of each target molecule according to each calculation result;
the force field parameter fitting module is used for calculating the total energy of each fragment of the target molecule under different dihedral angles according to the calculation result corresponding to each fragment of the target molecule and the charge information of the target molecule, and fitting out force field parameters of all dihedral angles of each fragment based on the total energy;
and the result determining module is used for integrating the force field parameters corresponding to the fragments in each target molecule and outputting formatted force field parameter files of all the target molecules.
In one possible implementation, the fragment generation module includes:
the molecular marking module is used for marking each atom in each target molecule according to the 3D structure of each target molecule to obtain the atomic fraction of each atom, wherein the atomic fractions corresponding to the atoms belonging to different atom types are different, and the chemical structure environment where the atoms of the same atom type are positioned comprises at least one kind;
the label setting submodule is used for adding labels to each atomic bond in each target molecule according to the 3D structure of each target molecule, and the labels added by the atomic bonds of different types are different;
the cutting sub-module is used for cutting the atomic bonds marked with the first label in the target molecule to obtain first-stage fragments of the target molecule;
wherein, for a target molecule comprising two primary fragments, the two primary fragments are directly linked; for a target molecule comprising more than two primary fragments, connecting two adjacent primary fragments to form a secondary fragment; fragments of the target molecule include primary fragments or secondary fragments.
In one possible implementation, the relationship determination module includes:
the fragment counting sub-module is used for counting fragments of each target molecule to obtain a fragment set to be de-duplicated;
The duplicate removal sub-module is used for removing duplicate fragments in the fragment set to be duplicate removed to obtain a fragment set;
and the relation determining submodule is used for determining the corresponding relation between each fragment in the fragment set and each target molecule according to the fragments of each target molecule.
In one possible implementation, the geometric optimization module includes:
the coordinate optimization submodule is used for optimizing the geometric structure of each target molecule by adopting a density functional method according to the initial molecular coordinate of each target molecule to obtain an optimized molecular coordinate;
the conformation determining submodule is used for determining the most stable conformation of each target molecule according to the optimized molecular coordinates corresponding to each target molecule.
In one possible implementation, the first computing module includes:
the first calculation submodule is used for calculating single-point energy of each fragment in the target molecule according to the optimized molecular coordinates of the target molecule and a preset calculation method aiming at the target molecule with the charge type of the first type and taking the single-point energy as a calculation result; and/or
The second calculation submodule is used for determining new fragments aiming at the target molecules according to the twistable dihedral angles and the preset interval degrees on the fragments of the target molecules aiming at the target molecules with the second type of charge types, performing geometric optimization on the new fragments, and calculating the electrostatic potential of the new fragments under different dihedral angles to be used as a calculation result.
In one possible implementation, the charge information calculation module includes:
a virtual atom adding submodule for adding virtual atoms in each target molecule according to the types of atoms in each target molecule to form a new target molecule;
the matrix construction submodule is used for constructing a target matrix according to the optimized molecular coordinates of the new target molecules corresponding to the target molecules and the electrostatic potential of the corresponding new fragments aiming at the target molecules with the charge type of the second type;
an atomic charge calculation sub-module for calculating the atomic charge of the new target molecule according to the target matrix;
and the atomic charge determination submodule is used for removing virtual atoms in the new target molecule, assigning the atomic charge of each virtual atom to the corresponding previous atom in the target molecule, and determining the atomic charge of each atom in the target molecule and taking the atomic charge as charge information of the target molecule.
In one possible implementation, the force field parameter fitting module comprises:
the energy fitting sub-module is used for calculating the coordinates of each atom in the target molecule and the total energy of the corresponding new fragment under different dihedral angles according to the preset interval degrees based on the atomic charge of each atom in the target molecule;
The curve fitting submodule is used for fitting a potential energy curve based on quantum mechanics of each dihedral angle corresponding to the target molecule according to the total energy of each new fragment in the target molecule under different dihedral angles;
and the force field fitting sub-module is used for fitting the dihedral angle force field parameters of each new fragment through a least square method according to the initial force field parameters and the atomic charge and energy of each atom in the new fragment under each dihedral angle conformation.
According to another aspect of the present disclosure there is provided a molecular force field fitting device comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to implement the above-described method when executing the instructions stored by the memory.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the above-described method.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, performs the above method.
According to the molecular force field fitting method and device provided by the embodiment of the disclosure, fragmenting treatment is carried out on each target molecule according to the 3D structure of each target molecule, fragments of each target molecule are determined, and the target molecules are ligand molecules for binding with receptor proteins; generating a fragment set according to fragments of each target molecule, and determining the corresponding relation between each fragment in the fragment set and each target molecule; geometrically optimizing each target molecule to determine the most stable conformation of each target molecule; respectively carrying out electrostatic potential or single-point energy calculation on each fragment of each target molecule in the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment in each target molecule; calculating charge information of each target molecule according to the calculation result fitting; according to the calculation result corresponding to the fragments of each target molecule and the charge information of the target molecule, calculating the total energy of each fragment of the target molecule under different dihedral angles, and fitting out force field parameters of all dihedral angles of each fragment based on the total energy; and integrating force field parameters corresponding to fragments in each target molecule, and outputting formatted force field parameter files of all target molecules. The obtained molecular force field fitting result has high precision, good accuracy and high fitting speed, and can be used for calculating the binding free energy of the target molecule and the receptor protein.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a flow chart of a molecular force field fitting method according to an embodiment of the present disclosure.
Fig. 2A-2E illustrate schematic diagrams of examples of atom types according to an embodiment of the disclosure.
Fig. 3 shows a schematic diagram of an example of an atomic bond score calculation mode according to an embodiment of the present disclosure.
Fig. 4 shows a schematic diagram of an example of tag arrangement of atomic bonds in a target molecule according to an embodiment of the present disclosure.
FIG. 5 shows a schematic diagram of an example of in-cut traversal in a target molecule according to an embodiment of the disclosure.
FIG. 6 illustrates a graph of molecular mechanical energy values at corresponding dihedral angles for an exemplary fragment according to an embodiment of the present disclosure.
Fig. 7 shows a block diagram of a molecular force field fitting device according to an embodiment of the present disclosure.
Fig. 8 is a block diagram illustrating an apparatus 1900 for molecular force field fitting, according to an example embodiment.
Detailed Description
Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
In addition, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.
In order to solve the above technical problems, embodiments of the present disclosure provide a method and an apparatus for fitting a molecular force field, where fragmentation processing is performed on each target molecule according to a 3D structure of each target molecule, so as to determine fragments of each target molecule, where the target molecule is a ligand molecule for binding to a receptor protein; generating a fragment set according to fragments of each target molecule, and determining the corresponding relation between each fragment in the fragment set and each target molecule; geometrically optimizing each target molecule to determine the most stable conformation of each target molecule; respectively carrying out electrostatic potential or single-point energy calculation on each fragment of each target molecule in the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment in each target molecule; calculating charge information of each target molecule; according to the calculation result corresponding to the fragments of each target molecule and the charge information of the target molecule, calculating the force field parameters of each fragment of the target molecule under different dihedral angles, wherein the force field parameters comprise molecular mechanical energy values; and determining the optimized molecular coordinates, charge information and force field parameters corresponding to the most stable conformation of each target molecule as a molecular force field fitting result. The obtained molecular force field fitting result has high precision, good accuracy and high fitting speed, and can be used for calculating the binding free energy of the target molecule and the receptor protein.
As shown in fig. 1, the method includes steps S101 to S107.
In step S101, fragmenting each target molecule according to the 3D structure of each target molecule, so as to determine fragments of each target molecule, where the target molecule is a ligand molecule for binding to a receptor protein.
In this example, the receptor protein may be a protein molecule against which the drug is developed, and the ligand molecule may be a substance capable of specifically binding to the receptor protein. In drug development, there may be a plurality of ligand molecules bound to the receptor protein for which molecular force field fitting is required, that is, there may be a plurality of target molecules, and the number of target molecules is not limited in the present disclosure.
In this embodiment, the 3D structure of the target molecule may include three-dimensional coordinates (i.e., the initial molecular coordinates described herein) representing the 3D structure of the target molecule.
In some embodiments, step S101 may include: scoring each atom in each target molecule according to the 3D structure of each target molecule to obtain the atomic fraction of each atom; a tag (break bond label) is added to each atomic bond (also referred to herein as a chemical bond) in each target molecule according to the 3D structure of the target molecule, and the tag of each chemical bond is identified as a first tag "Y" or a second tag "N", which respectively indicates whether the molecule will be cleaved from the position of the chemical bond to obtain fragments (also referred to as fragment molecules, small molecule fragments, etc.). Different kinds of atomic bonds add different labels; and cutting the atomic bond marked with the first label in the target molecule to obtain first-stage fragments of the target molecule. Wherein, for a target molecule comprising two primary fragments, the two primary fragments are directly linked; for target molecules comprising more than two primary fragments, two adjacent primary fragments are linked to form a secondary fragment. Fragments of the target molecule include primary fragments or secondary fragments.
In this embodiment, each atom of the target molecule may be scored according to a preset atom type, to obtain an atomic score of each atom. Wherein the atomic fractions corresponding to the atoms belonging to different atom types are different, the chemical structure environment where the atoms of the same atom type are located comprises at least one kind, and the chemical structure environments where the atoms of the same atom type are located are similar.
For example, the atomic fractions of the following several atomic types are respectively the following five groups: [1,4,7,8,10], [2,3,5,6,9,11,12,13], [30,31,32], [50,51,52,53,54,55], [60,61,62,63,64,65,66,67]. Wherein each group of numbers is represented by [1,4,7,8,10] as atomic fraction of atoms of similar chemical structure environments, and each number in the same group of numbers may be represented by atomic fraction of one or more atoms of similar chemical structure environments. Through the multi-level definition mode, the atom type can reach more than 500, the definition of the chemical environment where the atom is located is finer, and the obtained force field parameters are more accurate.
As shown in fig. 2A-2E, schematic examples are given for chemical structural environments where atoms corresponding to different atomic fractions are located, where X1, X2, X3, and X4 represent any one of H, F, cl, br, and Y1, Y2, Y3, and Y4 represent atoms different from C, X, X2, X3, and X4. "? "means that there is no atomic limitation to that location. The red chemical structure in fig. 2A-2E is only schematically shown in the figures as a partial structure. The atomic scores given under the structures shown in fig. 2A-2E are all the atomic scores of the central atoms of the corresponding structures. The chemical structure environment of the atoms corresponding to the atomic fraction [1,4,7,8,10] comprises 9 kinds of the chemical structure environments in the figure 2A. The chemical structure environment in which the atoms corresponding to the atomic fraction [2,3,5,6,9,11,12,13] are located includes 8 kinds in fig. 2B. The chemical structure environment in which the atoms corresponding to the atomic fraction [30,31,32] are located includes 4 kinds in fig. 2C. The chemical structure environment in which the atoms corresponding to the atomic fraction [50,51,52,53,54,55] are located includes 7 species in fig. 2D. The chemical structure environment in which the atoms corresponding to the atomic fraction [60,61,62,63,64,65,66,67] are located at least includes 6 kinds in fig. 2E. It will be appreciated that the chemical structures of the atomic types are only schematically shown in fig. 2A-2E, and the atomic types and the corresponding atomic fractions thereof may be set according to actual needs, which is not limited by the present disclosure.
In this embodiment, the atomic bonds (i.e., the bonds between two atoms in the target molecule, which may be covalent, ionic, metallic, and hydrogen bonds). The atomic bond may be tagged according to the type of the atomic bond, the position of the atomic bond in the target molecule, the type of the atom forming the atomic bond, and the like, and the atomic bonds having different kinds of characteristic information may have different tags. The labels of the atomic bonds may include a first label (which may be denoted by Y) that subsequently needs to be cut to form fragments and a second label (which may be denoted by N) that subsequently does not need to be cut.
The process of performing atomic bond tag setting for a target molecule may include the following first to seventh steps.
In the first step, the label of the atomic bond on the macrocyclic ring is set as the first label. Wherein a macrocycle may indicate a cyclic structure containing at least 10 atoms in the target molecule.
Wherein, each atom on the macrocyclic ring can be marked according to the difference of the connection objects of the atoms on the macrocyclic ring. For example, an atom attached to at least one ring atom and a terminal atom (TA atom, i.e., the atom at the terminal H, F, cl, br, etc.) may be labeled as CER atom. An atom that is attached to one or more ring atoms and that is attached to at least two aliphatic atoms simultaneously may be labeled as a CMR atom; an atom that is attached to only ring atoms and at least two ring atoms may also be labeled as a CMR atom; atoms on the macrocycle, attached to the macrocycle and not attached to the atom are labeled as CM atoms (i.e., multiple aliphatic atoms are attached simultaneously, including carbon atoms, sulfur atoms, etc.). Wherein, the small ring may refer to a cyclic structure having less than 10 atoms contained in the target molecule.
And a second step of determining the type of the bond (i.e., atomic bond) to which each atom not arranged in the small ring is connected, and setting a corresponding type value according to the type. Wherein, the type values of different types of atomic bonds can be set respectively, and the type values can identify the chemical environment in which the bond is located. The type value of the atomic bond may be set according to at least one of the following information: whether the bond is in a small ring, which of a single bond, a double bond, a triple bond, a conjugated bond, a multiple bond, whether the bond is a ring or a chain, the ring type of the ring to which the bond is attached (fused ring, exocyclic, bridged ring, aromatic ring, etc.), and the like. For example, the type value setting of different atomic bonds may be performed as shown in table 1 below.
TABLE 1 type value example of atomic bonds
And a third step of setting a label including an atomic bond having an atomic fraction of 0 and/or a terminal atom among the linked atoms as a second label N.
Fourth, the tag of the atomic bond whose type value is not any one of S, eS, bS, br and J is set as the second tag N.
Fifth, the tag of the atomic bond of the cyclic structure having a type value of any one of eS, bS, and br and not in the target molecule is set as the first tag Y.
And a sixth step of setting a label belonging to a single bond of the conjugate bonds as the first label Y. For example, a label of a single bond between a second C and a third C of-c=c-C-may be set as the first label Y.
Seventh, for other atomic bonds not yet provided with a label, the atomic bond score (break bond score) of the atomic bond may be calculated first by using the atomic scores of the two atoms forming the bond, and then the label of the atomic bond may be determined based on the atomic bond score and the score type (score style) corresponding to the atomic bond score.
In this embodiment, the atomic bond score can be calculated by the following formula:
break_bond_score=min(p_score_A,p_score_B)×10+max(p_score_A,p_score_B)
where break bond score represents the atomic bond score, and p score a and p score B represent the position of the atomic scores of the two atoms making up the atomic bond in the score type, respectively. As shown in fig. 3, for the atomic bonds a-B, the atomic bond score=min (2, 4) ×10+max (2, 4) =24.
The score types corresponding to the different atomic bond scores can be preset, and the labels corresponding to the atomic bonds of the different atomic bond scores under the different score types. In some embodiments, the labels corresponding to the atomic bonds of different atomic bond scores for different score types may be recorded by table 2 below, so as to quickly determine which label corresponds to the atomic bond in the target molecule by looking up a table.
TABLE 2 atomic bond score and tag correspondence score type table of atomic bonds
In this embodiment, the labels of the atomic bonds may also be aligned before cleavage of the target molecule to generate fragments. Wherein if one of the atoms constituting the atomic bond has an atomic fraction in the atomic type of [2,3,5,6,9,11,12,13] and the other atom has an atomic fraction in the atomic type of [1,4,7,8,10], the tag of the atomic bond may be changed to the second tag N if the tag of the atomic bond is the first tag Y. If the atomic fraction of both atoms constituting the atomic bond is not in the atomic type [2,3,5,6,9,11,12,13], and the positions of both atoms are not in the positions [ 'RC', 'RHC', 'RFC', 'RFMC', ] partially or wholly, the tag of the atomic bond will be set as the second tag N. Wherein an atom at the RC position may refer to two atoms attached to the ring within the ring and also to at least one aliphatic atom outside the ring. The atom at the RHC position may refer to an atom that is itself in the ring, attached to at least one other ring, and also attached to at least one aliphatic atom. Atoms in the RFC position may refer to atoms that are themselves on multiple rings, attached to three atoms on a common ring, and attached to at least one aliphatic atom. The atoms of the RFMC position may refer to atoms in two rings at the same time, connecting two ring atoms in the same ring, and also connecting at least one aliphatic atom.
In this example, long-chain target molecules may be cleaved into short-chain fragments (i.e., fragments) containing up to 3 atoms in the chain during cleavage of the target molecule. As shown in FIG. 4, assuming that the target molecule includes 8 atoms on its long chain, numbered 1-8, respectively, the atomic bonds connecting the 8 atoms have completed the corresponding tag settings (e.g., N and Y in FIG. 4). Then, all atoms and atomic bonds in FIG. 4 are traversed. As shown in fig. 5, taking the example of a fragment traversing atom "4", atoms "2", "3", "5" cannot be split from atom "4". Thus, starting with atoms "2", "3", "5", the search for atoms to which it is attached continues, respectively, until the label that determines the atomic bond is "Y" or the atom is "TA", and then the search is stopped and the atoms are placed in the list. And cutting the atomic bonds marked with the first labels in the target molecules according to the list obtained by traversing to obtain first-stage fragments of the target molecules.
Therefore, the customization of a molecular fragment method is realized through the fragment cutting, on one hand, the chemical environment of atoms in the fragments is maximized and approaches to the chemical environment of each atom in the original molecules, and the calculated amount of quantum chemistry is controlled.
In step S102, a fragment set is generated according to fragments of each target molecule, and a correspondence between each fragment in the fragment set and each target molecule is determined.
In this embodiment, step S102 may include: counting fragments of each target molecule to obtain a fragment set to be de-duplicated; removing repeated fragments in the fragment set to be de-duplicated to obtain a fragment set; and determining the corresponding relation between each fragment in the fragment set and each target molecule according to the fragments of each target molecule. For example, if the number of target molecules is 2, the first target molecule has a fragment 1 and a fragment 2, and the second target molecule has a fragment 1 and a fragment 3, the fragments in the formed set of fragments to be de-duplicated include "fragment 1, fragment 2, fragment 1 and fragment 3", and the fragments in the set of fragments obtained after de-duplication include "fragment 1, fragment 2 and fragment 3". Therefore, the subsequent calculated amount can be reduced, and the fitting speed and efficiency of the molecular force field can be improved.
In step S103, geometric optimization is performed on each target molecule, and the most stable conformation of each target molecule is determined.
In this embodiment, step S103 may include: optimizing the geometric structure of each target molecule by adopting a density functional method according to the initial molecular coordinates of each target molecule to obtain optimized molecular coordinates; and determining the most stable conformation of each target molecule according to the optimized molecular coordinates corresponding to each target molecule. The initial molecular coordinates and the optimized molecular coordinates may be coordinate values including X-axis, Y-axis, and Z-axis of each atom in the target molecule.
In step S104, electrostatic potential or single-point energy calculation is performed on each fragment of each target molecule according to the most stable conformation of each target molecule, so as to obtain a corresponding calculation result of each fragment in each target molecule.
In this embodiment, step S104 may include the following two implementations:
mode one: aiming at a target molecule with a first charge type, single-point energy of each fragment in the target molecule is calculated as a calculation result according to the optimized molecular coordinates of the target molecule and a preset calculation method. The single point energy of a fragment may refer to the energy of the fragment itself.
The first type may be a type of molecular charge obtained by calculating a molecular charge of a target molecule based on AM1-BCC, which is a charge calculation method used in Amber molecular dynamics simulation. It first optimizes the structure at AM1 level and calculates electrostatic potential fits to get the charge, then performs Bond Charge Correction (BCC).
Mode two: for a target molecule with a second charge type, determining new fragments for the target molecule according to the twistable dihedral angles and preset interval degrees on the fragments of the target molecule, performing geometric optimization on the new fragments, and calculating electrostatic potentials of the new fragments under different dihedral angles to obtain a calculation result.
The second type may be a type of molecular charge obtained by calculating a molecular charge of the target molecule based on the RESP. RESP is a method of calculating charge of small organic molecules. It is obtained by matching with quantitative calculation software (such as Gaussian, ORCA) to obtain electrostatic potential and then carrying out restrictive fitting calculation. In some embodiments, the second type may also refer to a type of molecular charge that results from a molecular charge calculation of the target molecule based on RESP2, RESP2 being an updated version of RESP.
The preset interval degree may be set according to actual needs, for example, 30 ° may be set, that is, new fragments with dihedral angles of-150 °, -120 °, -90 °, -60 °, -30 °, 0 °, 30 °, 60 °, 90 °, 120 °, 150 ° and 180 ° are generated for each fragment, and then geometric optimization is performed for each new fragment in a manner similar to step S103. The smaller the number of preset interval degrees, the more accurate the obtained calculation result.
In step S105, charge information of each target molecule is fitted according to each calculation result.
In this embodiment, step S105 may include: adding virtual atoms to each of the target molecules according to the types of atoms in each of the target molecules (i.e., the atom types described herein) to form new target molecules; constructing a target matrix according to the optimized molecular coordinates of the new target molecules corresponding to the target molecules and the electrostatic potential of the corresponding new fragments aiming at the target molecules with the charge type of the second type; calculating the atomic charge of the new target molecule according to the target matrix by using linear algebra; and removing the virtual atoms in the new target molecule, assigning the atomic charge of each virtual atom to the corresponding previous atom in the target molecule, and determining the atomic charge of each atom in the target molecule and taking the atomic charge as charge information of the target molecule.
According to the atom types of atoms at different positions in the target molecule, virtual atoms can be correspondingly arranged, and charge information can be better calculated.
In step S106, according to the calculation result corresponding to the fragments of each target molecule and the charge information of the target molecule, the total energy of each fragment of the target molecule under different dihedral angles is calculated, and the force field parameters of all dihedral angles of each fragment are fitted based on the total energy.
In this embodiment, step S106 may include: calculating the coordinates of each atom in the target molecule and the total energy of the corresponding new fragment under different dihedral angles according to the preset interval degrees based on the atomic charge of each atom in the target molecule; fitting a potential energy curve based on quantum mechanics of each dihedral angle corresponding to the target molecule according to the total energy of each new fragment in the target molecule under different dihedral angles; and fitting dihedral angle force field parameters of each new fragment by a least square method according to the initial force field parameters and the atomic charge and energy of each atom in the fragment under each dihedral angle conformation.
In some embodiments, the initial force field parameter may be a Gaff2 force field parameter. Among them, GAFF (Generalized Amber Force Field) is a force field describing small organic molecules and polymers. GAFF2 is an upgraded version of GAFF, and improves and optimizes GAFF, improving the accuracy of description of small organic molecules and polymers.
In some embodiments, the force field parameters of each new fragment in the target molecule at the next dihedral angle can be fitted by a least square method according to the initial force field parameters and the atomic charge and energy of each atom in the fragment at the current dihedral angle, and then the force field parameters at the current dihedral angle are used as the initial force field parameters. Therefore, the finally obtained force field parameters can be as close to QM (Quantum Mechanics) as possible, and the accuracy and precision of integral fitting are improved. The method can be combined with a GPU-accelerated quantum chemical computation tool, and a method of on-the-fly is adopted to re-fit the torsion terms of the molecules, so that the torsion terms of each molecule are guaranteed to be maximally close to the result of Quantum Mechanics (QM).
In step S107, the force field parameters corresponding to the fragments in each target molecule are integrated, and the formatted force field parameter files of all the target molecules are output.
In this embodiment, the force field parameters corresponding to each fragment in each target molecule are synthesized to form a molecular force field fitting result, and finally the formatted force field parameter files of all target molecules are output. For example, the following files may be included: vacuum. Frmod, force field parameters file; vacuum. Mol2, which contains information such as atomic coordinates, charges and the like; the leaprc_header contains atom type and element information; rule. Txt, contains topology information of the virtual atoms. The fitting result of the molecular force field can also be used as a basis for calculating the binding free energy between fragments and receptor proteins, and the fitting result of the molecular force field obtained by the method provided by the disclosure is more accurate, so that the accuracy and the accuracy of the binding free energy can be improved, and a better basis is provided for ligand molecule selection in drug research and development.
After the molecular force field fitting result is calculated by the molecular force field fitting method provided by the disclosure, a result diagram shown in fig. 6 can be generated. The visual display of the fitting result is realized, the accuracy of the method provided by the disclosure can be intuitively embodied, and a better basis is provided for drug research, development and selection setting.
Fig. 6 is a graph of molecular mechanical energy values at a certain dihedral angle for the same fragment, where the blue "QM" curve is the energy value corresponding to QM, the yellow "MM" curve is the total energy obtained by the method of the present disclosure (see step S106 for details), and the green "GAFF2" curve is the energy value calculated using GAFF 2. Then referring to fig. 6, it can be seen that, for the calculation of the molecular mechanical energy value under a certain dihedral angle of the same fragment, the method provided by the present disclosure is closer to QM, and far better than GAFF2 in accuracy.
As shown in fig. 7, an embodiment of the present disclosure further provides a molecular force field fitting device, the device including:
a fragmentation module 71, configured to perform fragmentation processing on each target molecule according to a 3D structure of each target molecule, and determine fragments of each target molecule, where the target molecule is a ligand molecule for binding to a receptor protein;
A relationship determining module 72, configured to generate a fragment set according to fragments of each target molecule, and determine a correspondence between each fragment in the fragment set and each target molecule;
a geometric optimization module 73, configured to perform geometric optimization on each of the target molecules, and determine a most stable conformation of each of the target molecules;
a first calculation module 74, configured to calculate electrostatic potential or single-point energy of each fragment of each target molecule according to the most stable conformation of each target molecule, so as to obtain a corresponding calculation result of each fragment in each target molecule;
a charge information calculation module 75 for fitting charge information of each target molecule according to each calculation result;
the force field parameter fitting module 76 is configured to calculate total energy of each fragment of the target molecule under different dihedral angles according to the calculation result corresponding to each fragment of the target molecule and the charge information of the target molecule, and fit force field parameters of all dihedral angles of each fragment based on the total energy;
the result determining module 77 is configured to synthesize force field parameters corresponding to fragments in each target molecule, and output formatted force field parameter files of all target molecules.
In one possible implementation, the fragment generation module includes:
the molecular marking module is used for marking each atom in each target molecule according to the 3D structure of each target molecule to obtain the atomic fraction of each atom, wherein the atomic fractions corresponding to the atoms belonging to different atom types are different, and the chemical structure environment where the atoms of the same atom type are positioned comprises at least one kind;
the label setting submodule is used for adding labels to each atomic bond in each target molecule according to the 3D structure of each target molecule, and the labels added by the atomic bonds of different types are different;
the cutting sub-module is used for cutting the atomic bonds marked with the first label in the target molecule to obtain first-stage fragments of the target molecule;
wherein, for a target molecule comprising two primary fragments, the two primary fragments are directly linked; for a target molecule comprising more than two primary fragments, connecting two adjacent primary fragments to form a secondary fragment; fragments of the target molecule include primary fragments or secondary fragments.
In one possible implementation, the relationship determination module includes:
the fragment counting sub-module is used for counting fragments of each target molecule to obtain a fragment set to be de-duplicated;
The duplicate removal sub-module is used for removing duplicate fragments in the fragment set to be duplicate removed to obtain a fragment set;
and the relation determining submodule is used for determining the corresponding relation between each fragment in the fragment set and each target molecule according to the fragments of each target molecule.
In one possible implementation, the geometric optimization module includes:
the coordinate optimization submodule is used for optimizing the geometric structure of each target molecule by adopting a density functional method according to the initial molecular coordinate of each target molecule to obtain an optimized molecular coordinate;
the conformation determining submodule is used for determining the most stable conformation of each target molecule according to the optimized molecular coordinates corresponding to each target molecule.
In one possible implementation, the first computing module includes:
the first calculation submodule is used for calculating single-point energy of each fragment in the target molecule according to the optimized molecular coordinates of the target molecule and a preset calculation method aiming at the target molecule with the charge type of the first type and taking the single-point energy as a calculation result; and/or
The second calculation submodule is used for determining new fragments aiming at the target molecules according to the twistable dihedral angles and the preset interval degrees on the fragments of the target molecules aiming at the target molecules with the second type of charge types, performing geometric optimization on the new fragments, and calculating the electrostatic potential of the new fragments under different dihedral angles to be used as a calculation result.
In one possible implementation, the charge information calculation module includes:
a virtual atom adding submodule for adding virtual atoms in each target molecule according to the types of atoms in each target molecule to form a new target molecule;
the matrix construction submodule is used for constructing a target matrix according to the optimized molecular coordinates of the new target molecules corresponding to the target molecules and the electrostatic potential of the corresponding new fragments aiming at the target molecules with the charge type of the second type;
an atomic charge calculation sub-module for calculating the atomic charge of the new target molecule according to the target matrix;
and the atomic charge determination submodule is used for removing virtual atoms in the new target molecule, assigning the atomic charge of each virtual atom to the corresponding previous atom in the target molecule, and determining the atomic charge of each atom in the target molecule and taking the atomic charge as charge information of the target molecule.
In one possible implementation, the force field parameter fitting module comprises:
the energy fitting sub-module is used for calculating the coordinates of each atom in the target molecule and the total energy of the corresponding new fragment under different dihedral angles according to the preset interval degrees based on the atomic charge of each atom in the target molecule;
The curve fitting submodule is used for fitting a potential energy curve based on quantum mechanics of each dihedral angle corresponding to the target molecule according to the total energy of each new fragment in the target molecule under different dihedral angles;
and the force field fitting sub-module is used for fitting the dihedral angle force field parameters of each new fragment through a least square method according to the initial force field parameters and the atomic charge and energy of each atom in the new fragment under each dihedral angle conformation.
In some embodiments, the functions or modules included in the apparatus provided by the embodiments of the present disclosure may be used to perform the methods described in the foregoing method embodiments, and specific implementation and beneficial effects of the functions or modules may refer to the descriptions in the foregoing method embodiments, which are not repeated herein for brevity.
It should be noted that, although the above embodiments are described as examples of the molecular force field fitting method and apparatus, those skilled in the art will understand that the present disclosure should not be limited thereto. In fact, the user can flexibly set each step and module according to personal preference and/or actual application scene, so long as the steps and modules are met.
The disclosed embodiments also provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method. The computer readable storage medium may be a volatile or nonvolatile computer readable storage medium.
The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to implement the above-described method when executing the instructions stored by the memory.
Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, performs the above method.
Fig. 8 is a block diagram illustrating an apparatus 1900 for molecular force field fitting, according to an example embodiment. For example, the apparatus 1900 may be provided as a server or terminal device. Referring to fig. 8, the apparatus 1900 includes a processing component 1922 that further includes one or more processors and memory resources represented by memory 1932 for storing instructions, such as application programs, that are executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. Further, processing component 1922 is configured to execute instructions to perform the methods described above.
The apparatus 1900 may further comprise a power component 1926 configured to perform power management of the apparatus 1900, a wired or wireless network interface 1950 configured to connect the apparatus 1900 to a network, and an input/output interface 1958 (I/O interface). The apparatus 1900 may operate based on an operating system stored in the memory 1932, such as Windows Server TM ,Mac OS X TM ,Unix TM ,Linux TM ,FreeBSD TM Or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 1932, including computer program instructions executable by processing component 1922 of apparatus 1900 to perform the above-described methods.
The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvements in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A method of molecular force field fitting, the method comprising:
fragmenting each target molecule according to the 3D structure of each target molecule to determine fragments of each target molecule, wherein each target molecule is a ligand molecule for binding with a receptor protein;
generating a fragment set according to fragments of each target molecule, and determining the corresponding relation between each fragment in the fragment set and each target molecule;
geometrically optimizing each target molecule to determine the most stable conformation of each target molecule;
respectively carrying out electrostatic potential or single-point energy calculation on each fragment of each target molecule according to the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment in each target molecule;
Fitting charge information of each target molecule according to each calculation result;
according to the calculation result corresponding to the fragments of each target molecule and the charge information of the target molecule, calculating the total energy of each fragment of the target molecule under different dihedral angles, and fitting out force field parameters of all dihedral angles of each fragment based on the total energy;
and integrating force field parameters corresponding to fragments in each target molecule, and outputting formatted force field parameter files of all target molecules.
2. The method of claim 1, wherein fragmenting each target molecule according to its 3D structure to determine fragments of each target molecule comprises:
scoring each atom in each target molecule according to the 3D structure of each target molecule to obtain the atomic fraction of each atom, wherein the atomic fractions corresponding to the atoms belonging to different atom types are different, and the chemical structure environment where the atoms of the same atom type are positioned comprises at least one kind;
adding labels for each atomic bond in each target molecule according to the 3D structure of each target molecule, wherein the labels added by the atomic bonds of different types are different;
Cutting the atomic bonds marked with the first label in the target molecule to obtain first-stage fragments of the target molecule;
wherein, for a target molecule comprising two primary fragments, the two primary fragments are directly linked; for a target molecule comprising more than two primary fragments, connecting two adjacent primary fragments to form a secondary fragment; fragments of the target molecule include primary fragments or secondary fragments.
3. The method of claim 1, wherein generating a fragment set from fragments of each target molecule and determining correspondence between each fragment in the fragment set and each target molecule comprises:
counting fragments of each target molecule to obtain a fragment set to be de-duplicated;
removing repeated fragments in the fragment set to be de-duplicated to obtain a fragment set;
and determining the corresponding relation between each fragment in the fragment set and each target molecule according to the fragments of each target molecule.
4. The method of claim 1, wherein geometrically optimizing each of the target molecules to determine a most stable conformation of each of the target molecules comprises:
Optimizing the geometric structure of each target molecule by adopting a density functional method according to the initial molecular coordinates of each target molecule to obtain optimized molecular coordinates;
and determining the most stable conformation of each target molecule according to the optimized molecular coordinates corresponding to each target molecule.
5. The method of claim 1, wherein the step of calculating the electrostatic potential or the single point energy of each fragment of each target molecule based on the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment of each target molecule comprises:
aiming at a target molecule with a first charge type, calculating single-point energy of each fragment in the target molecule according to the optimized molecular coordinates of the target molecule and a preset calculation method, and taking the single-point energy as a calculation result; and/or
For a target molecule with a second charge type, determining new fragments for the target molecule according to the twistable dihedral angles and preset interval degrees on the fragments of the target molecule, performing geometric optimization on the new fragments, and calculating electrostatic potentials of the new fragments under different dihedral angles to obtain a calculation result.
6. The method of claim 5, wherein fitting charge information for each of the target molecules based on each of the calculations comprises:
adding virtual atoms into each target molecule according to the types of atoms in each target molecule to form a new target molecule;
constructing a target matrix according to the optimized molecular coordinates of the new target molecules corresponding to the target molecules and the electrostatic potential of the corresponding new fragments aiming at the target molecules with the charge type of the second type;
calculating the atomic charge of the new target molecule according to the target matrix;
and removing the virtual atoms in the new target molecule, assigning the atomic charge of each virtual atom to the corresponding previous atom in the target molecule, and determining the atomic charge of each atom in the target molecule and taking the atomic charge as charge information of the target molecule.
7. The method of claim 6, wherein calculating total energy of each fragment of the target molecule at different dihedral angles based on the calculation result corresponding to each fragment of the target molecule and the charge information of the target molecule, and fitting force field parameters of all dihedral angles of each fragment based on the total energy, comprises:
Calculating the coordinates of each atom in the target molecule and the total energy of the corresponding new fragment under different dihedral angles according to the preset interval degrees based on the atomic charge of each atom in the target molecule;
fitting a potential energy curve based on quantum mechanics of each dihedral angle corresponding to the target molecule according to the total energy of each new fragment in the target molecule under different dihedral angles;
and fitting the dihedral angle force field parameters of each new fragment by a least square method according to the initial force field parameters and the atomic charge and energy of each atom in the new fragment under each dihedral angle conformation.
8. A molecular force field fitting device, the device comprising:
the fragment generation module is used for carrying out fragmentation treatment on each target molecule according to the 3D structure of each target molecule to determine fragments of each target molecule, wherein each target molecule is a ligand molecule for binding with receptor protein;
the relation determining module is used for generating a fragment set according to fragments of each target molecule and determining the corresponding relation between each fragment in the fragment set and each target molecule;
the geometric optimization module is used for carrying out geometric optimization on each target molecule and determining the most stable conformation of each target molecule;
The first calculation module is used for respectively carrying out electrostatic potential or single-point energy calculation on each fragment of each target molecule according to the most stable conformation of each target molecule to obtain a corresponding calculation result of each fragment in each target molecule;
a charge information calculation module for calculating charge information of each target molecule according to each calculation result;
the force field parameter fitting module is used for calculating the total energy of each fragment of the target molecule under different dihedral angles according to the calculation result corresponding to each fragment of the target molecule and the charge information of the target molecule, and fitting out force field parameters of all dihedral angles of each fragment based on the total energy;
and the result determining module is used for integrating the force field parameters corresponding to the fragments in each target molecule and outputting formatted force field parameter files of all the target molecules.
9. A molecular force field fitting device comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any one of claims 1 to 7 when executing the instructions stored by the memory.
10. A non-transitory computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1 to 7.
CN202311842262.8A 2023-12-28 2023-12-28 Molecular force field fitting method and device Pending CN117809757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311842262.8A CN117809757A (en) 2023-12-28 2023-12-28 Molecular force field fitting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311842262.8A CN117809757A (en) 2023-12-28 2023-12-28 Molecular force field fitting method and device

Publications (1)

Publication Number Publication Date
CN117809757A true CN117809757A (en) 2024-04-02

Family

ID=90419563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311842262.8A Pending CN117809757A (en) 2023-12-28 2023-12-28 Molecular force field fitting method and device

Country Status (1)

Country Link
CN (1) CN117809757A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1553953A (en) * 2000-07-07 2004-12-08 ά�������\����˾ Real-time sequence determination
WO2021103402A1 (en) * 2020-04-21 2021-06-03 深圳晶泰科技有限公司 Molecule force field fitting method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1553953A (en) * 2000-07-07 2004-12-08 ά�������\����˾ Real-time sequence determination
WO2021103402A1 (en) * 2020-04-21 2021-06-03 深圳晶泰科技有限公司 Molecule force field fitting method

Similar Documents

Publication Publication Date Title
Emms et al. OrthoFinder: phylogenetic orthology inference for comparative genomics
US11853389B2 (en) Methods and apparatus for sorting data
US10540350B2 (en) Source code search engine
Scroggs et al. Construction of arbitrary order finite element degree-of-freedom maps on polygonal and polyhedral cell meshes
The et al. Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0
Griss Spectral library searching in proteomics
Bruno et al. Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction
US9104656B2 (en) Using lexical analysis and parsing in genome research
CN104462668B (en) Computer-implemented method for designing an industrial product modeled with a binary tree
Li et al. Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification
CN104516730A (en) Data processing method and device
CN111095421B (en) Context-aware delta algorithm for gene files
CN103731377A (en) Method and equipment for processing messages
Kaminski et al. pLM-BLAST: distant homology detection based on direct comparison of sequence representations from protein language models
EP3888091B1 (en) Machine learning for protein binding sites
Atta et al. VeloViz: RNA velocity-informed embeddings for visualizing cellular trajectories
CN115756445A (en) Component generation method and device, electronic equipment and computer storage medium
CN117809757A (en) Molecular force field fitting method and device
US12001760B2 (en) Boundary-free periodic meshing method
CN109032696A (en) A kind of page tune method, terminal and computer storage medium
Meng et al. RAG-Web: RNA structure prediction/design using RNA-As-Graphs
Bhattacharya et al. FRAGSION: ultra-fast protein fragment library generation by IOHMM sampling
Parkman et al. CAN: A new program to streamline preparation of molecular coordinate files for molecular dynamics simulations
CN116414494A (en) Service processing method, device, equipment and storage medium
Rubert et al. Computing the family-free DCJ similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination