CN107038348B

CN107038348B - Drug target prediction method based on protein-ligand interaction fingerprint

Info

Publication number: CN107038348B
Application number: CN201710309067.7A
Authority: CN
Inventors: 李国菠; 吴勇; 刘莎; 于竹君
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2017-05-04
Filing date: 2017-05-04
Publication date: 2020-03-10
Anticipated expiration: 2037-05-04
Also published as: CN107038348A

Abstract

A drug target prediction method based on a protein-ligand interaction fingerprint. Collecting a large amount of diversified target and ligand compound crystal structures, constructing a reference protein-ligand interaction fingerprint model, predicting a possible combination mode of a to-be-detected drug and each target by adopting molecular docking, establishing the interaction fingerprint model of the drug and the target, calculating the similarity of the fingerprint and the reference interaction fingerprint model and the affinity of the drug and the target, sequencing the targets in a target library by integrating docking scoring, the similarity of the fingerprint and the affinity, and outputting potential targets of the drug. The invention not only adopts the interaction fingerprint method to carry out sequencing and prediction on the interaction mode of the drug and the target, but also overcomes the defect of lower success rate of molecular docking on the prediction of the interaction mode of the drug and the target; and the targets are sequenced by adopting the comprehensive index Cvalue, so that the advantages of each method are exerted, and the prediction accuracy of the drug targets is fundamentally improved.

Description

Drug target prediction method based on protein-ligand interaction fingerprint

Technical Field

The invention relates to the field of computer-aided drug molecule design, in particular to a novel method for predicting a drug target by fusing molecule docking and an interaction fingerprint spectrum, and specifically relates to a drug target prediction method based on a protein-ligand interaction fingerprint spectrum.

Background

Drug target identification refers to the discovery of targets for the action of a drug or active compound by some means. The identification of drug targets plays a key role in the fields of drug research and development, chemical biology and the like, such as the elucidation of drug action molecular mechanisms, the development of new applications of old drugs, the development of new modes of combined drugs and the like. Currently, a variety of experimental approaches for drug target identification have been developed, with chemical proteomics being the most widely used. The method adopts the concept of 'fishing', firstly fixes the drug to be detected on a biochip or connects a biotin label to capture the protein tightly combined with the drug, then separates the protein by methods such as affinity chromatography and the like, identifies the protein by high-sensitivity mass spectrometry, and finally carries out further bioinformatics analysis on the protein, thereby finally determining the action target of the drug. However, chemical proteomics and other experimental approaches tend to be time consuming, expensive and difficult to implement. To save time and research costs, various computer-aided drug target prediction methods have been applied to drug target identification studies in recent years. Since predicting a target by a computational method requires further experimental validation, a hybrid method, i.e., a computational method organically combined with an experiment, is gradually being popularized. In this hybrid approach, computer-aided target prediction methods are often used first, and thus its predictive power plays a crucial role in the successful identification of the final drug target.

Computer-aided drug target prediction methods that have been developed at present can be broadly divided into two main categories: ligand-based and structure-based methods. Ligand-based approaches typically infer its potential target of action by calculating the chemical structural similarity of a given drug or compound to the active compounds of known targets; if a given drug or compound has a high degree of similarity to certain active compounds, the target of the active compound may also be the target of action of the given drug or compound. The ligand-based method is simple in principle and quite effective, but is only limited to the situation of high similarity of chemical structures, and meanwhile, the three-dimensional structure of a drug target cannot be considered, so that the application range and accuracy of the ligand-based method are limited to a great extent. The structure-based approach calculates the shape and electrical match of the drug and potential target on the three-dimensional structure, thereby inferring the likely target of action of the drug. Among them, the reverse docking method is the most common structure-based target prediction method, and mainly utilizes the molecular docking method to predict the interaction pattern and affinity of a given drug or compound and a target, so as to rank the drug targets, thereby determining possible action targets for the drug. The method fully considers the three-dimensional structure information of the target protein, but the molecular docking method still has problems of no effective solution so far, such as protein flexibility, scoring function precision, solvent water molecules and the like, and the problems result in low pre-accuracy of the reverse docking method. In recent years, research has proposed integrated drug target prediction strategies, i.e., integrating drug target prediction based on the respective advantages of ligand-based and receptor-based approaches. Such a strategy improves the accuracy of drug target prediction to some extent. In summary, the existing computer-aided drug target prediction methods have some advantages, but also have some defects that are difficult to overcome, so that the target prediction accuracy is not high, and the success rate of drug target identification is affected. Therefore, there is still a need to develop a new drug target prediction method to improve target prediction accuracy, thereby providing an effective tool for drug target identification.

Disclosure of Invention

The purpose of the invention is: a novel method for predicting a drug target is provided. The method integrates a molecular docking, protein-ligand interaction fingerprint method and a protein-ligand affinity prediction method to predict the target, fully considers the important structural characteristics of the related target and improves the accuracy of target prediction.

The basic idea of the invention is as follows: collecting a large number of diversified crystal structures of the target and ligand composites, (short: composites), constructing a reference protein-ligand interaction fingerprint model aiming at each composite, predicting a possible combination mode of a to-be-detected drug and each target by adopting molecular docking, establishing the interaction fingerprint model of the drug and the target according to the possible combination mode, calculating the similarity of the fingerprints and the reference interaction fingerprint model and the affinity of the drug and the target, sequencing the targets of a target library by integrating docking scoring, the similarity of the fingerprints and the affinity, and outputting potential targets of the drug. The basic theory of this idea is based on: 1) the diversity and richness of the target and protein-ligand interaction fingerprint model in the target library can comprehensively reflect the interaction characteristics of the complex structure, so that the constructed target prediction system has universality and practicability; 2) the protein-ligand interaction fingerprint spectrogram analysis method can comprehensively consider the most key structural features of each target, so that the predicted drug and target action modes can be accurately sequenced, and the problem that the drug and target action modes cannot be correctly sequenced by a molecular docking scoring function is solved; 3) the targets are sequenced by adopting a comprehensive index, and the index integrates docking scoring, fingerprint spectrogram similarity and affinity, so that the advantages of each method can be exerted, the limitation of a single method can be overcome, and the accuracy of target prediction can be improved.

The purpose of the invention is achieved by the following steps:

collecting a large number of diversified crystal structures of the target and ligand compound, simply referring the crystal structures of the target and ligand compound as the compound, constructing a reference protein-ligand interaction fingerprint model aiming at each compound, predicting a possible combination mode of a given drug and each target by adopting molecular docking, establishing the interaction fingerprint model of the drug and the target, calculating the similarity of the fingerprints and the reference interaction fingerprint model and the affinity of the drug and the target, sequencing the targets in a target library by integrating docking scoring, fingerprint similarity and affinity, and outputting the potential target of the given drug.

The prediction was performed as follows:

(1) firstly, collecting drug targets, establishing a drug target information base, collecting all drug targets and small molecular compound crystal structures through a protein crystal structure database, and establishing an active site database according to the compound structures;

(2) analyzing the interaction characteristics of the proteins and the small molecular compounds in all the collected compound crystal structures by utilizing an autonomously developed protein-ligand interaction fingerprint method according to a drug target active site database, and establishing a reference interaction fingerprint model library;

(3) predicting possible action modes of a given drug or compound and all targets by adopting a molecular docking method, and establishing an interaction fingerprint model of the drug and the targets according to the possible action modes;

(4) calculating the similarity of the fingerprints and the interaction fingerprint models, and determining the action mode of the drug and the target according to the similarity value;

(5) predicting the affinity of the drug and the target by utilizing a protein-ligand affinity prediction method for the obtained action mode;

(6) and calculating a comprehensive evaluation index Cvalue according to the docking score, the fingerprint spectrogram similarity and the affinity value, sequencing all targets in a target library according to the Cvalue value, and outputting a potential target list of the given medicine.

The specific steps of drug target prediction are as follows:

(1) constructing a target information base and an active site database:

collecting the name, biological category, related diseases and related information of drug development of a drug target from TTD, PubMed, PDBbind, ChEMBL and PDB public free databases, and establishing a drug target information base; for each target, collecting the target-compound crystal structures from a protein crystal structure PDB database, wherein the precision of all the structures is higher than 2.5 angstroms, and if a plurality of compound crystal structures exist in the same target, selecting small molecule compound structures containing different classes; analyzing the crystal structure of each compound by utilizing an autonomously developed script program according to the collected crystal structures of the compounds, and automatically constructing an active site database;

(2) constructing a reference interaction fingerprint model library:

analyzing the protein-compound interaction in each compound by utilizing self-developed IFP-analytes software according to the collected protein-compound crystal structure and active site database to construct a reference interaction fingerprint spectrum model database;

(3) calculation of fingerprint model given the interaction of drug with target:

predicting the interaction pattern of a given drug or compound with all targets in a target library by using a molecular docking method, wherein 10 possible interaction patterns are generated by the given drug and each target; for each mode of action, a given drug-target interaction fingerprint model is calculated for each mode of action according to the calculation method of the reference interaction fingerprint, and is also stored as a. ifp format file.

(4) Similarity calculation of the predicted interaction fingerprint for a given drug to a reference interaction fingerprint model:

calculating the similarity of the interaction fingerprint pattern corresponding to the given drug and the reference interaction fingerprint pattern model one by one for 10 predicted interaction patterns of each target, wherein the similarity is calculated according to the following formula (I):

IFPscore in formula (I) is the similarity value of the interaction fingerprint of a given drug to a reference interaction fingerprint; d_iIs the total number of assignments of "1" in the interaction fingerprint for a given drug; r_iIs the total number of assignments of "1" in the reference interaction fingerprint; c_iIs the total number of assignments of "1" in both the interaction fingerprint of a given drug and the reference interaction fingerprint; w_iIs corresponding to each phase in the fingerprintA weight of the interaction category;

(5) prediction of affinity for a given drug to a target:

for each target, outputting the action mode of the corresponding medicine and the target when the similarity is highest according to the similarity of the fingerprint obtained by calculation; performing affinity prediction on the action mode of the drug and a target by adopting an ID-Score program, and outputting an affinity prediction value IDscore;

(6) comprehensive sequencing of targets:

calculating a comprehensive index Cvalue according to the molecular docking score, the fingerprint spectrogram similarity and the affinity predicted value, sequencing the targets according to the Cvalue, and calculating the Cvalue according to a formula (II);

in formula (II): IFPscore is a fingerprint similarity value, Dscore is a molecular docking score value, and IDscore is an affinity predicted value; mu.s₁Representing the average value of similarity values of the fingerprint spectra corresponding to all targets, mu₂Represents the mean value of the molecular docking scores, μ, for all targets₃Representing the average value of the predicted values of the affinity corresponding to all targets; sigma₁Standard variance value, sigma, representing fingerprint similarity values corresponding to all targets₂Represents the standard variance value, σ, of the molecular docking scores corresponding to all targets₃Standard variance values representing predicted values of affinity for all targets; w is a₁Weight, w, representing similarity value of fingerprint₂Weight, w, representing the score of molecular docking₃Weight representing predicted value of affinity.

The active site database is automatically constructed by utilizing an autonomously developed script program according to the collected compound crystal structure in the step (1), and the process is as follows: firstly, automatically identifying a small molecular compound in a crystal structure of the compound, and selecting a coordinate center of the small molecular compound as an active site center; then, the length, width and height of the small molecular compound are respectively added with the distance of 6 angstroms to be the size of the active site; all protein residues in the active site range are selected as active sites, and the central coordinates, the grid size of the active sites and all residue data are stored as an active site file in the format of. conf.

In the step (2), the reference interaction fingerprint model database is constructed, and the process is as follows: firstly, analyzing the interaction between 8 protein active site residues and a compound by using IFP-analyzers, wherein the interaction comprises a hydrogen bond donor, a hydrogen bond acceptor, a positive charge center, a negative charge center, a face-to-face pi-pi interaction, a hydrophobic interaction and a ligand-metal ion interaction, if any one of the interactions exists, the corresponding residue is assigned to be 1, and if no interaction exists, the corresponding residue is assigned to be 0; then, setting a weighted value of 2 for the positive charge center, the negative charge center and the ligand-metal ion interaction, setting a weighted value of 1 for the hydrogen bond donor, the hydrogen bond acceptor, the face-to-face pi-pi interaction and the hydrophobic interaction, wherein the active site residue-interaction assignment-weight jointly form an interaction fingerprint; by utilizing the steps, the structure of each target compound is analyzed, and a reference interaction fingerprint model is constructed and stored as a ifp format file.

The invention has the positive effects that: establishing a rich and diverse drug target database, establishing a target prediction method based on an interaction fingerprint, and comprehensively sequencing targets in the target database by adopting integrated molecule docking scoring, fingerprint spectrum similarity and affinity prediction values. According to the target prediction method, on one hand, the interaction fingerprint method is adopted to sequence and predict the interaction mode of the drug and the target, and the defect that the success rate of predicting the interaction mode of the drug and the target by molecular docking is low can be overcome; on the other hand, the targets are sequenced by adopting the comprehensive index Cvalue, the interaction fingerprint, molecular docking and affinity prediction methods are integrated, the interaction between the drug and the targets can be evaluated from different angles, and the advantages of each method are exerted, so that the prediction accuracy of the drug targets is fundamentally improved.

Drawings

FIG. 1 is a flowchart of the target prediction method based on protein-ligand interaction fingerprinting according to the present invention.

FIG. 2 shows the distribution of the target library constructed according to the present invention.

FIG. 3 is an example of an interaction fingerprint in an embodiment of the present invention.

Detailed Description

FIG. 1 depicts a target prediction method based on protein-ligand interaction fingerprinting. The input medicine has a chemical structural formula which is an optimized three-dimensional structure. According to a target list of a target library, the constructed programs are used for respectively calling target information in sequence, and a molecular docking program is called to enable the input medicine three-dimensional structure and the target T_iPerforming butt joint simulation on the active sites to generate drug molecules and target T_iThe molecular docking conformation of (3), in this example 10 conformations. Calling the constructed program to perform fingerprint analysis on the molecular docking conformation to generate a target T_iCalculating the target T by using the interaction fingerprint corresponding to each docking conformation_iThe similarity between the interaction fingerprints of all the docking conformations and the reference interaction fingerprints in the fingerprint library is output, and the target T with the highest similarity is output_iIn a docked conformation. Now the docking score and fingerprint similarity values for the docked conformation have been obtained, for which the conformation is similar to target T_iPredicting the affinity of the target, outputting the predicted value of the affinity of the conformation, and determining the target T_iCalculating the similarity, docking score and affinity predicted value of the fingerprint, and calculating the target T_iThe comprehensive index Cvalue of the given drug and all targets in the target library is calculated according to the process, all targets are sequenced according to the Cvalue, and finally a potential action target list of the input drug is given. The specific steps for realizing the process are as follows:

(1) constructing a target information base and an active site database:

see figure 2. The name, biological category, related diseases and related information of drug targets are collected from public free databases such as TTD, PubMed, PDBbind, ChEMBL and PDB, and a drug target information base is established, wherein the target base totally relates to 2842 drug targets and covers 10 different biological categories, including enzymes (enzymes), regulatory factors (factors), binding proteins (binding proteins), transport proteins (transport proteins), receptors (receptors), signaling proteins (signaling proteins), structural proteins (structural proteins), viral proteins (viral proteins), ion channels (ion channels) and the like. And (2) aiming at each target, collecting the target-compound crystal structures from a protein crystal structure PDB database, wherein the precision of all the structures is higher than 2.5 angstroms, if a plurality of compound crystal structures exist in the same target, selecting small molecule compound structures containing different classes, and requiring the small molecule compounds in the compound to have drug-like properties, wherein the conditions of the drug-like small molecules comprise: 1) a non-ionic small molecule; 2) the number of hydrogen bond donors is not more than 5; 3) the number of hydrogen bond acceptors does not exceed 10; 4) molecular weight less than 600 daltons; 5) no more than 5 centers of positive or negative charge; 6) the number of sulfur atoms is not more than 1. According to the collected crystal structure of the compound, an active site database is automatically constructed by utilizing an autonomously developed script program, and the process is as follows: firstly, automatically identifying a small molecular compound in a crystal structure of the compound, and selecting a coordinate center of the small molecular compound as an active site center; then, the length, width and height of the small molecular compound are respectively added with the distance of 6 angstroms to be the size of the active site; all protein residues within the active site range are selected as active sites, and the central coordinates, active site grid size and all residue data are stored as an active site file, i.e., the. conf format.

(2) Constructing a reference interaction fingerprint model library:

analyzing the protein-compound interaction in each compound by utilizing the IFP-analytes software which is independently developed according to the collected protein-compound crystal structure and active site database, and constructing a reference interaction fingerprint model database, wherein the process comprises the following steps: 1) analyzing the interaction of 8 protein active site residues and compounds by using IFP-analyzers, wherein the interactions comprise a hydrogen bond donor (D), a hydrogen bond acceptor (H), a positive center (P), a negative center (N), a face-to-face pi-pi interaction (F), a face-to-face pi-pi interaction (E), a hydrophobic interaction (H), a ligand-metal ion interaction (M) and the like; 2) if the active site residue interacts with any of the small molecules in the complex structure, the corresponding interaction type of the residue is assigned a value of 1, and if there is no interaction, the value is assigned a value of 0, and all the residues of the active site are sequentially cycled in this manner; 3) setting a weighted value of 2 aiming at positive charge centers, negative charge centers and ligand-metal ion interaction categories, setting a weighted value of 1 aiming at hydrogen bond donors, hydrogen bond acceptors, face-to-face pi-pi interaction, facing side pi-pi interaction and hydrophobic interaction, and forming an interaction fingerprint by active site residue-interaction assignment-weight; 4) using the above steps, each complex structure of the target library is analyzed, and a corresponding interaction fingerprint (referred to as a reference interaction fingerprint) is constructed and stored as a. ifp format file. The right panel of figure 3 is the resulting interaction fingerprint representing the complex active site residue Asn51 providing a hydrogen bond acceptor, Met98 providing a hydrophobic group, Leu103 providing a hydrogen bond acceptor, Leu107 providing a hydrophobic group, Phe138 providing a face-to-face pi-pi interaction and hydrophobic interaction, Tyr139 providing a hydrogen bond donor and hydrophobic interaction, Trp162 providing a face-to-face pi-pi interaction and hydrophobic interaction, and Thr184 providing a hydrogen bond donor.

(3) Calculation of fingerprint model given the interaction of drug with target:

predicting the interaction mode of a given drug or compound and all targets in a target library by adopting a molecular docking method, wherein the given drug and each target generate 10 possible docking conformations, and each docking conformation corresponds to a docking score Dscore; for each docking conformation, a corresponding interaction fingerprint model, referred to as docking conformation interaction fingerprint, is calculated according to the calculation method of the reference interaction fingerprint, and is also stored as a. ifp format file.

(4) Similarity calculation of the docking conformation interaction fingerprint of a given drug to a reference interaction fingerprint model:

calculating the similarity IFPscore of the interaction fingerprint pattern corresponding to the given drug and the reference interaction fingerprint pattern model one by one according to the following formula (I) for 10 docking conformations of each target and the given drug:

IFPscore in formula (I) is the similarity value of the interaction fingerprint of a given drug to a reference interaction fingerprint; d_iIs the total number of assignments of "1" in the interaction fingerprint for a given drug; r_iIs the total number of assignments of "1" in the reference interaction fingerprint; c_iIs the total number of assignments of "1" in both the interaction fingerprint of a given drug and the reference interaction fingerprint; w_iIs the weight of each interaction class in the corresponding fingerprint.

(5) Prediction of affinity for a given drug to a target:

for each target, outputting the docking conformation of the corresponding drug and the target when the similarity is highest according to the similarity of the fingerprint obtained by the calculation; for the docking conformation, an ID-Score program is used for performing affinity prediction, and an affinity prediction value IDscore is output.

(6) Comprehensive sequencing of targets:

in formula (II): IFPscore is a fingerprint similarity value, Dscore is a molecular docking score value, and IDscore is an affinity predicted value; mu.s₁Representing the average value of similarity values of the fingerprint spectra corresponding to all targets, mu₂Represents the mean value of the molecular docking scores, μ, for all targets₃Representing the corresponding predicted values of affinity for all targetsAverage value; sigma₁Standard variance value, sigma, representing fingerprint similarity values corresponding to all targets₂Represents the standard variance value, σ, of the molecular docking scores corresponding to all targets₃Standard variance values representing predicted values of affinity for all targets; w is a₁Weight, w, representing similarity value of fingerprint₂Weight, w, representing the score of molecular docking₃Weight representing predicted value of affinity.

Through the steps, for a given drug, the target prediction method based on the protein-ligand fingerprint map comprehensively sorts all targets in the target library according to the comprehensive index Cvalue, and outputs 300 top-ranked targets as potential action targets of the drug. It is believed that such a method will provide a powerful tool for drug target identification, increasing the efficiency of drug target identification. The interaction fingerprint of this embodiment is shown in figure 3.

Claims

1. A drug target prediction method based on a protein-ligand interaction fingerprint spectrum is characterized by comprising the following steps: collecting a large number of diversified crystal structures of the target and ligand compound, simply referring the crystal structures of the target and ligand compound to the compound, constructing a reference protein-ligand interaction fingerprint model aiming at each compound, predicting a possible combination mode of a given drug and each target by adopting molecular docking, establishing the interaction fingerprint model of the drug and the target, calculating the similarity of the fingerprints and the reference interaction fingerprint model and the affinity of the drug and the target, sequencing the targets in a target library by integrating docking scoring, fingerprint similarity and affinity, and outputting potential targets of the drug;

the prediction was performed as follows:

(2) analyzing the interaction characteristics of the protein and the small molecular compound in all the collected compound crystal structures by using a protein-ligand interaction fingerprint method according to a drug target active site database, and establishing a reference interaction fingerprint model library;

(4) calculating the similarity of the fingerprint and a reference interaction fingerprint model, and determining the action mode of the drug and the target according to the similarity value;

2. The method of claim 1 for predicting a drug target based on a protein-ligand interaction fingerprint, wherein:

the specific steps of drug target prediction are as follows:

(1) constructing a target information base and an active site database:

collecting the name, biological category, related diseases and related information of drug development of a drug target from TTD, PubMed, PDBbind, ChEMBL and PDB public free databases, and establishing a drug target information base; for each target, collecting the target-compound crystal structures from a protein crystal structure PDB database, wherein the precision of all the structures is higher than 2.5 angstroms, and if a plurality of compound crystal structures exist in the same target, selecting small molecule compound structures containing different classes; automatically constructing an active site database by using a script program according to the collected crystal structure of the compound;

(2) constructing a reference interaction fingerprint model library:

analyzing the protein-compound interaction in each compound by using IFP-analytes software according to the collected protein-compound crystal structure and active site database to construct a reference interaction fingerprint model database;

(3) calculation of fingerprint model given the interaction of drug with target:

predicting the interaction pattern of a given drug or compound with all targets in a target library by using a molecular docking method, wherein 10 possible interaction patterns are generated by the given drug and each target; for each action mode, calculating a given drug and target interaction fingerprint model under the action mode according to a calculation method of a reference interaction fingerprint, and storing the given drug and target interaction fingerprint model as a ifp format file;

IFPscore in formula (I) is the similarity value of the interaction fingerprint of a given drug to a reference interaction fingerprint; d_iIs the total number of assignments of "1" in the interaction fingerprint for a given drug; r_iIs the total number of assignments of "1" in the reference interaction fingerprint; c_iIs the total number of assignments of "1" in both the interaction fingerprint of a given drug and the reference interaction fingerprint; w_iIs the weight of each interaction category in the corresponding fingerprint;

(5) prediction of affinity for a given drug to a target:

(6) comprehensive sequencing of targets:

calculating a comprehensive index Cvalue according to the molecular docking score, the fingerprint spectrogram similarity and the affinity predicted value,

sequencing the targets according to Cvalue, and calculating the Cvalue according to a formula (II);

3. A drug target prediction method as claimed in claim 2 wherein:

step (1) according to the collected compound crystal structure, automatically constructing an active site database by using a script program, wherein the process comprises the following steps: firstly, automatically identifying a small molecular compound in a crystal structure of the compound, and selecting a coordinate center of the small molecular compound as an active site center; then, the length, width and height of the small molecular compound are respectively added with the distance of 6 angstroms to be the size of the active site; all protein residues in the active site range are selected as active sites, and the central coordinates, the grid size of the active sites and all residue data are stored as an active site file in the format of. conf.

4. A drug target prediction method as claimed in claim 2 wherein: in the step (2), constructing a reference interaction fingerprint model database, wherein the process comprises the following steps: firstly, analyzing the interaction between 8 protein active site residues and a compound by using IFP-analyzers, wherein the interaction comprises a hydrogen bond donor, a hydrogen bond acceptor, a positive charge center, a negative charge center, a face-to-face pi-pi interaction, a hydrophobic interaction and a ligand-metal ion interaction, if any one of the interactions exists, the corresponding residue is assigned to be 1, and if no interaction exists, the corresponding residue is assigned to be 0; then, setting a weighted value of 2 for the positive charge center, the negative charge center and the ligand-metal ion interaction, setting a weighted value of 1 for the hydrogen bond donor, the hydrogen bond acceptor, the face-to-face pi-pi interaction and the hydrophobic interaction, wherein the active site residue-interaction assignment-weight jointly form an interaction fingerprint; by utilizing the steps, the structure of each target compound is analyzed, and a reference interaction fingerprint model is constructed and stored as a ifp format file.