CN115761250A - Compound inverse synthesis method and device - Google Patents

Compound inverse synthesis method and device Download PDF

Info

Publication number
CN115761250A
CN115761250A CN202211454720.6A CN202211454720A CN115761250A CN 115761250 A CN115761250 A CN 115761250A CN 202211454720 A CN202211454720 A CN 202211454720A CN 115761250 A CN115761250 A CN 115761250A
Authority
CN
China
Prior art keywords
features
feature
sequence
compound
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211454720.6A
Other languages
Chinese (zh)
Other versions
CN115761250B (en
Inventor
侯静怡
刘志杰
贺威
苏磊
孟献兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202211454720.6A priority Critical patent/CN115761250B/en
Publication of CN115761250A publication Critical patent/CN115761250A/en
Application granted granted Critical
Publication of CN115761250B publication Critical patent/CN115761250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a compound inverse synthesis method and device, relates to the technical field of inverse synthesis, and particularly relates to a compound inverse synthesis method and device combining multi-scale convolution and self-attention coding. Extracting chemical molecular formula characteristics of a raw compound through a Transformer-based sequence model; extracting multi-scale local features and differential features of the chemical molecular formula features through a multi-scale convolution module; and inputting the multi-scale local features and the differential features into a GRU sequence generation model based on an attention mechanism, and performing regression generation on the original compound to complete compound inverse synthesis. The compound inverse synthesis method provided by the invention has obvious application value in the scenes of reducing the research and development cost of the compound and improving the research and development efficiency of the compound, and lays a model foundation for realizing comprehensive and accurate compound inverse synthesis and wide application of compound synthesis.

Description

Compound inverse synthesis method and device
Technical Field
The invention relates to the technical field of inverse synthesis, in particular to a compound inverse synthesis method and device combining multi-scale convolution and self-attention coding.
Background
The compound reverse synthesis is one of the important means for researching compound synthesis routes at present, and the synthesis method takes searching a plurality of synthesis raw materials through a target compound as a research route. In the study of compound synthesis, it is difficult to synthesize a given complex compound from a simple raw material, and therefore it is important to assist the design method by the reverse synthesis of the compound. For the general process of the compound inverse synthesis reasoning, the compound is gradually decomposed into simple raw materials, and finally, a more reasonable compound synthesis process route is selected. In order to apply the method extension to inverse synthesis of new compounds, the model is required to have strong adaptability to the reasoning process. The compound inverse synthesis method generally utilizes a recurrent neural network and a Transformer model based on the attention mechanism. In these methods, the target compound is first converted into a linear sequence form according to the SMILES rule, and then the sequence is input into a specific deep neural network to perform inverse synthetic reasoning on the characteristic sequence. The method based on the recurrent neural network relies on implicit memory modeling and recursive output, and the reasoning speed is low. In addition, this method has two problems: 1. the problem that the chemical molecular sequence is overlong and easy to generate gradient explosion; 2. the stacked design of the multi-layer recurrent neural network is easy to cause the problems of model degradation and incapability of learning more abundant characteristics. Recently, transformer-based methods have become increasingly important. The Transformer-based method does not rely on hidden information modeling, but directly calculates the relationship between each sequence feature. In addition, the method based on the Transformer can synchronously complete information transfer between chemical molecule sequences, has high learning speed and avoids the problems of gradient explosion and the like. However, in the problem of inverse synthesis of compounds, the Transformer-based method does not model by hidden information transfer, but directly calculates the relationship between each sequence feature, and can synchronously complete information transfer between chemical molecule sequences, so that the problems of gradient explosion and the like in a cyclic neural network do not occur. Although the method based on Transformer can solve the problem of long-term dependence of RNN, the method is easy to cause the problem of insufficient extraction of local features of chemical molecules due to excessive attention on global features of sequences.
Disclosure of Invention
The invention provides a compound inverse synthesis method and a compound inverse synthesis device, aiming at the problems that in the prior art, a Transformer model pays much attention to the global characteristics of a chemical molecule sequence and the extraction of local characteristics of chemical molecules is easy to generate insufficiently.
In order to solve the technical problems, the invention provides the following technical scheme:
in one aspect, a compound inverse synthesis method is provided, and the method is applied to electronic equipment, and comprises the following steps:
s1: extracting chemical molecular formula characteristics of a raw compound through a sequence model based on a Transformer;
s2: extracting multi-scale local features and differential features of the chemical molecular formula features through a multi-scale convolution module;
s3: and inputting the multi-scale local features and the differential features serving as input data into a GRU sequence generation model based on an attention mechanism, and performing regression generation on the original compound to complete compound inverse synthesis.
Optionally, in S1, extracting the chemical molecular formula features of the original compound through a Transformer-based sequence model includes:
s11: converting the chemical molecular formula of the original compound into linear representation in a SMILES format, and performing word segmentation;
s12: extracting the characteristics of each participle through a sequence model based on a Transformer, and expressing the characteristics as a characteristic sequence of a chemical molecule;
s13: and unifying the lengths of the characteristic sequences by an adaptive pooling method.
Optionally, in step S2, performing multi-scale local feature and difference feature extraction on the chemical molecular formula feature through a multi-scale convolution module, including:
s21: performing convolution operation on the feature sequence through convolution kernels with different sizes through a multi-scale convolution feature fusion module to obtain a plurality of local feature graphs, and fusing the plurality of local feature graphs to obtain new local features;
s22: and modeling the global features of the feature sequence by a sequence model based on a Transformer, and subtracting the new local features from the global features to obtain differential features.
Optionally, in step S21, performing convolution operation on the feature sequence through convolution kernels with different sizes by using a multi-scale convolution feature fusion module to obtain a plurality of local feature maps, and fusing the plurality of local feature maps to obtain a new local feature, where the method includes:
inputting the characteristic sequence X of the chemical molecule into a convolution module S containing three convolution kernels with different sizes to generate local characteristics of multiple receptive fields
Figure BDA0003952939700000031
Wherein
Figure BDA0003952939700000038
The convolution module S contains K convolution kernels of 1 xf size
Figure BDA0003952939700000033
Wherein K =3,f =2k +1;
for is to
Figure BDA0003952939700000034
Stacking in dimension J to obtain
Figure BDA0003952939700000035
And calculating a local feature attention diagram by combining matrix multiplication and a Softmax function according to the following formula (1):
Figure BDA0003952939700000036
wherein, a ji Representing the influence of the local feature map of the ith scale on the local feature map of the jth scale;
attention was sought to reintegrate local features with feature Z
Figure BDA0003952939700000037
And performing feature fusion on the dimension J to obtain new local features.
Optionally, in step S3, inputting the multi-scale local features and the difference features as input data to a sequence generation model of GRU based on attention mechanism, performing regression generation of the original compound, and completing inverse synthesis of the compound, including:
according to the multi-scale local features and the differential features, regression generation of the original compound is carried out through a GRU sequence generation model based on an attention mechanism;
in the regression generation process, performing feature selection on the local features and the difference features according to the output of the previous moment in the regression generation of each step;
when the GRU sequence generation model based on the attention mechanism is trained, the input of the previous moment directly adopts the true value, and the raw material sequence is generated through cluster searching during the inference so as to complete the inverse synthesis of the compound.
Optionally, in step S3, the method further includes:
adding prior restraint on an attention coefficient when a GRU sequence generating model based on an attention mechanism is trained, and if the output of the current moment is not searched out in a product sequence, the sum of the attention coefficients of restraint differential features is 1, and the sum of the attention coefficients of local features is 0; and conversely, the sum of the differential feature attention coefficients is 0, and the sum of the local feature attention coefficients is 1.
In one aspect, a compound inverse synthesis apparatus is provided, which is applied to an electronic device, and includes:
the preliminary characteristic extraction module is used for extracting the chemical molecular formula characteristics of the original compound through a sequence model based on a Transformer;
the local feature extraction module is used for extracting multi-scale local features and difference features of the chemical molecular formula features through the multi-scale convolution module;
and the regression synthesis module is used for inputting the multi-scale local features and the difference features serving as input data into a GRU sequence generation model based on an attention mechanism to carry out regression generation on the original compound, so that the compound inverse synthesis is completed.
Optionally, the preliminary feature extraction module is configured to convert the chemical molecular formula of the original compound into a linear representation in a SMILES format, and perform word segmentation;
extracting the characteristics of each participle through a sequence model based on a Transformer, and expressing the characteristics as a characteristic sequence of a chemical molecule;
and unifying the lengths of the characteristic sequences by an adaptive pooling method.
Optionally, the local feature extraction module is configured to perform convolution operation on the feature sequence through convolution kernels of different sizes to obtain a plurality of local feature maps, and perform fusion on the plurality of local feature maps to obtain new local features;
and modeling the global features of the feature sequence through a sequence model based on a Transformer, and subtracting the new local features from the global features to obtain differential features.
Optionally, the local feature extraction module is configured to input the feature sequence X of the chemical molecule into a convolution module S including three convolution kernels with different sizes to generate local features of multiple receptive fields
Figure BDA0003952939700000041
Wherein
Figure BDA0003952939700000042
The convolution module S contains K convolution kernels of 1 xf size
Figure BDA0003952939700000043
Wherein K =3,f =2k +1;
for is to
Figure BDA0003952939700000044
Stacking in dimension J to obtain
Figure BDA0003952939700000045
And calculating a local feature attention diagram by combining matrix multiplication and a Softmax function according to the following formula (1):
Figure BDA0003952939700000051
wherein, a ji Representing the influence of the local feature map of the ith scale on the local feature map of the jth scale;
attention is sought to re-integrate local features from feature A and feature Z
Figure BDA0003952939700000052
And performing feature fusion on the dimension J to obtain a new local feature.
In one aspect, an electronic device is provided, which includes a processor and a memory, where at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to implement the above-mentioned inverse compound synthesis method.
In one aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement one of the above-mentioned compound inverse synthesis methods.
The technical scheme of the embodiment of the invention at least has the following beneficial effects:
in the above scheme, a compound inverse synthesis method based on multi-scale coding and a self-attention mechanism is innovatively proposed for the problems of excessive capture of global context information and insufficient extraction of local features in a compound inverse synthesis scene of a transform model based on an attention mechanism, and the specific implementation manner is as follows: local feature and difference feature extraction by means of multi-scale convolution and attention-based Transformer, and inverse synthesis of compounds by means of GRU generation model based on attention mechanism and attention coefficient constraint with introduced prior loss. The compound inverse synthesis method has obvious application value in the scenes of reducing the research and development cost of the compound and improving the research and development efficiency of the compound, and lays a model foundation for realizing comprehensive and accurate compound inverse synthesis and wide application of compound synthesis.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for the retrosynthesis of a compound according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for the retrosynthesis of a compound according to an embodiment of the present invention;
FIG. 3 is a block diagram of a reverse synthesis model of a compound reverse synthesis method provided in an embodiment of the present invention;
FIG. 4 is a diagram of an internal structure of a GRU based on an attention mechanism for a compound reverse synthesis method provided by an embodiment of the invention;
FIG. 5 is a block diagram of a reverse synthesis apparatus for a compound according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
To make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
The embodiment of the invention provides a compound inverse synthesis method, which can be realized by electronic equipment, wherein the electronic equipment can be a terminal or a server. As shown in fig. 1, the flow chart of the method for inverse synthesis of a compound combining multi-scale convolution and self-attention coding may include the following steps:
s101: extracting the chemical molecular formula characteristics of the original compound through a Transformer-based sequence model;
s102: extracting multi-scale local features and differential features of the chemical molecular formula features through a multi-scale convolution module;
s103: and inputting the multi-scale local features and the differential features serving as input data into a GRU sequence generation model based on an attention mechanism, and performing regression generation on the original compound to complete compound inverse synthesis.
Optionally, in S101, extracting the chemical molecular formula features of the original compound through a Transformer-based sequence model, including:
s111: converting the chemical molecular formula of the original compound into linear representation in a SMILES format, and performing word segmentation;
s112: extracting the characteristics of each participle through a sequence model based on a Transformer, and expressing the characteristics as a characteristic sequence of a chemical molecule;
s113: and unifying the lengths of the characteristic sequences by an adaptive pooling method.
Optionally, in step S102, performing multi-scale local feature and difference feature extraction on the chemical molecular formula feature through a multi-scale convolution module, including:
s121: performing convolution operation on the feature sequence through convolution kernels with different sizes through a multi-scale convolution feature fusion module to obtain a plurality of local feature graphs, and fusing the plurality of local feature graphs to obtain new local features;
s122: and modeling the global features of the feature sequence through a sequence model based on a Transformer, and subtracting the new local features from the global features to obtain differential features.
Optionally, in step S121, performing convolution operation on the feature sequence through convolution kernels with different sizes by using a multi-scale convolution feature fusion module to obtain a plurality of local feature maps, and fusing the plurality of local feature maps to obtain a new local feature, where the method includes:
the characteristic sequence of the chemical moleculeInputting X into a convolution module S containing three convolution kernels with different sizes to generate local characteristics of multiple receptive fields
Figure BDA0003952939700000071
Wherein
Figure BDA0003952939700000072
The convolution module S contains K convolution kernels with the size of 1 xf
Figure BDA0003952939700000073
Wherein K =3,f =2k +1;
for is to
Figure BDA0003952939700000074
Stacking in dimension J to obtain
Figure BDA0003952939700000075
And calculating a local feature attention diagram by combining matrix multiplication and a Softmax function according to the following formula (1):
Figure BDA0003952939700000076
wherein, a ji Representing the influence of the local feature map of the ith scale on the local feature map of the jth scale;
attention is sought to re-integrate local features from feature A and feature Z
Figure BDA0003952939700000077
And performing feature fusion on the dimension J to obtain a new local feature.
Optionally, in step S103, inputting the multi-scale local features and the differential features as input data into a sequence generation model of a GRU based on attention mechanism, performing regression generation of the original compound, and completing compound inverse synthesis, including:
according to the multi-scale local features and the differential features, regression generation of the original compound is carried out through a GRU sequence generation model based on an attention mechanism;
in the regression generation process, performing feature selection on the local features and the difference features according to the output of the previous moment in the regression generation of each step;
when the GRU sequence generation model based on the attention mechanism is trained, the input of the previous moment directly adopts the true value, and the raw material sequence is generated through cluster searching during the inference so as to complete the inverse synthesis of the compound.
Optionally, in step S103, the method further includes:
adding prior restraint on attention coefficients when a GRU sequence generation model based on an attention mechanism is trained, and if the output of the current moment is not searched out in a product sequence, then restraining the sum of the attention coefficients of differential features to be 1 and the sum of the attention coefficients of local features to be 0; and conversely, the sum of the differential feature attention coefficients is 0, and the sum of the local feature attention coefficients is 1.
In the embodiment of the invention, aiming at the problems of excessive capture of global context information and insufficient extraction of local features in a compound inverse synthesis scene of a transducer model based on an attention mechanism, a compound inverse synthesis method based on a multi-scale coding and self-attention mechanism is innovatively provided, and the specific implementation mode is as follows: local feature and differential feature extraction is completed by using multi-scale convolution and a Transformer based on an attention mechanism, and compound inverse synthesis is completed by using a GRU generation model based on the attention mechanism and attention coefficient constraint introducing prior loss. The compound inverse synthesis method has obvious application value in the scenes of reducing the research and development cost of the compound and improving the research and development efficiency of the compound, and lays a model foundation for realizing comprehensive and accurate compound inverse synthesis and wide application of compound synthesis.
The embodiment of the invention provides a compound inverse synthesis method, which can be realized by electronic equipment, wherein the electronic equipment can be a terminal or a server. As shown in fig. 2, a flow chart of a method for inverse synthesis of a compound combining multi-scale convolution and self-attention coding, a processing flow of the method may include the following steps:
s201: converting the chemical molecular formula of the original compound into linear representation in a SMILES format, and performing word segmentation;
s202: extracting the characteristics of each participle through a sequence model based on a Transformer, and expressing the characteristics as a characteristic sequence of a chemical molecule;
s203: and unifying the lengths of the characteristic sequences by an adaptive pooling method.
In one possible embodiment, the invention first entails using a Transformer-based sequence model to extract features of a chemical formula of a compound. Firstly, converting the chemical molecular formula into linear representation in a SMILES format, performing word segmentation, extracting the characteristics of each word segmentation by using a sequence model, and then unifying the chemical molecular sequence lengths of the compounds by using an adaptive pooling method.
In one possible embodiment, the target length of the chemical molecule sequence is defined as N in actual operation due to the different lengths of the characteristic sequences of different chemical molecules, and then the lengths of each chemical molecule sequence are unified by adaptive pooling, which is expressed as
Figure BDA0003952939700000091
Wherein d is 1 Representing a feature dimension.
S204: performing convolution operation on the feature sequence through convolution kernels with different sizes through a multi-scale convolution feature fusion module to obtain a plurality of local feature graphs, and fusing the plurality of local feature graphs to obtain new local features;
in the embodiment of the invention, as multi-scale semantic information of a chemical molecular characteristic sequence needs to be extracted, an encoding method combining multi-scale convolution and an attention mechanism is introduced to extract multi-scale local and differential characteristics. Firstly, a multi-scale convolution feature fusion module is used for performing convolution operation on the chemical molecular sequence features through convolution kernels with different sizes, and the aim of extracting explicit local features from a product molecular sequence is fulfilled.
In one possible implementation, the Transformer-based model requires that each element in the sequence is interested in global context information, so the learned features indicate an over-interest in global information and a lack of detailed description of local features. The extraction of multi-scale local and differential features can enhance the local expressiveness of the sequence features of the chemical molecules. The specific implementation mode of the multi-scale local feature extraction is as follows: firstly, convolution operation is carried out on chemical molecule sequence characteristics by using convolution kernels with different sizes to obtain a plurality of local characteristic graphs, and then the plurality of local characteristic graphs are fused.
In one possible embodiment, the chemical molecular feature X is input into a convolution module S containing three convolution kernels of different sizes to generate local features of multiple receptive fields
Figure BDA0003952939700000092
Wherein
Figure BDA0003952939700000093
The convolution module S contains K convolution kernels with the size of 1 xf
Figure BDA0003952939700000094
Wherein K =3,f =2k +1;
in one possible embodiment, the local features z extracted under different sizes of receptive fields are taken into account j The local structural information captured is different and a mechanism of attention is introduced to fuse the final local features. For is to
Figure BDA0003952939700000095
Stacking in dimension J to obtain
Figure BDA0003952939700000096
And performing matrix multiplication and Softmax function according to the following formula (1) in combination to calculate a local feature attention diagram A epsilon R J×J :
Figure BDA0003952939700000101
Wherein, a ji Representing the influence of the local feature map of the ith scale on the local feature map of the jth scale;
attention is sought to re-integrate local features from feature A and feature Z
Figure BDA0003952939700000102
And performing feature fusion on the dimension J to obtain a new local feature.
In a possible implementation, in order to make the network adaptively capture the multi-size local features, a set of parameters is introduced
Figure BDA0003952939700000103
Performing feature fusion in dimension J to obtain final local feature encoding
Figure BDA0003952939700000104
Figure BDA0003952939700000105
As shown in equation (2):
Figure BDA0003952939700000106
wherein alpha is j Is learnable and is initialized to 1.
S205: and modeling the global features of the feature sequence through a sequence model based on a Transformer, and subtracting the new local features from the global features to obtain differential features.
In one possible embodiment, in order to satisfy the tasks of precise representation of chemical molecular sequences and regression generation of raw materials, the learned local features need to be further optimized to obtain differential features: firstly, global feature modeling is carried out on a chemical molecule sequence by using a Transformer sequence model to obtain global representation of the chemical molecule sequence
Figure BDA0003952939700000107
Then, difference characteristics are obtained by differentiating the global characteristics and the local characteristics, X diff =X glb -X loc
S206: and inputting the multi-scale local features and the differential features serving as input data into a GRU sequence generation model based on an attention mechanism, and performing regression generation on the original compound to complete compound inverse synthesis.
In one possible embodiment, the present invention addresses the problems of characterization of compound retrosynthesis and material regression modeling with a research background of compound retrosynthesis. Fig. 3 shows a framework diagram of an inverse synthetic model. According to the multi-scale local features and the differential features, regression generation of the original compound is carried out through a GRU sequence generation model based on an attention mechanism; GRU (Gated regenerative Unit) is a proposed variant to the RNN gradient problem.
In the generation process, each step selects the local features and the differential features according to the output of the previous moment; during training, the input of the previous moment directly adopts a true value, and during deduction, a raw material sequence is generated through cluster searching to complete compound inverse synthesis.
In one possible embodiment, an attention-based GRU sequence generation model is introduced to achieve the generation of feedstock. Fig. 4 shows a schematic diagram of the inside of a GRU based on the attention mechanism. In the generation process, each step utilizes an attention mechanism and performs feature selection on local features and differential features according to the output of the previous moment. During training, adding prior restraint on the attention coefficient, and if the output at the current moment is not searched in the product sequence, then adding the attention coefficients of the restraint difference features to be 1 and adding the attention coefficients of the local features to be 0; and conversely, the sum of the differential feature attention coefficients is 0, and the sum of the local feature attention coefficients is 1.
In one possible embodiment, the sequences generated by regression are searched using bundling during testing. In particular, the local feature X learned will be loc And difference feature X diff And the pooled chemical molecular characteristics learned in step 1
Figure BDA0003952939700000111
Respectively input into an attention mechanism-based sequence model to calculate an attention factor vector beta:
β=softmax(f(W,[X loc ;X diff ]))#(3)
wherein W represents the participle characteristics of the material sequence in the same way as the mapping of the participle characteristics of the product sequence in step 1, f (-) represents a nonlinear function for measuring the correlation between the characteristics, and then an attention factor is used for X loc And X diff Constraints are performed and regression is performed separately through a GRU sequence generation model based on an attention mechanism. And finally, finishing end-to-end training of the inverse synthetic model according to the Top-n prediction evaluation index.
In the embodiment of the invention, the whole design process is divided into four steps. Firstly, extracting sequence characteristics of chemical molecules by using a sequence model based on a Transformer; and in the second step, local features of the chemical molecule sequence are extracted by using a multi-scale convolution module, and differential features are obtained by combining global features extracted by a transducer based on an attention mechanism. And thirdly, designing a GRU sequence generation model based on an attention mechanism, introducing an attention coefficient prior constraint method, and completing regression generation learning of the raw materials through Top-n evaluation indexes.
In the embodiment of the invention, aiming at the problems of excessive capture of global context information and insufficient extraction of local features in a compound inverse synthesis scene of a transducer model based on an attention mechanism, a compound inverse synthesis method based on a multi-scale coding and self-attention mechanism is innovatively provided, and the specific implementation mode is as follows: local feature and differential feature extraction is completed by using multi-scale convolution and a Transformer based on an attention mechanism, and compound inverse synthesis is completed by using a GRU generation model based on the attention mechanism and attention coefficient constraint introducing prior loss. The compound inverse synthesis method has obvious application value in the scenes of reducing the research and development cost of the compound and improving the research and development efficiency of the compound, and lays a model foundation for realizing comprehensive and accurate compound inverse synthesis and wide application of compound synthesis.
In the embodiment of the invention, a compound inverse synthesis model combining multi-scale convolution and self-attention coding is innovatively provided, and a compound inverse synthesis task is completed based on the model. The method is inspired by the research of local feature extraction of a chemical molecule sequence, namely when a chemical molecule sequence is modeled through a deep learning model, the existing deep sequence model tends to summarize global information, and ignores information coding of local features. In the modeling process of the invention, firstly, the chemical molecular formula is converted into a sequence form through a SMILES rule, and a sequence model based on a Transformer is used for feature extraction. In consideration of locality of chemical molecular information coding, a multi-scale convolution module and a Transformer module based on attention mechanism are introduced. In order to further complete the task of compound inverse synthesis, a raw material generation learning method based on attention mechanism GRU and attention coefficient constraint based on prior loss is introduced, so that decomposed features only contain specific structural information exclusive to the decomposed features, and further initial expressions of different raw material information are accurately generated. The molecular formula information of the obtained inverse synthetic raw material is accurate and complete, and the applicability is strong. Compared with the traditional recurrent neural network and the transducer model based on the attention mechanism, the compound inverse synthesis model based on the multi-scale information coding and the attention mechanism can effectively avoid the problems of insufficient accuracy of characteristic information extraction and incomplete information expression, so that the method has very important theoretical value and application value.
FIG. 5 is a block diagram illustrating an apparatus for inverse synthesis of a compound according to an exemplary embodiment. Referring to fig. 5, the apparatus 300 includes:
a preliminary feature extraction module 310, configured to extract chemical molecular formula features of the original compound through a Transformer-based sequence model;
the local feature extraction module 320 is configured to perform multi-scale local feature and difference feature extraction on the chemical molecular formula features through a multi-scale convolution module;
and the regression synthesis module 330 is configured to input the multi-scale local features and the difference features as input data to a sequence generation model of the GRU based on the attention mechanism, perform regression generation of the original compound, and complete inverse synthesis of the compound.
Optionally, the preliminary feature extraction module 310 is configured to convert the chemical molecular formula of the original compound into a linear representation in SMILES format, and perform word segmentation;
extracting the characteristics of each participle through a sequence model based on a Transformer, and expressing the characteristics as a characteristic sequence of a chemical molecule;
and unifying the lengths of the characteristic sequences by an adaptive pooling method.
Optionally, the local feature extraction module 320 is configured to perform convolution operation on the feature sequence through convolution kernels of different sizes to obtain a plurality of local feature maps, and perform fusion on the plurality of local feature maps to obtain a new local feature;
and modeling the global features of the feature sequence through a sequence model based on a Transformer, and subtracting the new local features from the global features to obtain differential features.
Optionally, the local feature extraction module 320 is configured to input the feature sequence X of the chemical molecule into a convolution module S having three convolution kernels with different sizes to generate local features of multiple receptive fields
Figure BDA0003952939700000131
Wherein
Figure BDA0003952939700000132
The convolution module S contains K convolution kernels of 1 xf size
Figure BDA0003952939700000133
Wherein K =3,f =2k +1;
for is to
Figure BDA0003952939700000134
Stacking in dimension J to obtain
Figure BDA0003952939700000135
And calculating a local feature attention diagram by combining matrix multiplication and a Softmax function according to the following formula (1):
Figure BDA0003952939700000136
wherein, a ji Representing the influence of the local feature map of the ith scale on the local feature map of the jth scale;
attention was sought to reintegrate local features with feature Z
Figure BDA0003952939700000137
And performing feature fusion on the dimension J to obtain new local features.
Optionally, the regression synthesis module 330 is configured to perform regression generation of the original compound through a sequence generation model of GRU based on an attention mechanism according to the multi-scale local feature and the difference feature;
in the regression generation process, performing feature selection on the local features and the differential features according to the output of the previous moment in the regression generation of each step;
when the GRU sequence generation model based on the attention mechanism is trained, the input of the previous moment directly adopts the true value, and the raw material sequence is generated through cluster searching during the inference so as to complete the inverse synthesis of the compound.
Optionally, adding prior constraint on the attention coefficient during training of the GRU sequence generation model based on the attention mechanism, and if the output at the current moment is not searched out in the product sequence, then constraining the sum of the attention coefficients of the differential features to be 1 and the sum of the attention coefficients of the local features to be 0; and conversely, the sum of the differential feature attention coefficients is 0, and the sum of the local feature attention coefficients is 1.
In the embodiment of the invention, aiming at the problems of excessive capture of global context information and insufficient extraction of local features in a compound inverse synthesis scene of a transducer model based on an attention mechanism, a compound inverse synthesis method based on multi-scale coding and a self-attention mechanism is innovatively provided, and the specific implementation mode is as follows: local feature and differential feature extraction is completed by using multi-scale convolution and a Transformer based on an attention mechanism, and compound inverse synthesis is completed by using a GRU generation model based on the attention mechanism and attention coefficient constraint introducing prior loss. The compound inverse synthesis method has obvious application value in the scenes of reducing the research and development cost of the compound and improving the research and development efficiency of the compound, and lays a model foundation for realizing comprehensive and accurate compound inverse synthesis and wide application of compound synthesis.
Fig. 6 is a schematic structural diagram of an electronic device 400 according to an embodiment of the present invention, where the electronic device 400 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 401 and one or more memories 402, where at least one instruction is stored in the memory 402, and the at least one instruction is loaded and executed by the processor 401 to implement the following steps of the compound inverse synthesis method:
s1: extracting chemical molecular formula characteristics of a raw compound through a sequence model based on a Transformer;
s2: extracting multi-scale local features and differential features of the chemical molecular formula features through a multi-scale convolution module;
s3: and inputting the multi-scale local features and the differential features serving as input data into a GRU sequence generation model based on an attention mechanism, and performing regression generation on the original compound to complete compound inverse synthesis.
In an exemplary embodiment, a computer-readable storage medium, such as a memory including instructions executable by a processor in a terminal, is also provided to perform the compound inverse synthesis method described above. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A compound reverse synthesis method is characterized by comprising the following steps:
s1: extracting the chemical molecular formula characteristics of the original compound through a Transformer-based sequence model;
s2: extracting multi-scale local features and difference features of the chemical molecular formula features through a multi-scale convolution module;
s3: and inputting the multi-scale local features and the differential features serving as input data into a GRU sequence generation model based on an attention mechanism, and performing regression generation on the original compound to complete compound inverse synthesis.
2. The method of claim 1, wherein in S1, extracting the chemical molecular formula features of the original compound through a Transformer-based sequence model comprises:
s11: converting the chemical molecular formula of the original compound into linear representation in a SMILES format, and performing word segmentation;
s12: extracting the characteristics of each participle through a sequence model based on a Transformer, and expressing the characteristics as a characteristic sequence of a chemical molecule;
s13: and unifying the lengths of the characteristic sequences by an adaptive pooling method.
3. The method according to claim 2, wherein in the step S2, the multi-scale local feature and difference feature extraction is performed on the chemical molecular formula features through a multi-scale convolution module, and the step includes:
s21: performing convolution operation on the feature sequence through convolution kernels with different sizes through a multi-scale convolution feature fusion module to obtain a plurality of local feature graphs, and fusing the plurality of local feature graphs to obtain new local features;
s22: and modeling the global features of the feature sequence through a sequence model based on a Transformer, and subtracting the new local features from the global features to obtain differential features.
4. The method according to claim 3, wherein in step S21, the convolving the feature sequence with convolution kernels of different sizes by a multi-scale convolution feature fusion module to obtain a plurality of local feature maps, and the fusing the plurality of local feature maps to obtain new local features includes:
inputting the characteristic sequence X of the chemical molecule into a convolution module S containing three convolution kernels with different sizes to generate local characteristics of multiple receptive fields
Figure FDA0003952939690000011
Wherein the ratio of J =3 is,
Figure FDA0003952939690000012
the convolution module S contains K convolution kernels of 1 xf size
Figure FDA0003952939690000021
Wherein K =3,f =2k +1;
to pair
Figure FDA0003952939690000022
Stacking in dimension J to obtain
Figure FDA0003952939690000023
And calculating a local feature attention diagram by combining matrix multiplication and a Softmax function according to the following formula (1):
Figure FDA0003952939690000024
wherein, a ji Representing the influence of the local feature map of the ith scale on the local feature map of the jth scale;
attention is sought to re-integrate part of A and Z featuresFeature(s)
Figure FDA0003952939690000025
And performing feature fusion on the dimension J to obtain a new local feature.
5. The method according to claim 1, wherein the step S3 of inputting the multi-scale local features and the difference features as input data to a sequence generation model of GRU based on attention mechanism to perform regression generation of the original compound to complete the compound inverse synthesis includes:
according to the multi-scale local features and the differential features, regression generation of original compounds is carried out through a GRU sequence generation model based on an attention mechanism;
in the regression generation process, performing feature selection on the local features and the difference features according to the output of the previous moment in the regression generation of each step;
during training of the GRU sequence generation model based on the attention mechanism, the input of the previous moment directly adopts the true value, and the raw material sequence is generated through cluster searching during inference so as to complete the inverse synthesis of the compound.
6. The method according to claim 5, wherein the step S3 further comprises:
adding prior restraint on attention coefficients when a GRU sequence generation model based on an attention mechanism is trained, and if the output of the current moment is not searched out in a product sequence, then restraining the sum of the attention coefficients of differential features to be 1 and the sum of the attention coefficients of local features to be 0; and conversely, the sum of the differential feature attention coefficients is 0, and the sum of the local feature attention coefficients is 1.
7. A device for the retrosynthesis of a compound, said device being suitable for use in the method according to any one of the preceding claims 1 to 6, the device comprising:
the preliminary characteristic extraction module is used for extracting the chemical molecular formula characteristics of the original compound through a sequence model based on a Transformer;
the local feature extraction module is used for performing multi-scale local feature and difference feature extraction on the chemical molecular formula features through the multi-scale convolution module;
and the regression synthesis module is used for inputting the multi-scale local features and the difference features serving as input data into a GRU sequence generation model based on an attention mechanism to carry out regression generation on the original compound, so that the compound inverse synthesis is completed.
8. The apparatus of claim 7, wherein the preliminary feature extraction module is configured to convert the chemical formula of the original compound into a linear representation in SMILES format and perform word segmentation;
extracting the characteristics of each participle through a sequence model based on a Transformer, and expressing the characteristics as a characteristic sequence of a chemical molecule;
and unifying the lengths of the characteristic sequences by an adaptive pooling method.
9. The device according to claim 8, wherein the local feature extraction module is configured to perform convolution operation on the feature sequence through convolution kernels of different sizes by using an excessive scale convolution feature fusion module to obtain a plurality of local feature maps, and perform fusion on the plurality of local feature maps to obtain new local features;
and modeling the global features of the feature sequence through a sequence model based on a Transformer, and subtracting the new local features from the global features to obtain differential features.
10. The apparatus of claim 9, wherein the local feature extraction module is configured to input the feature sequence X of the chemical molecule into a convolution module S having three convolution kernels with different sizes to generate local features of multiple receptive fields
Figure FDA0003952939690000031
Wherein the ratio of J =3 is as follows,
Figure FDA0003952939690000032
the convolution module S contains K convolution kernels of 1 xf size
Figure FDA0003952939690000033
Wherein K =3,f =2k +1;
to pair
Figure FDA0003952939690000034
Stacking in dimension J to obtain
Figure FDA0003952939690000035
And calculating a local feature attention diagram by combining matrix multiplication and a Softmax function according to the following formula (1):
Figure FDA0003952939690000036
wherein, a ji Representing the influence of the local feature map of the ith scale on the local feature map of the jth scale;
attention is sought to re-integrate local features from feature A and feature Z
Figure FDA0003952939690000037
And performing feature fusion on the dimension J to obtain a new local feature.
CN202211454720.6A 2022-11-21 2022-11-21 Compound reverse synthesis method and device Active CN115761250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211454720.6A CN115761250B (en) 2022-11-21 2022-11-21 Compound reverse synthesis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211454720.6A CN115761250B (en) 2022-11-21 2022-11-21 Compound reverse synthesis method and device

Publications (2)

Publication Number Publication Date
CN115761250A true CN115761250A (en) 2023-03-07
CN115761250B CN115761250B (en) 2023-10-10

Family

ID=85333446

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211454720.6A Active CN115761250B (en) 2022-11-21 2022-11-21 Compound reverse synthesis method and device

Country Status (1)

Country Link
CN (1) CN115761250B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111524557A (en) * 2020-04-24 2020-08-11 腾讯科技(深圳)有限公司 Inverse synthesis prediction method, device, equipment and storage medium based on artificial intelligence
CN112397155A (en) * 2020-12-01 2021-02-23 中山大学 Single-step reverse synthesis method and system
CN113470758A (en) * 2021-07-06 2021-10-01 北京科技大学 Chemical reaction yield prediction method based on cause and effect discovery and multi-structure information coding
CN113907706A (en) * 2021-08-29 2022-01-11 北京工业大学 Electroencephalogram seizure prediction method based on multi-scale convolution and self-attention network
CN114220496A (en) * 2021-11-30 2022-03-22 华南理工大学 Deep learning-based inverse synthesis prediction method, device, medium and equipment
WO2022073452A1 (en) * 2020-10-07 2022-04-14 武汉大学 Hyperspectral remote sensing image classification method based on self-attention context network
CN114360638A (en) * 2021-12-15 2022-04-15 华东师范大学 Compound-protein interaction prediction method based on deep learning
KR20220050758A (en) * 2020-10-16 2022-04-25 현대자동차주식회사 Multi-directional scene text recognition method and system based on multidimensional attention mechanism
CN114530258A (en) * 2022-01-28 2022-05-24 华南理工大学 Deep learning drug interaction prediction method, device, medium and equipment
CN114743615A (en) * 2022-02-14 2022-07-12 北京科技大学 Small sample drug chemical reaction representation and automatic classification method and device
CN114841261A (en) * 2022-04-29 2022-08-02 华南理工大学 Increment width and deep learning drug response prediction method, medium, and apparatus
CN114882430A (en) * 2022-04-29 2022-08-09 东南大学 Lightweight early fire detection method based on Transformer
CN114997176A (en) * 2022-05-19 2022-09-02 上海大学 Method, apparatus and medium for identifying descriptor of text data
CN115047421A (en) * 2022-04-14 2022-09-13 杭州电子科技大学 Radar target identification method based on Transformer
CN115222998A (en) * 2022-09-15 2022-10-21 杭州电子科技大学 Image classification method

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111524557A (en) * 2020-04-24 2020-08-11 腾讯科技(深圳)有限公司 Inverse synthesis prediction method, device, equipment and storage medium based on artificial intelligence
WO2022073452A1 (en) * 2020-10-07 2022-04-14 武汉大学 Hyperspectral remote sensing image classification method based on self-attention context network
KR20220050758A (en) * 2020-10-16 2022-04-25 현대자동차주식회사 Multi-directional scene text recognition method and system based on multidimensional attention mechanism
CN112397155A (en) * 2020-12-01 2021-02-23 中山大学 Single-step reverse synthesis method and system
CN113470758A (en) * 2021-07-06 2021-10-01 北京科技大学 Chemical reaction yield prediction method based on cause and effect discovery and multi-structure information coding
CN113907706A (en) * 2021-08-29 2022-01-11 北京工业大学 Electroencephalogram seizure prediction method based on multi-scale convolution and self-attention network
CN114220496A (en) * 2021-11-30 2022-03-22 华南理工大学 Deep learning-based inverse synthesis prediction method, device, medium and equipment
CN114360638A (en) * 2021-12-15 2022-04-15 华东师范大学 Compound-protein interaction prediction method based on deep learning
CN114530258A (en) * 2022-01-28 2022-05-24 华南理工大学 Deep learning drug interaction prediction method, device, medium and equipment
CN114743615A (en) * 2022-02-14 2022-07-12 北京科技大学 Small sample drug chemical reaction representation and automatic classification method and device
CN115047421A (en) * 2022-04-14 2022-09-13 杭州电子科技大学 Radar target identification method based on Transformer
CN114841261A (en) * 2022-04-29 2022-08-02 华南理工大学 Increment width and deep learning drug response prediction method, medium, and apparatus
CN114882430A (en) * 2022-04-29 2022-08-09 东南大学 Lightweight early fire detection method based on Transformer
CN114997176A (en) * 2022-05-19 2022-09-02 上海大学 Method, apparatus and medium for identifying descriptor of text data
CN115222998A (en) * 2022-09-15 2022-10-21 杭州电子科技大学 Image classification method

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
HE ZHAO 等: "Review of Video Predictive Understanding: Early Action Recognition and Future Action Prediction", 《ARXIV》 *
PHILIPPE SCHWALLER 等: "Extraction of organic chemistry grammar from unsupervised learning of chemical reactions", 《SCIENCE ADVANCES》 *
UMIT V. UCAK 等: "Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments", 《NATURE COMMUNICATIONS》 *
XIANLUN TANG 等: "A Multi-scale Convolutional Attention Based GRU Network for Text Classification", 《IEEE》 *
ZIQIAO WANG 等: "Cross-phenological-region crop mapping framework using Sentinel-2 time series Imagery: A new perspective for winter crops in China", 《ELSEVIER》 *
饶晓洁 等: "基于多层注意力和消息传递网络的药物相互作用预测方法", 《自动化学报》 *

Also Published As

Publication number Publication date
CN115761250B (en) 2023-10-10

Similar Documents

Publication Publication Date Title
CN111275107A (en) Multi-label scene image classification method and device based on transfer learning
CN111666427A (en) Entity relationship joint extraction method, device, equipment and medium
CN113535984A (en) Attention mechanism-based knowledge graph relation prediction method and device
CN113254716B (en) Video clip retrieval method and device, electronic equipment and readable storage medium
CN113688878A (en) Small sample image classification method based on memory mechanism and graph neural network
CN113948157B (en) Chemical reaction classification method, device, electronic equipment and storage medium
CN112182167B (en) Text matching method and device, terminal equipment and storage medium
CN114065048A (en) Article recommendation method based on multi-different-pattern neural network
CN116229519A (en) Knowledge distillation-based two-dimensional human body posture estimation method
CN116245097A (en) Method for training entity recognition model, entity recognition method and corresponding device
CN115982480A (en) Sequence recommendation method and system based on cooperative attention network and comparative learning
CN115238582A (en) Reliability evaluation method, system, equipment and medium for knowledge graph triples
CN111832637A (en) Distributed deep learning classification method based on alternative direction multiplier method ADMM
CN117421393A (en) Generating type retrieval method and system for patent
CN115129826B (en) Electric power field model pre-training method, fine tuning method, device and equipment
CN115761250B (en) Compound reverse synthesis method and device
CN114281950B (en) Data retrieval method and system based on multi-graph weighted fusion
CN116257798A (en) Click rate prediction model training and click rate prediction method, system and equipment
CN115630223A (en) Service recommendation method and system based on multi-model fusion
CN115526177A (en) Training of object association models
CN116263794A (en) Double-flow model recommendation system and algorithm with contrast learning enhancement
CN113435243A (en) Hyperspectral true downsampling fuzzy kernel estimation method
CN112559582A (en) Small sample learning method and device based on sample pair relation propagation
CN114911909B (en) Address matching method and device combining deep convolutional network and attention mechanism
CN116629356B (en) Encoder and Gaussian mixture model-based small-sample knowledge graph completion method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant