CN114817578A - Scientific and technological thesis citation relation representation learning method, system and storage medium - Google Patents
Scientific and technological thesis citation relation representation learning method, system and storage medium Download PDFInfo
- Publication number
- CN114817578A CN114817578A CN202210745739.XA CN202210745739A CN114817578A CN 114817578 A CN114817578 A CN 114817578A CN 202210745739 A CN202210745739 A CN 202210745739A CN 114817578 A CN114817578 A CN 114817578A
- Authority
- CN
- China
- Prior art keywords
- scientific
- graph
- paper
- technological
- representation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000003860 storage Methods 0.000 title claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims abstract description 108
- 238000009826 distribution Methods 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 21
- 238000007476 Maximum Likelihood Methods 0.000 claims description 15
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 238000010586 diagram Methods 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 5
- 238000000547 structure data Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000005295 random walk Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/382—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using citations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a scientific and technological thesis citation relation representation learning method, a system and a storage medium, wherein the method comprises the following steps: acquiring a relational graph of the scientific and technical papers, wherein each node in the relational graph represents each scientific and technical paper, and each edge in the relational graph represents a reference relationship between the scientific and technical papers; determining a first scientific and technological paper feature matrix and a scientific and technological paper adjacency matrix based on the relational graph; constructing a graph automatic encoder; and inputting the first scientific paper feature matrix and the scientific paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of each scientific paper. The method can enable the scientific and technological paper citation relation to be more accurately represented.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a scientific and technological thesis citation relation representation learning method, a scientific and technological thesis citation relation representation learning system and a storage medium.
Background
Graph embedding is a method for converting nodes, edges, and features into vector space (lower dimension) while preserving attributes such as graph structure and information to the maximum extent.
Recent studies have shown that there are a number of ways to learn graph-embedded representations, each with a different level of granularity. Deepwalk belongs to one of the graph embedding techniques using walk, which is a concept in graph theory that graph traversal can be achieved by moving from one node to another as long as they are connected to a common edge. Node2vec is one of the deep learning methods that was the earliest attempt to learn from the graph structured data; as a modification of the node2vec variant, graph2vec is essentially a subgraph of the learning inset graph; these predetermined subgraphs have a set number of edges specified by the user. Likewise, the potential subgraph embedding is passed to the neural network for classification.
Unlike previous embedding techniques, SDNE does not use random walk techniques. Instead, it attempts to learn from two different metrics: first order proximity (two nodes are considered similar if they share an edge); second order proximity (two nodes are considered similar if they share many neighbors). LINE explicitly defines two functions; one for first order proximity and the other for second order proximity. The second order approach performs significantly better than the first, meaning that higher orders may smooth the improvement in accuracy. HARP improves the solution and avoids local optimality by better weight initialization, and uses graph coarsening to aggregate phase nodes into "super nodes", essentially a graph preprocessing step that can simplify the graph to speed up training.
Mutual Information (MI) measures the interdependency between two random variables, DGI is the earliest method of applying a mutual information constraint to graph structure data, which maximizes the mutual information between a global graph summary and each of its nodes to learn an information-rich node representation. However, there are currently two limitations to DGI; firstly, mutual dependency between node embedding and node attributes is ignored by the DGI; second, DGI does not adequately mine the various relationships between nodes. Therefore, the existing method cannot perform better representation learning on the relational graph and further cannot obtain more accurate embedded representation of the citation relation of the scientific and technical paper. Therefore, how to more accurately represent the citation relationship of the scientific and technical papers is a technical problem to be solved urgently.
Disclosure of Invention
In view of the above, the present invention provides a scientific and technological thesis citation relationship representation learning method, system and storage medium, so as to solve one or more problems in the prior art.
According to one aspect of the invention, the invention discloses a scientific and technological paper citation relation representation learning method, which comprises the following steps:
acquiring a relational graph of the scientific and technical papers, wherein each node in the relational graph represents each scientific and technical paper, and each edge in the relational graph represents a reference relationship between the scientific and technical papers;
determining a first scientific and technological paper feature matrix and a scientific and technological paper adjacency matrix based on the relational graph;
constructing a graph automatic encoder constrained by mutual information;
inputting the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of a node;
performing transposition operation on the first scientific and technological thesis feature matrix to obtain a second scientific and technological thesis feature matrix;
inputting the second scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a second embedded representation of the node;
determining a mutual information constraint penalty for the graph autoencoder based on the first embedded representation and the second embedded representation.
In some embodiments of the invention, the method further comprises: the graph autoencoder includes a plurality of convolutional layers.
In some embodiments of the invention, the mutual information constraint loss function is:
wherein,L MIC for mutual information constraint loss, N is the number of nodes in the relationship graph,E (X,A) representing a first maximum likelihood estimate of the first,representing the second maximum likelihood estimate and,S=∑εH,εis a set of weight factors, H is a first embedded representation, H = [ [ alpha ] ] [ [ alpha ] ] h 1 … h N ], In order to be the second embedded representation,,D(h i ,s) Is represented by a nodeh i To the aggregated vectorSThe probability score of (a) is obtained,is represented by a nodeTo the aggregated vectorSIs calculated.
In some embodiments of the invention, the loss function of the graph autoencoder based on mutual information constraints is:
wherein,L MIC in order to have a loss of mutual information constraint,αin order to balance the parameters of the system,Zin order to represent the nodes, the node is represented,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E q Z|X,A() which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)indicating a gaussian distribution.
In some embodiments of the present invention, the graph autoencoder is a variational graph autoencoder, and the first scientific paper feature matrix and the scientific paper adjacency matrix are input to the graph autoencoder to obtain the first embedded representation of the node, including:
and inputting the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into a graph convolution neural network, determining Gaussian distribution based on the graph convolution neural network, and sampling from the determined Gaussian distribution to obtain a first embedded representation.
In some embodiments of the invention, determining the gaussian distribution based on the graph-convolution neural network comprises:
mean and variance are calculated based on the graph convolution neural network.
In some embodiments of the present invention, the loss of the variational map auto-encoder includes cross entropy and KL divergence.
In some embodiments of the invention, the loss of the variational picture autoencoder based on mutual information constraints is:
wherein,αin order to balance the parameters of the system,Zin order to be the first embedded representation,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E q Z|X,A() which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)which represents a gaussian distribution of the intensity of the light,indicating divergence.
According to another aspect of the present invention, a scientific paper citation relation representation learning system is also disclosed, which comprises a processor and a memory, wherein the memory stores computer instructions, the processor is used for executing the computer instructions stored in the memory, and when the computer instructions are executed by the processor, the system realizes the steps of the method according to any one of the above embodiments.
According to another aspect of the present invention, a computer-readable storage medium is also disclosed, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the steps of the method according to any of the above embodiments.
The invention discloses a scientific and technological paper citation relation representation learning method and system, which are characterized in that a first scientific and technological paper feature matrix and a scientific and technological paper adjacency matrix are determined based on a relational graph of a scientific and technological paper, and the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix are input to an automatic graph encoder based on mutual information constraint, so that a first embedded representation of each scientific and technological paper is obtained based on the automatic graph encoder.
In addition to the above, the scientific and technological paper citation relation representation learning method disclosed by the invention constructs a diagram of the scientific and technological paper according to the existing relation, obtains node representation through the proposed encoder encoding, constructs a certain relation map for the learned node representation, and can also obtain a deep scientific and technological paper association relation. The variational graph automatic encoder based on mutual information constraint disclosed by the invention realizes the application of mutual information maximization constraint in graph structure scientific and technological paper data, attaches the constraint of maximization global and local representation to a graph encoder, and jointly optimizes the loss of the self encoder and maximization mutual information, so that the learned scientific and technological paper node representation captures richer graph global attribute and node neighborhood information, thereby improving the quality of scientific and technological paper representation.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the specific details set forth above, and that these and other objects that can be achieved with the present invention will be more clearly understood from the detailed description that follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. For purposes of illustrating and describing some portions of the present invention, corresponding parts may be exaggerated in the drawings, i.e., may be larger relative to other components in an exemplary device actually made according to the present invention. In the drawings:
fig. 1 is a flowchart illustrating a scientific and technological paper citation relationship representation learning method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of an overall architecture of a variational diagram auto-encoder based on a scientific and technological paper association diagram network and mutual information constraint according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the structures and/or processing steps closely related to the scheme according to the present invention are shown in the drawings, and other details not closely related to the present invention are omitted.
It should be emphasized that the term "comprises/comprising/comprises/comprising" when used herein, is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.
The existing unsupervised or self-supervised graph representation learning is a general technology for representing graph structure data in a low-dimensional space, and has important significance for promoting graph data mining tasks. The main difficulty is how to encode the original node features and edge associations of graph structure data into a low-dimensional embedding space. As a widely used depth model, a graph auto-encoder is proposed to learn graph embedding in an unsupervised manner by minimizing the reconstruction graph error of graph data, but its reconstruction loss ignores the distribution of potential representations, resulting in poor embedding effect. The inventor finds in research that maximizing mutual information between local node representations and global representation of graph structure data enables learned node representations to summarize information shared between nodes and to be used for downstream tasks; therefore, the invention provides a scientific and technological thesis citation relation representation learning method and system, loss and mutual information of an automatic encoder of a joint optimization graph are combined, so that the learned node representation captures richer information and node interaction, and the quality and accuracy of the node representation are improved.
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In the drawings, the same reference numerals denote the same or similar parts, or the same or similar steps.
Fig. 1 is a flowchart illustrating a scientific and technological paper reference relationship representation learning method according to an embodiment of the present invention, and as shown in fig. 1, the scientific and technological paper reference relationship representation learning method at least includes steps S10 to S40.
Step S10: acquiring a relational graph of the scientific papers, wherein each node in the relational graph represents each scientific paper, and each edge in the relational graph represents a reference relation between the scientific papers.
In the step, firstly, a scientific and technological paper is constructed into a relational graph according to the existing relationship; illustratively, is prepared byNRelationship graph formed by nodesGIs defined asG=V,EWhereinVA set of nodes is represented that represents a set of nodes,Ea corresponding set of edges is represented that is,V=n 1 …n n ,n i representation relationship diagramGTo (1)iIndividual nodes (or vertices). In this step, the nodes represent scientific papers, and the edges between the nodes represent reference relationships between the scientific papers.
Step S20: a first scientific paper feature matrix and a scientific paper adjacency matrix are determined based on the relational graph.
Adjacent matrix is denoted asAThe first science and technology thesis feature matrix is recorded asX. Wherein,Ais composed ofN×NThe matrix is a matrix of a plurality of matrices,Nis the number of nodes in the relationship graph, anA∈R N×N In a contiguous matrixAIn, if the nodeiAnd nodejConnected and then correspondingA ij =1; if the nodeiAnd nodejIf not connected, it is correspondingA ij = 0; from this, the matrixAThe element in (1) is 0 or 1. And each noden i ∈VAnd a feature vectorx i ∈R F Associating; all the feature vectors form a feature matrixX=[ x 1 …x n ]WhereinX∈R N×F ;NRefers to the number of nodes in the relationship graph,Frefers to the feature dimension.
Step S30: and constructing a graph automatic encoder constrained by mutual information.
Mutual informationI(X,Y)Weighing two random variationsXAndYdegree of correlation between them whenXAndYwhen completely independent, their mutual information is equal to 0, otherwise, the mutual informationI(X,Y)The larger the size, the representationXAndYthe higher the correlation. Currently, most unsupervised learning algorithms are directed to targets that are already partially defined in the input space, and as a "good" representation, should have excellent feature expression capability for targets that are not defined in the input space, so in the present invention, an encoder is trained in a manner of maximizing mutual information between local features and global features of a graph structure network, which stimulates the encoder to prefer information shared between all nodes, and if certain specific information (e.g., noise) exists only in certain neighborhoods, the information does not increase mutual information, and thus it is not desirable to encode it.
Step S40: and inputting the first scientific paper feature matrix and the scientific paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of the node.
In this step, the graph auto-encoder based on the mutual information constraint constructed in step S30 performs representation learning on the reference relationship of the scientific paper, and specifically inputs the first scientific paper feature matrix and the scientific paper adjacency matrix into the graph auto-encoder, thereby obtaining the first embedded representation.
In addition to the above, the scientific and technological paper citation relation representation learning method may further include the following steps: step S50: performing transposition operation on the first scientific and technological thesis feature matrix to obtain a second scientific and technological thesis feature matrix; step S60: inputting the second scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a second embedded representation of the node; step S70: determining a mutual information constraint penalty for the graph autoencoder based on the first embedded representation and the second embedded representation. In this embodiment, the first scientific and technological paper feature matrix may be used as a positive sample of the feature matrix, the second scientific and technological paper feature matrix may be used as a negative sample of the feature matrix, the first embedded representation and the second embedded representation may be obtained by the graph auto-encoder based on the positive sample of the feature matrix and the negative sample of the feature matrix through weight sharing, and then a mutual information constraint loss function related to the first embedded representation and the second embedded representation may be constructed, and based on the mutual information loss function, parameters of the graph auto-encoder may be further optimized, so that the scientific and technological paper reference relationship may further obtain a better node potential representation.
Illustratively, the mutual information constraint loss function is:
wherein,L MIC for mutual information constraint loss, N is the number of nodes in the relationship graph,E (X,A) representing a first maximum likelihood estimate of the first,representing the second maximum likelihood estimate and,S=∑εH,εis a set of weight factors, H is a first embedded representation, H = [ [ alpha ] ] [ [ alpha ] ] h 1 … h N ],In order to be the second embedded representation,,D(h i ,s) Is represented by a nodeh i To aggregated vectorSThe probability score of (a) is obtained,is represented by a nodeTo the aggregated vectorSThe probability score of (2).
In addition, the node embedding matrix learned based on the graph autoencoder representation is recorded asHThen H = [ H ] 1 …h N ],Is each onen i ∈VIsDimension node embedding vector, thenAs a relational graphGThe nodes of (a) are embedded in the matrix. Wherein,n i representation relationship diagramGTo (1)iThe number of nodes (or nodes) of the network,Nas to the number of nodes in the relationship graph,the dimensions of the vector are embedded for the nodes,Vrepresenting a set of nodes. In this embodiment, the target of the node representation learning is learningHSo that it remains simultaneouslyGStructural information ofAAnd its characteristic informationX. Once the learning is complete, the user can then,Hcan be used as a single input feature matrix for downstream tasks such as node classification, link prediction and clustering. The problem of node representation learning is equivalent to learning a codeThe function of the function is that of the function,in a contiguous matrixAAnd feature matrixXGenerating, as input, a node representationHI.e. byH =f(A,X)。
In one embodiment, the output of the convolutional neural network (GCN) of the previous layer of the graph in the graph autoencoder is used as the input of the convolutional network of the next layer, e.g.(ii) a Wherein, the firstlThe input to the convolutional network isH l() ∈R N ×D (initial input isH 0() =X),NIn order to be the number of nodes in the graph,Xfor the input first science and technology thesis feature matrix, each node usesDThe feature vector of the dimension is represented.In order to add a self-connected adjacency matrix,Ais a contiguous matrix of inputs that is,is shown aslThe adjacency matrix of the output of the layer convolution network,is a matrix of the units,is a matrix of the degrees, and the degree matrix,。representing adjacency matricesMiddle nodeiAnd nodejWhether the two parts are communicated or not, and when the two parts are communicated,is 1, when the communication is not connected,is 0.Are the network parameters to be trained.For the corresponding activation functions, the common ones are used hereActivating a function. The number of layers of the convolutional layers in the graph automatic encoder is not particularly limited, and may be set according to actual needs.
Specifically, the method adopts a graph convolution neural network as an encoder to obtain potential embedded expression of nodes, and the process is expressed by the following formula:Z=GCN(X,A). Treating the graph convolutional neural network as a function, and taking the feature matrix asXAnd adjacent matrixAs input, coded output via GCN,Representing a potentially embedded representation of all nodes, the graph autoencoder uses the inner-product as a decoder to reconstruct the original graph adjacency matrixAs shown in formula:. If desired to learnTo be able to better represent the nodes, the reconstructed adjacency matrix should be madeWith original adjacency matrixAAs similar as possible, since the adjacency matrix determines the structure of the graph. Therefore, in the process of training the automatic encoder of the graph, the following cross entropy can be adopted as the loss function:. Wherein,Nin order to be the number of the nodes,Frefers to the dimension of the characteristic of the image,in order to balance the parameters of the system,for the feature matrix of the first scientific paper,in order to form a adjacency matrix of a scientific paper,which represents the maximum likelihood estimate of the signal, P(A|Z)indicating a gaussian distribution.
The graph autoencoder in the invention can be a common graph autoencoder or a variational graph autoencoder. When the graph automatic encoder adopted by the scientific and technological paper citation relation representation learning method is a graph automatic encoder constrained by mutual information, the loss function of the graph automatic encoder based on the mutual information constraint is as follows:
wherein,L MIC in order to have a loss of mutual information constraint,αin order to balance the parameters of the system,Zin order to represent the nodes, the node is represented,Xfor the feature matrix of the first scientific and technical paper,Ain order to form a adjacency matrix of a scientific paper,E q Z|X,A() which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)representing a gaussian distribution.
The variational diagram automatic encoder adds a variational constraint on the basis of the automatic encoder, and in one embodiment,
inputting the first scientific paper feature matrix and the scientific paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of a node, comprising: and inputting the first scientific and technological paper characteristic matrix and the scientific and technological paper adjacency matrix into a graph convolution neural network, determining Gaussian distribution based on the graph convolution neural network, and sampling from the determined Gaussian distribution to obtain a first embedded expression. And, determining the gaussian distribution based on the atlas neural network includes calculating a mean and a variance based on the atlas neural network. For the variational diagram automatic encoder based on mutual information constraint, the loss comprises cross entropy and KL divergence.
Illustratively, in a variational graph autoencoder, the representation of nodesZIs no longer obtained from a determined function, but is obtained by sampling from a (multidimensional) gaussian distribution, i.e. a (multidimensional) gaussian distribution is determined by a graph convolution neural network, and then sampling is obtained from the distributionZ. The gaussian distribution can be uniquely determined by a second order matrix, so that only the mean and variance need be calculated in order to determine a gaussian distribution. The variational graph autoencoder uses a graph convolutional neural network to calculate the mean and variance, respectively. The mean and variance are calculated by the following formulas, respectively:and;a first scientific paper feature matrix input to the image autoencoder,Ais an input science and technology paper adjacency matrix. Wherein,andfirst layer convolutional network parameters ofAre shared, and the second layer of convolutional network parametersAre independent of each other and do not affect.
Furthermore, after the mean vector and the covariance matrix are obtained, since the sampling operation cannot provide gradient information,it cannot be directly sampled and needs to be re-parameterized. The decoder of the variational picture autoencoder is also an inner product, and is the same as the picture autoencoder; the variational picture autoencoder still wants the reconstructed picture to be as similar as possible to the original picture; it is also desirable that the distribution of the GCN layer calculations be as similar as possible to the standard gaussian distribution. The loss of the variational picture autoencoder thus consists of both cross entropy and KL divergence. For example, the loss function of a variational graph autoencoder is(ii) a Because:then, then. Wherein,in order to be the first embedded representation,for the feature matrix of the first scientific paper,in order to form a adjacency matrix of a scientific paper,is the distribution of graph convolution network layer computationsP(A|Z)Which represents a gaussian distribution of the intensity of the light,is a standard gaussian distribution.
Further, the loss function of the variational diagram automatic encoder based on the mutual information constraint is as follows:
wherein,αin order to balance the parameters of the system,Zin order to be the first embedded representation,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E q Z|X,A() which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)which represents a gaussian distribution of the intensity of the light,the degree of divergence is expressed in terms of,n is the number of nodes in the relationship graph,E (X,A) for the purpose of the first maximum likelihood estimate,for the purpose of the second maximum likelihood estimate,s=∑εH,εis a set of weight factors, H is a first embedded representation, H = [ [ alpha ] ] [ [ alpha ] ] h 1 … h N ],In order to be the second embedded representation,,D(h i ,s) Is represented by a nodeh i To the aggregated vectorsThe probability score of (a) is obtained,is represented by a nodeTo aggregated vectorsThe probability score of (2).
For the variational graph automatic encoder based on mutual information constraint of the above embodiment, the embedding space where the node representation is located is optimized on the basis of the variational graph automatic encoder framework, and the node potential representation can be better learned by adding the constraint of mutual information, and the overall architecture diagram is shown in fig. 2.
In particular, based on a loss functionConstructing a variational graph automatic encoder based on mutual information constraint; the first scientific and technological paper feature matrix is used as a positive sample, and the second scientific and technological paper feature matrix is used as a negative sample; and training the variational diagram automatic encoder based on mutual information constraint through positive and negative samples. Original feature matrixFor the first science and technology thesis feature matrix, the negative example is the original feature matrixIs disordered into,And maintaining a contiguous matrixThe change is not changed; in this case, the negative sample map and the positive sample map are isomorphic with each other.
Furthermore, after the single graph of the positive and negative samples is coded by the constructed variational graph automatic coder based on mutual information constraint, node representation of the positive and negative samples can be obtainedAnd. Here, the present invention seeks to obtain a node (i.e., local) representation that captures the global information content of the entire graph, as represented by a summary vectorAs shown in the figure, the material of the steel wire,is to beThe input readout function is summarized as follows:. Wherein,Nas to the number of the nodes,is a set of weight factors representing the importance of each node on the overall graph.After initialization, the nodes participate in gradient back propagation of the whole model, continuous optimization is carried out, and more weights are distributed to the nodes with high importance so as to better learnS。
This embodiment introduces a mutual information maximization strategy, i.e. the positive sample graph nodes represent the home summary vectorsThe probability of (c) is as large as possible, the negative sample graph nodes represent the home summary vectorIs as small as possible. MIC-VGAE (mutual information constraint-based automatic encoder) adopts discriminatorAs a proxy for maximizing local and global mutual information, whereinRepresenting distribution node representationsTo the aggregated vectorIs scored by the probability that the greater the score is, belongs toThe higher the probability of (c).The function is as follows:(ii) a Wherein,is a matrix of weights that can be learned,is a nodeiThe embedding of (a) represents a vector,sis a summarized vector.
To maximize mutual information, the present invention uses a noise contrast target with a standard binary cross-entropy (BCE) penalty between the joint samples (positive case) and the marginal products (negative case), as followsThe formula is a loss function of mutual information constraint in MIC-VGAE:(ii) a Then the variational graph auto-encoder based on mutual information constraints can be defined as:(ii) a A graph auto-encoder based on mutual information constraints can be defined as:. WhereinIn order to balance parameters, in the invention, balance parameters are introducedTo control the importance of both components.
Illustratively, the flow algorithm for constructing the mutual information constraint-based variational graph autoencoder and the mutual information constraint-based graph autoencoder is as follows:
inputting: scientific thesis relation diagramModel name, trade-off parameters, network parameters, training periodNumber of iterations
1. Initializing all network parameters
5. If the model adopts a variational diagram automatic encoder:
8. Otherwise:
end while
Correspondingly, the invention also discloses a scientific and technical paper citation relation representation learning system, which comprises a processor and a memory, wherein the memory stores computer instructions, the processor is used for executing the computer instructions stored in the memory, and when the computer instructions are executed by the processor, the system realizes the steps of the method according to any one of the above embodiments.
According to the embodiment, the scientific and technological paper representation learning method based on the scientific and technological paper association graph network and mutual information constraint is characterized in that a scientific and technological paper is constructed into a graph according to an existing relationship, node representation is obtained through encoding of a proposed encoder, and a certain relationship graph is constructed for the learned node representation to obtain a deep scientific and technological paper association relationship. And the variational graph automatic encoder based on the scientific and technological paper association relation graph network and mutual information constraint can maximize each node representation through the global graph abstract so as to learn the graph node representation under the unsupervised condition.
In the invention, the application of mutual information maximization constraint in the data of the scientific and technological paper with the graph structure is realized, a strategy is provided, the constraint of maximization global representation and the constraint of maximization local representation are added to a graph encoder, the loss of the self encoder and the maximization mutual information are jointly optimized, the learned node representation of the scientific and technological paper captures richer graph global attribute and node neighborhood information, and the quality of the representation of the scientific and technological paper is improved.
In summary, the scientific and technological paper citation relationship representation learning method and system disclosed by the present invention determine the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix based on the relational graph of the scientific and technological paper, and input the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix to the graph auto-encoder, so as to obtain the first embedded representation of each scientific and technological paper based on the graph auto-encoder.
In addition, the invention also discloses a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the method according to any of the above embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein may be implemented as hardware, software, or combinations of both. Whether this is done in hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments in the present invention.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A scientific and technical paper citation relation representation learning method is characterized by comprising the following steps:
acquiring a relational graph of the scientific and technical papers, wherein each node in the relational graph represents each scientific and technical paper, and each edge in the relational graph represents a reference relationship between the scientific and technical papers;
determining a first scientific and technological paper feature matrix and a scientific and technological paper adjacency matrix based on the relational graph;
constructing a graph automatic encoder constrained by mutual information;
inputting the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of a node;
performing transposition operation on the first scientific and technological thesis feature matrix to obtain a second scientific and technological thesis feature matrix;
inputting the second scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a second embedded representation of the node;
determining a mutual information constraint penalty for the graph autoencoder based on the first embedded representation and the second embedded representation.
2. The scientific paper citation relationship representation learning method of claim 1, wherein the graph autoencoder includes multiple convolutional layers.
3. The scientific and technological paper citation relationship representation learning method of claim 1, wherein the mutual information constraint loss function is:
wherein,L MIC for mutual information constraint loss, N is the number of nodes in the relationship graph,E (X,A) representing a first maximum likelihood estimate of the first,representing the second maximum likelihood estimate and,S=∑εH,εis a set of weight factors, H is a first embedded representation, H = [ [ alpha ] ] [ [ alpha ] ] h 1 … h N ], In order to be the second embedded representation,,D(h i ,s) Is represented by a nodeh i To the aggregated vectorSThe probability score of (a) is obtained,is represented by a nodeTo the aggregated vectorSThe probability score of (2).
4. The scientific paper citation relationship representation learning method as claimed in claim 3, wherein the loss function of the graph auto-encoder based on mutual information constraint is:
wherein,L MIC in order to have a loss of mutual information constraint,αin order to balance the parameters of the system,Zin order to represent the nodes, the node is represented,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E q Z|X,A() which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)representing a gaussian distribution.
5. The method as claimed in claim 3, wherein the graph auto-encoder is a variational graph auto-encoder, and the first science paper feature matrix and the science paper adjacency matrix are input into the graph auto-encoder to obtain the first embedded representation of the node, including:
and inputting the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into a graph convolution neural network, determining Gaussian distribution based on the graph convolution neural network, and sampling from the determined Gaussian distribution to obtain a first embedded representation.
6. The scientific paper citation relationship representation learning method of claim 5, wherein determining the Gaussian distribution based on the graph convolution neural network includes:
mean and variance are calculated based on the graph convolution neural network.
7. The scientific article citation relationship representation learning method of claim 6, wherein the losses of the variational diagram auto-encoder include cross entropy and KL divergence.
8. The scientific paper citation relationship representation learning method as claimed in claim 7, wherein the loss of the variational graph automatic encoder based on mutual information constraint is:
wherein,αin order to balance the parameters of the system,Zin order to be the first embedded representation,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E q Z|X,A() which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)which represents a gaussian distribution of the intensity of the light,indicating divergence.
9. A scientific and technical paper citation relationship representation learning system comprising a processor and a memory, wherein the memory has stored therein computer instructions, the processor being configured to execute the computer instructions stored in the memory, and wherein the system, when executed by the processor, implements the steps of the method of any one of claims 1 to 8.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210745739.XA CN114817578B (en) | 2022-06-29 | 2022-06-29 | Scientific and technological thesis citation relation representation learning method, system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210745739.XA CN114817578B (en) | 2022-06-29 | 2022-06-29 | Scientific and technological thesis citation relation representation learning method, system and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114817578A true CN114817578A (en) | 2022-07-29 |
CN114817578B CN114817578B (en) | 2022-09-09 |
Family
ID=82522852
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210745739.XA Active CN114817578B (en) | 2022-06-29 | 2022-06-29 | Scientific and technological thesis citation relation representation learning method, system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114817578B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561591A (en) * | 2023-07-10 | 2023-08-08 | 北京邮电大学 | Training method for semantic feature extraction model of scientific and technological literature, feature extraction method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150135056A1 (en) * | 2013-11-08 | 2015-05-14 | Business Objects Software Ltd. | Techniques for Creating Dynamic Interactive Infographics |
CN110580289A (en) * | 2019-08-28 | 2019-12-17 | 浙江工业大学 | Scientific and technological paper classification method based on stacking automatic encoder and citation network |
US20200081445A1 (en) * | 2018-09-10 | 2020-03-12 | Drisk, Inc. | Systems and Methods for Graph-Based AI Training |
CN112084328A (en) * | 2020-07-29 | 2020-12-15 | 浙江工业大学 | Scientific and technological thesis clustering analysis method based on variational graph self-encoder and K-Means |
CN113268993A (en) * | 2021-05-31 | 2021-08-17 | 之江实验室 | Mutual information-based attribute heterogeneous information network unsupervised network representation learning method |
CN114037014A (en) * | 2021-11-08 | 2022-02-11 | 西北工业大学 | Reference network clustering method based on graph self-encoder |
-
2022
- 2022-06-29 CN CN202210745739.XA patent/CN114817578B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150135056A1 (en) * | 2013-11-08 | 2015-05-14 | Business Objects Software Ltd. | Techniques for Creating Dynamic Interactive Infographics |
US20200081445A1 (en) * | 2018-09-10 | 2020-03-12 | Drisk, Inc. | Systems and Methods for Graph-Based AI Training |
CN110580289A (en) * | 2019-08-28 | 2019-12-17 | 浙江工业大学 | Scientific and technological paper classification method based on stacking automatic encoder and citation network |
CN112084328A (en) * | 2020-07-29 | 2020-12-15 | 浙江工业大学 | Scientific and technological thesis clustering analysis method based on variational graph self-encoder and K-Means |
CN113268993A (en) * | 2021-05-31 | 2021-08-17 | 之江实验室 | Mutual information-based attribute heterogeneous information network unsupervised network representation learning method |
CN114037014A (en) * | 2021-11-08 | 2022-02-11 | 西北工业大学 | Reference network clustering method based on graph self-encoder |
Non-Patent Citations (1)
Title |
---|
丁恒,任卫强,曹高辉: "基于无监督图神经网络的学术文献表示学习研究", 《情报学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561591A (en) * | 2023-07-10 | 2023-08-08 | 北京邮电大学 | Training method for semantic feature extraction model of scientific and technological literature, feature extraction method and device |
CN116561591B (en) * | 2023-07-10 | 2023-10-31 | 北京邮电大学 | Training method for semantic feature extraction model of scientific and technological literature, feature extraction method and device |
Also Published As
Publication number | Publication date |
---|---|
CN114817578B (en) | 2022-09-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113707235B (en) | Drug micromolecule property prediction method, device and equipment based on self-supervision learning | |
WO2023000574A1 (en) | Model training method, apparatus and device, and readable storage medium | |
Xu et al. | Hybrid regularized echo state network for multivariate chaotic time series prediction | |
CN111709491B (en) | Anomaly detection method, device, equipment and storage medium based on self-encoder | |
CN113299354B (en) | Small molecule representation learning method based on transducer and enhanced interactive MPNN neural network | |
Li et al. | Restricted Boltzmann machine-based approaches for link prediction in dynamic networks | |
CN110674323B (en) | Unsupervised cross-modal Hash retrieval method and system based on virtual label regression | |
US20230075100A1 (en) | Adversarial autoencoder architecture for methods of graph to sequence models | |
US20200167659A1 (en) | Device and method for training neural network | |
US8364615B2 (en) | Local graph partitioning using evolving sets | |
Du et al. | A kind of joint routing and resource allocation scheme based on prioritized memories-deep Q network for cognitive radio ad hoc networks | |
Zhou et al. | Community detection based on unsupervised attributed network embedding | |
Cheng et al. | Anomaly detection for internet of things time series data using generative adversarial networks with attention mechanism in smart agriculture | |
Seo et al. | Self-organizing maps and clustering methods for matrix data | |
CN114817578B (en) | Scientific and technological thesis citation relation representation learning method, system and storage medium | |
Hsieh et al. | Reinforced few-shot acquisition function learning for bayesian optimization | |
CN112199884A (en) | Article molecule generation method, device, equipment and storage medium | |
CN117036760A (en) | Multi-view clustering model implementation method based on graph comparison learning | |
Verma et al. | Modular flows: Differential molecular generation | |
Li et al. | Learning feature embedding refiner for solving vehicle routing problems | |
Lu et al. | Soft-orthogonal constrained dual-stream encoder with self-supervised clustering network for brain functional connectivity data | |
CN117349494A (en) | Graph classification method, system, medium and equipment for space graph convolution neural network | |
Banerjee et al. | Boosting exploration in actor-critic algorithms by incentivizing plausible novel states | |
Zhang et al. | Graph convolutional extreme learning machine | |
Peng et al. | Drl-based dependent task offloading strategies with multi-server collaboration in multi-access edge computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |