CN114817578A

CN114817578A - Scientific and technological thesis citation relation representation learning method, system and storage medium

Info

Publication number: CN114817578A
Application number: CN202210745739.XA
Authority: CN
Inventors: 薛哲; 杜军平; 宋杰; 梁美玉; 邵蓥侠; 寇菲菲
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-07-29
Anticipated expiration: 2042-06-29
Also published as: CN114817578B

Abstract

The invention provides a scientific and technological thesis citation relation representation learning method, a system and a storage medium, wherein the method comprises the following steps: acquiring a relational graph of the scientific and technical papers, wherein each node in the relational graph represents each scientific and technical paper, and each edge in the relational graph represents a reference relationship between the scientific and technical papers; determining a first scientific and technological paper feature matrix and a scientific and technological paper adjacency matrix based on the relational graph; constructing a graph automatic encoder; and inputting the first scientific paper feature matrix and the scientific paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of each scientific paper. The method can enable the scientific and technological paper citation relation to be more accurately represented.

Description

Scientific and technological thesis citation relation representation learning method, system and storage medium

Technical Field

The invention relates to the technical field of computers, in particular to a scientific and technological thesis citation relation representation learning method, a scientific and technological thesis citation relation representation learning system and a storage medium.

Background

Graph embedding is a method for converting nodes, edges, and features into vector space (lower dimension) while preserving attributes such as graph structure and information to the maximum extent.

Recent studies have shown that there are a number of ways to learn graph-embedded representations, each with a different level of granularity. Deepwalk belongs to one of the graph embedding techniques using walk, which is a concept in graph theory that graph traversal can be achieved by moving from one node to another as long as they are connected to a common edge. Node2vec is one of the deep learning methods that was the earliest attempt to learn from the graph structured data; as a modification of the node2vec variant, graph2vec is essentially a subgraph of the learning inset graph; these predetermined subgraphs have a set number of edges specified by the user. Likewise, the potential subgraph embedding is passed to the neural network for classification.

Unlike previous embedding techniques, SDNE does not use random walk techniques. Instead, it attempts to learn from two different metrics: first order proximity (two nodes are considered similar if they share an edge); second order proximity (two nodes are considered similar if they share many neighbors). LINE explicitly defines two functions; one for first order proximity and the other for second order proximity. The second order approach performs significantly better than the first, meaning that higher orders may smooth the improvement in accuracy. HARP improves the solution and avoids local optimality by better weight initialization, and uses graph coarsening to aggregate phase nodes into "super nodes", essentially a graph preprocessing step that can simplify the graph to speed up training.

Mutual Information (MI) measures the interdependency between two random variables, DGI is the earliest method of applying a mutual information constraint to graph structure data, which maximizes the mutual information between a global graph summary and each of its nodes to learn an information-rich node representation. However, there are currently two limitations to DGI; firstly, mutual dependency between node embedding and node attributes is ignored by the DGI; second, DGI does not adequately mine the various relationships between nodes. Therefore, the existing method cannot perform better representation learning on the relational graph and further cannot obtain more accurate embedded representation of the citation relation of the scientific and technical paper. Therefore, how to more accurately represent the citation relationship of the scientific and technical papers is a technical problem to be solved urgently.

Disclosure of Invention

In view of the above, the present invention provides a scientific and technological thesis citation relationship representation learning method, system and storage medium, so as to solve one or more problems in the prior art.

According to one aspect of the invention, the invention discloses a scientific and technological paper citation relation representation learning method, which comprises the following steps:

acquiring a relational graph of the scientific and technical papers, wherein each node in the relational graph represents each scientific and technical paper, and each edge in the relational graph represents a reference relationship between the scientific and technical papers;

determining a first scientific and technological paper feature matrix and a scientific and technological paper adjacency matrix based on the relational graph;

constructing a graph automatic encoder constrained by mutual information;

inputting the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of a node;

performing transposition operation on the first scientific and technological thesis feature matrix to obtain a second scientific and technological thesis feature matrix;

inputting the second scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a second embedded representation of the node;

determining a mutual information constraint penalty for the graph autoencoder based on the first embedded representation and the second embedded representation.

In some embodiments of the invention, the method further comprises: the graph autoencoder includes a plurality of convolutional layers.

In some embodiments of the invention, the mutual information constraint loss function is:

wherein,L _MIC for mutual information constraint loss, N is the number of nodes in the relationship graph,E _(X,A) representing a first maximum likelihood estimate of the first,

representing the second maximum likelihood estimate and,S=∑εH,εis a set of weight factors, H is a first embedded representation, H = [ [ alpha ] ] [ [ alpha ] ] h ₁ … h _N ],

In order to be the second embedded representation,

，D(h _i ,s) Is represented by a nodeh _i To the aggregated vectorSThe probability score of (a) is obtained,

is represented by a node

To the aggregated vectorSIs calculated.

In some embodiments of the invention, the loss function of the graph autoencoder based on mutual information constraints is:

wherein,L _MIC in order to have a loss of mutual information constraint,αin order to balance the parameters of the system,Zin order to represent the nodes, the node is represented,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E _{q Z|X,A()} which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)indicating a gaussian distribution.

In some embodiments of the present invention, the graph autoencoder is a variational graph autoencoder, and the first scientific paper feature matrix and the scientific paper adjacency matrix are input to the graph autoencoder to obtain the first embedded representation of the node, including:

and inputting the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into a graph convolution neural network, determining Gaussian distribution based on the graph convolution neural network, and sampling from the determined Gaussian distribution to obtain a first embedded representation.

In some embodiments of the invention, determining the gaussian distribution based on the graph-convolution neural network comprises:

mean and variance are calculated based on the graph convolution neural network.

In some embodiments of the present invention, the loss of the variational map auto-encoder includes cross entropy and KL divergence.

In some embodiments of the invention, the loss of the variational picture autoencoder based on mutual information constraints is:

wherein,αin order to balance the parameters of the system,Zin order to be the first embedded representation,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E _{q Z|X,A()} which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)which represents a gaussian distribution of the intensity of the light,

indicating divergence.

According to another aspect of the present invention, a scientific paper citation relation representation learning system is also disclosed, which comprises a processor and a memory, wherein the memory stores computer instructions, the processor is used for executing the computer instructions stored in the memory, and when the computer instructions are executed by the processor, the system realizes the steps of the method according to any one of the above embodiments.

According to another aspect of the present invention, a computer-readable storage medium is also disclosed, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the steps of the method according to any of the above embodiments.

The invention discloses a scientific and technological paper citation relation representation learning method and system, which are characterized in that a first scientific and technological paper feature matrix and a scientific and technological paper adjacency matrix are determined based on a relational graph of a scientific and technological paper, and the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix are input to an automatic graph encoder based on mutual information constraint, so that a first embedded representation of each scientific and technological paper is obtained based on the automatic graph encoder.

In addition to the above, the scientific and technological paper citation relation representation learning method disclosed by the invention constructs a diagram of the scientific and technological paper according to the existing relation, obtains node representation through the proposed encoder encoding, constructs a certain relation map for the learned node representation, and can also obtain a deep scientific and technological paper association relation. The variational graph automatic encoder based on mutual information constraint disclosed by the invention realizes the application of mutual information maximization constraint in graph structure scientific and technological paper data, attaches the constraint of maximization global and local representation to a graph encoder, and jointly optimizes the loss of the self encoder and maximization mutual information, so that the learned scientific and technological paper node representation captures richer graph global attribute and node neighborhood information, thereby improving the quality of scientific and technological paper representation.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the specific details set forth above, and that these and other objects that can be achieved with the present invention will be more clearly understood from the detailed description that follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. For purposes of illustrating and describing some portions of the present invention, corresponding parts may be exaggerated in the drawings, i.e., may be larger relative to other components in an exemplary device actually made according to the present invention. In the drawings:

fig. 1 is a flowchart illustrating a scientific and technological paper citation relationship representation learning method according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of an overall architecture of a variational diagram auto-encoder based on a scientific and technological paper association diagram network and mutual information constraint according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.

It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the structures and/or processing steps closely related to the scheme according to the present invention are shown in the drawings, and other details not closely related to the present invention are omitted.

It should be emphasized that the term "comprises/comprising/comprises/comprising" when used herein, is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.

The existing unsupervised or self-supervised graph representation learning is a general technology for representing graph structure data in a low-dimensional space, and has important significance for promoting graph data mining tasks. The main difficulty is how to encode the original node features and edge associations of graph structure data into a low-dimensional embedding space. As a widely used depth model, a graph auto-encoder is proposed to learn graph embedding in an unsupervised manner by minimizing the reconstruction graph error of graph data, but its reconstruction loss ignores the distribution of potential representations, resulting in poor embedding effect. The inventor finds in research that maximizing mutual information between local node representations and global representation of graph structure data enables learned node representations to summarize information shared between nodes and to be used for downstream tasks; therefore, the invention provides a scientific and technological thesis citation relation representation learning method and system, loss and mutual information of an automatic encoder of a joint optimization graph are combined, so that the learned node representation captures richer information and node interaction, and the quality and accuracy of the node representation are improved.

Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In the drawings, the same reference numerals denote the same or similar parts, or the same or similar steps.

Fig. 1 is a flowchart illustrating a scientific and technological paper reference relationship representation learning method according to an embodiment of the present invention, and as shown in fig. 1, the scientific and technological paper reference relationship representation learning method at least includes steps S10 to S40.

Step S10: acquiring a relational graph of the scientific papers, wherein each node in the relational graph represents each scientific paper, and each edge in the relational graph represents a reference relation between the scientific papers.

In the step, firstly, a scientific and technological paper is constructed into a relational graph according to the existing relationship; illustratively, is prepared byNRelationship graph formed by nodesGIs defined asG=V,EWhereinVA set of nodes is represented that represents a set of nodes,Ea corresponding set of edges is represented that is,V=n ₁ …n _n ，n _i representation relationship diagramGTo (1)iIndividual nodes (or vertices). In this step, the nodes represent scientific papers, and the edges between the nodes represent reference relationships between the scientific papers.

Step S20: a first scientific paper feature matrix and a scientific paper adjacency matrix are determined based on the relational graph.

Adjacent matrix is denoted asAThe first science and technology thesis feature matrix is recorded asX. Wherein,Ais composed ofN×NThe matrix is a matrix of a plurality of matrices,Nis the number of nodes in the relationship graph, anA∈R ^N×N In a contiguous matrixAIn, if the nodeiAnd nodejConnected and then correspondingA _ij =1; if the nodeiAnd nodejIf not connected, it is correspondingA _ij = 0; from this, the matrixAThe element in (1) is 0 or 1. And each noden _i ∈VAnd a feature vectorx _i ∈R ^F Associating; all the feature vectors form a feature matrixX=[ x ₁ …x _n ]WhereinX∈R ^N×F ；NRefers to the number of nodes in the relationship graph,Frefers to the feature dimension.

Step S30: and constructing a graph automatic encoder constrained by mutual information.

Mutual informationI(X,Y)Weighing two random variationsXAndYdegree of correlation between them whenXAndYwhen completely independent, their mutual information is equal to 0, otherwise, the mutual informationI(X,Y)The larger the size, the representationXAndYthe higher the correlation. Currently, most unsupervised learning algorithms are directed to targets that are already partially defined in the input space, and as a "good" representation, should have excellent feature expression capability for targets that are not defined in the input space, so in the present invention, an encoder is trained in a manner of maximizing mutual information between local features and global features of a graph structure network, which stimulates the encoder to prefer information shared between all nodes, and if certain specific information (e.g., noise) exists only in certain neighborhoods, the information does not increase mutual information, and thus it is not desirable to encode it.

Step S40: and inputting the first scientific paper feature matrix and the scientific paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of the node.

In this step, the graph auto-encoder based on the mutual information constraint constructed in step S30 performs representation learning on the reference relationship of the scientific paper, and specifically inputs the first scientific paper feature matrix and the scientific paper adjacency matrix into the graph auto-encoder, thereby obtaining the first embedded representation.

In addition to the above, the scientific and technological paper citation relation representation learning method may further include the following steps: step S50: performing transposition operation on the first scientific and technological thesis feature matrix to obtain a second scientific and technological thesis feature matrix; step S60: inputting the second scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix into an automatic graph encoder to obtain a second embedded representation of the node; step S70: determining a mutual information constraint penalty for the graph autoencoder based on the first embedded representation and the second embedded representation. In this embodiment, the first scientific and technological paper feature matrix may be used as a positive sample of the feature matrix, the second scientific and technological paper feature matrix may be used as a negative sample of the feature matrix, the first embedded representation and the second embedded representation may be obtained by the graph auto-encoder based on the positive sample of the feature matrix and the negative sample of the feature matrix through weight sharing, and then a mutual information constraint loss function related to the first embedded representation and the second embedded representation may be constructed, and based on the mutual information loss function, parameters of the graph auto-encoder may be further optimized, so that the scientific and technological paper reference relationship may further obtain a better node potential representation.

Illustratively, the mutual information constraint loss function is:

In order to be the second embedded representation,

，D(h _i ,s) Is represented by a nodeh _i To aggregated vectorSThe probability score of (a) is obtained,

is represented by a node

To the aggregated vectorSThe probability score of (2).

In addition, the node embedding matrix learned based on the graph autoencoder representation is recorded asHThen H = [ H ] ₁ …h _N ],

Is each onen _i ∈VIs

Dimension node embedding vector, then

As a relational graphGThe nodes of (a) are embedded in the matrix. Wherein,n _i representation relationship diagramGTo (1)iThe number of nodes (or nodes) of the network,Nas to the number of nodes in the relationship graph,

the dimensions of the vector are embedded for the nodes,Vrepresenting a set of nodes. In this embodiment, the target of the node representation learning is learningHSo that it remains simultaneouslyGStructural information ofAAnd its characteristic informationX. Once the learning is complete, the user can then,Hcan be used as a single input feature matrix for downstream tasks such as node classification, link prediction and clustering. The problem of node representation learning is equivalent to learning a codeThe function of the function is that of the function,

in a contiguous matrixAAnd feature matrixXGenerating, as input, a node representationHI.e. byH =f（A,X）。

In one embodiment, the output of the convolutional neural network (GCN) of the previous layer of the graph in the graph autoencoder is used as the input of the convolutional network of the next layer, e.g.

(ii) a Wherein, the firstlThe input to the convolutional network isH ^l() ∈R ^N ^×D (initial input isH ⁰⁽⁾ =X），NIn order to be the number of nodes in the graph,Xfor the input first science and technology thesis feature matrix, each node usesDThe feature vector of the dimension is represented.

In order to add a self-connected adjacency matrix,Ais a contiguous matrix of inputs that is,

is shown aslThe adjacency matrix of the output of the layer convolution network,

is a matrix of the units,

is a matrix of the degrees, and the degree matrix,

。

representing adjacency matrices

Middle nodeiAnd nodejWhether the two parts are communicated or not, and when the two parts are communicated,

is 1, when the communication is not connected,

is 0.

Are the network parameters to be trained.

For the corresponding activation functions, the common ones are used here

Activating a function

. The number of layers of the convolutional layers in the graph automatic encoder is not particularly limited, and may be set according to actual needs.

Specifically, the method adopts a graph convolution neural network as an encoder to obtain potential embedded expression of nodes, and the process is expressed by the following formula:Z=GCN(X,A). Treating the graph convolutional neural network as a function, and taking the feature matrix asXAnd adjacent matrix

As input, coded output via GCN

，

Representing a potentially embedded representation of all nodes, the graph autoencoder uses the inner-product as a decoder to reconstruct the original graph adjacency matrix

As shown in formula:

. If desired to learn

To be able to better represent the nodes, the reconstructed adjacency matrix should be made

With original adjacency matrixAAs similar as possible, since the adjacency matrix determines the structure of the graph. Therefore, in the process of training the automatic encoder of the graph, the following cross entropy can be adopted as the loss function:

. Wherein,Nin order to be the number of the nodes,Frefers to the dimension of the characteristic of the image,

in order to balance the parameters of the system,

for the feature matrix of the first scientific paper,

in order to form a adjacency matrix of a scientific paper,

which represents the maximum likelihood estimate of the signal, P(A|Z)indicating a gaussian distribution.

The graph autoencoder in the invention can be a common graph autoencoder or a variational graph autoencoder. When the graph automatic encoder adopted by the scientific and technological paper citation relation representation learning method is a graph automatic encoder constrained by mutual information, the loss function of the graph automatic encoder based on the mutual information constraint is as follows:

wherein,L _MIC in order to have a loss of mutual information constraint,αin order to balance the parameters of the system,Zin order to represent the nodes, the node is represented,Xfor the feature matrix of the first scientific and technical paper,Ain order to form a adjacency matrix of a scientific paper,E _{q Z|X,A()} which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)representing a gaussian distribution.

The variational diagram automatic encoder adds a variational constraint on the basis of the automatic encoder, and in one embodiment,

inputting the first scientific paper feature matrix and the scientific paper adjacency matrix into an automatic graph encoder to obtain a first embedded representation of a node, comprising: and inputting the first scientific and technological paper characteristic matrix and the scientific and technological paper adjacency matrix into a graph convolution neural network, determining Gaussian distribution based on the graph convolution neural network, and sampling from the determined Gaussian distribution to obtain a first embedded expression. And, determining the gaussian distribution based on the atlas neural network includes calculating a mean and a variance based on the atlas neural network. For the variational diagram automatic encoder based on mutual information constraint, the loss comprises cross entropy and KL divergence.

Illustratively, in a variational graph autoencoder, the representation of nodesZIs no longer obtained from a determined function, but is obtained by sampling from a (multidimensional) gaussian distribution, i.e. a (multidimensional) gaussian distribution is determined by a graph convolution neural network, and then sampling is obtained from the distributionZ. The gaussian distribution can be uniquely determined by a second order matrix, so that only the mean and variance need be calculated in order to determine a gaussian distribution. The variational graph autoencoder uses a graph convolutional neural network to calculate the mean and variance, respectively. The mean and variance are calculated by the following formulas, respectively:

and

；

a first scientific paper feature matrix input to the image autoencoder,Ais an input science and technology paper adjacency matrix. Wherein,

and

first layer convolutional network parameters of

Are shared, and the second layer of convolutional network parameters

Are independent of each other and do not affect.

Furthermore, after the mean vector and the covariance matrix are obtained, since the sampling operation cannot provide gradient information,

it cannot be directly sampled and needs to be re-parameterized. The decoder of the variational picture autoencoder is also an inner product, and is the same as the picture autoencoder; the variational picture autoencoder still wants the reconstructed picture to be as similar as possible to the original picture; it is also desirable that the distribution of the GCN layer calculations be as similar as possible to the standard gaussian distribution. The loss of the variational picture autoencoder thus consists of both cross entropy and KL divergence. For example, the loss function of a variational graph autoencoder is

(ii) a Because:

then, then

. Wherein,

in order to be the first embedded representation,

for the feature matrix of the first scientific paper,

in order to form a adjacency matrix of a scientific paper,

is the distribution of graph convolution network layer computationsP(A|Z)Which represents a gaussian distribution of the intensity of the light,

is a standard gaussian distribution.

Further, the loss function of the variational diagram automatic encoder based on the mutual information constraint is as follows:

the degree of divergence is expressed in terms of,

n is the number of nodes in the relationship graph,E _(X,A) for the purpose of the first maximum likelihood estimate,

for the purpose of the second maximum likelihood estimate,s=∑εH,εis a set of weight factors, H is a first embedded representation, H = [ [ alpha ] ] [ [ alpha ] ] h ₁ … h _N ],

In order to be the second embedded representation,

is represented by a node

To aggregated vectorsThe probability score of (2).

For the variational graph automatic encoder based on mutual information constraint of the above embodiment, the embedding space where the node representation is located is optimized on the basis of the variational graph automatic encoder framework, and the node potential representation can be better learned by adding the constraint of mutual information, and the overall architecture diagram is shown in fig. 2.

In particular, based on a loss function

Constructing a variational graph automatic encoder based on mutual information constraint; the first scientific and technological paper feature matrix is used as a positive sample, and the second scientific and technological paper feature matrix is used as a negative sample; and training the variational diagram automatic encoder based on mutual information constraint through positive and negative samples. Original feature matrix

For the first science and technology thesis feature matrix, the negative example is the original feature matrix

Is disordered into

，

And maintaining a contiguous matrix

The change is not changed; in this case, the negative sample map and the positive sample map are isomorphic with each other.

Furthermore, after the single graph of the positive and negative samples is coded by the constructed variational graph automatic coder based on mutual information constraint, node representation of the positive and negative samples can be obtained

And

. Here, the present invention seeks to obtain a node (i.e., local) representation that captures the global information content of the entire graph, as represented by a summary vector

As shown in the figure, the material of the steel wire,

is to be

The input readout function is summarized as follows:

. Wherein,Nas to the number of the nodes,

is a set of weight factors representing the importance of each node on the overall graph.

After initialization, the nodes participate in gradient back propagation of the whole model, continuous optimization is carried out, and more weights are distributed to the nodes with high importance so as to better learnS。

This embodiment introduces a mutual information maximization strategy, i.e. the positive sample graph nodes represent the home summary vectors

The probability of (c) is as large as possible, the negative sample graph nodes represent the home summary vector

Is as small as possible. MIC-VGAE (mutual information constraint-based automatic encoder) adopts discriminator

As a proxy for maximizing local and global mutual information, wherein

Representing distribution node representations

To the aggregated vector

Is scored by the probability that the greater the score is, belongs to

The higher the probability of (c).

The function is as follows:

(ii) a Wherein,

is a matrix of weights that can be learned,

is a nodeiThe embedding of (a) represents a vector,sis a summarized vector.

To maximize mutual information, the present invention uses a noise contrast target with a standard binary cross-entropy (BCE) penalty between the joint samples (positive case) and the marginal products (negative case), as followsThe formula is a loss function of mutual information constraint in MIC-VGAE:

(ii) a Then the variational graph auto-encoder based on mutual information constraints can be defined as:

(ii) a A graph auto-encoder based on mutual information constraints can be defined as:

. Wherein

In order to balance parameters, in the invention, balance parameters are introduced

To control the importance of both components.

Illustratively, the flow algorithm for constructing the mutual information constraint-based variational graph autoencoder and the mutual information constraint-based graph autoencoder is as follows:

inputting: scientific thesis relation diagram

Model name, trade-off parameters, network parameters, training period

Number of iterations

And (3) outputting: node representation

1. Initializing all network parameters

2.

3.

4.

5. If the model adopts a variational diagram automatic encoder:

6. according to

And

updating

And

7. according to a loss function

To optimize all parameters

8. Otherwise:

9. according to a loss function

To optimize all parameters

10.

end while

Correspondingly, the invention also discloses a scientific and technical paper citation relation representation learning system, which comprises a processor and a memory, wherein the memory stores computer instructions, the processor is used for executing the computer instructions stored in the memory, and when the computer instructions are executed by the processor, the system realizes the steps of the method according to any one of the above embodiments.

According to the embodiment, the scientific and technological paper representation learning method based on the scientific and technological paper association graph network and mutual information constraint is characterized in that a scientific and technological paper is constructed into a graph according to an existing relationship, node representation is obtained through encoding of a proposed encoder, and a certain relationship graph is constructed for the learned node representation to obtain a deep scientific and technological paper association relationship. And the variational graph automatic encoder based on the scientific and technological paper association relation graph network and mutual information constraint can maximize each node representation through the global graph abstract so as to learn the graph node representation under the unsupervised condition.

In the invention, the application of mutual information maximization constraint in the data of the scientific and technological paper with the graph structure is realized, a strategy is provided, the constraint of maximization global representation and the constraint of maximization local representation are added to a graph encoder, the loss of the self encoder and the maximization mutual information are jointly optimized, the learned node representation of the scientific and technological paper captures richer graph global attribute and node neighborhood information, and the quality of the representation of the scientific and technological paper is improved.

In summary, the scientific and technological paper citation relationship representation learning method and system disclosed by the present invention determine the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix based on the relational graph of the scientific and technological paper, and input the first scientific and technological paper feature matrix and the scientific and technological paper adjacency matrix to the graph auto-encoder, so as to obtain the first embedded representation of each scientific and technological paper based on the graph auto-encoder.

In addition, the invention also discloses a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the method according to any of the above embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein may be implemented as hardware, software, or combinations of both. Whether this is done in hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments in the present invention.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A scientific and technical paper citation relation representation learning method is characterized by comprising the following steps:

constructing a graph automatic encoder constrained by mutual information;

2. The scientific paper citation relationship representation learning method of claim 1, wherein the graph autoencoder includes multiple convolutional layers.

3. The scientific and technological paper citation relationship representation learning method of claim 1, wherein the mutual information constraint loss function is:

，

representing the second maximum likelihood estimate and,S=∑εH,εis a set of weight factors, H is a first embedded representation, H = [ [ alpha ] ] [ [ alpha ] ] h ₁ … h _N ],

In order to be the second embedded representation,

is represented by a node

To the aggregated vectorSThe probability score of (2).

4. The scientific paper citation relationship representation learning method as claimed in claim 3, wherein the loss function of the graph auto-encoder based on mutual information constraint is:

wherein,L _MIC in order to have a loss of mutual information constraint,αin order to balance the parameters of the system,Zin order to represent the nodes, the node is represented,Xfor the feature matrix of the first scientific paper,Ain order to form a adjacency matrix of a scientific paper,E _{q Z|X,A()} which represents the maximum likelihood estimate of the signal,q(Z|X,A) Representing a gaussian distribution calculated by the graph convolution network layer in the graph autoencoder,P(A|Z)representing a gaussian distribution.

5. The method as claimed in claim 3, wherein the graph auto-encoder is a variational graph auto-encoder, and the first science paper feature matrix and the science paper adjacency matrix are input into the graph auto-encoder to obtain the first embedded representation of the node, including:

6. The scientific paper citation relationship representation learning method of claim 5, wherein determining the Gaussian distribution based on the graph convolution neural network includes:

mean and variance are calculated based on the graph convolution neural network.

7. The scientific article citation relationship representation learning method of claim 6, wherein the losses of the variational diagram auto-encoder include cross entropy and KL divergence.

8. The scientific paper citation relationship representation learning method as claimed in claim 7, wherein the loss of the variational graph automatic encoder based on mutual information constraint is:

indicating divergence.

9. A scientific and technical paper citation relationship representation learning system comprising a processor and a memory, wherein the memory has stored therein computer instructions, the processor being configured to execute the computer instructions stored in the memory, and wherein the system, when executed by the processor, implements the steps of the method of any one of claims 1 to 8.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.