CN115130663A

CN115130663A - Heterogeneous network attribute completion method based on graph neural network and attention mechanism

Info

Publication number: CN115130663A
Application number: CN202211043710.3A
Authority: CN
Inventors: 于彦伟; 王凯; 董军宇
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2022-08-30
Filing date: 2022-08-30
Publication date: 2022-09-30
Anticipated expiration: 2042-08-30
Also published as: CN115130663B

Abstract

The invention discloses a heterogeneous network attribute completion method based on a graph neural network and an attention mechanism, and belongs to the technical field of data processing. Firstly, capturing nodes similar to target nodes (nodes with missing attributes) from a network by combining cosine similarity between K-Nearest Neighbor and the attributes, and converting the attributes of the nodes into network representation of a characteristic domain of the target nodes in a self-adaptive manner; then, based on the attention mechanism of a graph neural network and a transformer, carrying out hierarchical analysis on the topological structure in the network and the attribute information of the nodes, and further obtaining the network representation of the target node in a spatial domain; and finally, model parameter learning is carried out by combining network characterization of the space domain and the characteristic domain based on the Euclidean distance loss function so as to complete the missing attribute. Through practical verification, the attribute completion method provided by the invention has the characteristics of high efficiency and high accuracy.

Description

Heterogeneous network attribute completion method based on graph neural network and attention mechanism

Technical Field

The invention relates to a heterogeneous network attribute completion method based on a graph neural network and an attention mechanism, and belongs to the technical field of data processing.

Background

Networks are ubiquitous in real life, and connections among most objects in the real world can be represented as networks, for example, friend relationships among users can be regarded as social networks, reference relationships among papers can be regarded as reference networks, and connection relationships among road sections can be regarded as a traffic network. The networks described above are all made up of nodes of the same type, and they are therefore also referred to as homogeneous networks. In the real world, heterogeneous networks are more widely available, and the nodes constituting the heterogeneous networks are of different types, for example: the shopping network consists of users and commodities, and the academic network consists of authors, the units where the authors are located and papers. Although these networks contain massive data, the missing attributes of the networks (e.g., some users do not want to upload their own age information in shopping networks; and keywords of papers are not filled in completely by authors in academic networks.) also pose a huge challenge to mining the potential value contained in the networks. The efficiency of data mining on the network can be effectively improved by complementing the missing attributes in the network, but the method is relatively complex, and various factors are considered, such as the connection relation between network nodes and the relation between the existing attributes and the missing attributes of the network nodes. How to efficiently and accurately complement the missing data in the network is also receiving more and more attention from the academic and industrial fields.

In the field of attribute completion, a traditional attribute completion method generally analyzes and completes the connection relation of nodes in a network and semantic text information. The above method does not take into account the topology of the entire network. In recent years, the graph neural network method shows high efficiency and accuracy in mining network information, for example, the graph convolution neural network and the graph attention network show excellent performance in capturing network topology and node attribute information. The development of the graph neural network also brings new possibilities to the field of attribute completion, such as a method for performing joint learning based on the graph convolution neural network in combination with attribute completion and commodity recommendation, and performing attribute completion by using the characteristics learned by the graph neural network. These methods have achieved significant results in attribute completion, but there is still room for improvement.

By analyzing and summarizing the existing attribute completion method, the existing method has the following defects: 1) the network characterization in the heterogeneous network is different from the type of all nodes in the homogeneous network and is the same, and the missing attribute of the nodes in the heterogeneous network needs to be supplemented by comprehensively considering homogeneous node information and heterogeneous node information. This problem has plagued many network characterization models. 2) Graph neural networks do not efficiently capture higher-order nodal information in the network. GCN essentially generates tokens by aggregating the surrounding node information of target nodes under a semi-supervised framework. In the attribute completion problem, the attribute of the target node may be related to not only its surrounding nodes but also its higher-order nodes. If the high-order information of the target node can be captured admittedly only by stacking the number of GCN layers, but as the number of GCN layers increases, the nodes in the network are more similar to the nodes in terms of representation, the specificity is lost, and the accuracy of attribute completion is further influenced.

Disclosure of Invention

In order to solve the problem of heterogeneous network attribute completion more effectively, the invention aims to provide an attribute completion method based on a graph neural network and an attention mechanism, so as to further improve the efficiency and accuracy of attribute completion and provide method and technical support for the heterogeneous network attribute completion problem.

In order to achieve the purpose of the invention, the specific technical scheme adopted by the invention is as follows:

an attribute completion method based on a graph neural network and an attention mechanism comprises the following steps:

s1: acquiring a heterogeneous attribute network with missing attributes, wherein nodes with the missing attributes are called target nodes, and nodes with complete attributes are called source nodes;

s2: selecting K source nodes most similar to the existing attributes of the target nodes by adopting a K-nearest neighbor (K-nearest neighbor) algorithm to perform attribute completion in a feature space;

s3: after K source nodes most similar to the target node are screened out by utilizing cosine similarity, a learnable parameter is given to each source node to dynamically adjust the weight of the attribute of each source node for completing the attribute of the target node, and the attribute representation of the feature space, which is given with the weight, is obtained;

s4: aggregating the nodes directly connected with the target node in the heterogeneous attribute network to obtain the low-order representation of the structure space; the specific aggregation mode is realized by a simplified graph neural network (simplified graph neural networks);

s5: firstly, obtaining a high-order node of a target node in a random walk mode;

s6: different weights are given to nodes in the high-order node sequence based on a transformer, and high-order representation of the structure space, which is given with the weights, is obtained;

s7: splicing the attribute representation of the feature space given with the weight, the low-order representation of the structure space and the high-order representation of the structure space given with the weight, then sending the spliced representations into a multilayer perceptron, and converting the representations into the attributes of the target nodes;

s8: and (3) adopting a supervised learning mode, firstly artificially removing partial attributes of partial nodes, then complementing the attributes through reconstruction prediction, continuously training through the difference between the complemented attributes and the real attributes, and finally complementing the attribute missing values of other nodes by using a trained model.

Further, in S1, a heterogeneous network with missing attributes is defined as

A collection of vertices in the diagram is represented,

a collection of edges in the diagram is represented,

representing a matrix of attributes of the vertices in the graph,

is the number of vertices in the graph,

for the dimensions of the features of each vertex,

is a matrix of marks when

When the utility model is used, the water is discharged,

is missing; on the contrary, the first step is to take the reverse,

the corresponding attribute in (a) is complete.

Further, in the S2, a large number of nodes are usually included in the heterogeneous network, and it is inefficient and impractical to supplement the missing attribute of the target node with the attributes of all the source nodes; therefore, K neighbor is adopted to select K source nodes most similar to the existing attributes of the target node to complete the attributes in the feature space, and cosine similarity (formula (1)) is used to measure the similarity of two attribute vectors;

(1)

wherein ,

which indicates the similarity of the two nodes,

representing an existing property of the target node,

representing attributes in the source node that correspond to existing attributes of the target node,

a larger value indicates a higher similarity between the target node and the source node.

Further, in the S3, there may be a plurality of factors affecting the attribute completion relationship of the source node to the target node in the heterogeneous network, for example: whether nodes are directly connected with each other, the weight of edges of the connected nodes, and the like. The dynamic adjustment is performed according to equation (2):

（2）

wherein ,

for the characterization of the feature space of the target node,

is with the target node

Most similar of

A set of a plurality of source nodes, each source node,

for each learnable adjustment weight corresponding to a source node,

is the feature vector of the source node.

Further, in S4, after completing the feature learning of the target node in the feature space, learning a feature of the target node in the structural space is required; the specific formula (3) is shown in the following:

（3）

wherein ,

for simplification of the figure in neural networks

The weight matrix of the layer or layers is,

is the first

Output of layer at the first layer of the reduced graph neural network

Is a contiguous matrix.

The graph neural network essentially captures topology structure information and attribute information of nodes in the network by aggregating neighbor attributes of target nodes, but with stacking of the layers of the graph neural network, the characteristics aggregated by the target nodes are over-smoothed, that is, the characteristics obtained by the target nodes through the aggregated characteristics lose specificity and distinctiveness. Therefore, the neighbor nodes of the target node are aggregated in the structural space by using the simplified graph neural network, and only one layer of network is used, so that the characteristics of the topology and the attributes are captured while the specificity of the target node representation is kept.

Further, the step S5 is specifically: in order to capture the high-order information of a target node in the heterogeneous network, the high-order node of the target node is obtained by using a random walk mode, wherein the random walk mode is a mode of traversing nodes in a graph by combining the advantages of depth-first traversal and breadth-first traversal in the graph, and the specific traversal mode is shown as a formula (4):

（4）

wherein

Representing nodes

To the node

The weight of the edge of (c);

and for each target node, taking the target node as a root node, and performing random walk based on the target node to obtain a node sequence. Since directly connected source nodes are already used in the reduced graph neural network, these nodes are deleted in the randomly walked node sequence.

Further, in S6, since the influence of the node sequence generated by the random walk on the target node is different, the nodes in the node sequence are given different weights based on the transform. The method specifically comprises the following steps:

s6-1: firstly, carrying out linear change on the characteristics of sequences of a target node and a source node, then calculating the weights of the target node and the nodes in each sequence based on the linear change, and carrying out softmax normalization operation on the obtained weights for the calculation stability;

s6-2: then, a new linear transformation independent of the weight calculation is performed again on the node features in the source node sequence.

S6-3: and finally, endowing each linear transformed source node sequence node feature with weight, and accumulating the linear transformed source node sequence node features to obtain a high-order node representation of a target node, wherein the calculation method is shown as a formula (5):

（5）

wherein

Representing nodes

By simplifying the characterization that the neural network of the graph aggregates,

representing nodes

Is characterized in that it is a mixture of two or more of the above-mentioned components,

，

and

a projection parameter matrix representing the parameters that can be learned,

representing nodes

High-order network characterization of (1).

S6-4: on the basis, the attention mechanism is expanded to a multi-head attention mechanism to capture multiple dependency relationships between the target node and the high-order source node. Then, the multiple dependency relationships are sent to an average pooling layer to obtain a final high-order network characterization, and the calculation method is shown as formula (6):

（6）

wherein ,

representing nodes

In the first place

The resulting high-order network characterization in the secondary attention mechanism,

indicating the number of times a total attention mechanism calculation is required.

Further, in S7, the specific calculation manner of the splicing is shown in formula (7):

(7)

wherein

A vector splicing operation is represented as a vector splicing operation,

for predicted target node

The attribute value of (2).

Further, the S8 specifically includes:

s8-1: after the predicted attribute of the target node is obtained, the existing attribute is reserved, the predicted attribute is filled into the missing attribute to complete the task of attribute completion, and the filling method is shown as formula (8):

（8）

wherein ,

representing the predicted attributes of all nodes after attribute completion,

the product of the hadamard is represented,

is a matrix with elements all 1:

s8-2: setting a gap between the loss function measurement prediction attribute and the real attribute based on an Euclidean distance formula, wherein a specific calculation mode is shown as a formula (9):

（9）

wherein

A set of target nodes is represented that,

is a node

The true value of the missing attribute(s),

is a node

Predicted missing attribute values.

The invention considers the relevance between the target node and the source node in the attribute space and the relevance between the target node and the source node in the structure space when the heterogeneous network attribute is complemented, which is specifically expressed as follows: in attribute space use

Neighbor algorithm finds the most similar to target node

A source node then adaptively assigns to it a corresponding

And adding the weight values to obtain attribute space representation of the target node. In a structural space, aggregating first-order neighbor information of a target node by using a simplified graph neural network to obtain a low-order representation of the target node; and then capturing high-order neighbor information of the target node in the structural space by using a multi-head attention mechanism based on transformer and random walk to obtain high-order representation of the target node. Finally, the three characteristics are fused, and the loss function is introduced based on Euclidean distanceAnd (5) leading to updating parameters of the integral model, and finally completing the missing attribute of the target node.

The invention has the advantages and beneficial effects that:

compared with the traditional attribute completion method, the method introduces a network representation learning framework to perform attribute completion, and can better capture the structural characteristics of the target node in the network; the first-order neighbors of the target node are aggregated by using the simplified graph neural network, so that the accuracy of the obtained network representation can be ensured on the premise of improving the attribute completion time efficiency; the method uses an attention mechanism based on random walk and transformer to capture high-order neighbor information of a target node, and solves the problem of node representation over-smoothness caused by a simple stacking graph neural network module to a certain extent.

Through practical verification, the attribute completion method provided by the invention has the characteristics of high efficiency and high accuracy.

Drawings

FIG. 1 is an overall flow chart of the present invention.

Fig. 2 is a framework diagram of the present invention.

Fig. 3 is a topology structural diagram one of the weights between nodes obtained in the present invention.

Fig. 4 is a topology structural diagram two of the weights between the nodes obtained in the present invention.

FIG. 5 is a flow chart of the present invention for obtaining high-order node characterization based on an attention mechanism.

Detailed Description

The invention will be further described with reference to the accompanying figures 1-5 and the specific embodiments.

Example 1:

an attribute completion method based on a graph neural network and an attention mechanism is shown in the overall flow chart of fig. 1. The method comprises the following steps:

s1: acquiring a heterogeneous attribute network with missing attributes, wherein nodes with the missing attributes are called target nodes, and nodes with complete attributes are called source nodes; defining a heterogeneous attribute network with missing attributes as

Representing a collection of vertices in a graph

A collection of edges in the diagram is represented,

representing the attribute matrix of the vertexes in the graph, n is the number of the vertexes in the graph,

for the dimensions of the features of each vertex,

is a matrix of marks, if

Then represents

Is missing; otherwise, it means

Is complete;

s2: selecting K source nodes most similar to the existing attributes of the target nodes by adopting a K-nearest neighbor (K-nearest neighbor) algorithm to complete the attributes in the feature space; in heterogeneous networks, which typically include a large number of nodes, it is inefficient and impractical to use the attributes of all source nodes to fill in the missing attributes of the target nodes; therefore, K source nodes most similar to the existing attributes of the target node are selected by adopting K neighbors to perform attribute completion in a feature space, wherein cosine similarity (formula (1)) is used for measuring the similarity of two attribute vectors;

(1)

wherein ,

which indicates the similarity of the two nodes,

representing an existing property of the target node,

the larger the value is, the higher the similarity between the target node and the source node is;

s3: after K source nodes most similar to the target node are screened out by utilizing cosine similarity, a learnable parameter is endowed to each source node to dynamically adjust the weight of the attribute of each source node for completing the attribute of the target node, and attribute representation endowed with the weight of a feature space is obtained; in a heterogeneous network, there may be a plurality of factors that affect the attribute completion relationship of the source node to the target node, for example: whether nodes are directly connected with each other, the weight of edges of the connected nodes, and the like. The dynamic adjustment is performed according to equation (2):

（2）

wherein ,

for the characterization of the feature space of the target node,

is with the target node

Most similar of

A set of a plurality of source nodes, each of the source nodes having a node address,

for each learnable adjustment weight corresponding to a source node,

a feature vector of a source node;

s4: aggregating the nodes directly connected with the target node in the heterogeneous attribute network to obtain the low-order representation of the structure space; the specific aggregation mode is realized by a simplified graph neural network (simplified graph relational networks); after the feature space finishes the feature learning of the target node, the feature of the target node needs to be learned in the structural space; the specific formula (3) is shown in the following:

（3）

wherein ,

for simplification of the figure in neural networks

The weight matrix of the layer(s) is,

is the first

Output of layer at the first layer of the reduced graph neural network

Is a contiguous matrix;

s5: firstly, obtaining a high-order node of a target node in a random walk mode; in order to capture high-order information of a target node in a heterogeneous network, a random walk mode is firstly used for obtaining the high-order node of the target node, the random walk mode is a mode for traversing nodes in a graph by combining the advantages of depth-first traversal and breadth-first traversal in the graph, and the specific traversal mode is shown in a formula (4):

（4）

wherein

Representing nodes

To node

The weight of the edge of (1);

and for each target node, taking the target node as a root node, and performing random walk based on the target node to obtain a node sequence. Because directly connected source nodes are already used in the reduced graph neural network, the nodes are deleted in the randomly walked node sequence;

s6: different weights are given to nodes in the high-order node sequence based on a transformer, and high-order representation of the structure space, which is given with the weights, is obtained; the method comprises the following specific steps:

S6-3: and finally, giving the weight to the node characteristics of each source node sequence after linear transformation, and accumulating the node characteristics to obtain the high-order node characterization of the target node, wherein the calculation method is shown as a formula (5):

（5）

wherein

Representing nodes

representing nodes

Is characterized in that the pressure difference between the pressure sensor and the pressure sensor,

，

and

a projection parameter matrix representing the parameters that can be learned,

representing nodes

High-order network characterization of (1).

S6-4: on the basis, the attention mechanism is expanded to a multi-head attention mechanism to capture multiple dependency relationships between a target node and a high-order source node. Then, the multiple dependency relationships are sent to an average pooling layer to obtain a final high-order network characterization, and the calculation method is as shown in formula (6):

（6）

wherein ,

representing nodes

In the first place

representing the number of times a total attention mechanism calculation is required;

s7: splicing the attribute representation of the feature space given with the weight, the low-order representation of the structure space and the high-order representation of the structure space given with the weight, then sending the spliced representations into a multilayer perceptron, and converting the representations into the attributes of the target nodes; the specific calculation mode of the splicing is shown as formula (7):

（7）

wherein

A vector splicing operation is represented as a vector splicing operation,

for predicted target node

The attribute value of (2).

S8: adopting a supervised learning mode, firstly manually removing partial attributes of partial nodes, then complementing the attributes through reconstruction prediction, continuously training through the difference between the complemented attributes and the real attributes, and finally complementing attribute missing values of other nodes by using a trained model; the method specifically comprises the following steps:

（8）

wherein ,

representing the predicted attributes of all nodes after attribute completion,

the product of the hadamard is represented,

is a matrix with elements all being 1;

（9）

wherein

A set of target nodes is represented as,

is a node

The true value of the missing attribute(s),

is a node

Predicted missing attribute values.

Example 2: this example uses example 1 as a basic method to perform module design.

An attribute completion system based on a graph neural network and an attention mechanism comprises a data preprocessing module, a mark matrix construction module, a feature space characterization learning module, a low-order neighbor characterization learning module, a high-order neighbor characterization learning module, a characterization fusion module and an attribute reasoning module, and as shown in fig. 2, the following describes each part in detail:

the data preprocessing module: firstly, normalizing attribute features in an original data set, then dividing the data into a training set, a testing set and a verification set, randomly removing attribute information of nodes in the training set, and recording the removed attribute information as a true value to guide a model to learn.

The mark matrix constructing module: traversing the node attributes in the data set, marking the missing attribute values of the nodes, and further forming an attribute marking matrix

。

The feature space characterization learning module: and selecting similar nodes of the target node in the structural space, giving weights to the nodes, and summing the characteristics of the nodes to obtain the characterization of the attribute space of the target node.

The low-order neighbor representation learning module: attributes of first-order neighbor nodes of the target node are aggregated by using the simplified graph neural network, and as shown in fig. 3, low-order neighbor representations of the target node are obtained.

The high-order neighbor representation learning module: aggregating the attributes of the higher-order neighbor nodes of the target node through random walk and a transformer-based attention mechanism, as shown in fig. 4 and 5, obtaining the higher-order neighbor representation of the target node.

The characterization fusion module: and fusing the feature space representation, the low-order neighbor node representation and the high-order neighbor node representation of the target node.

The attribute reasoning module: and predicting the fusion representation of the target node through a multilayer perceptron to obtain the prediction attribute of the target node, and filling the predicted attribute into the corresponding missing attribute through a marking matrix.

Example 3: the embodiment performs instance verification based on the method and system

In order to verify the accuracy of the model attribute completion proposed by the invention, in three data sets: experiments were conducted on DataBase systems and program Logic networks (DBLP), Association for Computing Machinery (ACM), and Internet Movie databases (IMDb), using Heat Kernel and Correlation as evaluation indices, and comparing them with seven existing models.

The seven existing models are respectively: matrix Completion (MC), maximum likelihood Estimation (EM), multi-layer Perceptron (MLP)

) Support Vector Regression (SVR), Heterogeneous Graph attention networks (HGAT), Adaptive Graph Neural networks (AGCN), and Heterogeneous Graph Neural networks by Attribute Completion (HGNN-AC).

TABLE 1 comparative experimental results

The final experimental results are shown in table 1, wherein AC-HEN is the method provided by the present invention. It can be seen that on three real data sets, Heat Kernel and Correlation of the attribute completion method provided by the invention are significantly higher than those of other methods, which means that the model constructed by the invention is superior to other existing models, and the accuracy of attribute completion is higher.

The above-mentioned embodiments are merely intended to be examples of the present invention, but the scope of the present invention is not limited thereto, and all those skilled in the art can understand that the substitutions and changes within the technical scope of the present invention are included in the present invention, so the present invention shall be subject to the protection scope of the claims.

Claims

1. A heterogeneous network attribute completion method based on a graph neural network and an attention mechanism is characterized by comprising the following steps:

s2: selecting K source nodes which are most similar to the existing attributes of the target node by adopting a K neighbor algorithm to perform attribute completion in a feature space;

s3: after K source nodes most similar to the target node are screened out by utilizing cosine similarity, a learned parameter is given to each source node to dynamically adjust the weight of the attribute of each source node for completing the attribute of the target node, and the attribute representation of the feature space, which is given with the weight, is obtained;

s4: aggregating the nodes directly connected with the target node in the heterogeneous attribute network to obtain the low-order representation of the structure space; the specific aggregation mode is realized by a simplified graph neural network;

s8: and (3) adopting a supervised learning mode, firstly manually removing partial attributes of partial nodes, then complementing the attributes through reconstruction prediction, continuously training through the difference between the complemented attributes and the real attributes, and finally complementing the attribute missing values of other nodes by utilizing a trained model.

2. The attribute completion method according to claim 1, wherein in S1, a heterogeneous network with missing attributes is defined as

A collection of vertices in the diagram is represented,

a collection of edges in the diagram is represented,

representing a matrix of attributes of the vertices in the graph,

is the number of vertices in the graph,

for the dimension of each of the vertex features,

is a matrix of marks when

When the temperature of the water is higher than the set temperature,

is missing; on the contrary, the method can be used for carrying out the following steps,

the corresponding attribute in (a) is complete.

3. The method according to claim 1, wherein in S2, K neighbors are first used to select K source nodes that are most similar to the existing attributes of the target node for attribute completion in the feature space, and cosine similarity is used to measure the similarity between two attribute vectors;

(1)

wherein ,

which indicates the similarity of the two nodes,

representing an existing property of the target node,

4. The property completion method according to claim 1, wherein in said S3, said dynamic adjustment is performed according to formula (2):

（2）

wherein ,

for the characterization of the feature space of the target node,

is with the target node

Is most similar to

for each learnable adjustment weight corresponding to a source node,

is the feature vector of the source node.

5. The method for attribute completion according to claim 1, wherein in S4, after completing the feature learning of the target node in the feature space, the feature needs to be learned in the structure space; the specific formula (3) is shown in the following:

（3）

wherein ,

for simplifying the first in the neural network

The weight matrix of the layer or layers is,

is the first

Output of layer at the first layer of the reduced graph neural network

Is a contiguous matrix.

6. The attribute completion method according to claim 1, wherein S5 specifically: in order to capture the high-order information of the target node in the heterogeneous network, the high-order node of the target node is obtained by using a random walk mode, wherein the random walk mode is a mode of traversing nodes in a graph by combining the advantages of depth-first traversal and breadth-first traversal, and the specific traversal mode is as shown in formula (4):

（4）

wherein

Representing nodes

To the node

The weight of the edge of (2).

7. The attribute completion method according to claim 1, wherein the S6 is specifically:

s6-1: firstly, carrying out linear change on the characteristics of a target node sequence and a source node sequence, then calculating the weights of the target node and the nodes in each sequence based on the linear change, and carrying out softmax normalization operation on the obtained weights for the calculation stability;

s6-2: then, carrying out new linear transformation independent of weight calculation on the node characteristics in the source node sequence again;

s6-3: finally, giving weight to the node characteristics of each source node sequence after linear transformation, and accumulating the node characteristics to obtain the high-order node representation of the target node; the calculation method is shown in formula (5):

（5）

wherein

Representing nodes

representing nodes

，

and

a projection parameter matrix representing the parameters that can be learned,

representing nodes

High-order network characterization of (1);

s6-4: expanding the attention mechanism to a multi-head attention mechanism to capture multiple dependency relationships between a target node and a high-order source node; then, the multiple dependency relationships are sent to an average pooling layer to obtain a final high-order network characterization, and the calculation method is shown as formula (6):

（6）

wherein ,

representing nodes

In the first place

8. The attribute completion method according to claim 1, wherein in S7, the specific calculation manner of the concatenation is as shown in formula (7):