CN115310594A

CN115310594A - Method for improving expandability of network embedding algorithm

Info

Publication number: CN115310594A
Application number: CN202210953351.9A
Authority: CN
Inventors: 陈东明; 谢飞; 张陛圣; 聂铭硕; 王冬琦
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2022-08-10
Filing date: 2022-08-10
Publication date: 2022-11-08

Abstract

The invention discloses a method for improving the expandability of a network embedding algorithm, which relates to the field of network representation learning; the method is suitable for large-scale graph network representation learning, the expandability of network embedding is improved and the embedding quality is improved through graph fusion, graph coarsening, graph embedding and embedding refinement operations, and the embedding result can be used for graph downstream tasks such as network role discovery, social recommendation system and social influence prediction. Labels of super nodes in the coarsest graph are obtained by calculating a label matrix of the original graph and a mapping matrix of the original graph and the coarsest graph and participate in model training as training labels of a supervised graph embedding algorithm represented by GCN, and the problem that the supervised graph embedding algorithm cannot be processed by an existing multi-layer strategy is solved. The method effectively improves the capability of processing a large-scale network by using the graph embedding algorithm, and simultaneously, the expandability of the method is verified by performing experiments on a large-scale graph data set Friendster.

Description

Method for improving expandability of network embedding algorithm

Technical Field

The invention belongs to the field of network representation learning, and relates to a method for improving the expandability of a network embedding algorithm.

Background

In recent years, network embedding has attracted great attention because it is widely applied to a series of tasks such as network role discovery, social recommendation system, and social influence prediction. Although these new embedding methods tend to have obvious advantages over the traditional methods, the current graph embedding methods still have some drawbacks in terms of accuracy and scalability.

On one hand, random walk-based embedding algorithms such as deep walk and node2vec only carry out network embedding based on the topological structure of the network under the condition that the node attribute characteristics are not included, and therefore the embedding capability of the deep walk and node2vec is greatly limited. Subsequently, based on the concept that node embedding is smooth across the entire graph, a graph convolution network GCN has emerged, which, although simplifying graph convolution at each layer with topology and node feature information, may be affected by high frequency noise in the initial node features, which affects the embedding quality.

On the other hand, most embedding algorithms are computationally expensive and are typically memory intensive and difficult to scale to large network datasets (e.g., networks with over 100 million nodes). For almost all network embedding algorithms, how to apply the embedding algorithms under large-scale networks such as real social networks, communication networks or citation networks is always a key problem. The Graph Neural Network (GNN) is no exception, and it is difficult to extend the graph neural network because in the case of large data volume, many core computation steps require a considerable time overhead, for example, graphcage needs to collectively aggregate feature information from neighborhoods, and when there are multiple superimposed GNN layers, the final embedding vector of one node involves computing a large number of intermediate embedding from its neighboring nodes, which not only results in a drastic increase in the computation amount between nodes, but also results in a high memory usage rate for storing intermediate results. This occurs because the network is not common euclidean data, the neighborhood structure of each node is not the same, and therefore it cannot be directly applied to batch processing, and the laplacian of the graph is difficult to compute when there are millions of nodes and edges. It can be said that scalability will determine whether the network embedding algorithm can be applied to large networks in the real world.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a method for improving the expandability of a network embedding algorithm, which is suitable for large-scale graph network representation learning, improves the expandability of network embedding and improves the embedding quality through graph fusion, graph coarsening, graph embedding and embedding refinement operations, and the embedding result can be used for graph downstream tasks such as network role discovery, social recommendation system, social influence prediction and the like.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a method for improving the expandability of a network embedding algorithm comprises the steps of graph fusion, graph coarsening, graph embedding and embedding refinement, and comprises the following steps:

step 1: for an original graph

The original graph is an undirected graph, the node characteristic matrix of the undirected graph is converted into a characteristic graph and is fused with the original topology of the original graph, and A is calculated _fusion ＝f(A _topo And X), wherein,

represented as a contiguous matrix of the array of pixels,

is represented as a matrix of node characteristics,

representing weighted graphs

The adjacency matrix of (a);

and 2, step: using hybrid coarsening to transform an original graph

Coarsening to

Is a graph which is subjected to first coarsening,

is the final coarsest graph after m coarsening;

and step 3: in the coarsest figure

Executing a graph embedding method g (-) to obtain an embedding result epsilon;

and 4, step 4: obtaining an original graph according to the embedding result epsilon obtained in the step 3

Is embedded in ₀ 。

The step 1 specifically comprises the following steps:

step 1.1: original graph for | V | nodes

Its adjacency matrix is expressed as

Its node feature matrix is

Wherein K represents the dimension of the feature vector of the corresponding node;

step 1.2: generating a k nearest neighbor graph according to the L2-norm between the attribute vectors of each node pair by using a local spectral clustering algorithm, and converting the original graph into a new graph

Is converted into a node feature map

Wherein the L2-norm is the Euclidean distance;

step 1.3: from the original graph

The cosine similarity between the attribute vectors of any two nodes assigns a weight to each edge of the k-nearest neighbor graph, i.e.

Wherein, X _i,: And X _j,: Is the attribute vector for nodes i and j;

step 1.4: combining the topological graph and the attribute graph, and constructing a fusion graph through weighting, wherein the fusion graph is shown as a formula (1):

A _fusion ＝A _topo +βA _feat (1)

wherein beta is used to balance topology information and node characteristic information during the fusion process,

representing weighted graphs

Of a contiguous matrix of _feat Is a node characteristic matrix of the k nearest neighbor graph.

The generation of the k nearest neighbor graph in the step 1.2 specifically includes the following steps:

step 1.2.1: taking a random graph signal x as a random vector, and representing by combining a linear combination of a feature vector u of a graph Laplacian;

step 1.2.2: filtering out a high-frequency component of a random graph signal x by adopting a low-pass graph filter, wherein the high-frequency component of the random graph signal x is a feature vector corresponding to a high feature value of a graph Laplacian operator, and obtaining a smooth vector by applying a smoothing function to the random graph signal x

As shown in formula (2):

step 1.2.3: solving linear equations with Gauss-Seidel iteration

Obtaining T initial random vectors T = (x: (x)) ¹ ),...,x ^(t) ) Wherein

representing the original picture

A laplacian matrix of;

step 1.2.4: embedding each node into a T-dimensional space based on an initial random vector T, and calculating low-dimensional embedded vectors of a node p and a node q

And

similarity, if the similarity meets a similarity threshold, the node p and the node q are similar nodes;

the node similarity is determined by the spectrum node affinity of adjacent nodes p and q, and is shown in formula (3):

wherein:

wherein, a _p,q For the spectral node affinities of neighboring nodes p and q,

embedding vectors for the kth low dimension of node p

Embedding vectors for the kth low dimension of node q

The similarity threshold is set to be greater than 60%;

step 1.2.5: iterating the step 1.2.1 to the step 1.2.4, and further aggregating the aggregated node set as a super node until the original graph is obtained

And if the node similarity between any two nodes does not meet the similarity threshold, determining a final node cluster, selecting the first k nearest neighbors in each cluster, and constructing a k nearest neighbor graph.

The step 2 is a graph coarsening method, which specifically comprises the following steps:

step 2.1: inputting an original graph

And setting the total coarsening layer number m, wherein i is more than or equal to 0 and less than or equal to m-1;

step 2.2: by projecting a matrix M _i,i+1 Coarsening information is stored for coarsening a plurality of nodes into supernodes in a hybrid coarsening, wherein,

step 2.3: graph built at i +1 level by matrix operation

Adjacent matrix A of _i+1 Calculating A _i+1 ＝M _i,i+ ₁ ^T A _i M _i,i+1 Wherein M is _i,i+1 Is derived from

To

A mapping matrix of _i Is composed of

The adjacency matrix of (a);

step 2.4: calculate the map at the i-1 level

Second-order neighbor coarsening mapping matrix of (1)

Mark matrix

Initializing a node for storing a first-order neighbor coarsening map

Pair V in ascending order of node degree _i-1 Sorting is carried out;

step 2.5: if v and u are not marked and u is a neighbor node of v, finding a first-order coarsening node u of v according to the information interaction probability t _i,j ，

Finally marking the node u and the node v;

step 2.6: based on

And

calculating a mapping matrix M by matrix operation _i-1,i I.e. if it is to

The nodes a, b and c in the node B are coarsened into

D, then the matrix M is mapped _i-1,i The value of the row a, the row d, the column b, the row c and the column d in the table is 1, otherwise, the value is 0; at M _i-1,i Each column in the column represents a super node in the next layer of graph, the node value of the coarse super node in the column is 1, and the others are 0;

step 2.7: is calculated to obtain

Adjacent matrix A of _i As shown in formula (5):

A _i ＝M _i-1,i ^T A _i-1 M _i-1,i (5)

wherein, M _i-1,i To be driven from

To

A mapping matrix of _i-1 Is composed of

The adjacency matrix of (a);

step 2.8: repeating the step 2.5 to the step 2.7, and traversing all the nodes after the ascending sequence;

step 2.9: repeating the step 2.1 to the step 2.8 until the total coarsening layer number m is reached;

step 2.10: obtaining i is more than or equal to 0 and less than or equal to m-1 times of coarsening

And M _i,i+1 。

The information interaction probability t in the step 2.5 _i,j Measuring the similarity between the nodes and determining whether to combine the nodes, comprising the following steps:

step 2.5.1: traverse the original graph

All node pairs in;

step 2.5.2: for the original picture

Carrying out second-order neighbor coarsening, and merging two nodes with the same neighbor node set;

step 2.5.3: traversing the coarsened graph of the second-order neighbor;

step 2.5.4: calculating the information interaction probability t between node pairs _i,j As shown in formula (6):

wherein,

is the weight of the edge between node i and node j, d _i Degree of node i, d _j Degree of node j;

step 2.5.5: first order neighbor coarsening is performed on the graph, and the graph is not coarsened and t _i,j The largest neighbor node will merge.

The step 3 is graph embedding, and specifically comprises the following steps:

step 3.1: in the coarsest figure

Upper application graph embedding g (·);

step 3.2: if the graph embedding method g (-) is an unsupervised embedding method, the coarsest graph and the original graph are expanded

The corresponding relation between the node labels and the characteristics;

step 3.3: if graph embedding method g (-) is a supervised embedding method, no extension is needed.

The graph embedding method g (-) in the step 3.2 is an unsupervised embedding method, and specifically comprises the following steps:

step 3.2.1: according to the original diagram

And constructing an original graph by using Lb node labels

Tag matrix of

Each row of the matrix is a label vector in the form of one-hot node;

step 3.2.2: constructing an original graph

Mapping matrix M to the coarsest graph _0,i I.e. M _0,i ＝M _0,1 M _1,2 …M _i,i-1 M _0,i ；M _0,i The rows in the matrix are in a one-hot form, and the columns are coarsened nodes;

step 3.2.3: by the label matrix Lb ₀ Obtaining labels of the coarsened nodes, taking mode of each column of labels as the labels of the coarsened nodes to obtain a label matrix

Step 3.2.4: original graph

Node attribute feature F ₀ Original picture

Feature mapping matrix M to the coarsest graph _0,i (ii) a First to M _0,i Normalization is carried out, and then a characteristic matrix of the coarsest graph is obtained through matrix operation

The step 4 specifically comprises the following steps:

step 4.1: using a local refinement process driven by gihonov regularization, node embedding on the graph is smoothed by minimizing the formula, as shown in equation (7):

wherein L is _i Is a normalized laplacian matrix of the signal,

for embedding for smoothing the i-th layer,. Epsilon _i Embedding for the ith layer;

and 4.2: taking the derivative of the objective function in the equation in step 4.1 equal to 0, as shown in equation (8):

wherein epsilon _i For the embedding of the ith layer, I is the identity matrix, L _i Is a normalized laplacian matrix of the signal,

embedding for smoothing the ith layer;

step 4.3: the mapped embedded matrix is smoothed using a low-pass filter, as shown in equation (9):

wherein,

a degree matrix is normalized twice for the ith layer,

the normalized adjacency matrix for the ith layer,

to be driven from

To

Projection matrix of epsilon _i+1 Embedding the (i + 1) th layer;

step 4.4: the original map is obtained by iterating step 4.2

Is embedded in ₀ 。

Adopt the produced beneficial effect of above-mentioned technical scheme to lie in:

1. the invention provides a method for improving the expandability of a network embedding algorithm. And carrying out graph coarsening by using a mixed coarsening strategy based on second-order neighbor coarsening and first-order neighbor coarsening, and carrying out embedding refinement by using an effective graph filter after calculating and embedding on the coarsest graph. And a large number of experiments prove that the algorithm has more advanced performance compared with other multi-layer strategies.

2. The labels of the super nodes in the coarsest graph are obtained through the calculated label matrix of the original graph and the mapping matrix of the original graph and the coarsest graph, and the labels are used as training labels of a supervised graph embedding algorithm represented by GCN to participate in model training, so that the problem that the supervised graph embedding algorithm cannot be processed by the existing multilayer strategy is solved.

3. The expandability of the graph embedding algorithm is improved. The strategy is irrelevant to the bottom map embedding algorithm, and the capacity of the map embedding algorithm for processing a large-scale network is improved. Finally, the expandability of the strategy is proved by performing real verification on a large-scale graph data set Friendster.

Drawings

FIG. 1 is a schematic diagram of a method for improving the scalability of a network embedding algorithm according to the present invention;

FIG. 2 is a schematic diagram of a hybrid coarsening strategy in a graph coarsening stage of the present invention;

FIG. 3 is a schematic diagram of Micro-F1 values under different coarsening layers after the invention combines the baseline algorithm embedded by various graphs;

FIG. 4 is a comparison graph of CPU time at different coarseness layers after the invention incorporates the baseline algorithm embedded by each type of graph;

FIG. 5 is a schematic diagram of the Micro-F1 values under different coarsening layers for the scalability experiment on a large-scale network according to the present invention;

FIG. 6 is a graph of node number and edge number changes in a coarsened graph, obtained from a scalability experiment on a large-scale network according to the present invention;

FIG. 7 is a system use diagram of a prototype system developed in the present invention;

FIG. 8 is a schematic diagram of system functional modules of a prototype system developed in the present invention;

FIG. 9 is a timing diagram of a model training lifecycle of a prototype system developed in the present invention;

FIG. 10 is a timing diagram of a data set upload for a prototype system developed in the present invention;

FIG. 11 is a data set upload interface screenshot of a prototype system developed in the present invention;

FIG. 12 is a static web data source introduction interface screenshot of a prototype system developed in the present invention;

FIG. 13 is a static network data detail introduction interface screenshot of a prototype system developed in the present invention;

FIG. 14 is a screenshot of a model training module interface of a prototype system developed in the present invention;

FIG. 15 is a screenshot of a node classification task result interface of the prototype system developed in the present invention;

FIG. 16 is a resource occupancy interface screenshot of the current server of the prototype system developed in the present invention.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

The embodiment provides a method for improving the extensibility of a network embedding algorithm, as shown in fig. 1, which includes the following steps:

step 1: for an original graph

The original graph is an undirected graph, the node characteristic matrix of the undirected graph is converted into a characteristic graph and is fused with the original topology of the original graph to calculate A _fusion ＝f(A _topo And X), wherein,

represented as a contiguous matrix of the array of pixels,

represented as a matrix of the characteristics of the nodes,

representing weighted graphs

The adjacency matrix of (a); the method specifically comprises the following steps:

step 1.1: original graph for | V | nodes

Its adjacency matrix is expressed as

Its node feature matrix is

Is converted into a node feature map

Wherein the L2-norm is the Euclidean distance;

As shown in formula (1):

step 1.2.3: solving a system of linear equations using Gauss-Seidel iterations

To obtain T initial random vectors T = (x) ⁽¹⁾ ,...,x ^(t) ) Wherein

representing the original picture

A laplacian matrix of;

And

similarity, if the similarity meets a similarity threshold, the node p and the node q are similar nodes, and the node similarity is determined by the spectrum node affinity of the adjacent nodes p and q, as shown in formula (2);

wherein:

wherein, a _p,q For the spectral node affinities of neighboring nodes p and q,

embedding vectors for the kth low dimension of node p

Embedding vectors for the kth low dimension of node q

The similarity threshold is set to be greater than 60%;

Determining a final node cluster if no node similarity between any two nodes meets a similarity threshold, selecting the first k nearest neighbors in each cluster, and constructing a k nearest neighbor graph;

step 1.3: according to the original diagram

Wherein X _i,: And X _j,: Is the attribute vector for nodes i and j;

step 1.4: combining the topological graph and the attribute graph, and constructing a fusion graph through weighting, wherein the formula (4) is as follows:

A _fusion ＝A _topo +βA _feat (4)

wherein beta is used for balancing topology structure information and node characteristic information in the fusion process,

representing weighted graphs

Of a neighboring matrix of _feat A node feature matrix of the k nearest neighbor graph is obtained;

step 2: using hybrid coarsening to transform an original graph

Coarsening to

Is a graph which is subjected to the coarsening for the first time,

is the final coarsest graph after m coarsening;

in this embodiment, the original graph is coarsened by a hybrid coarsening strategy

Iteratively coarsening into smaller graphs; the two-stage neighbor coarsening strategy can effectively coarsen the graph and simultaneously reserve the global structure; as shown in fig. 2;

second-order neighbor coarsening; because the nodes sharing similar adjacent nodes have higher second-order similarity, two nodes with the same neighbor node set are merged; as shown in FIG. 2, node v ₁ ，v ₂ And v ₃ Will be merged since they are both connected to node v ₄ And has a common neighbor set v ₄ V node ₅ Unlike their neighbor set;

first-order neighbor coarsening; after the second-order coarsening, a plurality of non-coarsened and adjacent nodes still exist, and the nodes also have high probability to carry out information exchange so as to have similar characteristics, so that the strategy measures whether two non-coarsened adjacent nodes should be merged or not through the information exchange probability; if at node v _i Of neighbors, slave node v _i To node v _j Information propagation probability of

Maximum, then node v _j Is node v _i Most probabilistically spread the information, but this does not mean that node v is _j Also most probable to node v _i Information is propagated; the strategy uses the information interaction probability t _i,j The similarity of the two nodes is measured, and is shown in a formula (5); not coarsened and t _i,j The largest neighbor node will be merged; as shown in FIG. 2, after the second order folding, the weights

Degree of node d ₇ ＝4，d ₈ ＝2，d ₉ =2, so t _7,8 =1/8 and t _8,9 =1/4, therefore node v ₈ And v ₉ Will be merged;

the method specifically comprises the following steps:

step 2.1: inputting an original graph

step 2.2: by projecting a matrix M _i,i+1 Saving coarsening information for coarsening a plurality of nodes into supernodes in the hybrid coarsening, wherein,

step 2.3: constructing graphs at the i +1 level by matrix operations

Adjacent matrix A of _i+1 Calculating A _i+1 ＝M _i,i+ ₁ ^T A _i M _i,i+1 Wherein M is _i,i+1 To be driven from

To

Of a mapping matrix of _i Is shown as a drawing

The adjacency matrix of (a);

step 2.4: calculate the map at the i-1 level

Second-order neighbor coarsening mapping matrix of (1)

Mark matrix

Initializing a node for storing a first-order neighbor coarsening mapping

Pair V in ascending order of node degree _i-1 Sorting is carried out;

Finally marking the node u and the node v;

the information interaction probability t in the step 2.5 _i,j Measuring between nodesSimilarity, whether to merge nodes is determined, and the method comprises the following steps:

step 2.5.1: traverse the original graph

All node pairs in;

step 2.5.2: to the picture

Performing second-order neighbor coarsening, and merging two nodes with the same neighbor node set;

step 2.5.3: traversing the coarsened graph of the second-order neighbor;

step 2.5.4: calculating information interaction probability t between node pairs _i,j As shown in formula (6):

wherein,

step 2.5.5: first order neighbor coarsening is performed on the graph, and the graph is not coarsened and t _i,j The largest neighbor node will be merged;

step 2.6: based on

And

calculating mapping matrix M by matrix operation _i-1,i I.e. if it is to

The nodes a, b and c in the network are coarsened into

D, then the matrix M is mapped _i-1,i The value of the row a, the row d, the row b, the row c and the column d is 1, otherwise, the value is 0; at M _i-1,i Each column in the column represents a super node in the next layer of graph, the node value of the coarse super node in the column is 1, and the others are 0;

step 2.7: is calculated to obtain

Adjacent matrix A of _i As shown in formula (7):

A _i ＝M _i-1,i ^T A _i-1 M _i-1,i (7)

wherein M is _i-1,i To be driven from

To

A mapping matrix of _i-1 Is shown as a drawing

The adjacency matrix of (a);

And M _i,i+1 ；

And step 3: in the coarsest figure

Executing a graph embedding method g (-) to obtain an embedding result epsilon;

the method specifically comprises the following steps:

step 3.1: in the coarsest figure

Upper application graph embedding g (·);

step 3.2: if the graph embedding method g (-) is an unsupervised embedding method, the corresponding relation of the node labels and the characteristics between the coarsest graph and the original graph is expanded;

the step 3.2 is an unsupervised embedding method for the graph embedding method g (-) and specifically comprises the following steps:

step 3.2.1: from the original graph

And constructing an original graph by using Lb node labels

Tag matrix of

Each row of the matrix is a label vector in the form of one-hot node;

step 3.2.2: constructing a mapping matrix M from an original image to a coarsest image _0,i I.e. M _0,i ＝M _0,1 M _1,2 …M _i,i-1 M _0,i ；M _0,i The rows in the matrix are in a one-hot form, and the columns are coarsened nodes;

Step 3.2.4: node attribute feature F of original graph ₀ Feature mapping matrix M from original graph to coarsest graph _0,i (ii) a First to M _0,i Normalization is carried out, and then a characteristic matrix of the coarsest graph is obtained through matrix operation

Step 3.3: if the graph embedding method g (-) is a supervised embedding method, no expansion is needed;

network embedding is carried out on the coarsest graph, so that the global structure of the original graph can be visually captured; performing m times of iterative coarsening on the graph, and performing coarseness on the coarsest graph

Apply graph embedding method g (·); will be provided with

Is represented by epsilon _m Thus, therefore, it is

Since the invention is independent of the graph embedding method employed, any graph embedding algorithm can be used for basic embedding;

it is worth noting that the research of the original multilayer method is that the supervised embedding algorithms such as GCN and the like cannot be processed, because the corresponding relation between the node labels and the characteristics between the coarsest graph and the original graph cannot be found, the invention widens the limit through the matrix operation;

Is embedded in ₀ ；

The step 4 is graph thinning, namely, the embedding is thinned back to the original graph through a low-pass graph filter, and meanwhile, the smoothness of the embedding on the original graph is ensured; the method specifically comprises the following steps:

step 4.1: using a local refinement process driven by gihonov regularization, the node embedding on the graph is smoothed by a minimization formula, as shown in equation (8):

wherein L is _i Is a normalized laplacian matrix of the signal,

and 4.2: taking the derivative of the objective function in the equation in step 4.1 equal to 0, as shown in equation (9):

embedding for smoothing the ith layer;

step 4.3: the mapped embedded matrix is smoothed using a low-pass filter, as shown in equation (10):

wherein,

a degree matrix is normalized twice for the ith layer,

the normalized adjacency matrix for the ith layer,

to be driven from

To

Projection matrix of epsilon _i+1 Embedding the (i + 1) th layer;

step 4.4: the embedding epsilon of the original graph is obtained by iterating step 4.2 ₀ 。

To illustrate the effectiveness of the method of the invention, we performed the following experiments:

experimental analysis comparing various kinds of graph-embedded baseline algorithms, the experimental results are shown in fig. 3-4 in the accuracy and time of use under different coarsening layers, and the method comprises the following steps:

step S1: experimental selection of a Pubmed data set;

step S1.1: selecting a baseline algorithm comprising Deepwalk, node2vec, graRep and NetMF;

step S1.2: setting a hyper-parameter; all basic embedding methods, the embedding dimension d is set to 128; each node of the Deepwalk and node2vec algorithm performs 10 random walks, the walk length is 80, the window size is 10, the return parameter p =1.0 in the node2vec, and the in-out parameter q =0.5;

step S1.3: after the method and the baseline algorithm embedded in various graphs are operated, the Micro-F1 values and the time consumption under different coarsening layers are obtained; the calculation process of the Micro-F1 is as follows:

step S1.4: CPU time process _ time (the value of the sum of the system and user CPU time of the current process) for different algorithms, this time value is commonly used for the testing of code time, and the reported time includes all phase times of the model;

step S1.5: visualizing the Micro-F1 value and the use time, and carrying out comparative analysis;

FIG. 3 shows the embedded baseline algorithm of various graphs after the method of the invention (MLAGE) is combined with the Micro-F1 values and the time of use under different coarsening layers, where the baseline algorithm includes deep walk, node2vec, graRep and NetMF, and the experimental use is the Pubmed data set;

fig. 3 (a) shows the improvement of the embedding quality by MLAGE, the abscissa is the number of coarsening layers, the ordinate is the Micro-F1 value, and the number of layers equal to 0 represents the original embedding algorithm without MLAGE, and it can be observed that the embedding after MLAGE is always optimal in all methods; FIG. 3 (b) can see the effect of MLAGE on runtime, with the abscissa being the number of coarsening layers and the ordinate being the execution time, where CPU time is used as the execution time, and the execution time of MLAGE is the sum of CPU time at each stage; it can be seen that as the number of coarsening layers increases, the time consumption almost decreases exponentially; in general, the MLAGE is used for not only faster embedding speed, but also improving the embedding quality;

FIG. 4 is a comparison of Deepwalk, netMF and coarsened MLAGE (DW, l = 7) and MLAGE (NMF, l = 7), and it can be seen that the multilayer method adds only a small amount of graph fusion, coarsening and thinning time, which sharply reduces the embedding time, and the total time is only 1/67 and 1/373 of the original time;

experimental analysis comparing accuracy and time of other multi-layer methods under different coarsening layers, experimental results are shown in the attached table 1-table 5; the method comprises the following steps:

step S2: selecting Cora, citeser and Pubmed data sets in an experiment;

step S2.1: the method uses Deepwalk based on random walk, graRep based on matrix decomposition, DGI based on a graph neural network and GCN based on a semi-supervised embedding algorithm of the graph neural network as an embedding kernel, and the method also uses the same embedding kernel;

step S2.2: setting a hyper-parameter; the embedding dimension d except DGI is set to 128, the embedding dimension d of DGI is 512, each node of the Deepwalk algorithm carries out 10 random walks, the walk length is 80, the window size is 10, the GCN and the DGI use an early stopping strategy, the learning rates of the early stopping strategy are 0.01 and 0.001 respectively, and the epoch iteration number is set to 200; in the aspect of GraphSAGE configuration, each epoch trains a two-layer model, the learning rate is 0.00001, and the batch size is 256;

step S2.3: setting different coarsening layers; l =1, 2, 3;

step S2.4: after the Method (MLAGE) and the baseline algorithm embedded in each graph in the step S2.1 are operated, the Micro-F1 values and the time consumption under different coarsening layers are obtained;

step S2.5: acquiring CPU time (process _ time) of different algorithms, namely the value of the sum of the CPU time of a system and a user of the current process;

step S2.6: visualizing the Micro-F1 value and the use time, and carrying out comparative analysis;

compared with other multilayer methods, the MLAGE shows results of three coarsening layers in a direct push type node classification experiment, and shows results of two coarsening layers in a inductive node classification experiment; the results of various network embedding show that MLAGE is irrelevant to the bottom layer embedding method, and the accuracy and speed of embedding can be improved on various data, even the performance of the most advanced network embedding algorithm in the prior art can be improved; the experimental results are shown in the attached tables 1 to 5, wherein Gzoom represents Graphzoom multilayer method, and l represents the number of coarsening layers;

specifically, for the direct push learning task, MLAGE improved the class Micro-F1 values for deep walk by 0.025, 0.043 and 0.064 on the Cora, cineser and Pubmed datasets, respectively, while achieving up to 8.1 times the run-time reduction, see table 1; for the classification Micro-F1 values of GraRep, the values are respectively improved by 0.047, 0.036 and 0.045, and meanwhile, the maximum 5 times of CPU operation time reduction is realized, see Table 2, in addition, the acceleration effect of the multilayer method is not obvious under the condition of the original extremely short operation time through the experiment that GraRep is used as an embedded kernel on Cora and Citeser data sets; MLAGE also achieved higher accuracy than DGI, with an acceleration ratio as high as 8.4 ×, see table 3;

TABLE 1 comparative experimental results of graph embedding algorithm Deepwalk based on random walk and multi-layer method using Deepwalk as embedding kernel

TABLE 2 comparative experimental results of graph embedding algorithm GraRep based on matrix decomposition and a multi-layer method using GraRep as embedding kernel

TABLE 3 comparison of experimental results of the graph neural network based unsupervised embedding algorithm DGI and the multi-layer method using DGI as the embedding kernel

The MLAGE can also improve the accuracy and the expandability of the network embedding of semi-supervised learning in most cases, see Table 4, but the GCN runs too fast on a small data set, so that the speed of the small data set cannot be improved, and the experimental result on the Citeser data set shows that the MLAGE (GCN) is longer than the GCN in use, because the semi-supervised algorithm needs to spend a little time on constructing a super-node label matrix of a rough graph before calculating the embedding, the time is more at nodes, the performance is more obvious on the Citeser small-size graph data set with fewer edges, the advantage of the coarsening speed cannot be embodied due to fewer edges, the time proportion for constructing the super-node label matrix is larger due to more nodes, and finally, the result that the MLAGE (GCN) is longer than the GCN in use is caused;

TABLE 4 comparison experiment results of semi-supervised embedding algorithm GCN based on graph neural network and MLAGE with GCN as embedding kernel

For induction learning tasks, MLAGE performed on PPI and Citeser better than the Micro-F1 values of GraphSAGE four aggregation functions, respectively, with acceleration ratios as high as 4.0 ×, see Table 5;

TABLE 5 Induction of Micro-F1 values and CPU time for tasks, baseline is the experimental result of GraphSAGE with four different aggregation functions

These results indicate that a method for improving network embedding algorithm scalability (MLAGE) improves embedding speed and quality, because the embedding model is trained only for the smallest coarsened graph on the coarsest level, in addition to reducing the size of the graph, the coarsening method of MLAGE further filters out redundant information in the original graph, so that the embedding method can more intuitively capture the global structure of the original graph;

the experiment verifies the expandability on the large-scale network, and the experimental result is shown in the attached figures 5-6, and comprises the following steps:

and step S3: selecting a Friendster data set with more than 200 ten thousand nodes and more than 700 ten thousand edges for experiment;

step S3.1: the method of the invention (MLAGE) uses DeepWalk as the embedding kernel;

step S3.2: setting a hyper-parameter; the embedding dimension d is set to be 128, each node of the deep walk algorithm carries out 10 times of random walks, the walk length is 80, and the window size is 10;

step S3.3: setting different coarsening layers; l =1, 2, 3, 4, 5, 6;

step S3.4: after the method and the baseline algorithm embedded in various graphs are operated, the Micro-F1 values and the time consumption under different coarsening layers are obtained;

step S3.5: obtaining the CPU time process _ time of different algorithms;

step S3.6: obtaining the node number change of the graph and the edge number change of the graph after different coarsenings;

step S3.7: visualizing the change of the number of nodes of the graph and the change of the number of edges of the graph after the Micro-F1 value, time consumption and different coarsenings are carried out, and carrying out comparative analysis;

in the experiment, the MLAGE uses Deepwalk as an embedded kernel, as shown in FIG. 5, after 1 layer of coarsening, the Micro-F1 scores of the MLAGE and the Deepwalk are equivalent, but the CPU time is greatly accelerated, and the acceleration ratio reaches 3.2 x; fig. 6 shows the number of nodes and edges of the coarsened graph, which can clearly show the coarsening capability of MLAGE, and the number of nodes after 6 layers of coarsening is reduced to 162,604, which is only 8.0% of the number of nodes in the original graph; the number of edges is reduced to 2,636,275 which is only 35.7 percent of the number of the edges of the original graph; in addition, with the increase of the number of coarsening layers, the embedding precision of the MLAGE is only reduced elegantly, and high embedding quality is still kept in 5-layer coarsening; this shows the core advantages of MLAGE: the large-scale network is effectively coarsened by combining redundant nodes with similar information, so that the graph structure attribute which is important for a basic embedding model of a bottom layer is reserved, and when basic embedding is applied to the coarsest graph, the global structure of the graph can be visually captured, so that high-quality embedding is realized;

partial interface screenshots of system function modules of the prototype system developed in the invention are shown in fig. 7, fig. 8, fig. 9, fig. 10, fig. 11, fig. 12, fig. 13, fig. 14, fig. 15 and fig. 16, and are used for training and testing the model through a clear system page, and supporting functions of adjusting hyper-parameters, managing data sets, visualizing the training process, displaying test results in a chart and the like; the method comprises the following steps:

and step S4: constructing a prototype system; the system use diagram of the prototype system is shown in fig. 7, the system function module is shown in fig. 8, the model training life cycle timing diagram is shown in fig. 9, and the data set uploading timing diagram is shown in fig. 10;

step S4.1: uploading a data set, as shown in FIG. 11;

step S4.2: view data sets, as shown in FIGS. 12-13;

step S4.3: selecting a data set and a downstream task through a drop-down box; as shown in fig. 14;

step S4.4: the super parameters are set, and input limitation and prompt are provided for input of the super parameters, so that training collapse caused by wrong input is avoided; as shown in fig. 14;

step S4.5: clicking to start training to automatically jump to a training process page, displaying the total training progress at the top of the page, and displaying the training process below the page; a plurality of references can be provided for experiments according to the index change trend; as shown in fig. 15-16;

step S4.6: after training is finished, the automatic downloading network embedded file embed.npy is sent to the local, and a test task is executed;

step S4.7: the system will automatically download the log file of the detailed test results locally.

Claims

1. A method for improving expandability of a network embedding algorithm is characterized in that: the method comprises the following steps:

step 1: for an original graph

represented as a contiguous matrix of the array of pixels,

represented as a matrix of the characteristics of the nodes,

representing weighted graphs

The adjacency matrix of (a);

step 2: using hybrid coarsening to transform an original graph

Coarsening to

Is a graph which is subjected to first coarsening,

after m times of coarseningThe final coarsest graph;

and step 3: in the coarsest figure

Executing a graph embedding method g (-) to obtain an embedding result epsilon;

Is embedded in ₀ 。

2. The method for improving scalability of network embedding algorithms according to claim 1, wherein: the step 1 specifically comprises the following steps:

step 1.1: original graph for | V | nodes

Its adjacency matrix is expressed as

Its node feature matrix is

Is converted into a node feature map

Wherein the L2-norm is the Euclidean distance;

step 1.3: according to the original diagram

The cosine similarity between the attribute vectors of any two nodes assigns a weight to each edge of the k nearest neighbor graph, i.e. each edge is weighted

Wherein, X _i,: And X _j,: Is the attribute vector for nodes i and j;

A _fusion ＝A _topo +βA _feat (1)

representing weighted graphs

Of a neighboring matrix of _feat Is a node characteristic matrix of the k nearest neighbor graph.

3. The method for improving scalability of network embedding algorithms according to claim 2, wherein: the generation of the k-nearest neighbor graph in the step 1.2 specifically comprises the following steps:

step 1.2.1: taking a random graph signal x as a random vector, and representing by combining a linear combination of feature vectors u of a graph Laplacian operator;

As shown in formula (2):

step 1.2.3: solving linear equations with Gauss-Seidel iteration

T initial random vectors T = (x) are obtained ⁽¹⁾ ,...,x ^(t) ) Wherein, in the process,

representing the original picture

A laplacian matrix of;

And

step 1.2.5: iterating the step 1.2.1 to the step 1.2.4, and further aggregating the aggregated node set as a super node until the original graph y ₀ And if the node similarity between any two nodes does not meet the similarity threshold, determining a final node cluster, selecting the first k nearest neighbors in each cluster, and constructing a k nearest neighbor graph.

4. The method for improving scalability of network embedding algorithms according to claim 3, wherein: the node similarity is determined by the spectrum node affinity of adjacent nodes p and q, and is shown in formula (3):

wherein:

wherein, a _p,q For the spectral node affinities of neighboring nodes p and q,

embedding vectors for the kth low dimension of node p

Embedding vectors for the kth low dimension of node q

5. The method for improving scalability of network embedding algorithms according to claim 4, wherein: the similarity threshold is set to be greater than 60%.

6. The method for improving scalability of network embedding algorithms according to claim 1, wherein: the step 2 is a graph coarsening method, which specifically comprises the following steps:

step 2.1: inputting an original graph

step 2.3: constructing graphs at the i +1 level by matrix operations

Adjacent matrix A of _i+1 Calculating A _i+1 ＝M _i,i+1 ^T A _i M _i,i+1 Wherein M is _i,i+1 To be driven from

To

Of a mapping matrix of _i Is composed of

The adjacency matrix of (a);

step 2.4: calculate the map at the i-1 level

Second-order neighbor coarsening mapping matrix of (1)

Mark matrix

Initializing a node for storing a first-order neighbor coarsening map

Pair V in ascending order of node degree _i-1 Sorting is carried out;

Finally marking the node u and the node v;

step 2.6: based on

And

calculating a mapping matrix M by matrix operation _i-1,i I.e. if it is to

The nodes a, b and c in the network are coarsened into

step 2.7: is calculated to obtain

Adjacent matrix A of _i As shown in formula (5):

A _i ＝M _i-1,i ^T A _i-1 M _i-1,i (5)

wherein, M _i-1,i To be driven from

To

A mapping matrix of _i-1 Is composed of

The adjacency matrix of (a);

step 2.8: repeating the steps 2.5 to 2.7, and traversing all the nodes after the ascending sequence;

And M _i,i+1 。

7. The method for improving scalability of network embedding algorithms according to claim 6, wherein: the information interaction probability t in the step 2.5 _i,j Measuring the similarity between the nodes and determining whether to merge the nodes, comprising the following steps:

step 2.5.1: traverse the original graph

All node pairs in;

step 2.5.2: for the original picture

step 2.5.3: traversing the coarsened graph of the second-order neighbor;

wherein,

8. The method for improving scalability of network embedding algorithms according to claim 1, wherein: the step 3 is graph embedding, and specifically comprises the following steps:

step 3.1: in the coarsest figure

Upper application graph embedding g (·);

The corresponding relation between node labels and characteristics;

9. The method for improving scalability of network embedding algorithms according to claim 8, wherein: the graph embedding method g (-) in the step 3.2 is an unsupervised embedding method, and specifically comprises the following steps:

step 3.2.1: according to the original diagram

And constructing an original graph by Lb node labels

Tag matrix of

Each row of the matrix is a label vector in the form of one-hot node;

step 3.2.2: constructing an original graph

To the coarsestMapping matrix M of the graph _0,i I.e. M _0,i ＝M _0,1 M _1,2 …M _i,i-1 M _0,i ；M _0,i The rows in the matrix are in a one-hot form, and the columns are coarsened nodes;

step 3.2.3: through the label matrix Lb ₀ Obtaining labels of the coarsened nodes, taking mode of each column of labels as the labels of the coarsened nodes to obtain a label matrix

Step 3.2.4: original graph

Node attribute feature F ₀ Original picture

10. The method for improving scalability of network embedding algorithms according to claim 1, wherein: the step 4 specifically comprises the following steps: