CN115879507A - Large-scale graph generation method based on deep confrontation learning - Google Patents

Large-scale graph generation method based on deep confrontation learning Download PDF

Info

Publication number
CN115879507A
CN115879507A CN202211167773.XA CN202211167773A CN115879507A CN 115879507 A CN115879507 A CN 115879507A CN 202211167773 A CN202211167773 A CN 202211167773A CN 115879507 A CN115879507 A CN 115879507A
Authority
CN
China
Prior art keywords
graph
community
node
encoder
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211167773.XA
Other languages
Chinese (zh)
Inventor
程大伟
许辰昊
蒋昌俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202211167773.XA priority Critical patent/CN115879507A/en
Publication of CN115879507A publication Critical patent/CN115879507A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a large-scale graph generation method based on deep confrontation learning, which comprises the steps of giving an adjacent matrix A and a characteristic matrix X of a graph G, sampling the adjacent matrix A and the characteristic matrix X, inputting a graph attention encoder to obtain structural information of the graph, and obtaining a true value of a community label by applying a community detection algorithm; feeding the community information and the graph representation output by the graph attention encoder to a community decoder to generate community labels corresponding to the nodes; adjusting parameters of the graph attention encoder and the community decoder by utilizing back propagation to guide the graph attention encoder and the community decoder to a potential space kept by a community; feeding the community information and graph representation output by the graph attention encoder to a graph encoder, generating edge probabilities; utilizing the edge probability simulation graph fractional matrix to finally sample to obtain a new graph generated by a model; the model provided by the invention can realize good balance between the quality and the efficiency (expandability) of the graph simulation.

Description

Large-scale graph generation method based on deep confrontation learning
Technical Field
The invention relates to the technical field of large-scale graph generation models, in particular to a large-scale graph generation method based on deep confrontation learning.
Background
The research on the graph generation model has been long, and the traditional methods such as ase:Sub>A B-A model, ase:Sub>A Chung-Lu model, ase:Sub>A Kronecker graph model, ase:Sub>A BTER model, an exponential random graph, ase:Sub>A random block model and the like are well designed to simulate ase:Sub>A specific graph family; for example, the exponential stochastic graph model (ERGM) relies on an expressive probabilistic model that learns the weights on node features to model the likelihood of edges in a graph; in practice, however, this approach is limited to a certain extent because it can only capture graph structures with sufficient statistical information, and the Kronecker graph model relies on the Kronecker matrix product to efficiently generate large adjacency matrices, although this approach is scalable and can learn some graph properties (e.g., degree distributions) from data, it is still limited to a large extent in terms of the graph structures it can represent; the BTER model is used for correcting the average clustering coefficient in each community and correcting degree distribution through a two-stage edge sampling process, and the BTER considers the community structure of the image display type by modeling the image display type into a two-stage E-R image; notably, SBM and its variants DCSBM and MMSB also consider community structure of graphs, but they suffer from the limitation of stochastic model simplicity, resulting in underperforming community structure preservation of graph generation tasks in real life, specifically, they have only one parameter for capturing each community (i.e., an edge within the community), one parameter for representing the probability of connectivity for each pair of communities (i.e., an edge between the two communities).
In recent years, some deep neural network-based techniques (e.g., VGAE, deepGMG, graphnn, graphite, GRAN, condGen) have been proposed to solve the map generation problem, which significantly improve the quality of map generation compared to conventional methods; for example, graph and VGAE use a variational auto-encoder (VAE) technique, where graph neural networks are used for inference (encoding) and generation (decoding), since graph and VGAE assume a fixed set of vertices so they can only learn from a single graph, netGAN performs more efficiently than VGAE by learning random walks on the graph, but it is not scalable since fixed-size graphs are generated; in DeepGMG, graph neural networks are used to represent probabilistic dependencies between nodes and edges of a graph, which can be correctly learnedWhile we have tried distributions over graph, generating a graph with m edges, n vertices, and a diameter D (G) requires a complexity of O (mn) 2 D (G)), which also has the problem of poor scalability.
Currently, graphnn generates graphs through a Recurrent Neural Network (RNN) sequence, but it is not permutation-invariant, since computing likelihood requires possible permutations of the node order of the marginalized adjacency matrix, GRAN improves the scalability of graphnn by generating one node block and associated edges at each step in the autoregressive approach, which is still not permutation-invariant, condGen overcomes this permutation-invariance challenge by using GCN as an encoder and dealing with graph generation problems in the embedding space, graphnu-Nets select specific nodes to implement upsampling and downsampling of the graph to obtain a graph representation, however, they do not consider the observed community structure of the graph in the learning process, SBMGNN is a variant of SBM, equipped with deep learning techniques, but its graph neural network is used to infer parameters of the overlapping random block model, which are not directly related to the retention property, and thus there is no improvement in community performance in terms of retention compared to other deep learning based graph generation models.
The generation of a countermeasure network (GAN) shows remarkable effects in various tasks such as image generation, image translation, super-resolution imaging and multimedia synthesis, and GAN is recently used for network scientific tasks such as network embedding, semi-supervised learning and graph generation, for the graph generation task, prior structure knowledge specified by a sample data set is crucial to graph generation, especially under the condition of community structure maintenance; for the community structure of the graph, some models using the pooling strategy can be trained to represent communities (clusters), but it is still a challenge to represent and generate these community structures simultaneously, for example, netGAN generates graphs by random walk, which is very important for maintaining the community structure.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, the technical problem solved by the invention is as follows: limited by complexity, the scalability of the deep learning model is often poor, and the community retention characteristic of the graph is not concerned by the traditional method, so the correlation performance of the graph generation is also poor.
In order to solve the technical problems, the invention provides the following technical scheme: comprises the steps of (a) preparing a substrate,
for the graph G, an adjacent matrix A and a characteristic matrix X are given, after sampling is carried out on the graph G, the graph G is input into a graph attention encoder to obtain structural information of the graph, and a real value of a community label is obtained by applying a community detection algorithm;
feeding the community information and the graph representation output by the graph attention encoder to a community decoder to generate community labels corresponding to the nodes;
adjusting parameters of the graph attention encoder and the community decoder by utilizing back propagation to guide the graph attention encoder and the community decoder to a potential space kept by a community;
feeding the community information and graph representation output by the graph attention encoder to a graph encoder, generating edge probabilities;
and finally sampling to obtain a new graph generated by the model by utilizing the edge probability simulation graph fractional matrix.
As a preferable aspect of the large-scale map generation method based on deep confrontation learning according to the present invention, wherein: there is also a need to enhance the reconstruction of the graph structure prior to sampling, including,
for the graph G, the adjacency matrix A and the feature matrix X are given, the adjacency matrix A and the feature matrix X are input into a ladder encoder to obtain structure information of the graph, and the real value of the community label is obtained by applying the community detection algorithm;
feeding the community information and graph representation output by the ladder encoder to a discriminator to determine whether the input graph is a false graph distinct from the true graph;
meanwhile, the coarsening graph of each level distributes the community structure characteristics to the original nodes through a differentiable layer message transmission process;
and decoding a series of community information of each original node to enhance the reconstruction of the graph structure.
As a preferable aspect of the large-scale map generation method based on deep confrontation learning according to the present invention, wherein: the ladder encoder includes graph convolution, graph pooling, graph readout, and graph transpose pooling.
As a preferable scheme of the large-scale map generation method based on deep confrontation learning of the present invention, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
generating a new graph with observed hierarchical community structure distribution by using variational reasoning before decoding node characteristics;
and selecting a multilayer perceptron as a reasoning model for reasoning, and finishing the mapping from the reconstruction characteristics to the prior distribution.
As a preferable aspect of the large-scale map generation method based on deep confrontation learning according to the present invention, wherein: the discriminator comprises a plurality of discriminators which are connected in series,
the identification task needs the graph characteristics obtained by the ladder encoder, and a layer output matrix is read according to the graph;
and optimizing the discriminator, performing form calculation through game, combining training and updating parameters through increasing gradient.
As a preferable aspect of the large-scale map generation method based on deep confrontation learning according to the present invention, wherein: comprises the steps of (a) preparing a substrate,
by passing from A out The parameterized class distribution sampling of the ith row generates an edge for the node i;
selection A out Until the number of edges reaches a predefined number;
the total temporal complexity of generating the new graph is O (n) 2 )。
As a preferable scheme of the large-scale map generation method based on deep confrontation learning of the present invention, wherein: the image decoder includes a decoder for decoding a plurality of encoded data streams,
decoding the hierarchical representation sequence;
and predicting node linkage.
As a preferable aspect of the large-scale map generation method based on deep confrontation learning according to the present invention, wherein: comprises the steps of (a) preparing a substrate,
given input node characteristics
Figure BDA0003862071980000041
And a self map;
aggregating messages from graph structures using a multi-headed attention mechanism to obtain hidden variables h for central nodes u of a coherent autograph u
As a preferable scheme of the large-scale map generation method based on deep confrontation learning of the present invention, wherein: for each of the self graphs, message aggregation includes,
Figure BDA0003862071980000042
wherein the content of the first and second substances,
Figure BDA0003862071980000043
a row of hidden variables representing the graph attention coding layer, namely hidden variables on the node u;
Figure BDA0003862071980000044
representing an output projection matrix; h is attn Indicating the number of heads of attention; d att Representing the dimension of the attention vector a.
As a preferable aspect of the large-scale map generation method based on deep confrontation learning according to the present invention, wherein: probability distribution based on node degree is taken as an initial node sampling strategy, comprising,
Figure BDA0003862071980000045
wherein deg is u Representing the degree of node u, assuming sampling in each traversaln s Each node is used as an initial time sequence node, and n is sampled s The self graph is used as the input of the coding process, and the initial node set is expressed as
Figure BDA0003862071980000046
The invention has the beneficial effects that: the invention provides a new graph generation model, namely a large-scale graph generation method (LSGEN), which not only can keep community structure and other important attributes in a real graph, but also can reduce graph simulation time and improve scalability compared with other graph generation models based on learning, wherein a generator and a discriminator are carefully designed in a unified GAN framework, wherein the generator is positioned in a hierarchical graph variation automatic encoder, can learn the replacement-invariant representation of an input graph and can generate a new graph from node representation, the discriminator is used for judging whether embedding is from a real graph or a simulated graph, and a micro-ladder network is introduced to realize graph pooling and message transmission of different community structure levels, which is more effective than simply stacking deeper graph volume layers; meanwhile, the invention also introduces an extensible version SLSGEN which has a shorter production line and a faster training speed, and in the SLSGEN, an efficient graph attention automatic encoder framework is designed for generating a community maintenance graph; the self-graph (EgoGraph) sampling and two-part computation graph assembling strategy are adopted, and a small-batch-based method is realized to train a graph generator; the data parallel and model parallel architecture for training and reasoning of the scalable graph generation model is provided, extensive experiments are carried out on the composite graph and the real graph, and results show that compared with a baseline method, the proposed model can achieve good balance between the quality and the efficiency (expandability) of graph simulation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
FIG. 1 is a block diagram of a LSGEN of a large-scale graph generation method based on deep confrontation learning according to an embodiment of the present invention;
FIG. 2 is a flow diagram of an SLSGEN of a large-scale graph generation method based on deep confrontation learning according to an embodiment of the present invention;
fig. 3 is a schematic diagram of two computation graphs of a large-scale graph generation method based on deep confrontation learning according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
Furthermore, the references herein to "one embodiment" or "an embodiment" refer to a particular feature, structure, or characteristic that may be included in at least one implementation of the present invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not necessarily enlarged to scale, and are merely exemplary, which should not limit the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Meanwhile, in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation and operate, and thus, cannot be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected and connected" in the present invention are to be understood broadly, unless otherwise explicitly specified or limited, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1
For ase:Sub>A long time, the traditional methods (such as B-A model, chung-Lu model, etc.) are well designed to simulate ase:Sub>A specific graph family, but they often rely on ase:Sub>A priori distribution of graphs and are limited by model simplicity and do not perform well in terms of generation quality.
In recent years, some technologies based on deep neural networks (such as VGAE, deep gmg models, etc.) have been proposed to solve the problem of graph generation, which significantly improve the quality of graph generation compared with the conventional methods, but are limited by complexity, the scalability of deep learning models is often poor, and the community retention characteristics of graphs are not focused by the past methods, so the related performance of graph generation is also poor.
Referring to fig. 1, 2 and 3, a large-scale graph generation method based on deep confrontation learning is provided for an embodiment of the present invention, and includes:
s1: for the graph G, an adjacent matrix A and a characteristic matrix X are given and input into a ladder encoder to obtain structural information of the graph, and a real value of the community label is obtained by applying a community detection algorithm.
S2: the community information and graph representation output by the ladder encoder are fed to a discriminator to determine if the input graph is a false graph distinct from the real graph, while the coarsened graph of each level will distribute its community structure characteristics to the original nodes through a differentiable layer messaging process.
S3: and decoding a series of community information of each original node to enhance the reconstruction of the graph structure.
S4: and for the graph G, the adjacent matrix A and the characteristic matrix X are given, after sampling, the graph is input into a graph attention encoder to obtain the structure information of the graph, and the real value of the community label is obtained by applying a community detection algorithm.
S5: community information and graph representations output by the graph attention encoder are fed to a community decoder, and community labels corresponding to the nodes are generated.
S6: parameters of the attention encoder and community decoder are adjusted by back propagation to guide the parameters to potential space maintained by the community.
S7: the community information and graph representation output by the graph attention encoder are fed to a graph encoder, generating edge probabilities.
S8: and (4) simulating the graph fractional matrix by using the edge probability, and finally sampling to obtain a new graph generated by the model.
Referring to fig. 1, in order to better explain the implementation principle of the LSGEN algorithm provided by the present invention, the following detailed description is made in this embodiment:
(1) Ladder messaging encoder
Introducing a ladder encoder, wherein the model can adaptively adjust the pooling strategy and extract the community structure information of the nodes, and the node characteristic X and the adjacency matrix A are belonged to {0,1} n×n As input to the proposed encoder, for each graph G, the identity matrix is used as its default node feature X ∈ R n×d Each node has d-dimensional features and an adjacency matrix a, and the input graph G will be coarsened using the superimposed convolution and pooling layers.
Graph convolution: the classical messaging model is represented by the Graph Convolution Network (GCN).
Propagated message Z ∈ R n×d' The calculation is as follows:
Figure BDA0003862071980000071
wherein the content of the first and second substances,
Figure BDA0003862071980000072
represents->
Figure BDA0003862071980000073
Based on the degree matrix, is greater than or equal to>
Figure BDA0003862071980000074
Figure BDA0003862071980000075
A contiguous matrix containing self-loops is represented,
Figure BDA0003862071980000079
W∈R d×d' is a trainable parameter in the graph volume layer, with a kernel size of d'; σ denotes the activation function (ReLU by default); x denotes a node feature derived from spectral embedding of the adjacency matrix a of X = X (a). />
If used, the
Figure BDA0003862071980000077
In a cell line (e.g. < u >)>
Figure BDA0003862071980000078
) To improve the connectivity of the graph, the information can flow faster between nodes, the time complexity of graph convolution is O (m + n), where m represents the number of edges.
Pooling of the pictures: simply passing messages through the GCN may result in a very deep network being able to capture structural information, and particularly when large sparse graphs of low connectivity are encountered, there is a need for an efficient way to obtain a hierarchical representation of the graph, with graphs being paired in a hierarchyCoarsening and learning through a series of distribution matrices
Figure BDA0003862071980000084
Figure BDA0003862071980000085
Strategy for roughing a set of drawings, where n l 、n l+1 And k represent the number of input nodes, output nodes, and layers, respectively.
The assignment matrix is calculated as follows:
Z (l) =σ(GCN l,embed (X (l) ,A (l) ))
S (l) =softmax(GCN l,pool (Z (l) ,A (l) ))
where, σ is the ReLU activation function,
Figure BDA0003862071980000086
and &>
Figure BDA0003862071980000087
Respectively represent n l Characteristic matrix and adjacency matrix of a cluster node->
Figure BDA0003862071980000088
Feature matrices representing the l-th layer with structural information, two GCNs used to collect structural information and infer pooling strategies for the l-layer, respectively, pairNorm is used after each GCN due to the multiple operations of graph convolution and pooling of one graph to allow stacking of deep GCNs without excessive smoothing, the assignment matrix can be viewed as a predictive node community assignment and will be constrained by the true label, given the assignment matrix S (l) Coarsening the adjacency matrix A (l+1) And a new insertion X (l+1) Can be generated as follows:
Figure BDA0003862071980000089
Figure BDA00038620719800000810
the stack graph volume and pooling layer may obtain a series of node representations at different levels, and if layer k has only one node after pooling, the corresponding allocation matrix will be
Figure BDA00038620719800000811
So that pooling of the map corresponds to the sum of the map readouts, the total temporal complexity of the map pooling is O (m + n).
Reading the graph: the node representation of each graph is folded into a graph representation by graph readout, and therefore, the output characteristic s of the ith-level coarsened graph i The read out of (c) is calculated as follows:
Figure BDA0003862071980000081
Figure BDA0003862071980000082
where k is the number of layers per graph, x ij A representation of a jth node representing an ith level graph, an
Figure BDA0003862071980000083
Representation combining all representations in a new dimension, graph read time complexity O (n), final graph representation s ∈ R k×d Is the input to the graph discriminator.
And (3) image transposition pooling: unlike upsampling of coarsened graphs, which requires proper depolarization of the graph to reconstruct the node representation, the present embodiment introduces a differentiable method to distribute information from a coarsened graph to a detailed graph, the proposed distribution method uses a transposed version of a similar distribution matrix, the transposed distribution matrix
Figure BDA0003862071980000091
Figure BDA00038620719800000911
Is calculated as follows:
Figure BDA0003862071980000092
thus, reconstructing the node representation Z rec ∈R n×k×d Is calculated as follows:
Figure BDA0003862071980000093
Figure BDA0003862071980000094
wherein the content of the first and second substances,
Figure BDA0003862071980000095
representation all node representations are combined in a new dimension, after which Z rec As an input to the decoder D proposed in this embodiment, the total time complexity of transpose pooling is O (m + n), and it is noted that in this work, this embodiment adds a variational inference module to the potential distribution of conjugate nodes to control the output of the encoder.
(2) Variational reasoning
The embodiment utilizes variational reasoning before decoding node characteristics to generate a new graph with observed hierarchical community structure distribution, and uses
Figure BDA0003862071980000096
Achieving a priori distribution N (mu, diag (sigma)) from reconstructed features 2 ) ) that selects a multi-layer perceptron (MLP) as the inference model. The reasoning process formula is as follows:
g(Z rec ,φ)=σ(Z rec φ 01
Figure BDA0003862071980000097
Figure BDA0003862071980000098
Figure BDA0003862071980000099
Figure BDA00038620719800000910
where φ represents a set of parameters in MLP, g (-) i Line i, Z, representing g (-) vae ∈R n×k×d The time complexity of the reasoning module is O (kn), and the probabilistic variational reasoning can enable the node representation to be far away from the zero center, so that the node representation is intuitively made to be more sparse, and the community structure of the node can be kept.
After the variation reasoning module, a new node feature is selected from the prior distribution to generate a new graph, but the fully connected network itself is found to be unable to handle the task of generating a graph with a complex and hierarchical community structure, so this embodiment proposes a graph decoder to solve this problem.
(3) Graphic decoder
The decoder proposed in this embodiment comprises two steps: decoding the hierarchical representation sequence first, then predicting the node link, embedding the hierarchical community structure with Gating Recursive Unit (GRU), and obtaining the node characteristic h k Where k represents the number of community structures, the decoding characteristics are obtained by the following formula:
Figure BDA0003862071980000101
wherein h is l Represents the hidden state of the coarsened graph, h 0 Is a matrix of zeros, which is,
Figure BDA0003862071980000102
represents the node characteristics of the first coarsened graph, h k And representing the characteristics of the decoding nodes with the hierarchical community information, and after the node representation is obtained, giving the following link prediction:
g θ (h k )=σ(h k θ 01
p θ (A ij |h k,i ,h k,j )=σ(g θ (h k,i ) T g θ (h k,j ))
Figure BDA0003862071980000103
wherein, g θ (h k,i ) Is a two-layer MLP to extract community information to help generate edges, h k,i Features representing the ith node, A rec ∈R n×n Representing the probability matrix of the link prediction, n is sampled in order to speed up this process when the decoder is trained on a large graph s (n s N) nodes to obtain A rec ∈R ns×ns
Specifically, the nodes are sampled without replacement to assemble the subgraph according to the node degree policy, as follows:
Figure BDA0003862071980000104
wherein, P i Is the probability, deg, of selecting node i i Represents the degree of node i, thus illustrating the temporal complexity of the decoder as
Figure BDA0003862071980000105
(4) Discriminator and optimization
The graph discriminator: the discrimination task requires the features of the graph obtained by the encoder, i.e. the output matrix s e R of the graph readout layer k×d And S = E (a), using a two-layer MLP classifier as discriminator D, which is defined as:
D φ (A)=σ(MLP(s,φ))
where φ represents the parameters of the MLP and σ represents the sigmoid activation function.
Optimizing the discriminator: formally, G and D play the minimax game using the value function V (G, D) as follows:
Figure BDA0003862071980000111
wherein, Z vae And Z s Sampling from the approximate distribution and gaussian prior distribution, respectively, and furthermore, in order to improve the level of discriminators with the clustering results, a clustering consistency is introduced:
Figure BDA0003862071980000112
wherein S is l The distribution matrix to be introduced is represented,
Figure BDA0003862071980000113
representing real community division of the observation graph, under the default condition, obtaining a layered community detection result by utilizing a louvain community detection algorithm to serve as the real community division, and updating phi through increasing gradient in the training process D
Figure BDA0003862071980000114
When judging the graph in the real data set, using the upper half of the above formula to update the parameters, in order to ensure that the community structure maintenance and other optimization objectives are fully considered, the training process is only L clus And log (D (A)) converge; when the generated graph is judged, the parameters are updated using the lower half of the above equation.
Generator optimization: the generator aims at minimizing the logarithmic probability of the discriminator correctly assigning to the graph reconstructed by G, introducing from CycleGAN a mapping consistency L in order to improve the performance of the decoder D and guarantee the invariance of permutations rec Such asThe following:
Figure BDA0003862071980000115
wherein, A' i Denotes from A i The reconstructed pseudo-adjacency matrix, in practice, the collapse of the encoder E can be controlled by mapping consistency, by computing the decoder gradient with respect to φ D Gradient (2):
Figure BDA0003862071980000116
where A' represents the reconstructed adjacency matrix, and the encoder is computed with respect to φ by reducing the gradient after updating the decoder E Gradient (2):
Figure BDA0003862071980000121
wherein, the Gaussian prior p (Z) is set as
Figure BDA0003862071980000122
And L is prior (. Is) represents the calculation of the Kullback-Leibler (KL) divergence between the two distributions, with this improved encoder and decoder, the generation process can generate new graphs of arbitrary size and similar community structure.
(5) Generating new graphs
After training, the present embodiment samples n s (n s N) nodes to obtain A sub ∈R ns×ns And combining the output matrix A obtained from the generator and verified by the discriminator to generate the adjacency matrix out ∈R n×n Specifically, a null A is initialized out And filling in the adjacency matrix A of each subgraph sub Until the number of generated edges meets the requirement, selecting a threshold value to determine the binarization strategy of each edge and passing A out Parameterized Bernoulli-distributed sampling strategies may each lead to overlookingSlightly lower-order nodes and high variance output; to solve these problems, the present embodiment uses the following strategy: by passing from A out The parameterized class distribution sampling of the ith row generates an edge for the node i; selection A out Until the number of edges reaches a predefined number; the total temporal complexity of generating the new graph is O (n) 2 )。
Referring to fig. 2, in order to reduce the parameter and computation costs, the hierarchical ladder encoder and discriminator are abandoned, and the core codec part is retained: the graph attention encoder, the community encoder and the graph decoder together form an automatic coding-based framework of SLSGEN, and the embodiment is described in detail below in order to better explain a large-scale graph generation method based on SLSGEN.
(1) Simplified autoencoder architecture
To avoid the entire graph computation in GCN, graph attention networks are used to measure edge importance in sampled local structures, specifically given input node characteristics
Figure BDA0003862071980000127
And an Ego Graph (Ego Graph) that uses a multi-head attention mechanism to aggregate messages from Graph structures to obtain hidden variables h for a central node u of the associated Ego Graph u Wherein d is enc Representing the dimensions of the hidden variables after the encoding process, for each autograph, the message aggregation formula is as follows:
Figure BDA0003862071980000123
/>
wherein the content of the first and second substances,
Figure BDA0003862071980000124
a row of hidden variables representing the graph attention coding layer, namely hidden variables on the node u;
Figure BDA0003862071980000125
representing an output projection matrix; h is attn Indicating the number of heads of attention; d att Indication noteThe dimension of the gravity vector a; each head of the chart attention tier>
Figure BDA0003862071980000126
The calculation formula of (c) is as follows:
Figure BDA0003862071980000131
where σ denotes the activation function, N (u) denotes the neighbor of node u,
Figure BDA0003862071980000132
expressing the importance of the ith head middle edge (u, v), the calculation formula is as follows:
Figure BDA0003862071980000133
wherein the content of the first and second substances,
Figure BDA0003862071980000134
denotes the attention vector of the ith attention head and the leakyrev denotes a non-linear activation function with negative input slope α = 0.2.
The present embodiment uses community tags to direct latent variables to a latent space maintained by the community, and furthermore, to avoid complexity of O (n) in the decoder 2 ) The inner product decoder is abandoned, a linear decoder is used to obtain the fraction of each edge, and the formula of the two decoders is as follows:
Figure BDA0003862071980000135
Figure BDA0003862071980000136
wherein the content of the first and second substances,
Figure BDA0003862071980000137
representing a predicted community allocation probability matrix; />
Figure BDA0003862071980000138
Representing a predicted edge probability matrix; in practice, the real community label of a node is calculated through a lovain community detection algorithm, after the community label of a sampling node is generated, cross entropy loss is calculated according to the real community label, parameters of a community decoder and an encoder are updated, an adjacent vector of the sampling node is generated, and approximate loss is calculated according to an adjacent matrix of a real graph, wherein an approximate loss function calculation formula is as follows:
Figure BDA0003862071980000139
wherein the content of the first and second substances,
Figure BDA00038620719800001310
representing a set of sampled central nodes, n s Represents->
Figure BDA00038620719800001311
The size of (d); />
Figure BDA00038620719800001312
Representing a fractional matrix from a graph decoder; y is c Representing real community tags by adjusting n s Can be balanced between generating a high quality temporal map and fast model training, assuming that n is sampled for model training s The spatial complexity of the proposed autoencoder architecture is O (n × (n) for each node s +d in ) In a time complexity of @) for each training iteration>
Figure BDA00038620719800001313
(2) Scalable graph sampling
In order to realize the generation of the scalable graph and maintain the generation performance of the model, the complete graph is decomposed into a plurality of self graphs, and the approximation of the task of generating the complete graph is realized by running an SLSGEN core architecture on the self graphs for a plurality of times; specifically, a representative node in a training phase is selected to model a complete graph structure, and nodes with higher degrees are associated with outliers with lower probability, so in order to pay attention to the representative nodes and edges and generate a high-quality graph, a probability distribution based on the node degrees is used as an initial node sampling strategy, and the formula is as follows:
Figure BDA0003862071980000141
wherein deg is u Representing the degree of node u, assuming that n is sampled in each traversal s Each node is used as an initial time sequence node, and n is sampled s The self graph is used as the input of the coding process, and the initial node set is expressed as
Figure BDA0003862071980000142
Referring to fig. 3, to reduce the time consumption of the encoding process, a node representation of parallel node encoding is computed over multiple autographs, as shown in fig. 3 (a), to reduce the computational complexity from O (n) to
Figure BDA0003862071980000143
Wherein b represents the parallel number of the time self graph, namely the batch size; for efficient training, the batch size is set to the size of the initial sampling center node set, where
Figure BDA0003862071980000144
Thus, the computational complexity is parallelized into ≦ ≦ the ≦ computation complexity>
Figure BDA0003862071980000145
In order to further reduce space consumption, the present embodiment uses a truncation mechanism to control space usage, ignores duplicate nodes when sampling replacement nodes, uses th as a threshold to control worst-case space requirements, and once the total number of neighbors of a node exceeds th, the algorithm will convert from a full neighbor sampling strategy to a th neighbor sampling strategy.
In order to avoid repeated computation on high-order nodes, after sampling the self graphs, all the self graphs are merged into k-dichotomy computation graphs as shown in fig. 3 (b), and message passing and edge significance computation are performed on the computation graphs at the same time.
(3) Data parallel and model parallel
When training is performed on a large graph with more than one million nodes, GPU memory is easily exhausted, and therefore, it is necessary to perform data parallel training on multiple GPU machines, and assuming that there is one dual GPU machine, according to fig. 3 (c), model parameters and sampled self graphs are respectively placed into two GPUs, and in the two GPUs, the respective self graphs are assembled into two computation graphs, in this embodiment, representations of central nodes, prediction score matrices, and gradient values of parameters to be updated are computed, and self graph merging and encoder-decoder modules can be completed completely in parallel; only when there is a significant difference in training time between the two GPUs will serial dependency be generated on the update embedding of certain nodes.
For example, in FIG. 3 (c), three node embeddings require synchronization between two GPUs, limiting the number of embeddings to O (t) according to a self graph sampling policy k ) Hereinafter, where k is the self-graph depth of the sample and t is the size of the cutoff value, so that the communication cost is controllable, the upper limit of SLSGEN is 20 hundred million nodes (i.e. 2) by this data parallel strategy 31 1 node), which is far more than other learning-based approaches.
If necessary, have more than 2 31 Large graph of nodes using model parallelism, in particular, dividing the model into c copies and dividing the training data into blocks, each block containing less than 2 31 And a node, training the model on a plurality of machine clusters, in this case, assuming that the depth of the sampled self-graph is 1, the time complexity of communication between the machines is less than O (t), wherein t is a truncated value when the self-graph is sampled, and the communication cost is manageable.
Preferably, it should be described again in this embodiment that, as with other learning-based graph generators, LSGEN also uses some classical models, such as VAE and CycleGAN, so as to achieve graph generation quality far exceeding that of the conventional graph generation method, and be capable of being competent for various graph generation tasks, but the present embodiment focuses on two new objectives: efficiency (scalability) and community-keeping based on learned models, which requires developing new techniques in the present invention, e.g., inferring hierarchy information separately with VAEs, which facilitates community-keeping for graph generation; furthermore, the consistency of the mappings of CycleGAN and VAE is used for permutation invariant graph generation, which is crucial for scalable implementation of sampling strategies.
Preferably, the embodiment further needs to explain that, in SLSGEN, a learning-based solution with better expandability is provided by using a CPU memory or even a hard disk, and specifically, an efficient graph attention automatic encoder architecture is designed for generating a community maintenance graph; the self-graph sampling and the second computation graph assembling strategy are adopted, and a small-batch-based method is realized to train the graph generator; a data parallel and model parallel architecture for training and reasoning of a scalable graph generation model is proposed; in summary, SLSGEN reduces the training time of the original LSGEN model by 4 times and reduces the memory usage by 4 times.
It should be recognized that embodiments of the present invention can be realized and implemented in computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer-readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, the operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein. A computer program can be applied to input data to perform the functions described herein to transform the input data to generate output data that is stored to non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on a display.
As used in this application, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (10)

1. A large-scale map generation method based on deep confrontation learning is characterized in that: comprises the steps of (a) preparing a substrate,
for the graph G, an adjacent matrix A and a characteristic matrix X are given, after sampling is carried out on the graph G, the graph G is input into a graph attention encoder to obtain structural information of the graph, and a real value of a community label is obtained by applying a community detection algorithm;
feeding the community information and the graph representation output by the graph attention encoder to a community decoder to generate community labels corresponding to the nodes;
adjusting parameters of the graph attention encoder and the community decoder by utilizing back propagation to guide the graph attention encoder and the community decoder to a potential space kept by a community;
feeding the community information and graph representation output by the graph attention encoder to a graph encoder, generating edge probabilities;
and finally sampling to obtain a new graph generated by the model by utilizing the edge probability simulation graph fractional matrix.
2. The large-scale map generation method based on deep confrontation learning according to claim 1, characterized in that: there is also a need to enhance the reconstruction of the graph structure prior to sampling, including,
for the graph G, the adjacency matrix A and the feature matrix X are given, the adjacency matrix A and the feature matrix X are input into a ladder encoder to obtain structure information of the graph, and the real value of the community label is obtained by applying the community detection algorithm;
feeding the community information and graph representation output by the ladder encoder to a discriminator to determine whether the input graph is a false graph distinct from the true graph;
meanwhile, the coarsening graph of each level distributes the community structure characteristics to the original nodes through the differentiable layer message transmission process;
and decoding a series of community information of each original node to enhance the reconstruction of the graph structure.
3. The large-scale map generation method based on deep confrontation learning according to claim 2, characterized in that: the ladder encoder includes graph convolution, graph pooling, graph readout, and graph transpose pooling.
4. The large-scale map generation method based on deep confrontation learning according to claim 2 or 3, characterized in that: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
generating a new graph with observed hierarchical community structure distribution by using variational reasoning before decoding node characteristics;
and selecting a multilayer perceptron as a reasoning model for reasoning, and completing the mapping from the reconstruction characteristics to the prior distribution.
5. The large-scale map generation method based on deep confrontation learning according to claim 4, characterized in that: the discriminator comprises a plurality of discriminators which are connected in series,
the identification task needs the graph characteristics obtained by the ladder encoder, and a layer output matrix is read according to the graph;
and optimizing the discriminator, performing form calculation through game, combining training and updating parameters through increasing gradient.
6. The large-scale map generation method based on deep confrontation learning according to claim 5, characterized in that: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
by passing from A out The parameterized class distribution sampling of the ith row generates an edge for the node i;
selection A out Until the number of edges reaches a predefined number;
the total temporal complexity of generating the new graph is O (n) 2 )。
7. The large-scale map generation method based on deep confrontation learning according to claim 1, characterized in that: the image decoder includes a decoder for decoding a plurality of encoded data streams,
decoding the hierarchical representation sequence;
and predicting node linkage.
8. The large-scale map generation method based on deep confrontation learning according to claim 1 or 7, characterized in that: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
given input node characteristics
Figure FDA0003862071970000021
And a self map;
using a multi-head attention mechanism to gather data from graph structuresMessage, hidden variable h of central node u of obtaining relevant self-graph u
9. The large-scale map generation method based on deep confrontation learning according to claim 8, characterized in that: for each of the self graphs, message aggregation includes,
Figure FDA0003862071970000022
wherein the content of the first and second substances,
Figure FDA0003862071970000023
a row of hidden variables representing the graph attention coding layer, namely hidden variables on the node u;
Figure FDA0003862071970000024
representing an output projection matrix; h is attn Indicating the number of attention heads; d att Representing the dimension of the attention vector a.
10. The large-scale map generation method based on deep confrontation learning according to claim 9, characterized in that: the probability distribution based on the node degree is used as an initial node sampling strategy, including,
Figure FDA0003862071970000025
wherein deg. de u Representing the degree of node u, assuming that n is sampled in each traversal s Each node is used as an initial time sequence node, and n is sampled s The self graph is used as the input of the coding process, and the initial node set is expressed as
Figure FDA0003862071970000026
/>
CN202211167773.XA 2022-09-23 2022-09-23 Large-scale graph generation method based on deep confrontation learning Pending CN115879507A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211167773.XA CN115879507A (en) 2022-09-23 2022-09-23 Large-scale graph generation method based on deep confrontation learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211167773.XA CN115879507A (en) 2022-09-23 2022-09-23 Large-scale graph generation method based on deep confrontation learning

Publications (1)

Publication Number Publication Date
CN115879507A true CN115879507A (en) 2023-03-31

Family

ID=85769994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211167773.XA Pending CN115879507A (en) 2022-09-23 2022-09-23 Large-scale graph generation method based on deep confrontation learning

Country Status (1)

Country Link
CN (1) CN115879507A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117194721A (en) * 2023-08-22 2023-12-08 黑龙江工程学院 Method and device for generating graph data and computer equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117194721A (en) * 2023-08-22 2023-12-08 黑龙江工程学院 Method and device for generating graph data and computer equipment

Similar Documents

Publication Publication Date Title
CN109685116B (en) Image description information generation method and device and electronic device
Sau et al. Deep model compression: Distilling knowledge from noisy teachers
CN112529168B (en) GCN-based attribute multilayer network representation learning method
CN108108854B (en) Urban road network link prediction method, system and storage medium
CN111541570B (en) Cloud service QoS prediction method based on multi-source feature learning
Zhao et al. Learning hierarchical features from generative models
Pareek et al. IntOPMICM: intelligent medical image size reduction model
CN109389151B (en) Knowledge graph processing method and device based on semi-supervised embedded representation model
US9361586B2 (en) Method and system for invariant pattern recognition
CN112417289B (en) Information intelligent recommendation method based on deep clustering
CN111310852A (en) Image classification method and system
CN113065649A (en) Complex network topology graph representation learning method, prediction method and server
CN114926770A (en) Video motion recognition method, device, equipment and computer readable storage medium
CN112446489A (en) Dynamic network embedded link prediction method based on variational self-encoder
CN112560456A (en) Generation type abstract generation method and system based on improved neural network
CN114239675A (en) Knowledge graph complementing method for fusing multi-mode content
CN115062587B (en) Knowledge graph embedding and replying generation method based on surrounding information
CN116310667A (en) Self-supervision visual characterization learning method combining contrast loss and reconstruction loss
Knop et al. Generative models with kernel distance in data space
CN114723037A (en) Heterogeneous graph neural network computing method for aggregating high-order neighbor nodes
CN113836319A (en) Knowledge completion method and system for fusing entity neighbors
Yang et al. Lstm network-based adaptation approach for dynamic integration in intelligent end-edge-cloud systems
CN117541668A (en) Virtual character generation method, device, equipment and storage medium
CN116595479A (en) Community discovery method, system, equipment and medium based on graph double self-encoder
CN114882288B (en) Multi-view image classification method based on hierarchical image enhancement stacking self-encoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination