CN116319426A

CN116319426A - Network sparsification measurement method and system based on graph neural network

Info

Publication number: CN116319426A
Application number: CN202310304303.1A
Authority: CN
Inventors: 姚欣; 王闻宇; 李星星; 张恒; 王晓飞
Original assignee: Pioneer Cloud Computing Shanghai Co ltd
Current assignee: Pioneer Cloud Computing Shanghai Co ltd
Priority date: 2023-03-27
Filing date: 2023-03-27
Publication date: 2023-06-23

Abstract

The invention relates to the technical field of network sparsification measurement, and provides a network sparsification measurement method based on a graph neural network, which comprises the following steps of: s1: sparse sampling is carried out on the end-to-end network service quality data; s2: based on the sparsely sampled data, a network service quality prediction model is established and trained; s3: and using the historically learned spatio-temporal network service quality information for learning and reasoning at future moments by adopting a continuous learning function mode. According to the technical scheme, the space-time correlation of the edge cloud network service quality is fully mined, and the space-time correlation among the network service quality is utilized to complement the full network end-to-end service quality matrix. The method solves the problem that the traditional algorithm filters partial useful data while filtering abnormal data, and can cope with complex and changeable network service quality data. The method solves the problem that the traditional algorithm cannot continuously infer when being completed.

Description

Network sparsification measurement method and system based on graph neural network

Technical Field

The invention relates to the technical field of network sparsification measurement, in particular to a network sparsification measurement method and system based on a graph neural network.

Background

In recent years, there has been an increasing demand for distributed systems. The advent of increasingly delay sensitive applications, such as VR, AR, and autopilot, has led to an increasing trend for service providers to densely and distributively deploy services. Edge clouds, an emerging edge computing model, allow for dense and distributed service deployment, are attracting attention. Facing such a distributed edge cloud system, the service provider needs to know all end-to-end network quality of service (QoS) to make appropriate resource arrangements and requests. In addition, when a distributed system fails, a comprehensive and accurate end-to-end quality of service may better help service providers estimate the number of affected users and resolve the failure problem.

However, the dense deployment of services presents a significant challenge to the monitoring of network QoS. For example, for an edge cloud system with n nodes, performing a full amount of end-to-end QoS detection at a time requires n ² The measurement of the level places an unbearable, tremendous burden on the system. Even if these burdens are not considered, network errors, data transmission delays, etc., will give a strong sparsity to the total QoS. The present invention is therefore directed to the filling of true measured, sparse end-to-end QoS by a data-filling algorithm. In this way, the complexity of measurement can be reduced, the measurement efficiency can be improved, and the control over the network QoS can be enhanced.

Most of the existing network service quality alignment algorithms are matrix alignment algorithms based on mathematics, and cannot cope with complex variability of edge cloud data. At the same time, has the following disadvantages:

(1) The complex variability of the edge cloud service quality is not considered, the matrix is decomposed in a mathematical mode, and then is complemented, so that the useful characteristics in the data are weakened while partial abnormal data are filtered, and the prediction accuracy is influenced.

(2) The effect of the existing learned patterns on future data completions is not considered. The existing applications such as VR and AR automatic driving provide high real-time requirements for network services. This requires that the network service provider also be able to perceive the network quality of service in real time. The mathematical-based approach retrains and inferences each time a complement is made, and the existing learned complement patterns cannot be applied to the inference of new data.

(3) The spatiotemporal relationship is not considered to aid in quality of service data reasoning. The network service quality data has stronger time-space correlation. In particular, the end-to-end delay between two servers has a strong correlation with the physical distance between the two servers. The traditional mathematical mode cannot effectively utilize the space-time correlation among the nodes, so that the accuracy of reasoning and the calculation efficiency of the reasoning are affected.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide a network sparseness measurement method and system based on a graph neural network, which are used for solving the problems mentioned in the background art.

The above object of the present invention is achieved by the following technical solutions:

a network sparsification measurement method based on a graph neural network comprises the following steps:

s1: sparse sampling is carried out on the end-to-end network service quality data;

s2: based on the sparsely sampled data, a network service quality prediction model is established and trained;

s3: and using the historically learned spatio-temporal network service quality information for learning and reasoning at future moments by adopting a continuous learning function mode.

Further, the end-to-end network quality of service data includes end-to-end delay data, end-to-end bandwidth data, and end-to-end packet loss rate data.

Further, in step S1, the thinning sampling is performed on the end-to-end network quality of service data, specifically:

s11: making a network sparseness measurement plan, for n network nodes, n is included ² A distributed system of strip end-to-end paths, wherein in all the end-to-end paths, random sampling is performed according to a sampling rate alpha, and the random sampling is performed in a Bernoulli sampling mode;

s12: and executing the network sparse measurement plan, and storing a sampling result of the network sparse measurement plan to a central node.

Further, in step S2, a network quality of service prediction model is built and trained, specifically:

s21: constructing a delay matrix of the end-to-end path into graph structure data;

s22: a characteristic encoder based on a graph neural network, which adopts a plurality of coding layers based on message transmission to encode the characteristic represented by the graph structure data;

s23: adopting a delay predictor based on a multi-layer perceptron to restore the hidden characteristic generated after the graph structure data is encoded in the step S22 into a complete network service quality prediction matrix;

s24: and (3) adopting a gradient descent method to perform repeated training optimization and minimize errors.

Further, in step S21, the delay matrix of the end-to-end path is constructed as the graph structure data, specifically:

the distributed system comprising n network nodes, the constructed graph structure data is G (V, E), wherein v= { V ₁ ，v ₂ ，...v _n -all n of said netsA set of network nodes e _ij E represents the network quality of service from network node i to network node j.

Further, in step S22, the feature encoder based on the graph neural network encodes the features represented by the graph structure data by using a multi-layer encoding layer based on message passing, specifically:

for any u E V, embedding network service quality information among neighbor network nodes of u into a representation of u in an aggregation mode, wherein an aggregation formula is as follows:

wherein ,AGG_l For the aggregate function, σ is the nonlinear function sigmod, CONCAT stands for two-vector concatenation, l is the layer sequence number, v is the node to be encoded,

for the learnable parameters of layer I for encoding,/for the coding>

For the characteristic representation of node u in layer x of node u,/>

A characteristic representation at layer 1 for the quality of service from node u to node v;

after the polymerization is finished, polymerizing the current layer to obtain

Combined with the information indicated by v of the previous layer, the output of the current layer is obtained together>

The expression is:

wherein ,

for the layer I, a matrix of learnable parameters for aggregation,>

for the l-1 layer output, the feature x of v is used to initialize h when the current layer is the first layer ₀ ；

Defining an inter-layer residual, injecting said inter-layer residual into the output y of the first layer ^l Specifically, the information of the previous layer is input to the current layer in a ratio of 1-lambda:

wherein ,

is the output of the previous layer;

defining an initial residual error for compensating gradient disappearance and gradient explosion caused by multi-layer picture convolution coding, wherein the expression is as follows:

wherein ,x₀ For all node initial features, β is an adaptive scaling determined using the Frobenius norm;

wherein ,h^l Feature matrix formed by splicing all node features of layer I, and x ₀ The initial feature matrix is used for all nodes;

finally superimposing the inter-layer residual and the initial output to the output of the neural network

And outputting the data, wherein the expression is as follows:

at this time, encoding of the features represented by the map structure data of the first layer is completed.

Further, in step S23, a delay predictor based on a multi-layer perceptron is adopted to restore the implicit characteristics generated after the encoding of the graph structure data in step S22 to a complete network quality of service prediction matrix, which specifically includes:

in step S2, for any one of the network nodes v, implicit features are generated

Reconstructing a service quality matrix by adopting decoding based on a multi-layer perceptron, and restoring the service quality matrix with high-dimension implicit expression, wherein the expression is as follows:

wherein ,

and predicting and outputting the service quality of the final model, wherein T is the current moment, W is a learnable parameter matrix for outputting, T is the transposition operation of the matrix, and b is a bias coefficient matrix.

Further, in step S24, a gradient descent method is adopted to minimize the error, specifically:

the difference between the predicted value and the true value is described using the mean square error, expressed as:

where l is the error between the predicted value and the true value,

for predicted quality of service, A _t Is true quality of service;

and (3) on the basis of the obtained error, performing repeated training optimization in a gradient descent mode, and dynamically adjusting the learning rate by using an Adam optimizer in the training process until the error change between the two optimizations is smaller than a preset threshold value.

Further, in step S3, the historical learned spatio-temporal network quality of service information is used for learning and reasoning at a future time by adopting a continuous learning function, specifically:

F′ _n+1 ＝G(F ₀ ，F ₁ ，...，F _n )

wherein G is a continuous learning function, and the input is a history model F ₀ ，F ₁ ...F _n The initial model F 'at the time of n+1 is output' _n+1 F 'is used first at time n+1' _n+1 The model is initialized and training is performed on the basis of the initialization.

A graph neural network-based network sparsification measurement system for performing the graph neural network-based network sparsification measurement method as described above, comprising:

the sparse data acquisition module is used for sparsely sampling end-to-end network service quality data

The model building and training module is used for building a network service quality prediction model and training based on the sparsely sampled data;

and the continuous learning module is used for learning and reasoning the time-space network service quality information learned by the history in a mode of adopting a continuous learning function.

A computer device comprising a memory and one or more processors, the memory having stored therein computer code which, when executed by the one or more processors, causes the one or more processors to perform a method as described above.

A computer readable storage medium storing computer code which, when executed, performs a method as described above.

Compared with the prior art, the invention has at least one of the following beneficial effects:

(1) And fully mining the space-time correlation of the edge cloud network service quality, and complementing the full network end-to-end service quality matrix by utilizing the space-time correlation among the network service qualities.

(2) The method solves the problem that the traditional algorithm filters partial useful data while filtering abnormal data, and can cope with complex and changeable network service quality data.

(3) The method solves the problem that the traditional algorithm cannot continuously infer when being completed. The data model can be applied to new data reasoning by the aid of the data model, calculation cost caused by training is reduced, and the whole framework can conduct rapid, continuous and stable reasoning.

Drawings

FIG. 1 is an overall flow chart of a network sparsification measurement method based on a graph neural network;

fig. 2 is an overall structure diagram of the network sparsification measurement system based on the graph neural network.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

Conventional fault prediction is mainly based on methods such as supervised classification or prediction, and the edge cloud scene has more complex scenes, more uncontrollable factors and other challenges.

The method mainly has the following technical breakthroughs:

(1) The patent provides an algorithm for supplementing network service quality data based on a Graph Neural Network (GNN), which can fully utilize space-time correlation among nodes to supplement missing parts in all end-to-end service quality data.

(2) On the basis of adopting the graph neural network, the patent also designs a set of self-adaptive residual error structure which is used for overcoming the defect that the change in data is large and the useful characteristics are weakened when the graph neural network is used. The residual structure is adopted to effectively extract high-frequency and valuable features in the original data.

(3) This patent adopts the mode of continuous learning to reduce neural network's calculation cost. The patent proposes a continuous learning-based approach that utilizes inference knowledge learned from existing data to infer unknown parts of quality of service data. In this way, we can reduce the number of iterations of the neural network, thereby reducing the computational complexity.

The invention relates to the term interpretation:

graph neural network:

the graph neural network (Graph Neural Network, GNN) is a framework for directly learning graph structure data by deep learning, which has been developed in recent years, and its excellent performance has attracted a high degree of attention and intensive exploration by students. The GNN converts the graph structure data into a standard representation by making a certain strategy on nodes and edges in the graph, and inputs the standard representation into a plurality of different neural networks for training, so that excellent effects are achieved on tasks such as node classification, side information propagation, graph clustering and the like.

The history of GNNs was first traced back to 2005, and Gori et al first proposed GNN concepts, with RNNs to handle undirected, directed, labeled, cyclic, etc. After this, scarselli et al and Micheli et al inherit and develop the GNN algorithm for this mode and make some degree of improvement. The early-stage GNN mainly uses RNNs as a main body frame, generates vector expression for each node through simple feature mapping and node aggregation, and cannot well cope with complex and changeable graph data in reality. For this case Bruna et al propose to apply CNN to the graph, through smart transformations on convolution operators, graph convolution networks (Graph Convolutional Netwok, GCN) are proposed, and many variants are derived. The GCN realizes the translation invariance, local perception and weight sharing of the CNN on the graph, and provides guidance and reference on ideas for the construction and improvement of other GNN frameworks.

From the concept of GNN proposed by Gori et al in 2005, to the advent of GCN to provide an effective processing paradigm for non-european structural data, to the proposal of different GNN framework variants such as GAE, GAT, GRN, GGN, and to the application of GNN in various fields, GNN underwent a process from none to existence, from existence to optimization in theory and practice, and the architecture family of GNN was also continually developing and perfecting. From this history, many researchers can see continuous improvements and optimizations of GNN algorithms and structures.

Residual structure:

the residual network is characterized by easy optimization and can improve accuracy by increasing considerable depth. The residual blocks inside the deep neural network are connected in a jumping mode, and the gradient disappearance problem caused by depth increase in the deep neural network is relieved.

Continuous learning:

continuous learning refers to learning models of a large number of tasks in sequence without forgetting knowledge obtained from previous tasks. This is an important concept because machine learning models are trained to be optimal functions for a given data set or data distribution under supervised learning. In a real world environment, however, the data is rarely static and may change. Typical ML models may degrade when faced with invisible data. This phenomenon is known as catastrophic forgetfulness.

Bernoulli sampling:

bernoulli sampling is a random sampling method based on bernoulli distribution. In Bernoulli sampling, each event has two possible outcomes, such as success or failure, head or tail, etc. These results are randomly generated from a Bernoulli distribution with a parameter p, where p represents the probability of success. If an event is successful, we mark it as 1, otherwise it is marked as 0.

Bernoulli sampling is often used in machine learning and statistics, for example in binary classification tasks, we can use bernoulli sampling to generate sample data, where success represents positive examples and failure represents negative examples.

An implementation of bernoulli sampling may use a pseudo-random number generator, such as a function in a random module in Python. In particular, we can use the random () function to generate a random number r between 0 and 1, if r is less than or equal to p, we mark the event as 1, otherwise as 0. In this way we can generate a sample dataset where the result of each event is randomly generated based on the probability of p.

First embodiment

As shown in fig. 1, the present embodiment provides a network sparseness measurement method based on a graph neural network, including the following steps:

s1: and sparsely sampling the end-to-end network service quality data.

S2: and establishing a network service quality prediction model and training based on the sparsely sampled data.

Each step of the present invention is described in detail below:

in step S1, the sparse sampling is performed on the end-to-end network quality of service data, specifically:

s11: making a network sparseness measurement plan, for n network nodes, n is included ² A distributed system of strip end-to-end paths, wherein in all of said end-to-end paths, random sampling is performed according to a sampling rate α, wherein said random sampling is performed in a bernoulli sampling manner.

The end-to-end network service quality data comprises end-to-end delay data, end-to-end bandwidth data and end-to-end packet loss rate data.

In step S2, a network quality of service prediction model is built and trained, specifically:

s21: and constructing the delay matrix of the end-to-end path into graph structure data, wherein the graph structure data comprises the following specific steps of:

the distributed system comprising n network nodes, the constructed graph structure data is G (V, E), wherein v= { V ₁ ，v ₂ ，...v _n The set of all n said network nodes, e _ij E represents the network quality of service from network node i to network node j.

S22: the characteristic encoder based on the graph neural network adopts a plurality of coding layers based on message transmission to code the characteristics shown by the graph structure data, and specifically comprises the following steps:

in a first portion of the graph neural network-based feature encoding, a message passing-based multi-layer encoding layer is employed to encode features. For any u e V, we obtain an embedded representation of u by aggregating attributes on all edges related to u, and embed network quality of service information between neighboring network nodes of u into the representation of u by means of aggregation, where the aggregation formula is:

for the learnable parameters of layer I for encoding,/for the coding>

For the characteristic representation of node u in layer x of node u,/>

after the polymerization is finished, polymerizing the current layer to obtain

The expression is:

wherein ,

for the layer I, a matrix of learnable parameters for aggregation,>

The node topology information coding based on the graph neural network is completed. However based on polymeric functionsNumber AGG _l The aggregation of (a) weakens the useful features of the part nodes, so we propose an adaptive-based residual module to compensate for the weakening of the GNN for fast-varying network features. In this patent we define two residual paths altogether, injecting the inter-layer residual and the initial residual to the output y of the first layer, respectively ^l Is a kind of medium.

wherein ,

is the output of the previous layer;

on the basis of the above information fusion, the multi-layer coding stack can lead to potential gradient vanishing and gradient explosion, here we design another residual route to compensate for the gradient vanishing and gradient explosion caused by multi-layer picture convolution coding. Defining an initial residual error for compensating gradient disappearance and gradient explosion caused by multi-layer picture convolution coding, wherein the expression is as follows:

wherein ,x₀ For all node initial features we need to go x ₀ Scaled by beta and superimposed on

And (3) upper part. But for the first layer, the size of the first layer relative to x0 is different, so we set β to an adaptive value to adapt to the range of values of the different layers, and β is an adaptive scaling determined by using the Frobenius norm.

And outputting the data, wherein the expression is as follows:

S23: and (2) recovering the hidden characteristics generated after the graph structure data is encoded in the step (S22) into a complete network service quality prediction matrix by adopting a delay predictor based on a multi-layer perceptron, wherein the method specifically comprises the following steps:

the second part is a delay predictor based on a multi-layer perceptron. In step S2, for any one of the network nodes v, implicit features are generated

In the step, a quality of service matrix is reconstructed by adopting decoding based on a multi-layer perceptron, and the quality of service matrix with high dimensionality implicit expression is restored, wherein the expression is as follows:

wherein ,

predicting output for service quality of final model, t is current time, W is a learnable parameter matrix for outputT is the transpose operation of the matrix, and b is the matrix of bias coefficients.

By means of the decoder we restore the quality of service matrix of the high-dimensional implicit representation.

S24: the gradient descent method is adopted to carry out repeated training optimization and minimize errors, and the method specifically comprises the following steps:

where l is the error between the predicted value and the true value,

for predicted quality of service, A _t Is true quality of service;

and (3) on the basis of the obtained error, performing repeated training optimization in a gradient descent mode, and dynamically adjusting the learning rate by using an Adam optimizer in the training process until the error change between the two optimizations is smaller than a preset threshold (for example, 0.001).

In step S3, the time-space network service quality information learned by the history is used for learning and reasoning at the future time by adopting a continuous learning function mode, which specifically includes:

since training and reasoning from scratch will take a long time and implicit spatio-temporal correlations between nodes in the distributed system do not change much in a short time, we process the historical reasoning matrix by means of continuous learning and use the historically learned spatio-temporal network quality of service information for future time learning and reasoning, thus reducing the computational complexity for training and reasoning at new points in time.

F′ _n+1 ＝G(F ₀ ，F ₁ ，...，F _n )

Second embodiment

As shown in fig. 2, the present embodiment provides a graph neural network-based network sparsification measurement system for performing the graph neural network-based network sparsification measurement method as in the first embodiment, including:

the sparse data acquisition module 1 is used for sparsely sampling end-to-end network service quality data

The model building and training module 2 is used for building a network service quality prediction model and training based on the sparsely sampled data;

and the continuous learning module 3 is used for learning and reasoning the time-space network service quality information obtained by historical learning in a mode of adopting a continuous learning function.

A computer readable storage medium storing computer code which, when executed, performs a method as described above. Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

It should be noted that the above embodiments can be freely combined as needed. The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. The network sparsification measurement method based on the graph neural network is characterized by comprising the following steps of:

2. The network sparseness measurement method of claim 1 wherein the end-to-end network quality of service data includes data including end-to-end delay data, end-to-end bandwidth data, and end-to-end packet loss rate.

3. The network sparsification measurement method based on the graph neural network according to claim 1, wherein in step S1, the sparsification sampling is performed on the end-to-end network quality of service data, specifically:

4. The network sparsification measurement method based on the graph neural network according to claim 2, wherein in step S2, a network service quality prediction model is built and trained, specifically:

5. The network sparsification measurement method according to claim 4, wherein in step S21, the delay matrix of the end-to-end path is constructed as the graph structure data, specifically:

6. The method for measuring network sparseness based on a graph neural network according to claim 5, wherein in step S22, the feature encoder based on the graph neural network encodes the features represented by the graph structure data by using a multi-layer encoding layer based on message passing, specifically:

for the learnable parameters of layer I for encoding,/for the coding>

For the characteristic representation of node u in layer x of node u,/>

after the polymerization is finished, polymerizing the current layer to obtain

The expression is:

wherein ,

for the layer I, a matrix of learnable parameters for aggregation,>

wherein ,

for the output of the previous layer:

And outputting the data, wherein the expression is as follows:

7. The network sparsification measurement method according to claim 6, wherein in step S23, a delay predictor based on a multi-layer perceptron is adopted to restore the implicit characteristics generated after the encoding of the graph structure data in step S22 to a complete network quality of service prediction matrix, specifically:

in step S2, for any one of the network nodes v, implicit features are generated

wherein ,

8. The network sparsification measurement method according to claim 1, wherein in step S24, a gradient descent method is used to minimize the error, specifically:

where l is the error between the predicted value and the true value,

for predicted quality of service, A _t Is true quality of service;

9. The network sparsification measurement method according to claim 1, wherein in step S3, a continuous learning function is adopted to use the time-space network service quality information learned by history for learning and reasoning at future time, specifically:

F′ _n+1 ＝G(F ₀ ，F ₁ ，...，F _n )

10. A graph neural network-based network sparsification measurement system for performing the graph neural network-based network sparsification measurement method of any of claims 1-9, comprising:

the sparse data acquisition module is used for carrying out sparse sampling on the end-to-end network service quality data;