CN115686846B - Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation - Google Patents

Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation Download PDF

Info

Publication number
CN115686846B
CN115686846B CN202211347967.8A CN202211347967A CN115686846B CN 115686846 B CN115686846 B CN 115686846B CN 202211347967 A CN202211347967 A CN 202211347967A CN 115686846 B CN115686846 B CN 115686846B
Authority
CN
China
Prior art keywords
representing
container
request
physical node
deployment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211347967.8A
Other languages
Chinese (zh)
Other versions
CN115686846A (en
Inventor
陈卓
朱博文
周川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Technology
Original Assignee
Chongqing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Technology filed Critical Chongqing University of Technology
Priority to CN202211347967.8A priority Critical patent/CN115686846B/en
Publication of CN115686846A publication Critical patent/CN115686846A/en
Application granted granted Critical
Publication of CN115686846B publication Critical patent/CN115686846B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a container cluster online deployment method for fusion graph neural network and reinforcement learning in edge calculation, which comprises the following steps: s1, extracting topological association relations existing between containers through a graph convolution network; s2, deducing the deployment strategy from the sequence to the sequence network with the aid of the graph rolling network. According to the method, edge calculation can be reasonably deployed according to the constructed optimization model.

Description

Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation
Technical Field
The invention relates to the technical field of edge deployment, in particular to a container cluster online deployment method integrating a graph neural network and reinforcement learning in edge calculation.
Background
With the rapid development of wireless access technology in recent years, various mobile internet and novel internet of things applications are continuously emerging, services increasingly present new characteristics of shorter response time requirements, higher service quality requirements, more and more resource requirements, dynamic change of resource requirement scale and the like, and IT is difficult to meet the new requirements by concentrating IT resources in a cloud computing mode of a data center for providing services for users. The near-end computing mode represented by edge computing attracts great attention, and the edge computing enables a mobile user to access services on an edge service node nearby by deploying the service node at the network edge closer to the user in a distributed mode, so that the service quality can be remarkably improved, and the resource load of a data center can be effectively reduced. By introducing the virtualization technology, the edge service provider can abstract the physical resources of the edge node into virtual network function units (Virtual Network Function, VNF), improve the utilization efficiency of IT resources on the premise of meeting the service demands of users, and further reduce the service expenditure (OPEX) of the edge service provider. Currently, virtual Machine (VM-VNF) based virtualization technology (VM) is most widely used. However, VM-VNF has limitations such as slow start-up and migration and large resource overhead, which makes it slow when facing the dynamic demands of tasks. With the recent rise of newly proposed serverless computing (Serverless Computing), network functions can be deployed in the form of Containers (CT) and, in turn, form Container-based virtualization technologies (CT-VNFs). CT-VNFs are increasingly being used by edge service providers due to their advantages of lighter resource usage, shorter service start-up time, and higher migration efficiency. Providing services to tasks at the edge end often requires deploying multiple container units on the edge service node and interconnecting them to build a Container Cluster (CC), for example: a real-time data analysis service with information security requirements may require the establishment of functional units including a Firewall, IDS, a plurality of computing units, and a load balancer. These functional units are all mapped onto the same or different edge service nodes in the form of containers and build a virtual network for interconnection. The complexity of the service itself and the higher demands on service efficiency make how to implement optimized CC deployment in an edge computing environment a challenging problem, which needs to be considered simultaneously: 1) The business is to the multiple characteristics of the resource request; 2) A logical association between a plurality of containers; 3) The IT resources remaining from the currently available edge nodes; 4) The expense of container deployment for energy consumption; 5) Quality of service degradation that may result from container deployment, etc.
Disclosure of Invention
The invention aims at least solving the technical problems in the prior art, and particularly creatively provides a container cluster online deployment method for fusing a graph neural network and reinforcement learning in edge calculation.
In order to achieve the above object of the present invention, the present invention provides a method for on-line deployment of a container cluster fusing graph neural network and reinforcement learning in edge computation, comprising the steps of:
s1, extracting topological association relations existing between containers through a graph convolution network;
s2, deducing the deployment strategy from the sequence to the sequence network with the aid of the graph rolling network.
In a preferred embodiment of the invention, the hierarchical propagation of the graph rolling network in step S1 is:
Figure BDA0003917890840000021
wherein ,H(l+1) Features representing layer l+1;
σ () represents an activation function;
Figure BDA0003917890840000022
representation matrix->
Figure BDA0003917890840000023
A degree matrix of (2);
Figure BDA0003917890840000024
representation pair->
Figure BDA0003917890840000025
Performing (-1/2) power operation on the matrix;
a represents a relationship matrix between nodes in the graph G;
Figure BDA0003917890840000026
representing an adjacency matrix of the undirected graph G with additional self-connections;
H (l) features representing a first layer;
W (l) representing the training parameter matrix of the first layer.
In a preferred embodiment of the present invention, the deployment strategy in step S2 is:
π(p|c,θ)=P r {A t =p|S t =c,θ t =θ}
where pi (p|c, θ) represents the probability of the output deployment policy p for a given input c;
θ represents the training parameters of the model;
P r representing a probability of outputting the deployment policy p;
A t an operation at time t;
S t a state at time t;
θ t the training parameters at time t are indicated.
In a preferred embodiment of the present invention, step S1 is followed by step S3, in which the reviewer network evaluates the return obtained after performing the actor' S action.
In a preferred embodiment of the present invention, step S1 is followed by step S4, in which the actor network updates the optimization model parameters according to the output of the commentator module.
In a preferred embodiment of the invention, the optimization model is:
max (total charge-total energy expenditure) (1.1)
Figure BDA0003917890840000031
Wherein N represents a set of physical nodes;
G c representing a per unit computational resource benefit;
η k,c representing the utilization of computing resources on a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure BDA0003917890840000041
representing a binary flag bit->
Figure BDA0003917890840000042
When container j representing request i is deployed on physical node k; />
Figure BDA0003917890840000043
Representing the demand of container j for request i for computing resources;
G m representing the benefit of each unit memory resource;
Figure BDA0003917890840000044
representing the amount of memory resource demand by container j of request i;
G s representing the benefit of storage resources per unit;
Figure BDA0003917890840000045
representing the demand of container j for request i for storage resources;
Figure BDA0003917890840000046
wherein N represents a set of physical nodes;
Figure BDA0003917890840000047
representing the maximum energy consumption value of the physical node k;
Figure BDA0003917890840000048
an idle energy consumption value representing a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure BDA0003917890840000049
representing a binary flag bit->
Figure BDA00039178908400000410
When container j representing request i is deployed on physical node k;
Figure BDA00039178908400000411
representing the demand of container j for request i for computing resources;
Figure BDA00039178908400000412
representing the total amount of computing resources of the physical node k;
u k representing binary flag bits, u k When=1, it means that physical node k is in an active state;
c represents the unit energy consumption branching coefficient.
In a preferred embodiment of the invention, the optimization model is: min (total energy expenditure), min () represents taking the minimum; max () means max.
Figure BDA00039178908400000413
Wherein N represents a set of physical nodes;
Figure BDA00039178908400000414
representing the maximum energy consumption value of the physical node k;
Figure BDA0003917890840000051
an idle energy consumption value representing a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure BDA0003917890840000052
representing a binary flag bit->
Figure BDA0003917890840000053
When container j representing request i is deployed on physical node k; />
Figure BDA0003917890840000054
Representing the demand of container j for request i for computing resources;
Figure BDA0003917890840000055
representing the total amount of computing resources of the physical node k;
u k representing binary flag bits, u k When=1, it means that physical node k is in an active state;
c represents the unit energy consumption branching coefficient.
In a preferred embodiment of the invention, the constraints of the optimization model are:
Figure BDA0003917890840000056
wherein ,ηk,c Representing the utilization of computing resources on a physical node k;
i represents a service request set;
n represents a set of physical nodes;
Figure BDA0003917890840000057
representing a binary flag bit->
Figure BDA0003917890840000058
When container j representing request i is deployed on physical node k; />
Figure BDA0003917890840000059
Representing the demand of container j for request i for computing resources;
Figure BDA00039178908400000510
representing the total amount of computing resources of the physical node k;
Figure BDA00039178908400000511
wherein N represents a set of physical nodes;
Figure BDA00039178908400000512
representing a binary flag bit->
Figure BDA00039178908400000513
When container j representing request i is deployed on physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure BDA00039178908400000514
wherein I represents a service request set;
V i a container set representing a service request i;
Figure BDA0003917890840000061
representing bandwidth requirements of container m and container n for request i;
Figure BDA0003917890840000062
representing a binary flag bit->
Figure BDA0003917890840000063
Container m, which represents request i, is deployed at physical node k u Applying;
Figure BDA0003917890840000064
representing a binary flag bit->
Figure BDA0003917890840000065
Container n, which represents request i, is deployed at physical node k v Applying;
Figure BDA0003917890840000066
representing a physical node k u and kv The total amount of bandwidth resources between;
Figure BDA0003917890840000067
Figure BDA0003917890840000068
Figure BDA0003917890840000069
wherein I represents a service request set;
n represents a set of physical nodes;
Figure BDA00039178908400000610
representing a binary flag bit->
Figure BDA00039178908400000611
When container j representing request i is deployed on physical node k;
Figure BDA00039178908400000612
representing the demand of container j for request i for computing resources;
Figure BDA00039178908400000613
representing the total amount of computing resources of the physical node k;
Figure BDA00039178908400000614
representing the amount of memory resource demand by container j of request i;
Figure BDA00039178908400000615
representing the total memory resource amount of the physical node k; />
Figure BDA00039178908400000616
Representing the demand of container j for request i for storage resources;
Figure BDA00039178908400000617
representing the total amount of storage resources of physical node k.
In a preferred embodiment of the invention, the model is updated as:
Figure BDA00039178908400000618
wherein ,θk+1 Model parameters representing the next moment;
θ k model parameters representing the current time;
alpha represents a learning rate;
Figure BDA00039178908400000619
represents the lagrangian gradient approximated using monte carlo sampling.
In a preferred embodiment of the present invention, the model update further comprises:
Figure BDA0003917890840000071
wherein ,
Figure BDA0003917890840000072
representing the mean square error of the evaluation value b (c, p) and the prize value Q (c, p) given by the reference evaluator;
m represents the number of samples;
Q(c,p i ) Representing the decision p made by the algorithm at a given input container cluster c i The awards obtained are issued;
b(c,p i ) Expressed in a given input volumeCluster c and decision p i The evaluation value given by the lower reference evaluator b.
In summary, by adopting the technical scheme, the edge calculation can be reasonably deployed according to the constructed optimization model.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic diagram of a container cluster deployment in an edge network environment of the present invention.
FIG. 2 is a schematic diagram of a reinforcement learning model decision-reward cycle of the present invention.
FIG. 3 is a schematic diagram of the model training process of the present invention.
Fig. 4 is a detailed schematic diagram of the actor network model of the present invention.
FIG. 5 is a schematic representation of training history of the present invention in three experimental scenarios;
wherein, (a) is a training history (small scale scene), (b) is a training history (medium scale scene), (c) is a training history (large scale scene), (d) is a training penalty (small scale scene), (e) is a training penalty (medium scale scene), (f) is a training penalty (large scale scene).
FIG. 6 is a comparative schematic of the solution time of the present invention.
FIG. 7 is a comparative schematic diagram of the present invention in terms of deployment error rate.
FIG. 8 is a graph showing a comparison of cumulative benefits over a period of time in accordance with the present invention;
where (a) is a cumulative revenue comparison (small scale scenario), (b) is a cumulative revenue comparison (medium scale scenario), (c) is a cumulative revenue comparison (large scale scenario).
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
The invention mainly comprises the following steps: modeling a container cluster deployment problem in an edge computing network environment, and solving a framework based on an edge computing container cluster deployment strategy of Actor Critic (Actor-Critic) reinforcement learning. The method comprises the steps of extracting characteristics of a network relation topological structure among a plurality of containers in a container cluster by introducing a graph rolling network, taking the result as input of a attention mechanism in a Seq2Seq network to improve output quality of solution, performing embedded coding on the container cluster by an encoder part of the Seq2Seq, and outputting a corresponding container deployment position by a decoder part. And the reinforcement learning framework based on the Actor-Critic is adopted to train the network, label mapping is not needed, the Actor network and the Critic network can mutually train and learn to perform autonomous promotion, and the solution given by the trained network obviously improves the system income.
The same period of time edge computing platform may receive different numbers of service requests, each of which may require different functions, the different functions of the service requiring the use of different types and different numbers of containers, the same number of containers having uncertain communication requirements. The most intuitive impact of service request size and kind is the change in virtual nodes and links, i.e. the change in configuration of the fabric. Workload fluctuations typically change the amount of resource demand, i.e., the change in resource configuration, of a virtual node or link. Two different container clusters are shown in FIG. 1 as mapped to the underlying physical network.
1. Reinforcement learning solving framework combined with graph convolution network
In the invention, the model is trained by adopting an Actor-Critic reinforcement learning framework. The entire model involves two neural networks: actor networks and critics networks. Their workflow is as shown in fig. 2: for a given cluster of containers to be input into the decision system, agent (Actor network) will depend on the current network state S t Giving the appropriate decision A t Among our problems is the deployment policy Placement, which indicates the deployment location of a container in a container cluster. The environment will then evaluate the deployment policy to generate corresponding feedback information (rewards) R indicative of the quality of the deployment policy t+1 At the same time, the environment updates the new environment S after deployment t+1 . The critic network evaluates the return (namely Langerhans' day value) obtained after the action of the actor is executed, and the evaluation result is Baseline; the actor network updates the model parameters based on the output of the commentator module (the actor network will update the parameters in a direction towards a higher return). The training process of the model is shown in detail in fig. 3.
In the invention, the topological link relation existing in the container cluster is extracted by the graph convolution neural network (Graph Convolutional Network) on the basis of the neural combination optimization theory, so that an agent can predict the topological structure of the container cluster in advance, and a deployment strategy is more accurately given. In particular, we use a graph rolling network and a sequence-to-sequence model based on a codec structure to infer deployment policies. For the container clusters of the same training batch, we use the following method: the characteristic information groups in a plurality of container clusters and a block diagonal matrix are input into a graph rolling network for training. To explain the model operation more clearly, we assume a set of container clusters [ Q, V, W ]]Mapping into the underlying physical network is required. Each container cluster corresponding to the service request has a container number of variable size m, for example, q= (f) 1 ,f 2 ,...,f m ). Clustering containers [ Q, V, W]As input to the GCN network, containers q= (f) in the container cluster 1 ,f 2 ,...,f m ) As input to the encoder, the decoder section outputs a deployment policy p= (P 1 ,p 2 ,...,p m ) Indicating the deployment location of each container. The network model of the actor in the method proposed by the invention is shown in fig. 4.
One part of the task request is input to the GCN network for topology feature extraction, and the other part of the task request is input to the encoder part of the Seq2Seq network for controlling the sequence of container deployment. The output of the GCN network and the output part of the encoder are input to the decoder part of the Seq2Seq through matrix point multiplication operation, and finally the deployment strategy of the container is given by the decoder.
The invention builds an optimization model from the perspective of an edge computing service provider, and hopes to reduce the total energy consumption expenditure on the premise of meeting the service request of the user as much as possible so as to realize the maximization of the benefit of the service provider.
max (total charge-total energy expenditure) (1.1)
The objective function is divided into two parts: equation (1.2) charges the edge computing service provider for the corresponding rule on leased resources, i.e., for the container j e V contained in service request I e I i Occupied physical resources: computing resources
Figure BDA0003917890840000101
Memory resource->
Figure BDA0003917890840000102
And storage resource->
Figure BDA0003917890840000103
Respectively multiplying by corresponding charging coefficients: g c 、G m and Gs . Notably, we creatively add a service effect coefficient (1-eta for the charging rules of the computing resources k,c ) The increased competing use of physical resources to constrain containers results in reduced service capabilities.
Figure BDA0003917890840000104
Wherein N represents a set of physical nodes;
G c representing a per unit computational resource benefit;
η k,c representing the utilization of computing resources on a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure BDA0003917890840000105
representing a binary flag bit->
Figure BDA0003917890840000106
When container j representing request i is deployed on physical node k;
Figure BDA0003917890840000107
representing the demand of container j for request i for computing resources;
G m representing the benefit of each unit memory resource;
Figure BDA0003917890840000108
representing the amount of memory resource demand by container j of request i;
G s representing the benefit of storage resources per unit;
Figure BDA0003917890840000109
representing the demand of container j for request i for storage resources;
in equation (1.3) we define the energy expenditure generated by the underlying physical network, considering that the energy expenditure accounts for a large part of the service provider's daily operating expenditure, so here our optimization model only considers the energy expenditure as an operator expenditure.
Figure BDA0003917890840000111
For maximum energy consumption value of physical node k, < ->
Figure BDA0003917890840000112
For the minimum energy consumption value of the physical node k, we use +.>
Figure BDA0003917890840000113
And computing resource occupancy->
Figure BDA0003917890840000114
The product of (2) represents the energy consumption of the physical node k, which would also occur when the physical node is idle, so the energy consumption value of the physical node k when idle is added +.>
Figure BDA0003917890840000115
And finally multiplying the sum of the two by a unit energy consumption expense coefficient to represent the total energy consumption expense of the service provider. />
Figure BDA0003917890840000116
Wherein N represents a set of physical nodes;
Figure BDA0003917890840000117
representing the maximum energy consumption value of the physical node k;
Figure BDA0003917890840000118
an idle energy consumption value representing a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure BDA0003917890840000119
representing a binary flag bit->
Figure BDA00039178908400001110
When container j representing request i is deployed on physical node k;
Figure BDA00039178908400001111
representing the demand of container j for request i for computing resources;
Figure BDA00039178908400001112
representing the total amount of computing resources of the physical node k;
u k representing binary flag bits, u k When=1, it means that physical node k is in an active state;
c represents a unit energy consumption branching coefficient;
the optimization model is limited by a plurality of constraint conditions, and the constraint (1.4) represents the utilization rate eta of the computing resources on the physical node k k,c ,η k,c The range of values is limited to [0,1 ]]。
Figure BDA00039178908400001113
wherein ,ηk,c Representing the utilization of computing resources on a physical node k;
i represents a service request set;
n represents a set of physical nodes;
Figure BDA0003917890840000121
representing a binary flag bit->
Figure BDA0003917890840000122
When container j representing request i is deployed on physical node k;
Figure BDA0003917890840000123
representing the demand of container j for request i for computing resources;
Figure BDA0003917890840000124
representing the total amount of computing resources of the physical node k;
constraint (1.5) defines that the jth container of the ith service request can only be deployed on one physical node and cannot be redeployed.
Figure BDA0003917890840000125
Wherein N represents a set of physical nodes;
Figure BDA0003917890840000126
representing a binary flag bit->
Figure BDA0003917890840000127
When container j representing request i is deployed on physical node k;
i represents a service request set;
V i a container set representing a service request i;
constraint (1.6) defines that two service requests i are located at physical nodes k, respectively u and kv The bandwidth resources occupied by the communication between the containers m and n of (a) do not exceed the physical node k u and kv The total amount of bandwidth resources in between.
Figure BDA0003917890840000128
/>
Wherein I represents a service request set;
V i a container set representing a service request i;
Figure BDA0003917890840000129
representing bandwidth requirements of container m and container n for request i;
Figure BDA00039178908400001210
representing a binary flag bit->
Figure BDA00039178908400001211
Container m, which represents request i, is deployed at physical node k u Applying;
Figure BDA00039178908400001212
representing a binary flag bit->
Figure BDA00039178908400001213
Container n, which represents request i, is deployed at physical node k v Applying;
Figure BDA00039178908400001214
representing a physical node k u and kv The total amount of bandwidth resources between;
constraints (1.7), (1.8) and (1.9) define that the sum of the total amount of all container resources contained by the service request does not exceed the total amount of computing resources, memory resources and storage resources, respectively.
Figure BDA00039178908400001215
Figure BDA00039178908400001216
Figure BDA00039178908400001217
Wherein I represents a service request set;
n represents a set of physical nodes;
Figure BDA0003917890840000131
representing a binary flag bit->
Figure BDA0003917890840000132
When container j representing request i is deployed on physical node k;
Figure BDA0003917890840000133
representing the demand of container j for request i for computing resources;
Figure BDA0003917890840000134
representing the total amount of computing resources of the physical node k;
Figure BDA0003917890840000135
representing the amount of memory resource demand by container j of request i;
Figure BDA0003917890840000136
representing the total memory resource amount of the physical node k;
Figure BDA0003917890840000137
representing the demand of container j for request i for storage resources;
Figure BDA0003917890840000138
representing the total amount of storage resources of the physical node k;
2. topological relation description based on graph convolution network
The invention adopts the graph rolling network to extract the topological relation of the input container cluster, and uses the extracted characteristics to assist the intelligent agent to give a more accurate deployment strategy on the premise of not damaging constraint conditions, thereby reducing the container deployment cost and improving the overall benefit of the edge computing service provider.
Let the graph of one container cluster be denoted by g= (N, E). Where N represents vertices in the graph, i.e., containers in the container cluster, and E represents edges in the graph, i.e., links resulting from communications between containers in the container cluster. The features of the vertices in G form an nxd matrix X, where D represents the number of features. The relationship between containers is represented by an N x N dimensional matrix a, i.e., a contiguous matrix of G. The hierarchical propagation of the graph convolutional network is shown in equation (10).
Figure BDA0003917890840000139
wherein ,H(l+1) Features representing layer l+1;
σ () represents an activation function;
Figure BDA00039178908400001310
representation matrix->
Figure BDA00039178908400001311
A degree matrix of (2); />
Figure BDA00039178908400001312
Representation pair->
Figure BDA00039178908400001313
Performing (-1/2) power operation on the matrix;
a represents a relationship matrix between nodes in the graph G;
Figure BDA0003917890840000141
representing an adjacency matrix of the undirected graph G with additional self-connections;
H (l) features representing a first layer;
W (l) representing a training parameter matrix of the first layer;
I N representing an identity matrix with an order of N;
Figure BDA0003917890840000142
representation->
Figure BDA0003917890840000143
The ith row, j, column element of the matrix;
x represents a feature matrix formed by G node features in the diagram;
in this formula
Figure BDA0003917890840000144
Is an adjacency matrix of undirected graph G with additional self-connections, where A is the adjacency matrix of undirected graph G, I N Is an identity matrix. />
Figure BDA0003917890840000145
Is a matrix->
Figure BDA0003917890840000146
Is a degree matrix of (2). W (W) (l) Is the training parameter matrix for the first layer. Sigma represents an activation function, such as ReLu, sigmoid, etc. (in our model we use ReLu). H (l) Representative is the feature of the first layer, h=x for the input layer.
3. Constraint optimization based on strategy gradients
Let C denote the cluster of one container cluster, where C denotes (C e C), the policy function of C is expressed as:
π(p|c,θ)=P r {A t =p|S t =c,θ t =θ}
where pi (p|c, θ) represents the probability of the output deployment policy p for a given input c;
θ represents the training parameters of the model;
P r representing a probability of outputting the deployment policy p;
A t an operation at time t;
S t a state at time t;
θ t training parameters representing the time t;
the strategy function represents the moment t, the input c, the parameter theta and the probability P of outputting the deployment strategy P r . The strategy gives higher probability to the high-benefit deployment strategy p and lower probability to the low-benefit deployment strategy p. Interaction of the input container clusters with the output strategy during period T generates a trajectory τ= (c) of a markov decision process 1 ,p 1 ,...,c T ,p T ) The probability of (2) can be expressed as:
Figure BDA0003917890840000151
wherein ,Pθ (c 1 ,p 1 ,...,c T ,p T ) Represents trajectory τ= (c) under parameter θ 1 ,p 1 ,...,c T ,p T ) Probability of occurrence;
p(c 1 ) Representing state c 1 (i.e. the input at time t=1 is c 1 ) Probability of occurrence;
t represents a period of time;
π θ (p t |c t ) Indicating at time t that the current state is c t (i.e., a cluster of containers entered), in an environment with parameters θ, the agent takes action p t Probability (i.e., the deployment policy that is output);
p(c t+1 |c t ,p t ) The state at time t (i.e., the input container cluster) is denoted as c t And the action (i.e., the output deployment policy) is p t Under the condition of (1), the system state at time t+1 (i.e. the input container cluster) is c t+1 Probability of (2);
c 1 representing the system state (i.e., the incoming container cluster) at time t=1;
p 1 representing a deployment policy at time t=1;
c t an input representing a time t;
p t representing a deployment strategy output at the moment t;
in the above policy function, for the current input container cluster c t Deployment policy p of (2) t The probability of (2) depends on the deployment position p of the previous container cluster (<t) And system status. For simplicity we assume that the system state is fully defined by the container cluster C. The policy function outputs only a probability indicating the deployment location of the container cluster. The goal of the strategy gradient method is to find the optimal set of parameters θ * GetTo the optimal deployment location of the container clusters. To this end, we need to define an objective function to describe the quality of the deployment strategy.
Figure BDA0003917890840000152
wherein ,JR (θ|c) represents the policy quality corresponding to input c;
Figure BDA0003917890840000153
representing a desire;
r (p) represents the service benefit corresponding to the deployment policy p;
p-pi theta (|c) represents all deployment policies p for a given input c;
in the above formula, we use the expected service benefit R (p) of a given container cluster C for a deployment policy as an objective function describing the quality of the deployment policy. Because the agent infers the deployment policy from all container clusters, the revenue expectations may then be defined as expectations of the container probability distribution:
Figure BDA0003917890840000161
wherein ,JR (θ) represents the policy quality, i.e., the expected value of the benefit;
Figure BDA0003917890840000162
representing a desire;
J R (θ|c) represents the policy quality corresponding to input c;
C-C represents a cluster C for all containers;
the same thing can be expressed as the expected penalty due to violating constraints:
Figure BDA0003917890840000163
wherein ,JC (θ) represents an expected value of the penalty value;
Figure BDA0003917890840000164
representing a desire;
J C (θ|c) represents a penalty value corresponding to input c;
C-C represents a cluster C for all containers;
here we define four constraint signals, respectively: a computing resource cpu, a memory resource mem, a storage resource sto, and a bandwidth resource bw. The final optimization objective can be converted to an unconstrained problem by lagrangian relaxation techniques:
Figure BDA0003917890840000165
wherein ,JL (lambda, theta) represents the Lagrangian value in such a way that the expected value J of the benefit will be calculated R (θ) adding an expected value J of penalty values corresponding to various resources C A weighted sum of (θ);
λ represents the weights of the four constraint signals;
J R (θ) represents the policy quality, i.e., the expected value of the benefit;
λ i representing weights of the constraint signal;
J C (θ) represents an expected value of the penalty value;
J ξ (θ) represents a weighted sum of the expected values of the four constraint signal penalty values;
where λ is the weight of four constraint signals, J ξ And (θ) is the desired gain weighted sum of the four constraint signals. Next, we calculate J using log likelihood method L (lambda, theta) gradient.
Figure BDA0003917890840000171
wherein ,
Figure BDA0003917890840000172
representing gradient operations with respect to θ;
J L (lambda, theta) represents the Lagrangian value in such a way that the expected value J of the benefit will be calculated R (θ) adding an expected value J of penalty values corresponding to various resources C A weighted sum of (θ);
Figure BDA0003917890840000177
representing a desire;
pi theta (p|c) represents a policy function of c;
q (c, p) represents the rewards earned given the decision p made by the input container cluster c algorithm;
p-pi theta (|c) represents the deployment policy p for a given input c;
in the above equation, Q (c, p) is used to describe the rewards available at a given input c and decision p made by the algorithm. The calculation method is by adding the weighted sum of all constraint unsatisfied values C (p) to the benefit value R (p), as shown in (18):
Figure BDA0003917890840000173
wherein Q (c, p) represents the rewards earned under the decision p made by the algorithm of a given input container cluster c;
r (p) represents rewards available to the system corresponding to decision p;
ζ (p) represents a weighted sum of penalty values for all constraint signals of decision p;
λ i representing weights of the constraint signal;
c (p) represents the penalty value generated by the next constraint signal at decision p;
then we approximate the Lagrangian gradient using Monte Carlo sampling
Figure BDA0003917890840000174
Where m is the number of samples, to reduce the variance of the gradient, the modulus is acceleratedConvergence rate we use the critic network as the benchmark evaluator b, which is made up of a simple RNN network. The lagrangian gradient can be expressed as:
Figure BDA0003917890840000175
wherein ,
Figure BDA0003917890840000176
representing the Lagrangian gradient;
m represents the number of samples;
Q(c,p i ) Representing the decision p made by the algorithm at a given input container cluster c i And awarding the obtained awards.
b(c,p i ) Represented in a given input container cluster c and decision p i An evaluation value given by the lower reference evaluator b;
Figure BDA0003917890840000181
a gradient representing the logarithm of the policy function;
finally, updating the parameter theta of the network model by adopting a random gradient descent method:
Figure BDA0003917890840000182
wherein ,θk+1 Model parameters representing the next moment;
θ k model parameters representing the current time;
alpha represents a learning rate;
Figure BDA0003917890840000183
representing the lagrangian gradient approximated using monte carlo sampling; />
The reference evaluator gives the evaluation value b (c, p) of the current bin cluster report, and then the parameter sigma of the reference evaluator is updated based on the mean square error of b (c, p) and the prize value Q (c, p) using a random gradient descent method.
Figure BDA0003917890840000184
wherein ,
Figure BDA0003917890840000185
representing the mean square error of the evaluation value b (c, p) and the prize value Q (c, p) given by the reference evaluator;
m represents the number of samples;
Q(c,p i ) Representing the decision p made by the algorithm at a given input container cluster c i The awards obtained are issued;
b(c,p i ) Represented in a given input container cluster c and decision p i An evaluation value given by the lower reference evaluator b;
the container cluster deployment algorithm training process optimized based on graph convolution network and neural combination can be described as table 1:
TABLE 1 description of Container Cluster deployment Algorithm training Process based on graph roll-up network and neural combination optimization
Figure BDA0003917890840000186
Figure BDA0003917890840000191
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims (7)

1. The container cluster online deployment method for fusing the graph neural network and reinforcement learning in the edge calculation is characterized by comprising the following steps of:
s1, extracting topological association relations existing between containers through a graph convolution network; updating the optimized model parameters by the actor network according to the output of the commentator module; wherein the optimization model is:
max (total charge-total energy expenditure) (1.1)
Figure FDA0004142553980000011
Wherein N represents a set of physical nodes;
G c representing a per unit computational resource benefit;
η k,c representing the utilization of computing resources on a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure FDA0004142553980000012
representing a binary flag bit->
Figure FDA0004142553980000013
When container j representing request i is deployed on physical node k;
Figure FDA0004142553980000014
representing the demand of container j for request i for computing resources;
G m representing the benefit of each unit memory resource;
Figure FDA0004142553980000015
representing the amount of memory resource demand by container j of request i;
G s representing the benefit of storage resources per unit;
Figure FDA0004142553980000016
representing the demand of container j for request i for storage resources;
Figure FDA0004142553980000017
wherein N represents a set of physical nodes;
Figure FDA0004142553980000021
representing the maximum energy consumption value of the physical node k;
Figure FDA0004142553980000022
an idle energy consumption value representing a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure FDA0004142553980000023
representing a binary flag bit->
Figure FDA0004142553980000024
When container j representing request i is deployed on physical node k;
Figure FDA0004142553980000025
representing the demand of container j for request i for computing resources;
Figure FDA0004142553980000026
representing the total amount of computing resources of the physical node k;
u k representing binary flag bits, u k When=1, it means that physical node k is activeA state;
c represents a unit energy consumption branching coefficient;
alternatively, min (total energy expenditure)
Figure FDA0004142553980000027
Wherein N represents a set of physical nodes;
Figure FDA0004142553980000028
representing the maximum energy consumption value of the physical node k;
Figure FDA0004142553980000029
an idle energy consumption value representing a physical node k;
i represents a service request set;
V i a container set representing a service request i;
Figure FDA00041425539800000210
representing a binary flag bit->
Figure FDA00041425539800000211
When container j representing request i is deployed on physical node k;
Figure FDA00041425539800000212
representing the demand of container j for request i for computing resources;
Figure FDA00041425539800000213
representing the total amount of computing resources of the physical node k;
u k representing binary flag bits, u k When=1, it means that the physical node k is activeA state;
c represents a unit energy consumption branching coefficient;
s2, deducing the deployment strategy from the sequence to the sequence network with the aid of the graph rolling network.
2. The method for online deployment of container clusters fusing graph neural networks and reinforcement learning in edge computing according to claim 1, wherein the hierarchical propagation of graph rolling networks in step S1 is:
Figure FDA0004142553980000031
wherein ,H(l+1) Features representing layer l+1;
σ () represents an activation function;
Figure FDA0004142553980000032
representation matrix->
Figure FDA0004142553980000033
A degree matrix of (2);
Figure FDA0004142553980000034
representation pair->
Figure FDA0004142553980000035
Performing (-1/2) power operation on the matrix;
a represents a relationship matrix between nodes in the graph G;
Figure FDA0004142553980000036
representing an adjacency matrix of the undirected graph G with additional self-connections;
H (l) features representing a first layer;
W (l) representing the training parameter matrix of the first layer.
3. The method for on-line deployment of container clusters in edge computing with fusion of graph neural networks and reinforcement learning according to claim 1, wherein the deployment strategy in step S2 is:
π(p|c,θ)=P r {A t =p|S t =c,θ t =θ}
where pi (p|c, θ) represents the probability of the output deployment policy p for a given input c;
θ represents the training parameters of the model;
P r representing a probability of outputting the deployment policy p;
A t an operation at time t;
S t a state at time t;
θ t the training parameters at time t are indicated.
4. The method for online deployment of container clusters in edge computing incorporating graph neural networks and reinforcement learning of claim 1, further comprising step S3, after step S1, of the reviewer network evaluating returns obtained after performing actor actions.
5. The method for on-line deployment of container clusters fusing graph neural network and reinforcement learning in edge computing according to claim 1, wherein constraint conditions of an optimization model are as follows:
Figure FDA0004142553980000041
wherein ,ηk,c Representing the utilization of computing resources on a physical node k;
i represents a service request set;
n represents a set of physical nodes;
Figure FDA0004142553980000042
representing a binary flag bit->
Figure FDA0004142553980000043
When container j representing request i is deployed on physical node k; />
Figure FDA0004142553980000044
Representing the demand of container j for request i for computing resources;
Figure FDA0004142553980000045
representing the total amount of computing resources of the physical node k;
Figure FDA0004142553980000046
wherein N represents a set of physical nodes;
Figure FDA0004142553980000047
representing a binary flag bit->
Figure FDA0004142553980000048
When container j representing request i is deployed on physical node k; i represents a service request set;
V i a container set representing a service request i;
Figure FDA0004142553980000049
wherein I represents a service request set;
V i a container set representing a service request i;
Figure FDA00041425539800000410
representing bandwidth requirements of container m and container n for request i;
Figure FDA00041425539800000411
representing a binary flag bit->
Figure FDA00041425539800000412
Container m, which represents request i, is deployed at physical node k u Applying; />
Figure FDA00041425539800000413
Representing a binary flag bit->
Figure FDA00041425539800000414
Container n, which represents request i, is deployed at physical node k v Applying; b (B) ku·kv Representing a physical node k u and kv The total amount of bandwidth resources between;
Figure FDA00041425539800000415
Figure FDA00041425539800000416
Figure FDA00041425539800000417
wherein I represents a service request set;
n represents a set of physical nodes;
Figure FDA0004142553980000051
representing a binary flag bit->
Figure FDA0004142553980000052
When container j representing request i is deployed on physical node k; />
Figure FDA0004142553980000053
Representing the demand of container j for request i for computing resources;
Figure FDA0004142553980000054
representing the total amount of computing resources of the physical node k;
Figure FDA0004142553980000055
representing the amount of memory resource demand by container j of request i;
Figure FDA0004142553980000056
representing the total memory resource amount of the physical node k;
Figure FDA0004142553980000057
representing the demand of container j for request i for storage resources;
Figure FDA0004142553980000058
representing the total amount of storage resources of physical node k.
6. The method for online deployment of container clusters fusing graph neural network and reinforcement learning in edge computing according to claim 1, wherein the model is updated as follows:
Figure FDA0004142553980000059
wherein ,θk+1 Model parameters representing the next moment;
θ k model parameters representing the current time;
alpha represents a learning rate;
Figure FDA00041425539800000510
represents the lagrangian gradient approximated using monte carlo sampling.
7. The method for online deployment of container clusters in edge computing that fuses graph neural networks and reinforcement learning of claim 6, wherein the model updating further comprises:
Figure FDA00041425539800000511
wherein ,
Figure FDA00041425539800000512
representing the mean square error of the evaluation value b (c, p) and the prize value Q (c, p) given by the reference evaluator;
m represents the number of samples;
Q(c,p i ) Representing the decision p made by the algorithm at a given input container cluster c i The awards obtained are issued;
b(c,p i ) Represented in a given input container cluster c and decision p i The evaluation value given by the lower reference evaluator b.
CN202211347967.8A 2022-10-31 2022-10-31 Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation Active CN115686846B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211347967.8A CN115686846B (en) 2022-10-31 2022-10-31 Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211347967.8A CN115686846B (en) 2022-10-31 2022-10-31 Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation

Publications (2)

Publication Number Publication Date
CN115686846A CN115686846A (en) 2023-02-03
CN115686846B true CN115686846B (en) 2023-05-02

Family

ID=85045641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211347967.8A Active CN115686846B (en) 2022-10-31 2022-10-31 Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation

Country Status (1)

Country Link
CN (1) CN115686846B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116069512B (en) * 2023-03-23 2023-08-04 之江实验室 Serverless efficient resource allocation method and system based on reinforcement learning
CN117149443B (en) * 2023-10-30 2024-01-26 江西师范大学 Edge computing service deployment method based on neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008819A (en) * 2019-01-30 2019-07-12 武汉科技大学 A kind of facial expression recognizing method based on figure convolutional neural networks
CN113568675A (en) * 2021-07-08 2021-10-29 广东利通科技投资有限公司 Internet of vehicles edge calculation task unloading method based on layered reinforcement learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3792831A1 (en) * 2019-09-11 2021-03-17 Siemens Aktiengesellschaft Method for generating an adapted task graph
CN112631717B (en) * 2020-12-21 2023-09-05 重庆大学 Asynchronous reinforcement learning-based network service function chain dynamic deployment system and method
CN112711475B (en) * 2021-01-20 2022-09-06 上海交通大学 Workflow scheduling method and system based on graph convolution neural network
US20220124543A1 (en) * 2021-06-30 2022-04-21 Oner Orhan Graph neural network and reinforcement learning techniques for connection management
CN113778648B (en) * 2021-08-31 2023-07-11 重庆理工大学 Task scheduling method based on deep reinforcement learning in hierarchical edge computing environment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008819A (en) * 2019-01-30 2019-07-12 武汉科技大学 A kind of facial expression recognizing method based on figure convolutional neural networks
CN113568675A (en) * 2021-07-08 2021-10-29 广东利通科技投资有限公司 Internet of vehicles edge calculation task unloading method based on layered reinforcement learning

Also Published As

Publication number Publication date
CN115686846A (en) 2023-02-03

Similar Documents

Publication Publication Date Title
CN115686846B (en) Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation
CN109818786B (en) Method for optimally selecting distributed multi-resource combined path capable of sensing application of cloud data center
Guim et al. Autonomous lifecycle management for resource-efficient workload orchestration for green edge computing
Rkhami et al. Learn to improve: A novel deep reinforcement learning approach for beyond 5G network slicing
Bahrpeyma et al. An adaptive RL based approach for dynamic resource provisioning in Cloud virtualized data centers
CN114936708A (en) Fault diagnosis optimization method based on edge cloud collaborative task unloading and electronic equipment
Aslam et al. Using artificial neural network for VM consolidation approach to enhance energy efficiency in green cloud
CN108073442B (en) Simulation request execution time prediction method based on depth fuzzy stack self-coding
CN116009990B (en) Cloud edge collaborative element reinforcement learning computing unloading method based on wide attention mechanism
Luan et al. LRP‐based network pruning and policy distillation of robust and non‐robust DRL agents for embedded systems
CN113543160A (en) 5G slice resource allocation method and device, computing equipment and computer storage medium
Huang et al. Learning-aided fine grained offloading for real-time applications in edge-cloud computing
Qin et al. Dynamic IoT service placement based on shared parallel architecture in fog-cloud computing
CN116126534A (en) Cloud resource dynamic expansion method and system
CN112906745B (en) Integrity intelligent network training method based on edge cooperation
CN115499511A (en) Micro-service active scaling method based on space-time diagram neural network load prediction
CN113783726B (en) SLA-oriented resource self-adaptive customization method for edge cloud system
Liu et al. Hidden markov model based spot price prediction for cloud computing
Bhargavi et al. Uncertainty aware resource provisioning framework for cloud using expected 3-SARSA learning agent: NSS and FNSS based approach
Li et al. An automated VNF manager based on parameterized action MDP and reinforcement learning
CN111913780A (en) Resource prediction and scheduling method in cloud computing
CN117648174B (en) Cloud computing heterogeneous task scheduling and container management method based on artificial intelligence
Damaševičius et al. Short time prediction of cloud server round-trip time using a hybrid neuro-fuzzy network
Dixit et al. Machine Learning Based Adaptive Auto-scaling Policy for Resource Orchestration in Kubernetes Clusters
Su et al. An Attention Mechanism-based Microservice Placement Scheme for On-star Edge Computing Nodes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant