CN114489925A

CN114489925A - Containerized service scheduling framework and flexible scheduling algorithm

Info

Publication number: CN114489925A
Application number: CN202111498294.1A
Authority: CN
Inventors: 曾纪钧; 龙震岳; 张小陆; 梁哲恒
Original assignee: Guangdong Power Grid Co Ltd
Current assignee: Guangdong Power Grid Co Ltd
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2022-05-13

Abstract

The invention discloses a containerized service scheduling framework and an elastic scheduling algorithm, which comprises the steps that a user request is subjected to semantic analysis and forwarding through an API gateway of a main node; creating and scheduling the analyzed actual operation objects, and storing all the objects into a key value database; additional controller processes perform health monitoring on the objects in the cluster. The invention integrates a Docker container technology and a Kubenetes container management technology, provides a container technology-based electric power Internet of things edge side service scheduling framework, optimizes and improves initialization, updating, heuristic factor setting and the like of pheromones on the basis of labeling an ant colony algorithm in order to make up the defects of an electric power Internet of things edge side service scheduling algorithm under the container technology-based electric power Internet of things edge side service scheduling framework, and designs an elastic resource dynamic scheduling algorithm based on the ant colony algorithm, so that load balance of edge task allocation is realized, and limited edge resources can exert the maximum effect.

Description

Containerized service scheduling framework and flexible scheduling algorithm

Technical Field

The invention relates to the technical field of power Internet of things, in particular to a containerization service scheduling framework and an elastic scheduling algorithm.

Background

In a new generation of power internet of things, high bandwidth, high density and low time delay are the main characteristics of services. If the original cloud computing mode is still adopted, the speed, the delay and the stability cannot be ensured at all. The edge computing is a computing mode for providing services for users at the edge of a network, and has the characteristics of low time delay and high bandwidth. With the application of the edge computing technology, the service scheduling function is gradually inclined to the edge side. However, the existing power internet of things edge side scheduling strategy has low resource distribution efficiency and poor flexibility, so that the data processing delay is high, and the service expandability is not high.

Disclosure of Invention

This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.

The present invention has been made in view of the above and/or other problems occurring in the existing containerized service scheduling framework and flexible scheduling algorithm.

Therefore, the invention aims to solve the problems that the existing power internet of things edge side scheduling strategy has low resource distribution efficiency and poor flexibility, so that the data processing delay is high and the service expandability is not high.

In order to solve the technical problems, the invention provides the following technical scheme: a containerized service scheduling framework and an elastic scheduling algorithm comprise that a user request is subjected to semantic analysis and forwarding through an API gateway of a main node;

creating and scheduling the analyzed actual operation objects, and storing all the objects into a key value database;

an additional controller process performs health monitoring on objects in the cluster;

based on HPA horizontal dilatation controller in Kubenetes, realize dynamic resource scheduling, the index data that horizontal dilatation controller collected the third party instrument through the index aggregator is gathered to be used for the elasticity scheduling algorithm with the index, calculate the optimal scheme of elasticity resource scheduling, and carry out dynamic dilatation, the specific flow of dilatation:

creating HPA horizontal capacity expansion controller resources;

setting a period of a control manager;

running an application example and collecting the use condition of resources;

tolerances are taken into account.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: the master node is used as a control center of the whole cluster and mainly responsible for allocation, scheduling and recovery of resources, and all the slave nodes are used as resource pools and are used as division objects of the master node.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: all nodes adopt virtual machines, the availability of the nodes is guaranteed by the multiple virtual machines, and all the nodes in the whole framework operate as containers through Docker engines.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: the cluster storage adopts a distributed database etcd, and stores metadata and expected states and current states of all resources in a high-availability key value storage mode.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: on the basis of the labeled ant colony algorithm, initialization, updating, heuristic factor setting and the like of pheromones are optimized and improved so as to achieve target tasks and meet requirements under the edge computing environment.

The load balance of the edge nodes is measured by adopting the load unbalance degree, the value is taken from 0 to 1, and the smaller the value is, the more uniform the load distribution of the tasks of the edge nodes is, and the higher the overall performance of the system is. The concrete expression is as follows:

wherein the content of the first and second substances,

respectively represent edgesThe utilization rate of an edge node i (i is 1, …, n) for a CPU, a GPU and a memory;

respectively representing the average utilization rate of the whole edge node to CPU, GPU and memory.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: eta for heuristic factors_ijThe main expression is the desired strength, η, of the placement of task i at edge node j_ijThe larger the value of (c) the greater the probability that the task is placed at the node:

wherein Q is an arbitrary constant, and Q is an arbitrary constant,

the cosine similarity between the resource required by the task to be distributed and the node idle resource can be defined, which is equivalent to the similarity between the task i and the node j represented by the included angle between the task i and the node j;

the smaller the included angle is, the higher the similarity between the two is represented, and the higher the possibility that the task is allocated at the node is:

wherein, A represents the number of the resource types which can be provided by the edge node for the task (three resources of CPU, GPU and memory are used as demand resources);

the idle amount of the a-th resource of the edge node i;

the demand of the resource a of task j.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: the distribution between the tasks and the nodes is a mapping relation of many-to-one, the quantity of each resource required by the tasks is ensured to be smaller than the quantity of the idle resources of the nodes in the distribution of the tasks, and the tasks are prevented from being distributed on the nodes with insufficient resources by comparing the demands of the tasks on a CPU (Central processing Unit), a GPU (graphics processing Unit) and a memory with the available resources of the nodes, namely:

wherein the content of the first and second substances,

respectively representing the use amounts of a CPU, a GPU and a memory of the task j on the edge node i;

respectively representing the available amount of the edge node i relative to the CPU, the GPU and the memory.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: the pheromone update of the initial ant colony algorithm is volatile for all paths, which results in that the pheromones of the nodes which are not frequently walked are lower and even approach zero. Therefore, the route which is not traveled is not volatilized, and the specific change is as follows:

where ρ represents the volatilization factor of pheromones and 1- ρ represents the residual factor of pheromones. If Δ τ_ijA value of 0 indicates that the task is not allocated on the node.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: the overall algorithm flow is as follows:

inputting: the method comprises the following steps of setting an edge node set PM (PM 1, PM2, … and pmn), setting a task requirement set TN (TN 1, TN2, … and tnn), an pheromone heuristic factor alpha, an expected heuristic factor beta, an pheromone volatilization rate rho and other initial requirements.

And (3) outputting: optimal task allocation scheme and load imbalance.

As a preferred solution of the containerized service scheduling framework and flexible scheduling algorithm of the present invention, wherein: initializing the number AntNum of ants and attaching the required resources of the tasks to each task according to the order of task submission;

randomly distributing n ants carrying task requirements on random nodes, and calculating the probability that the kth ant distributes a task i on a node j;

after the kth ant completes all task allocation, locally updating the allocated nodes;

after the ant is deployed, calculating the load imbalance degree of the distribution scheme, comparing the load imbalance degree with a historical record, and recording the optimal distribution scheme and the minimum load imbalance degree;

judging whether all ants are finished or not;

and judging whether the iteration times are met or the preset load balance degree is met, returning the optimal path solution after the algorithm iteration is finished, and otherwise, recalculating the probability that the kth ant distributes the task i to the node j.

The invention has the beneficial effects that: the invention integrates a Docker container technology and a Kubenetes container management technology, provides a container technology-based electric power Internet of things edge side service scheduling framework, optimizes and improves initialization, updating, heuristic factor setting and the like of pheromones on the basis of labeling an ant colony algorithm in order to make up the defects of an electric power Internet of things edge side service scheduling algorithm under the container technology-based electric power Internet of things edge side service scheduling framework, and designs an elastic resource dynamic scheduling algorithm based on the ant colony algorithm, so that load balance of edge task allocation is realized, and limited edge resources can exert the maximum effect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:

fig. 1 is a scene diagram of a containerized service scheduling framework and an elastic scheduling algorithm.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.

Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.

Example 1

Referring to fig. 1, a first embodiment of the present invention provides a containerized service scheduling framework and an elastic scheduling algorithm, where the containerized service scheduling framework and the elastic scheduling algorithm include that as shown in fig. 1, a distributed master/slave architecture is adopted in an overall architecture, and a master node is used as a control center of a whole cluster and is mainly responsible for resource allocation, scheduling, and recovery. All slave nodes are used as resource pools as partitioning objects of the master node. All the nodes adopt virtual machines, and the multiple virtual machines ensure the availability of the nodes and reduce the possibility of the paralysis of the nodes to the minimum. All nodes in the whole architecture run as containers through a Docker engine. The cluster storage adopts a distributed database etcd, stores metadata and expected states and current states of all resources in a high-availability key value storage mode, and ensures data safety and stable operation of a system.

The whole framework operation flow comprises the following steps: the user requests to perform semantic analysis and forwarding through an API gateway of the main node; creating and scheduling the analyzed actual operation objects, and storing all the objects into a key value database; additional controller processes perform health monitoring on the objects in the cluster.

Based on an HPA horizontal capacity expansion controller in Kubenetes, dynamic resource scheduling is realized. And the horizontal capacity expansion controller aggregates the index data collected by the third-party tool through the index aggregator, so that the index is used for an elastic scheduling algorithm, an optimal scheme for elastic resource scheduling is calculated, and dynamic capacity expansion is performed. Capacity expansion specific process:

HPA level expansion controller resources are created. Setting the number range of the application instances and the CPU average usage limit, setting the resource usage limit of the Pod according to the application type, setting tolerance parameters, and defaulting to 0.1;

the period of the control manager is set. The period is the period of inquiring the resource use condition and is set to be 30 seconds;

and running the application example and collecting the resource use condition. Calculating (sum of index values/set limit) by using the index values and the set values to obtain the number of target adjustment examples;

tolerances are taken into account. If the tolerance range is exceeded, the number of the examples is adjusted, otherwise, the number of the examples of the target adjustment is repeatedly calculated. The number of the target adjusted instances cannot exceed the range of the number of the instances set in the tolerance parameter, and if the number of the target adjusted instances does not exceed the range of the number of the instances set in the tolerance parameter, capacity expansion is carried out; if the target adjustment is exceeded, the capacity is expanded to the maximum number of instances, the index value and the set value are repeatedly used for calculation (the sum of the index values/the set limit), and the number of instances of the target adjustment is obtained.

On the basis of the labeled ant colony algorithm, initialization, updating, heuristic factor setting and the like of pheromones are optimized and improved so as to achieve target tasks and meet requirements under the edge computing environment.

this is shown in equation 1, where,

respectively representing the utilization rates of the edge nodes i (i is 1, …, n) to a CPU, a GPU and a memory;

Eta for heuristic factors_ijThe main expression is the desired strength, η, of the placement of task i at edge node j_ijThe larger the value of (c) the greater the probability that the task is placed at the node:

this is equation 2, where Q is an arbitrary constant,

the cosine similarity between the task waiting to be allocated and the required resource and the node idle resource can be defined, which is equivalent to that the similarity between the task i and the node j is represented by the included angle between the task i and the node j. The smaller the included angle is, the higher the similarity between the two is represented, and the higher the possibility that the task is allocated at the node is:

this is formula 3, where a represents the number of resource types that the edge node can provide for the task (three resources, i.e., CPU, GPU, and memory, are required resources);

the idle amount of the a-th resource of the edge node i;

the demand of the resource a of task j.

From equation 3, it follows: the smaller the matching between the task to be distributed and the edge node, the more the node can adapt to the performance requirement of the task on the node, and the higher the probability that the task is distributed on the node. In the process of task allocation, the matching degree is continuously changed along with the change of time, so that

And representing the matching degree of the edge node i and the task j at the moment t, and if the task j is placed on the edge node i, calculating the matching degree of the edge node i and other tasks at the next time and calculating the idle resources after the resources required by the task j are distributed.

The distribution between the tasks and the nodes is a mapping relation of many-to-one, the quantity of each resource required by the tasks is ensured to be smaller than the quantity of the idle resources of the nodes in the distribution of the tasks, and the tasks are prevented from being distributed on the nodes with insufficient resources by comparing the demands of the tasks on a CPU (Central processing Unit), a GPU (graphics processing Unit) and a memory with the available resources of the nodes, namely:

formula 4, formula 5, and formula 6, in that order,

The pheromone update of the initial ant colony algorithm is volatile for all paths, which results in that the pheromones of the nodes which are not frequently walked are lower and even approach zero. Therefore, the route which is not traveled is not volatilized, and the specific change is as follows:

equation 7 and equation 8 follow, where ρ represents the volatility factor of the pheromone and 1- ρ represents the retention factor of the pheromone. If Δ τ_ij0 indicates that the task is not allocatedAbove the node.

The overall algorithm flow is as follows:

inputting: the method comprises the following steps of an edge node set PM (PM 1, PM2, … and pmn), a task requirement set TN (TN 1, TN2, … and tnn), an pheromone heuristic factor alpha, an expected heuristic factor beta, an pheromone volatilization rate rho and other initial requirements.

And (3) outputting: optimal task allocation scheme and load imbalance.

Step 1: initializing ant number AntNum and attaching required resources of the tasks to each task according to the order of task submission, iteration times maxIter, heuristic factors alpha and beta and related parameters.

Step 2: and randomly distributing n ants carrying task requirements on random nodes, and calculating the probability that the kth ant distributes the task i on the node j. And then randomly selecting one node from the nodes meeting the conditions as a distribution node of the task by using a roulette mode, and deploying the task on the node. The probability selection formula is:

this is the equation 9 in which, among other things,

represents the probability of task i in the kth ant selecting node j, tau_ij(t) pheromone concentration, η, of pathway (i, j)_ij(t) represents the heuristic factor of the path (i, j), as in equation (2). allowed_kAn allocable list for task i.

And step 3: after the kth ant completes all task assignments, the assigned nodes are locally updated according to equation 5.

And 4, step 4: and after the disposition of the ants is completed, calculating the load imbalance degree of the distribution scheme according to the formula 1, comparing with a historical record, and recording the optimal distribution scheme and the minimum load imbalance degree.

And 5: judging whether all ants are finished or not, and jumping to the step 2 if all ants are not finished; if all ants finish the iteration, calculating and storing a global optimal solution, and finally performing global updating on the pheromone according to an optimal placement scheme and a formula 7.

Step 6: and judging whether the iteration times are met or the preset load balance degree is met, and returning the optimal path solution after the algorithm iteration is finished. Otherwise, returning to the step 2. .

It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims

1. A containerized service dispatch framework, comprising: comprises the steps of (a) preparing a mixture of a plurality of raw materials,

the user requests to perform semantic analysis and forwarding through an API gateway of the main node;

creating HPA horizontal capacity expansion controller resources;

setting a period of a control manager;

running an application example and collecting the use condition of resources;

tolerances are taken into account.

2. The containerized services scheduling framework of claim 1, wherein: the master node is used as a control center of the whole cluster and mainly responsible for allocation, scheduling and recovery of resources, and all the slave nodes are used as resource pools and are used as division objects of the master node.

3. The containerized services scheduling framework of claim 2, wherein: all nodes adopt virtual machines, the availability of the nodes is guaranteed by the multiple virtual machines, and all nodes in the whole architecture run as containers through a Docker engine.

4. The containerized services scheduling framework of claim 3, wherein: the cluster storage adopts a distributed database etcd, and stores metadata and expected states and current states of all resources in a high-availability key value storage mode.

5. The flexible scheduling algorithm of the containerized service scheduling framework of any one of claims 1-4, wherein: on the basis of the labeled ant colony algorithm, initialization, updating, heuristic factor setting and the like of pheromones are optimized and improved so as to achieve target tasks and meet requirements under the edge computing environment.

wherein the content of the first and second substances,

6. The flexible scheduling algorithm of the containerized services scheduling framework of claim 5, wherein: eta for heuristic factors_ijThe main expression is the desired strength, η, of the placement of task i at edge node j_ijThe larger the value of (c) the greater the probability that the task is placed at the node:

wherein Q is an arbitrary constant, and Q is an arbitrary constant,

the idle amount of the a-th resource of the edge node i;

the demand of the resource a of task j.

7. The flexible scheduling algorithm of the containerized services scheduling framework of claim 6, wherein: the distribution between the tasks and the nodes is a mapping relation of many-to-one, the quantity of each resource required by the tasks is ensured to be smaller than the quantity of the idle resources of the nodes in the distribution of the tasks, and the tasks are prevented from being distributed on the nodes with insufficient resources by comparing the demands of the tasks on a CPU (Central processing Unit), a GPU (graphics processing Unit) and a memory with the available resources of the nodes, namely:

the amount of use of (c);

8. The flexible scheduling algorithm of the containerized services scheduling framework of claim 7 wherein: the pheromone update of the initial ant colony algorithm is volatile for all paths, which results in that the pheromones of the nodes which are not frequently walked are lower and even approach zero. Therefore, the route which is not traveled is not volatilized, and the specific change is as follows:

9. The flexible scheduling algorithm of the containerized services scheduling framework of claim 8, wherein: the overall algorithm flow is as follows:

And (3) outputting: optimal task allocation scheme and load imbalance.

10. The flexible scheduling algorithm of the containerized services scheduling framework of claim 9, wherein: initializing the number AntNum of ants and attaching the required resources of the tasks to each task according to the order of task submission;

judging whether all ants are finished or not;