CN116541178A - Dynamic load balancing method and device for Docker cloud platform - Google Patents
Dynamic load balancing method and device for Docker cloud platform Download PDFInfo
- Publication number
- CN116541178A CN116541178A CN202310821239.4A CN202310821239A CN116541178A CN 116541178 A CN116541178 A CN 116541178A CN 202310821239 A CN202310821239 A CN 202310821239A CN 116541178 A CN116541178 A CN 116541178A
- Authority
- CN
- China
- Prior art keywords
- load
- nodes
- container
- preset
- load data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000004364 calculation method Methods 0.000 claims abstract description 45
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 19
- 239000004973 liquid crystal related substance Substances 0.000 claims description 66
- 230000005540 biological transmission Effects 0.000 claims description 30
- 230000004044 response Effects 0.000 claims description 22
- 238000005457 optimization Methods 0.000 claims description 13
- 230000009471 action Effects 0.000 claims description 12
- 230000015654 memory Effects 0.000 claims description 12
- 238000011156 evaluation Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000002787 reinforcement Effects 0.000 description 4
- 230000008602 contraction Effects 0.000 description 3
- 230000010485 coping Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention provides a dynamic load balancing method and device for a Docker cloud platform, wherein the method comprises the following steps: load data of a plurality of nodes and containers are obtained; judging whether the load data meets a preset load condition or not; if the load data does not meet the preset load condition, expanding or shrinking the container; obtaining the load unbalance degree among the nodes according to the load data of all the nodes, and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value or not; if the load unbalance degree among the nodes exceeds the preset load balance threshold, inputting the load data of the nodes and the containers into a load decision model to obtain a load balance strategy; and scheduling the container resources according to the load balancing strategy. The method and the device can solve the problems that in the prior art, when load balancing is realized, single index is considered, heuristic algorithm calculation with poor expandability is adopted, and the real network environment is difficult to deal with.
Description
Technical Field
The invention relates to the technical field of computer cloud computing, in particular to a dynamic load balancing method and device for a Docker cloud platform.
Background
With the rapid development and wide application of technologies such as big data and artificial intelligence, the core-virtualization technology of the cloud computing platform as the bottom support is more self-evident. Compared with the traditional virtualization technology based on an operating system, the Docker container technology has the advantages of light and flexible weight, high resource utilization rate, high starting speed, easiness in migration and the like, and becomes a core representative technology in the cloud computing field. At present, more and more enterprises take a Docker cluster as a main task execution environment, and a plurality of different application containers are deployed on each node in the Docker cluster to execute different tasks, so that load imbalance of a Docker cloud platform is easy to occur.
The unbalanced load of the Docker cloud platform not only greatly reduces the availability, the resource utilization rate and the stability of the system, but also causes the waste of energy and space. In addition, the load balancing strategy of the existing Docker cloud platform only considers singleness, only considers load balancing among nodes or elastic expansion of a container independently, does not well combine the singleness and the elastic expansion of the container and omits consideration of QoS (Quality of Service) of users, and the problems are that the index is single and the scalability is poor are adopted mostly when the load balancing is realized, so that the heuristic calculation is difficult to deal with the real network environment.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a dynamic load balancing method and device for a Docker cloud platform, aiming at the defects of the prior art, so as to at least solve the problems that in the prior art, when load balancing is realized, consideration indexes are single, heuristic algorithm calculation with poor expandability is adopted, and the real network environment is difficult to deal with.
In a first aspect, the present invention provides a dynamic load balancing method for a Docker cloud platform, including:
load data of a plurality of nodes and containers are obtained;
load data for each of the containers:
judging whether the load data meet a preset load condition according to the load data of the container;
if the load data does not meet the preset load condition, expanding or shrinking the container to meet the preset load condition;
when the load data of all the containers meet the preset load condition, obtaining the load unbalance degree among the nodes according to the load data of all the nodes, and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value or not;
if the load unbalance degree among the nodes exceeds the preset load balance threshold, inputting load data of the nodes and the containers into a load decision model to obtain a load balance strategy, wherein the load decision model is a model constructed based on an asynchronous dominant action evaluation A3C algorithm network, and the optimization target of the load decision model is determined by the task average instruction response time ratio of the user quality of service QoS and the load unbalance degree among the nodes;
and scheduling the container resources according to the load balancing strategy.
Further, the judging, according to the load data, whether the load data meets a preset load condition specifically includes:
if the load data is larger than a preset overload threshold or smaller than a preset underload threshold, judging that the load data does not meet the preset load condition;
and if the load data is not greater than a preset overload threshold value and is not less than a preset underload threshold value, judging that the load data meets the preset load condition.
Further, if the load data does not meet the preset load condition, expanding or shrinking the container specifically includes:
if the load data is larger than a preset overload threshold, expanding the container;
and if the load data is smaller than a preset underload threshold value, the container is reduced.
Further, the obtaining the load imbalance degree between the nodes according to the load data of all the nodes specifically includes:
for all nodes at each of the T times, the following steps are performed:
acquiring the average utilization rate of CPU of the central processing units of all the nodes;
calculating the CPU utilization standard deviation of all the nodes according to the CPU average utilization;
and calculating the average value of the CPU utilization standard deviations of all the nodes at T moments, and taking the average value of the CPU utilization standard deviations of all the nodes as the load unbalance degree among the nodes.
Further, the calculation formula of the average CPU utilization of all the nodes is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the average CPU utilization rate of all nodes at time t is represented by M, which is the number of containersQuantity N is the number of nodes, T is the number of moments, +.>CPU utilization at time t for container m, < >>The position relation between the container m and the node n at the moment t;
the calculation formula of the CPU utilization standard deviation of all the nodes is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,CPU utilization standard deviation of all nodes at t moment;
the calculation formula of the average value of the standard deviation of the CPU utilization rates of all the nodes at the T moments is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the average value of the standard deviation of CPU utilization of all nodes at T moments.
Further, before the load data of the plurality of nodes and the container is input into the load decision model to obtain the load balancing policy, the method further includes:
for each task in all containers, the following steps are performed:
acquiring the transmission time of the task to the corresponding container;
calculating the waiting time of the task after reaching the corresponding container according to the transmission time;
acquiring the execution time of the task in the corresponding container;
and calculating the average instruction response time ratio of all tasks in all containers according to the transmission time, the waiting time and the execution time of each task.
Further, the calculation formula of the transmission time of the task to the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the data transmission time of task i to container m, < >>For the data transmission quantity of task i, +.>For the network transmission speed to container m;
the calculation formula of the waiting time after the task reaches the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the waiting time after task i reaches container m, k=1, …, C is the number of tasks that container m has not performed, +.>The execution time of task k in container m;
the calculation formula of the execution time of the task in the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the execution time of task i in container m, < >>Instruction length for task i, +.>CPU utilization for container m, +.>Instruction execution speed for container m;
the calculation formula of the average instruction response time ratio of all tasks in all containers is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the average instruction response time ratio for a total of I tasks within M containers, I being the number of tasks and M being the number of containers.
Further, the calculation formula of the optimization target of the load decision model is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,weight for task average instruction response time ratio, +.>The CPU is weighted by the average value of the standard deviation of the utilization rate.
In a second aspect, the present invention provides a dynamic load balancing device for a dock cloud platform, including:
the acquisition module is used for acquiring load data of the plurality of nodes and the container;
the first judging module is connected with the acquiring module and is used for judging whether the load data meet a preset load condition according to the load data of the container;
the expansion and reduction module is connected with the first judging module and is used for expanding or reducing the container to meet the preset load condition if the load data does not meet the preset load condition;
the second judging module is connected with the expansion and reduction module and is used for obtaining the load unbalance degree among the nodes according to the load data of all the nodes when the load data of all the containers meet the preset load condition and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value or not;
the input model module is connected with the second judging module and is used for inputting the load data of the plurality of nodes and the containers into a load decision model to obtain a load balancing strategy if the load unbalance degree among the nodes exceeds the preset load balancing threshold;
and the scheduling module is connected with the input model module and used for scheduling the container resources according to the load balancing strategy.
In a third aspect, the present invention provides a dynamic load balancing device for a dock cloud platform, including a memory and a processor, where the memory stores a computer program, and the processor is configured to run the computer program to implement the dynamic load balancing method for a dock cloud platform in the first aspect.
According to the dynamic load balancing method and device for the Docker cloud platform, load data of a plurality of nodes and containers are obtained; load data for each of the containers is then: judging whether the load data meet a preset load condition according to the load data of the container; if the load data does not meet the preset load condition, expanding or shrinking the container to meet the preset load condition; when the load data of all the containers meet the preset load condition, obtaining the load unbalance degree among the nodes according to the load data of all the nodes, and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value; if the load unbalance degree among the nodes exceeds the preset load balance threshold, inputting load data of the nodes and the containers into a load decision model to obtain a load balance strategy, wherein the load decision model is a model constructed based on an asynchronous dominant action evaluation A3C algorithm network, and the optimization target of the load decision model is determined by the task average instruction response time ratio of the user quality of service QoS and the load unbalance degree among the nodes; and finally, scheduling the container resources according to the load balancing strategy. The invention sets three thresholds, and utilizes the deep reinforcement learning algorithm which jointly forms an optimization target by the load unbalance degree among the nodes and the QoS of the user to carry out reasonable resource scheduling, and can timely carry out efficient resource scheduling according to the load change of the working system, thereby effectively aiming at the load unbalance condition in the real network environment, and solving the problems that the prior art realizes the load balancing by considering the heuristic algorithm calculation with single index and poor expandability, which causes difficulty in coping with the real network environment.
Drawings
Fig. 1 is a flowchart of a dynamic load balancing method of a Docker cloud platform according to embodiment 1 of the present invention;
FIG. 2 is a schematic diagram of a dynamic load balancing architecture according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an Actor network model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a Critical network model according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a dynamic load balancing device for a dock cloud platform according to embodiment 2 of the present invention;
fig. 6 is a schematic structural diagram of a dynamic load balancing device for a dock cloud platform according to embodiment 3 of the present invention.
Detailed Description
In order to make the technical scheme of the present invention better understood by those skilled in the art, the following detailed description of the embodiments of the present invention will be given with reference to the accompanying drawings.
It is to be understood that the specific embodiments and figures described herein are merely illustrative of the invention, and are not limiting of the invention.
It is to be understood that the various embodiments of the invention and the features of the embodiments may be combined with each other without conflict.
It is to be understood that only the portions relevant to the present invention are shown in the drawings for convenience of description, and the portions irrelevant to the present invention are not shown in the drawings.
It should be understood that each unit and module in the embodiments of the present invention may correspond to only one physical structure, may be formed by a plurality of physical structures, or may be integrated into one physical structure.
It will be appreciated that, without conflict, the functions and steps noted in the flowcharts and block diagrams of the present invention may occur out of the order noted in the figures.
It is to be understood that the flowcharts and block diagrams of the present invention illustrate the architecture, functionality, and operation of possible implementations of systems, apparatuses, devices, methods according to various embodiments of the present invention. Where each block in the flowchart or block diagrams may represent a unit, module, segment, code, or the like, which comprises executable instructions for implementing the specified functions. Moreover, each block or combination of blocks in the block diagrams and flowchart illustrations can be implemented by hardware-based systems that perform the specified functions, or by combinations of hardware and computer instructions.
It should be understood that the units and modules related in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, for example, the units and modules may be located in a processor.
Example 1:
the embodiment provides a dynamic load balancing method for a Docker cloud platform, as shown in fig. 1, which comprises the following steps:
step S101: load data of a plurality of nodes and containers is acquired.
It should be noted that, as shown in fig. 2, the dynamic load balancing method for the Docker cloud platform provided by the present embodiment is applied to the dynamic load balancing architecture provided by the present embodiment, and the whole structure is formed by two parts of Docker clusters and logic layers together.
Specifically, the logic layer is a core part of the realization of the whole dynamic load strategy, and comprises a container resource acquisition module, a container data storage module and a container resource scheduling module. The container resource collection module is mainly responsible for counting and transmitting the usage of various resources (i.e. load data of a plurality of nodes and containers) of the physical host and the dock container, wherein the usage includes a CPU (Central Processing Unit ), a memory, a network, an IO (Input/Output), and the like. The container data storage module is responsible for storing the acquired data into a database, sorting the data into a specified format and sending the data to the container resource scheduling module. The container resource scheduling module is the core of the whole logic layer and is mainly responsible for the expansion and contraction of the container and the realization of load balancing among nodes.
It should be noted that, aiming at the heuristic algorithm which is commonly used at present, that is, the parameters of the heuristic algorithm need to be readjusted and complex calculation is executed again along with the change of the network state, and the problem that the local optimal solution is easy to fall in exists, the container resource scheduling module integrates the deep reinforcement learning algorithm, and can directly obtain the optimal strategy of container resource scheduling, without complex calculation again, and finally, the optimal allocation of the container resources is realized.
Specifically, load data of a host, a plurality of nodes and a container are obtained through a container resource acquisition module, wherein the load data of the host is mainly used for judging whether the container can be created.
Step S102: load data for each of the containers:
and judging whether the load data meets a preset load condition according to the load data of the container.
Specifically, if the load data is greater than a preset overload threshold or less than a preset underload threshold, judging that the load data does not meet the preset load condition, and if the load data is not greater than the preset overload threshold and not less than the preset underload threshold, judging that the load data meets the preset load condition.
Step S103: and if the load data does not meet the preset load condition, expanding or shrinking the container to meet the preset load condition.
Specifically, if the load data is greater than a preset overload threshold, the container is expanded, and if the load data is less than a preset underload threshold, the container is contracted.
Step S104: when the load data of all the containers meet the preset load condition, obtaining the load unbalance degree among the nodes according to the load data of all the nodes, and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value.
Specifically, when all the container load data are not greater than a preset overload threshold and not less than a preset underload threshold, calculating to obtain the load imbalance degree (namely the load imbalance degree among the nodes) of the Docker container cluster according to the load data of all the nodes, and judging whether the load imbalance degree of the Docker container cluster exceeds the preset load balance threshold.
In an optional embodiment, the obtaining the load imbalance degree between the nodes according to the load data of all the nodes specifically includes:
for all nodes at each of the T times, the following steps are performed:
acquiring the average utilization rate of CPU of the central processing units of all the nodes;
calculating the CPU utilization standard deviation of all the nodes according to the CPU average utilization;
and calculating the average value of the CPU utilization standard deviations of all the nodes at T moments, and taking the average value of the CPU utilization standard deviations of all the nodes as the load unbalance degree among the nodes.
Specifically, the calculation formula of the average CPU utilization of all the nodes is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the CPU average utilization rate of all nodes at the time T is represented by M, the number of containers, N, the number of nodes, T, the number of time and +.>CPU utilization at time t for container m, < >>The position relation between the container m and the node n at the moment t is that the value of the container m is 1 when the container m is positioned at the node n, and is otherwise 0;
the calculation formula of the CPU utilization standard deviation of all the nodes is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,CPU utilization standard deviation of all nodes at t moment;
the calculation formula of the average value of the standard deviation of the CPU utilization rates of all the nodes at the T moments is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the average value of the standard deviation of CPU utilization of all nodes at T moments.
Step S105: if the load unbalance degree among the nodes exceeds the preset load balance threshold, inputting the load data of the nodes and the containers into a load decision model to obtain a load balance strategy, wherein the load decision model is a model constructed based on an asynchronous dominant action evaluation A3C algorithm network, and the optimization target of the load decision model is determined by the task average instruction response time ratio of the user quality of service QoS and the load unbalance degree among the nodes.
Specifically, if the load imbalance degree of the Docker container cluster exceeds a preset load balancing threshold, inputting network states corresponding to load data of a plurality of nodes and containers into a trained load decision model, and outputting an optimal strategy (namely a load balancing strategy) of load balancing by the load decision model.
It should be noted that, the A3C (Asynchronous Advantage Actor-Critic) algorithm is mainly responsible for learning a decision model with load energy-gathering decision knowledge (i.e. the load decision model described above). The model comprehensively considers the load unbalance degree among the nodes and the QoS of users, the number of known containers is M, the number of the nodes is N, the state space of the load decision model (namely the network state) can be represented by an MxN two-dimensional matrix A, and when the container M is positioned at the node N, the A is mn =1, otherwise a mn =0, the action space of the load decision model (i.e. the best strategy for load balancing described above) represents the allocation of containers to virtual machines, then action mn On behalf of container m migrate to node n:
the value of the reward function of the load decision model is related to the container cluster state, when a certain action is executed, the current container cluster state is good, the positive reward is given, otherwise, the negative reward is given:
wherein, the liquid crystal display device comprises a liquid crystal display device,for rewarding function->And->Is of different prize values, and +.>>/>>0,/>Is the network state after the container has migrated.
Specifically, the A3C algorithm actually performs synchronous training by placing the Actor-Critic in a plurality of threads, where an Actor network model is designed as shown in fig. 3, and the Actor network model specifically includes: the input of the Actor network model is a two-dimensional matrix of a state space, and the output is the probability corresponding to each action in the action space in the current state. Critic network model design As shown in FIG. 4, the Critic network model specifically includes: the input layer, the first convolution layer, the second convolution layer, the flame layer, and the full connection layer are sequentially connected, in the Critic network model, the input is still a two-dimensional matrix, and the output is a value evaluation for performing a specified action, so that the Critic network structure is approximately the same as the Actor network structure, except that only one neuron is output.
In an optional embodiment, before the load data of the plurality of nodes and the container is input into the load decision model to obtain the load balancing policy, the method further includes:
for each task in all containers, the following steps are performed:
acquiring the transmission time of the task to the corresponding container;
calculating the waiting time of the task after reaching the corresponding container according to the transmission time;
acquiring the execution time of the task in the corresponding container;
and calculating the average instruction response time ratio of all tasks in all containers according to the transmission time, the waiting time and the execution time of each task.
Specifically, the calculation formula of the transmission time of the task to the corresponding container is:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the data transmission time of task i to container m, < >>For the data transmission quantity of task i, +.>For the network transmission speed to container m;
the calculation formula of the waiting time after the task reaches the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the waiting time after task i reaches container m, k=1, …, C is the number of tasks that container m has not performed, +.>The execution time of task k in container m;
the calculation formula of the execution time of the task in the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the execution time of task i in container m, < >>Instruction length for task i, +.>CPU utilization for container m, +.>Instruction execution speed for container m;
the calculation formula of the average instruction response time ratio of all tasks in all containers is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the average instruction response time ratio for a total of I tasks within M containers, I being the number of tasks and M being the number of containers.
In an alternative embodiment, the calculation formula of the optimization target of the load decision model is:
wherein, the liquid crystal display device comprises a liquid crystal display device,weight for task average instruction response time ratio, +.>The CPU is weighted by the average value of the standard deviation of the utilization rate.
Step S106: and scheduling the container resources according to the load balancing strategy.
Specifically, according to a load balancing strategy, a container to be scheduled and a target node are selected, and the container to be scheduled is migrated and scheduled to the target node.
The dynamic load balancing method of the Docker cloud platform provided by the embodiment of the invention comprises the steps of firstly obtaining load data of a plurality of nodes and containers; load data for each of the containers is then: judging whether the load data meet a preset load condition according to the load data of the container; if the load data does not meet the preset load condition, expanding or shrinking the container to meet the preset load condition; when the load data of all the containers meet the preset load condition, obtaining the load unbalance degree among the nodes according to the load data of all the nodes, and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value; if the load unbalance degree among the nodes exceeds the preset load balance threshold, inputting load data of the nodes and the containers into a load decision model to obtain a load balance strategy, wherein the load decision model is a model constructed based on an asynchronous dominant action evaluation A3C algorithm network, and the optimization target of the load decision model is determined by the task average instruction response time ratio of the user quality of service QoS and the load unbalance degree among the nodes; and finally, scheduling the container resources according to the load balancing strategy. The invention sets three thresholds, and utilizes the deep reinforcement learning algorithm which jointly forms an optimization target by the load unbalance degree among the nodes and the QoS of the user to carry out reasonable resource scheduling, and can timely carry out efficient resource scheduling according to the load change of the working system, thereby effectively aiming at the load unbalance condition in the real network environment, and solving the problems that the prior art realizes the load balancing by considering the heuristic algorithm calculation with single index and poor expandability, which causes difficulty in coping with the real network environment.
Example 2:
as shown in fig. 5, this embodiment provides a dynamic load balancing device for a dock cloud platform, configured to execute the above dynamic load balancing method for a dock cloud platform, including:
an acquisition module 11 for acquiring load data of a plurality of nodes and containers;
a first judging module 12, connected to the acquiring module 11, for judging whether the load data meets a preset load condition according to the load data of the container;
an expansion and contraction module 13, connected to the first judging module 12, configured to expand or contract a container to meet a preset load condition if the load data does not meet the preset load condition;
the second judging module 14 is connected with the expansion and contraction module 13, and is configured to obtain a load imbalance degree between nodes according to the load data of all the nodes when the load data of all the containers meet the preset load condition, and judge whether the load imbalance degree between the nodes exceeds a preset load balance threshold;
the input model module 15 is connected with the second judging module 14, and is configured to input the load data of the plurality of nodes and the container into a load decision model if the load imbalance degree between the nodes exceeds the preset load balancing threshold value, so as to obtain a load balancing strategy;
and the scheduling module 16 is connected with the input model module 15 and is used for scheduling the container resources according to the load balancing strategy.
Further, the first judging module 12 specifically includes:
the first judging unit is used for judging that the load data does not meet the preset load condition if the load data is larger than a preset overload threshold or smaller than a preset underload threshold;
and the second judging unit is used for judging that the load data meets the preset load condition if the load data is not greater than a preset overload threshold value and not less than a preset underload threshold value.
Further, the expansion and reduction module 13 specifically includes:
the expansion unit is used for expanding the container if the load data is larger than a preset overload threshold value;
and the reduction unit is used for reducing the container if the load data is smaller than a preset underload threshold value.
Further, the second judging module 14 specifically includes:
the acquisition unit is used for acquiring the average utilization rate of the CPU of all the nodes;
the first calculation unit is used for calculating the CPU utilization standard deviation of all the nodes according to the CPU average utilization;
and the second calculation unit is used for calculating the average value of the CPU utilization standard deviations of all the nodes at T moments and taking the average value of the CPU utilization standard deviations of all the nodes as the load unbalance degree among the nodes.
Further, the calculation formula of the average CPU utilization of all the nodes is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the CPU average utilization rate of all nodes at the time T is represented by M, the number of containers, N, the number of nodes, T, the number of time and +.>CPU utilization at time t for container m, < >>The position relation between the container m and the node n at the moment t;
the calculation formula of the CPU utilization standard deviation of all the nodes is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,CPU utilization standard deviation of all nodes at t moment;
the calculation formula of the average value of the standard deviation of the CPU utilization rates of all the nodes at the T moments is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the average value of the standard deviation of CPU utilization of all nodes at T moments.
Further, the apparatus further comprises:
the transmission time acquisition module is connected with the input model module 15 and is used for acquiring the transmission time of the task to the corresponding container;
the waiting time calculation module is connected with the transmission time acquisition module and is used for calculating the waiting time of the task after reaching the corresponding container according to the transmission time;
the execution time acquisition module is connected with the waiting time calculation module and is used for acquiring the execution time of the task in the corresponding container;
and the third calculation unit is connected with the execution time acquisition module and is used for calculating the average instruction response time ratio of all tasks in all containers according to the transmission time, the waiting time and the execution time.
Further, the calculation formula of the transmission time of the task to the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the data transmission time of task i to container m, < >>For the data transmission quantity of task i, +.>For the network transmission speed to container m;
the calculation formula of the waiting time after the task reaches the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the waiting time after task i reaches container m, k=1, …, C is the number of tasks that container m has not performed, +.>The execution time of task k in container m;
the calculation formula of the execution time of the task in the corresponding container is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the execution time of task i in container m, < >>Instruction length for task i, +.>CPU utilization for container m, +.>Instruction execution speed for container m;
the calculation formula of the average instruction response time ratio of all tasks in all containers is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the average instruction response time ratio for a total of I tasks within M containers, I being the number of tasks and M being the number of containers.
Further, the calculation formula of the optimization target of the load decision model is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,weight for task average instruction response time ratio, +.>The CPU is weighted by the average value of the standard deviation of the utilization rate.
Example 3:
referring to fig. 6, the present embodiment provides a dynamic load balancing apparatus for a dock cloud platform, including a memory 21 and a processor 22, where the memory 21 stores a computer program, and the processor 22 is configured to run the computer program to execute the dynamic load balancing method for the dock cloud platform in embodiment 1.
The memory 21 is connected to the processor 22, the memory 21 may be a flash memory, a read-only memory, or other memories, and the processor 22 may be a central processing unit or a single chip microcomputer.
The dynamic load balancing method and device for the dock cloud platform provided in embodiments 2 to 3 firstly acquire load data of a plurality of nodes and containers; load data for each of the containers is then: judging whether the load data meet a preset load condition according to the load data of the container; if the load data does not meet the preset load condition, expanding or shrinking the container to meet the preset load condition; when the load data of all the containers meet the preset load condition, obtaining the load unbalance degree among the nodes according to the load data of all the nodes, and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value; if the load unbalance degree among the nodes exceeds the preset load balance threshold, inputting load data of the nodes and the containers into a load decision model to obtain a load balance strategy, wherein the load decision model is a model constructed based on an asynchronous dominant action evaluation A3C algorithm network, and the optimization target of the load decision model is determined by the task average instruction response time ratio of the user quality of service QoS and the load unbalance degree among the nodes; and finally, scheduling the container resources according to the load balancing strategy. The invention sets three thresholds, and utilizes the deep reinforcement learning algorithm which jointly forms an optimization target by the load unbalance degree among the nodes and the QoS of the user to carry out reasonable resource scheduling, and can timely carry out efficient resource scheduling according to the load change of the working system, thereby effectively aiming at the load unbalance condition in the real network environment, and solving the problems that the prior art realizes the load balancing by considering the heuristic algorithm calculation with single index and poor expandability, which causes difficulty in coping with the real network environment.
It is to be understood that the above embodiments are merely illustrative of the application of the principles of the present invention, but not in limitation thereof. Various modifications and improvements may be made by those skilled in the art without departing from the spirit and substance of the invention, and are also considered to be within the scope of the invention.
Claims (10)
1. A dynamic load balancing method for a Docker cloud platform, the method comprising:
load data of a plurality of nodes and containers are obtained;
load data for each of the containers:
judging whether the load data meet a preset load condition according to the load data of the container;
if the load data does not meet the preset load condition, expanding or shrinking the container to meet the preset load condition;
when the load data of all the containers meet the preset load condition, obtaining the load unbalance degree among the nodes according to the load data of all the nodes, and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value or not;
if the load unbalance degree among the nodes exceeds the preset load balance threshold, inputting load data of the nodes and the containers into a load decision model to obtain a load balance strategy, wherein the load decision model is a model constructed based on an asynchronous dominant action evaluation A3C algorithm network, and the optimization target of the load decision model is determined by the task average instruction response time ratio of the user quality of service QoS and the load unbalance degree among the nodes;
and scheduling the container resources according to the load balancing strategy.
2. The method according to claim 1, wherein the determining, according to the load data, whether the load data meets a preset load condition specifically includes:
if the load data is larger than a preset overload threshold or smaller than a preset underload threshold, judging that the load data does not meet the preset load condition;
and if the load data is not greater than a preset overload threshold value and is not less than a preset underload threshold value, judging that the load data meets the preset load condition.
3. The method according to claim 2, wherein expanding or contracting the container if the load data does not meet a preset load condition, specifically comprises:
if the load data is larger than a preset overload threshold, expanding the container;
and if the load data is smaller than a preset underload threshold value, the container is reduced.
4. The method according to claim 1, wherein the obtaining the load imbalance degree between the nodes according to the load data of all the nodes specifically includes:
for all nodes at each of the T times, the following steps are performed:
acquiring the average utilization rate of CPU of the central processing units of all the nodes;
calculating the CPU utilization standard deviation of all the nodes according to the CPU average utilization;
and calculating the average value of the CPU utilization standard deviations of all the nodes at T moments, and taking the average value of the CPU utilization standard deviations of all the nodes as the load unbalance degree among the nodes.
5. The method of claim 4, wherein the calculation formula of the average CPU utilization of all nodes is:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,the CPU average utilization rate of all nodes at the time T is represented by M, the number of containers, N, the number of nodes, T, the number of time and +.>CPU utilization at time t for container m, < >>The position relation between the container m and the node n at the moment t;
the calculation formula of the CPU utilization standard deviation of all the nodes is as follows:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,CPU utilization standard deviation of all nodes at t moment;
the calculation formula of the average value of the standard deviation of the CPU utilization rates of all the nodes at the T moments is as follows:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,the average value of the standard deviation of CPU utilization of all nodes at T moments.
6. The method of claim 1, wherein before inputting the load data of the plurality of nodes and containers into the load decision model to derive the load balancing policy, the method further comprises:
for each task in all containers, the following steps are performed:
acquiring the transmission time of the task to the corresponding container;
calculating the waiting time of the task after reaching the corresponding container according to the transmission time;
acquiring the execution time of the task in the corresponding container;
and calculating the average instruction response time ratio of all tasks in all containers according to the transmission time, the waiting time and the execution time of each task.
7. The method of claim 1, wherein the calculation formula of the transmission time of the task to the corresponding container is:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,for the data transmission time of task i to container m, < >>For the data transmission quantity of task i, +.>For the network transmission speed to container m;
the calculation formula of the waiting time after the task reaches the corresponding container is as follows:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,for the waiting time after task i reaches container m, k=1, …, C is the number of tasks that container m has not performed, +.>The execution time of task k in container m;
the calculation formula of the execution time of the task in the corresponding container is as follows:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,for the execution time of task i in container m, < >>Instruction length for task i, +.>CPU utilization for container m, +.>Instruction execution speed for container m;
the calculation formula of the average instruction response time ratio of all tasks in all containers is as follows:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,the average instruction response time ratio for a total of I tasks within M containers, I being the number of tasks and M being the number of containers.
8. The method according to claim 5 or 7, wherein the calculation formula of the optimization objective of the load decision model is:
,
wherein, the liquid crystal display device comprises a liquid crystal display device,weight for task average instruction response time ratio, +.>The CPU is weighted by the average value of the standard deviation of the utilization rate.
9. The utility model provides a Docker cloud platform dynamic load balancing device which characterized in that includes:
the acquisition module is used for acquiring load data of the plurality of nodes and the container;
the first judging module is connected with the acquiring module and is used for judging whether the load data meet a preset load condition according to the load data of the container;
the expansion and reduction module is connected with the first judging module and is used for expanding or reducing the container to meet the preset load condition if the load data does not meet the preset load condition;
the second judging module is connected with the expansion and reduction module and is used for obtaining the load unbalance degree among the nodes according to the load data of all the nodes when the load data of all the containers meet the preset load condition and judging whether the load unbalance degree among the nodes exceeds a preset load balance threshold value or not;
the input model module is connected with the second judging module and is used for inputting the load data of the plurality of nodes and the containers into a load decision model to obtain a load balancing strategy if the load unbalance degree among the nodes exceeds the preset load balancing threshold;
and the scheduling module is connected with the input model module and used for scheduling the container resources according to the load balancing strategy.
10. A dock cloud platform dynamic load balancing device, comprising a memory and a processor, the memory having stored therein a computer program, the processor being arranged to run the computer program to implement the dock cloud platform dynamic load balancing method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310821239.4A CN116541178B (en) | 2023-07-06 | 2023-07-06 | Dynamic load balancing method and device for Docker cloud platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310821239.4A CN116541178B (en) | 2023-07-06 | 2023-07-06 | Dynamic load balancing method and device for Docker cloud platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116541178A true CN116541178A (en) | 2023-08-04 |
CN116541178B CN116541178B (en) | 2023-10-20 |
Family
ID=87449216
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310821239.4A Active CN116541178B (en) | 2023-07-06 | 2023-07-06 | Dynamic load balancing method and device for Docker cloud platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116541178B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102857577A (en) * | 2012-09-24 | 2013-01-02 | 北京联创信安科技有限公司 | System and method for automatic load balancing of cluster storage |
CN104202388A (en) * | 2014-08-27 | 2014-12-10 | 福建富士通信息软件有限公司 | Automatic load balancing system based on cloud platform |
CN106998303A (en) * | 2017-03-24 | 2017-08-01 | 中国联合网络通信集团有限公司 | The load-balancing method and SiteServer LBS of routing node |
US20170279877A1 (en) * | 2016-03-28 | 2017-09-28 | Industrial Technology Research Institute | Load balancing method, load balancing system, load balancing device and topology reduction method |
CN114816723A (en) * | 2021-01-29 | 2022-07-29 | 中移(苏州)软件技术有限公司 | Load balancing system, method and computer readable storage medium |
CN115987995A (en) * | 2022-08-31 | 2023-04-18 | 兴业银行股份有限公司 | Method and system for improving node resource utilization rate balance load of kubernets |
-
2023
- 2023-07-06 CN CN202310821239.4A patent/CN116541178B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102857577A (en) * | 2012-09-24 | 2013-01-02 | 北京联创信安科技有限公司 | System and method for automatic load balancing of cluster storage |
CN104202388A (en) * | 2014-08-27 | 2014-12-10 | 福建富士通信息软件有限公司 | Automatic load balancing system based on cloud platform |
US20170279877A1 (en) * | 2016-03-28 | 2017-09-28 | Industrial Technology Research Institute | Load balancing method, load balancing system, load balancing device and topology reduction method |
CN106998303A (en) * | 2017-03-24 | 2017-08-01 | 中国联合网络通信集团有限公司 | The load-balancing method and SiteServer LBS of routing node |
CN114816723A (en) * | 2021-01-29 | 2022-07-29 | 中移(苏州)软件技术有限公司 | Load balancing system, method and computer readable storage medium |
CN115987995A (en) * | 2022-08-31 | 2023-04-18 | 兴业银行股份有限公司 | Method and system for improving node resource utilization rate balance load of kubernets |
Non-Patent Citations (1)
Title |
---|
梅荣;: "基于云计算的弹性负载均衡服务研究", 中国公共安全, no. 01 * |
Also Published As
Publication number | Publication date |
---|---|
CN116541178B (en) | 2023-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112000459B (en) | Method for expanding and shrinking capacity of service and related equipment | |
Han et al. | Tailored learning-based scheduling for kubernetes-oriented edge-cloud system | |
WO2018076791A1 (en) | Resource load balancing control method and cluster scheduler | |
CN110413389B (en) | Task scheduling optimization method under resource imbalance Spark environment | |
CN109324875B (en) | Data center server power consumption management and optimization method based on reinforcement learning | |
CN111274036B (en) | Scheduling method of deep learning task based on speed prediction | |
CN113806018B (en) | Kubernetes cluster resource mixed scheduling method based on neural network and distributed cache | |
CN106528270A (en) | Automatic migration method and system of virtual machine based on OpenStack cloud platform | |
CN109962969A (en) | The method and apparatus of adaptive cache load balancing for cloud computing storage system | |
CN108694077A (en) | Based on the distributed system method for scheduling task for improving binary system bat algorithm | |
CN103699433B (en) | One kind dynamically adjusts number of tasks purpose method and system in Hadoop platform | |
CN111381928B (en) | Virtual machine migration method, cloud computing management platform and storage medium | |
CN106201701A (en) | A kind of workflow schedule algorithm of band task duplication | |
CN113821332B (en) | Method, device, equipment and medium for optimizing efficiency of automatic machine learning system | |
Li et al. | An effective scheduling strategy based on hypergraph partition in geographically distributed datacenters | |
WO2020134133A1 (en) | Resource allocation method, substation, and computer-readable storage medium | |
Mousavi Khaneghah et al. | A mathematical multi-dimensional mechanism to improve process migration efficiency in peer-to-peer computing environments | |
CN112416578B (en) | Container cloud cluster resource utilization optimization method based on deep reinforcement learning | |
CN114780244A (en) | Container cloud resource elastic allocation method and device, computer equipment and medium | |
CN113553138A (en) | Cloud resource scheduling method and device | |
Zhang et al. | Hierarchical resource scheduling method using improved cuckoo search algorithm for internet of things | |
CN117349026B (en) | Distributed computing power scheduling system for AIGC model training | |
CN116541178B (en) | Dynamic load balancing method and device for Docker cloud platform | |
CN113535346A (en) | Method, device and equipment for adjusting number of threads and computer storage medium | |
CN106874215B (en) | Serialized storage optimization method based on Spark operator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |