CN113946436B

CN113946436B - Resource pre-scheduling method based on load balancing

Info

Publication number: CN113946436B
Application number: CN202110863510.1A
Authority: CN
Inventors: 高岭; 朱海蓉; 郭子正; 向东; 李妍; 许佶鹏; 杨旭东; 郭红波
Original assignee: NORTHWEST UNIVERSITY
Current assignee: NORTHWEST UNIVERSITY
Priority date: 2021-07-29
Filing date: 2021-07-29
Publication date: 2024-03-08
Anticipated expiration: 2041-07-29
Also published as: CN113946436A

Abstract

According to a result of user demand prediction, determining a resource server set matched with the user demand; selecting a resource pre-scheduling server, collecting resource use data of each server according to the running period of the resource pre-scheduling server, and obtaining the current actual utilization rate of the server according to the executing user request on each server; judging whether the server is in a saturated state, if so, selecting the server with the shortest waiting time as a resource pre-dispatching server, otherwise, distributing the user demand prediction result to a server set to obtain the load variance of the server, and selecting the server with the smallest variance as the server for processing the user demand. And finally, selecting a path with the smallest maximum link bandwidth utilization rate as a resource pre-scheduling link. The method selects a resource pre-scheduling server by calculating a server load variance, and selects an optimal resource pre-scheduling link by using a principle of minimizing maximum bandwidth utilization (max-min). The efficiency of resource scheduling is greatly improved, and the problem that resources cannot be fully utilized is solved.

Description

Resource pre-scheduling method based on load balancing

Technical Field

The invention relates to the field of cloud resource scheduling, in particular to a resource pre-scheduling method based on load balancing.

Background

The cloud platform is constructed in different application forms, and digital resources are stored in the service platform. The resource user can access the cloud platform without time and place limitation, and the cloud service platform is efficiently applied to acquire related resources.

In recent years, public digital culture and civil engineering such as national cultural information resource sharing engineering, digital library popularization engineering, public electronic reading room construction plan and the like are implemented, a national service network is basically established, a large-scale cloud digital resource library group and a technical support service platform represented by national public culture cloud are formed, and a foundation is laid for digitization and networking of a public culture service system. However, there are also a few problems with national public culture cloud services, such as: the most reasonable resource allocation method and resource scheduling strategy cannot be found according to the user demands, public culture cloud resources cannot be efficiently scheduled, and the resource utilization rate is low.

Different from the traditional resource scheduling, the development of public culture cloud environment enables more resources to be stored in the cloud, and the resources in the cloud are more required to be reasonably distributed and used because the cloud resource distribution efficiency influences the experience of using cloud computing services by users. In the process of cloud resource scheduling, because the cloud computing environment has high dynamic property and isomerism and different clients have different demands on cloud computing resources, reasonable scheduling is needed to be carried out on the resources from global optimal and overall server resource load balancing, and timely decision of scheduling is considered according to real-time demand prediction of users. In addition, in the process of executing scheduling, static resource allocation and scheduling often cause insufficient or wasted resources, and manual dynamic resource adjustment has obvious hysteresis, so that the execution state of tasks needs to be monitored in real time, and the server saturation is intelligently judged through the resource utilization rate. And finally, selecting a resource pre-scheduling server through server load variance in the resource scheduling process, and selecting an optimal resource pre-scheduling link by using a principle of minimizing the maximum bandwidth utilization rate. The efficiency of the whole system resource scheduling is greatly improved, and the problem that resources cannot be fully utilized is solved.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a resource pre-scheduling method based on load balancing, which is used for determining a resource pre-scheduling server set according to a prediction result of resources required by a user for a time interval in the future and required resource distribution data, combining server load variance, determining a scheduling server based on server saturation and server load variance, and finally determining an optimal pre-scheduling link according to judging the link conditions of a public culture cloud server and a user side server. The method solves the defects of low resource utilization rate, slow scheduling time, unbalanced server resource load and the like of the existing resource scheduling method.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

a resource pre-scheduling method based on load balancing comprises the following steps:

step 1, server set selection:

determining a resource server set matched with the user demand according to the result of the user demand prediction by the following formula,

wherein C is _ik N is the total amount of resource k on server i _mi (t-1) waiting for M types of user requests on the server i for t-1 time units, M being the upper limit of the types of user requests, D _mk D, the amount of consumed resource k for m-class user requests _nk Requesting the amount of the resource k required by n for the user;

step 2, calculating the utilization rate of the server:

monitoring equipment is arranged at each server node, the resource use data of each server is collected according to the operation period, the user request which is being executed on each server is analyzed, and the current actual utilization rate of the server is calculated, specifically comprising the following steps:

1) Collecting the number P of user requests being executed on each server according to the characteristic that the public culture cloud server node data changes along with time and the use data of resources according to the operation period _i ；

2) According to the number P of user requests being executed by each server _i Calculating the actual utilization rate eta of each server _i ；

Wherein, the servers can jointly run P in parallel _i Individual user requirements, P _i More than 0 is a positive integer, N is the number of total servers in public culture cloud, the denominator represents the maximum number of user requests which can be executed by the servers in parallel, and the numerator represents the number of user requirements executed at the current stage, when eta _i When=1, the representative server is in saturation; when eta _i When=0, no executing user demand exists on the representative server, and the representative server is in a non-saturated state;

step 3, selecting a resource pre-scheduling server:

1) Judging whether the servers are in a saturated state or not according to the actual utilization rate of each server, and selecting a user requirement n to be allocated if the servers are in a non-saturated state;

(1) according to the m-class user demand number in the t time, calculating the total user demand number to be distributed in the t time, wherein the calculation formula is as follows:

L _m (t)＝L _m (t-1)+A _m (t)-H _m (t)

wherein L is _m (t) is the number of m-class user demands in t time, A _m (t) new arrival within t timeUp to m kinds of user demand number, H _m (t) is the number of m types of user demands completed in t time, and L (t) is the total number of user demands to be distributed in t time;

(2) if the sum of the total user demand number to be allocated and the m types of user demands waiting to be allocated in the time unit t-1 is not greater than the length of the waiting queue, selecting the user demand with the largest required resource in the user demands to be allocated as n, otherwise, selecting the user demand with the largest task weight to be allocated as n;

wherein Max is _ql To wait for the queue length, N _mi (t-1) class m resources waiting to be allocated for t-1 time units, W _a For the weight of task a to be assigned, T _a For the time that the user demand a waits in the waiting sequence to be allocated,the resource amount required by the user demand a to be allocated;

2) Selecting a server which processes the user demand when the server is in a non-saturated state;

(1) the average load of each server when the regional user demand n is distributed to the server set is calculated, and the calculation formula is as follows:

wherein the method comprises the steps ofDenoted as load on t time cell server i, P _ni (t) probability of allocation of user demand n to server i for t time units, D _mk The amount of consumed resource k for class m user demand, D _nk For the amount of resources k required by the user demand N, N _mi (t-1) class m resources waiting to be allocated for t-1 time units, avg _DC (t) the average load of each server when the user demand N is distributed to the server set, wherein N is the number of public culture cloud total servers;

(2) when the user demand n is distributed to each server, the load variance of the server is calculated, and the calculation formula is as follows:

wherein N is the number of public culture cloud total servers,expressed as load on t time cell server i, avg _DC (t) an average load for each server when the user demand n is assigned to the set of servers;

(3) analyzing the load of the server, and selecting the server with the smallest load variance as the server for processing the user demand;

3) If the servers are in a saturated state, the waiting time of all the servers is obtained according to the execution time of the server user requests, the queuing number of the server user requests and the operation user request number of the servers, and the server with the shortest waiting time is selected to process the user demands. Detecting the running state of the resource according to the resource monitoring equipment, and simultaneously obtaining a corresponding parameter value;

the resource waiting time of the server is calculated, and the calculation formula is as follows:

wherein P is _i Representing the number of user demands that the server is currently executing,indicating the task queuing number of the server, I _i Indicating the time required for the server to fulfill a user's demand;

step 4, link selection:

measuring the balance degree of the flow in the network according to the principle of minimizing the maximum bandwidth utilization rate (max-min), and selecting the path with the minimum maximum link bandwidth utilization rate as a resource scheduling link;

abstract describing public culture cloud to user network topology by using a directed graph G= (V, E), wherein V represents a set of nodes in a network, and E represents a network link set; the number of nodes and the number of links are respectively represented by M and N, namely M= |V| and N= |E|; a path P from a source public culture cloud s to a destination user terminal t is formed by a group of non-repeated links (I ₁ ，I ₂ ，I ₃ ,…,I _n ) Composition; for the followingAll correspond to a length value, use +.>A representation;

the sum of all length values that make up the path is:

the bandwidth utilization of path P is:

where len (P) denotes the length of path P, isIndicating that traffic is passing through link l _i Bandwidth utilization of the back link.

In the step 4, the new link problem is abstracted as follows: in a directed graph G, a path P from a source public culture cloud s to a destination user t is found, and an upper length limit D is specified to enable the path P to meet the following conditions:

len(P)≤D

the length of each path does not exceed the long limit, the service quality of the user is guaranteed, and the path with the smallest maximum link bandwidth utilization rate is selected as the resource scheduling link.

The beneficial effects of the invention are as follows:

the method selects a resource pre-scheduling server by calculating a server load variance, and selects an optimal resource pre-scheduling link by using a principle of minimizing maximum bandwidth utilization (max-min). The efficiency of resource scheduling is greatly improved, and the problem that resources cannot be fully utilized is solved.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

fig. 2 is a schematic diagram of a resource scheduling network structure.

Detailed Description

The invention is further described below with reference to the drawings and examples.

As shown in fig. 1 and 2, a resource pre-scheduling method based on load balancing specifically includes the following steps:

step 1: and determining a resource server set matched with the user demand according to the result of the user demand prediction.

The calculation formula is as follows:

wherein C is _ik N is the total amount of resource k on server i _mi (t-1) is t-1 timeThe unit waits for M types of user requests on the server i, M is the upper limit of the types of the user requests, and D _mk D, the amount of consumed resource k for m-class user requests _nk Requesting the amount of the resource k required by n for the user;

step 2: monitoring equipment is arranged at each server node, the resource use data of each server are collected according to the operation period, the user request which is being executed on each server is analyzed, and the current actual utilization rate of the server is calculated.

Further, the step S2 specifically includes the following steps:

s21: collecting the number P of user requests being executed on each server according to the characteristic that the public culture cloud server node data changes along with time and the use data of resources according to the operation period _i ；

S22: according to the number P of user requests being executed by each server _i Calculating the actual utilization rate eta of each server _i ；

step 3: and judging the saturation state of the server according to the current actual utilization rate of the server, if the server is in the unsaturated state, calculating the load variance of the server when the user demand prediction result is distributed to the server set, and selecting the server with the smallest variance as the server for processing the user demand. If the servers are in a saturated state, the waiting time of all the servers is obtained according to the execution time of the server user requests, the queuing number of the server user requests and the operation user request number of the servers, and the server with the shortest waiting time is selected.

Further, the step S3 specifically includes the following steps:

s31: judging whether the servers are in a saturated state or not according to the actual utilization rate of each server, and selecting a user requirement n to be allocated if the servers are in a non-saturated state;

1) According to the m-class user demand number in the t time, calculating the total user demand number to be distributed in the t time, wherein the calculation formula is as follows:

L _m (t)＝L _m (t-1)+A _m (t)-H _m (t)

wherein L is _m (t) is the number of m-class user demands in t time, A _m (t) is the number of m types of user demands newly arrived in t time, H _m (t) is the number of m types of user demands completed in t time, and L (t) is the total number of user demands to be distributed in t time;

2) If the sum of the total user demand number to be allocated and the m types of user demands waiting to be allocated in the time unit t-1 is not greater than the length of the waiting queue, selecting the user demand with the largest required resource in the user demands to be allocated as n, otherwise, selecting the user demand with the largest task weight to be allocated as n;

s32: selecting a server which processes the user demand when the server is in a non-saturated state;

1) The average load of each server when the regional user demand n is distributed to the server set is calculated, and the calculation formula is as follows:

2) When the user demand n is distributed to each server, the load variance of the server is calculated, and the calculation formula is as follows:

3) Analyzing the load of the server, and selecting the server with the smallest load variance as the server for processing the user demand;

wherein N is the number of public culture cloud total servers,denoted as t time cell serviceLoad on device i, avg _DC (t) an average load for each server when the user demand n is assigned to the set of servers;

s32: if the servers are in a saturated state, calculating the waiting time of all the servers, and selecting the server with the shortest waiting time to process the user demands. Detecting the running state of the resource according to the resource monitoring equipment, and simultaneously obtaining a corresponding parameter value;

s4: measuring the balance degree of the flow in the network according to the principle of minimizing the maximum bandwidth utilization rate (max-min), and selecting the path with the minimum maximum link bandwidth utilization rate as a resource scheduling link;

and abstractly describing the network topology from the public culture cloud to the user side by using a directed graph G= (V, E). V denotes a set of nodes in the network and E denotes a set of network links. The number of nodes and the number of links are respectively represented by M and N, namely M= |V| and N= |E|. A path P from a source public culture cloud s to a destination user terminal t is formed by a group of non-repeated links (I ₁ ，I ₂ ，I ₃ ,…,I _n ) Composition; for the followingAll correspond to a length value, use +.>A representation;

the sum of all length values that make up the path is:

the bandwidth utilization of path P is:

Abstracting the selection of a new link problem as: in a directed graph G, a path P from a source public culture cloud s to a destination user t is found, and an upper length limit D is specified to enable the path P to meet the following conditions:

len(P)≤D

Claims

1. The resource pre-scheduling method based on load balancing is characterized by comprising the following steps:

step 1, server set selection:

step 2, calculating the utilization rate of the server:

Wherein, the servers can jointly run P in parallel _i Individual user requirements, P _i More than 0 is a positive integer, N is the number of total servers in public culture cloud, the denominator represents the maximum number of user requests which can be executed by the servers in parallel, and the numerator represents the number of user requirements executed at the current stage, when eta _i When=1, the representative server is in saturation; at the current eta _i When=0, no executing user demand exists on the representative server, and the representative server is in a non-saturated state;

step 3, selecting a resource pre-scheduling server:

L _m (t)＝L _m (t-1)+A _m (t)-H _m (t)

wherein the method comprises the steps ofRepresented as load on t time cell server i, p _ni (t) probability of allocation of user demand n to server i for t time units, D _mk The amount of consumed resource k for class m user demand, D _nk For the amount of resources k required by the user demand N, N _mi (t-1) class m resources waiting to be allocated for t-1 time units, avg _DC (t) the average load of each server when the user demand N is distributed to the server set, wherein N is the number of public culture cloud total servers;

3) If the servers are in a saturated state, according to the execution time of the server user requests, the queuing quantity of the server user requests and the running user request quantity of the servers, waiting time of all the servers is obtained, and the server with the shortest waiting time is selected to process the user demands; detecting the running state of the resource according to the resource monitoring equipment, and simultaneously obtaining a corresponding parameter value;

step 4, link selection:

the sum of all length values that make up the path is:

the bandwidth utilization of path P is:

2. The method for pre-scheduling resources based on load balancing according to claim 1, wherein the step 4 abstracts selecting a new link problem as follows: in a directed graph G, a path P from a source public culture cloud s to a destination user t is found, and an upper length limit D is specified to enable the path P to meet the following conditions:

len(P)≤D