CN115242797B

CN115242797B - Micro-service architecture-oriented client load balancing method and system

Info

Publication number: CN115242797B
Application number: CN202210692502.XA
Authority: CN
Inventors: 吴昊; 贺小伟; 赵军壮
Original assignee: NORTHWEST UNIVERSITY
Current assignee: NORTHWEST UNIVERSITY
Priority date: 2022-06-17
Filing date: 2022-06-17
Publication date: 2023-10-27
Anticipated expiration: 2042-06-17
Also published as: CN115242797A

Abstract

The application discloses a method and a system for balancing client loads facing a micro-service architecture. Comprising the following steps: after receiving service request information sent by a user, a client acquires an instance registry for deploying the service in a cluster from a registry; acquiring historical information of resource utilization rates of all servers for deploying the service from an information collector module, and predicting the resource utilization rate of the corresponding node in the next second by using a server resource prediction algorithm based on an improved ARIMA model; and the server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted next second resource utilization rate to obtain a corresponding server load value, returns load value information to the client and compares and selects a server execution request with the optimal load value. The method and the system reasonably distribute the requests of the users, and improve the utilization rate of cluster resources.

Description

Micro-service architecture-oriented client load balancing method and system

Technical Field

The application relates to the technical field of information, in particular to a client load balancing method and system for a micro-service architecture.

Background

With the high-speed development of the Internet and industry, internet enterprises are developed in a micro-service architecture mode at present, but with the increase of power exponent of service access concurrency, the load pressure on a server host is continuously increased, and load balancing becomes the focus of research work of domestic and foreign scientific researchers and Internet manufacturers at present. How service requests are distributed evenly among multiple servers is currently a very important area of research.

At present, the research of the load balancing algorithm is mainly divided into two parts, namely a static load balancing algorithm and a dynamic load balancing algorithm. The static load balancing algorithm is to schedule the service node according to the prior information of the service node, distribute tasks without considering the current load condition of the node, and comprises a polling algorithm, a weighted polling algorithm, a region-aware polling algorithm, a random algorithm and the like. Over time, static load balancing algorithms tend to create situations where cluster server loads are unbalanced, resulting in underutilization of cluster server resources and reduced cluster system efficiency. The dynamic load balancing algorithm calculates the real-time load condition of the service node according to the selected load index, reasonably schedules the service node, distributes tasks, and comprises a minimum connection algorithm, a weighted polling algorithm and the like. When a dynamic load balancing algorithm is adopted, a weight value of load balancing is often set by manual experience, when the user demand changes, a weight distribution strategy cannot be adjusted in real time according to the demand, the response time is influenced, and certain hysteresis exists.

In addition, in the system application of the micro-service architecture, the load balancing of the server is implemented by using a special instance server as a load balancer to forward the request. When faced with a large number of requests from users, the problem caused by centralized computation can only be the capacity expansion of the load balancer to increase its throughput. Similarly, the server load balancing can only configure a load balancing policy, and the forwarding of the request cannot be performed according to the load value of each server, which obviously consumes a great deal of material resources and financial resources unless the source code is changed.

Therefore, how to dynamically forward the service request in balance according to the server resource value in the cluster and send the final result to the client, so that the whole system can better process high and is a problem to be solved at present.

Disclosure of Invention

The application aims to provide a client load balancing method and a system for a micro-service architecture, which lighten the pressure of a server, and in addition, a service registry adopts real-time dynamic update, so that the effectiveness of an acquired service registration instance can be ensured.

The method specifically comprises the following steps:

a method for load balancing of clients oriented to a micro-service architecture, the method comprising:

s1, acquiring service request information of a user, and acquiring an instance registry for deploying the service in a cluster by a client in a registry;

s2: acquiring historical information of resource utilization rates of all node servers deploying the service from an information collector, and predicting the resource utilization rate of the corresponding node server in the next second by using an ARIMA model;

s3: and the node server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted next second resource utilization rate to obtain a corresponding node server load value, returns the node server load value to the client and compares the node server load value, and selects a node server with the minimum node server load value to execute the request.

Optionally, the predicting the next second resource utilization of the corresponding node by using the ARIMA model includes:

(1) Using the processed data as an input data sequence, judging whether the data is stable or not by using a unit root test rule, and if the sequence is not stable, carrying out differential processing on the sequence until the unit root test is met, and determining the differential order;

(2) Determining the optimal value of the model according to the trailing property and the ending property of the sample autocorrelation function ACF and the partial autocorrelation function PACF, and outputting a corresponding predicted value according to sample data;

(3) Historical resource information values and predicted value data of CPU utilization rate and memory utilization rate of the node server can be converted into actual data by normalization.

Optionally, the step S2 specifically includes:

taking the resource utilization rate of the server in a period of time as input quantity, modeling an ARIMA model, wherein the modeling process is as follows:

s2.1, judging whether the sample sequence is a stable sequence, if not, adopting a difference method for the sample sequence until the sequence is stable, and simultaneously determining a difference order d to obtain a time sequence model after difference;

s2.2, calculating an autocorrelation function and a partial autocorrelation function of the sample sequence, and calculating a q value of the MA model and a p value of the AR model by utilizing a red pool information criterion AIC, wherein the AIC standard function is as follows:

AIC＝n logσ ² +(p+q+1)log n (1)；

wherein n is the number of samples, σ ² To fit the sum of squares of the residuals, p is the order of autoregressive, q is the order of moving smoothing;

s2.3, estimating the values of parameters in a linear prediction model by using a least square method, wherein the ARIMA model is as follows:

wherein ,is an autoregressive coefficient; p is the order of autoregressive; θ _i Is a motion smoothing coefficient; q is the order of motion smoothing; x is x _t A load value at time t when the load data is a time series; epsilon _i A zero-mean white noise sequence;

the residual term is

wherein z_t Is the real load value of the load data at the moment of the data t,is a predicted object observation;

definition w _i For the weight of the data lagging the i-order, expressed by the square of the residual, i.e

Constructing a weighting matrix;

thenNamely, the weighted least square parameter estimation value;

wherein ,

x _i is a time series; y is an (n-p) x 1 matrix of real values before the prediction time t.

Optionally, before receiving the service request information sent by the user, the method further includes:

initializing a registry, periodically sending a heartbeat detection mechanism to each service registry by a service provider and a service caller, and updating or removing the changed or deactivated service instance in the registry; initializing servers, setting an independent thread in each server to collect the utilization rate of server resources, taking seconds as a unit, and storing the resources into a corresponding information collector; the client load balancing executor and cache are initialized.

Optionally, after receiving service request information sent by a user, determining whether the requested service is cluster deployment includes:

acquiring a registry of a registry, searching a service instance registry according to service information requested by a user, and judging whether the service is deployed by a cluster server according to the service instance registry;

if the cluster server does not exist to deploy the service, a corresponding node server is screened out from a service instance registry to process a request sent by a user;

if there is a cluster server for deploying the service, acquiring all node servers for deploying the service according to a service registration instance table, simultaneously sending requests for collecting server load values to the servers, acquiring resource utilization rates from information collector threads by corresponding node servers, taking normalized data as input values based on ARIMA model prediction, wherein the server resources comprise: CPU utilization and memory utilization per second for the server.

Optionally, the obtaining, by the corresponding node server, the historical resource utilization information of the corresponding node server includes:

(1) Each node server simultaneously starts respective resource information processing, acquires the values of the CPU and the memory utilization rate of the current server for a period of time, and fills the CPU or the memory utilization rate with the average value of the first 1s and the second 1s if the CPU or the memory utilization rate is empty in a certain second;

(2) If the CPU utilization rate and the memory utilization rate data are correct, the node server normalizes the respective CPU utilization rate and memory utilization rate sample sequences, inputs the normalized data into an ARIMA model based on improvement for model training, obtains an optimal ARIMA model and predicts the CPU utilization rate and the memory utilization rate of the next second.

Optionally, the node server performs calculation of a dynamic weight loading method, including:

(1) Each node server calculates the variation of the current CPU utilization rate and the variation of the memory utilization rate;

(2) Obtaining the weight ratio occupied by the CPU and the memory according to the variation of the CPU utilization rate and the variation of the memory utilization rate;

(3) And finally, calculating the load value of the node server according to the values of the current CPU utilization rate and the memory utilization rate and the calculated weights of the CPU and the memory.

Optionally, communication is realized between the node server and the client through an HTTP service method; after the node server calculates the load value, a request carrying the load value of the node server and server-related information including a server application port number, a server IP address and service name information is sent to the client.

A system for implementing the method for balancing client load of micro-service architecture according to any one of the present application, the system for balancing client load of micro-service architecture includes:

client side: the method is used for acquiring service request information sent by a user, sending and receiving the information to a cluster server and comparing load values;

the server side: the method is used for receiving the user information request, is responsible for receiving the information sent by the client, recording the utilization rate of the server resources and calculating the dynamic weight load method.

The beneficial effects of the application are as follows:

after receiving the service request sent by the user, the client side and the server side start to execute respective processes. The client acquires the instance registry for deploying the request service from the registry and caches the registry locally, so that the pressure of the server is reduced, and in addition, the service registry adopts real-time dynamic update, so that the validity of the acquired service registration instance can be ensured.

The application obtains the resource value of each service node, and uses the weight calculation based on the improved ARIMA model to predict the utilization rate value of the next second and the server load value to be distributed on each node server for processing, thereby avoiding the centralized calculation on one node server in the past and relieving the pressure of the node server. Furthermore, the load value of the current node of the corresponding server is calculated by using a dynamic service weight method according to the node resource information and the ARIMA algorithm model predicted value, so that the load information of the nodes in the cluster can be dynamically monitored in real time. Further, after the load value is calculated, each node sends the load value information of the node and the server information of the node to a client (consumer), and the client selects an optimal server according to the sent load value to execute the processing of the user information request, so that the utilization rate of cluster resources is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some examples of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without the inventive effort of a person skilled in the art.

FIG. 1 shows a flow chart of one embodiment of a load balancing method according to the present application;

FIG. 2 is a flowchart of another method for balancing load of clients facing to a micro-service architecture according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a load balancing system facing to a micro-service architecture according to an embodiment of the present application; in the figure: 1. a client; 2, a server side;

fig. 4 shows a schematic diagram of a client load balancing system facing to a micro-service architecture according to an embodiment of the present application;

fig. 5 is a graph of throughput versus various algorithms.

Detailed Description

For the purposes of promoting an understanding of the principles and technical aspects and advantages of embodiments of the application, reference will now be made in detail to the drawings and specific examples. The following examples or figures are illustrative of the application and are not intended to limit the scope of the application.

As shown in fig. 1, the method for balancing the load of the client facing the micro-service architecture disclosed by the application comprises the following steps:

s101, receiving service request information sent by a user, and acquiring an instance registry for deploying the service in the cluster from a registry.

The registry may be a registry provided by an existing internet manufacturer, or may be a registry customized by a user by using open source software, so long as the purpose of the embodiment can be achieved. In addition, each server is in communication with the registry, and typically uses a heartbeat detection mechanism to ensure proper operation of the service instance. Further, the registry instance table obtained from the registry may be cached by various middleware as long as the purpose of the present embodiment can be achieved.

S102, acquiring historical information of resource utilization rates of all servers for deploying the service from an information collector, and predicting the resource utilization rate of the corresponding node in the next second by using a server resource prediction algorithm based on an improved ARIMA model.

And sending a load value prediction request according to the service instance in the registry, and after the corresponding node receives the request, inputting the historical resource value in the information collector as an input sample into an ARIMA model based on improvement to obtain the resource utilization rate of the node server in the next second. The request transmitted here may be set by the user himself, whether it be a request of the Restful style or a request of the RPC style, as long as the object of the present embodiment can be achieved.

And S103, the server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted resource utilization rate of the next second to obtain a corresponding server load value, returns load value information to the client and compares and selects a server execution request with the optimal load value.

After the service weight calculation module is called, the historical resource value and the predicted resource value are integrated. And calculating to obtain the weight values of the CPU and the memory, and calculating to obtain the corresponding load value of the server by using a load balancing method based on dynamic weight. And forwards the user request to the server with the optimal load value. The embodiment can obtain the load values at different moments through the dynamic weights, and can reflect the real conditions of the servers at the moments.

Fig. 2 is a flowchart of a method for balancing load of a client end of a micro-service oriented architecture according to the present application, including:

s201, initializing a registration center, a Redis cache, a client load balancing executor and an information collection module.

S202, receiving request information sent by a user, screening out micro-services to be called, and acquiring an instance registry for deploying the services from a registry.

S203: determining whether it is a cluster that deploys the service instance

Judging whether the service is deployed by the cluster according to the service registration instance table, if so, executing the next step. If the service is not deployed by the cluster, the request is forwarded to the corresponding server through the client load executor to be directly executed.

S204: and acquiring the historical resources in the information collection module from the corresponding node server in the service instance registry, and carrying out normalization processing on the data.

The server in the service instance registry collects and integrates the historical resource information of the local machine, namely, the historical resource information is acquired from the information collecting module, and the acquired data is standardized by utilizing the z-score.

S205: each node server inputs the historical resource value in the information collector as input quantity into an ARIMA model, and predicts the utilization rate of each resource of the node server in the next second.

The specific implementation content of the step is that the CPU utilization rate and the memory utilization rate in a period of time are respectively put into different sequences to be used as input quantity, an ARIMA model is modeled, and the modeling process is as follows:

(1) And judging whether the sample sequence is a stable sequence or not according to the ADF unit root test, if not, adopting a difference method for the sample sequence until the sequence is stable, and simultaneously determining a difference order d to obtain a time sequence model after difference.

(2) The autocorrelation function and the partial autocorrelation function of the sample sequence are calculated, and the q value of the MA (moving average) model and the p value of the AR (autoregressive) model are calculated by using the red-cell information criterion AIC. The AIC standard function is:

AIC＝n logσ ² +(p+q+1)log n (1)；

wherein n is the number of samples, σ ² To fit the sum of squares of the residuals, p is the order of autoregressive and q is the order of motion smoothing.

(3) A common method is the least squares method. Recording in ARIMA model

wherein ,is an autoregressive coefficient; p is the order of autoregressive; θ _i Is a motion smoothing coefficient; q is the order of motion smoothing; x is x _t A load value at time t when the load data is a time series; epsilon _i Is a zero-mean white noise sequence. The residual term is

wherein z_t Is the real load value of the load data at the moment of the data t,is a predicted object observation.

The method for selecting the weighting matrix optimizes the parameter estimation of the least square method, so as to eliminate the heteroscedasticity and achieve better model fitting effect. Definition w _i To lag the weight of the data of the i-order (several differences lag several orders), the weight taken by the term with the larger residual should be lower, thus enabling smaller errors. To eliminate the effect of sign, it is expressed in terms of the square of the residual, i.e

Construction of a weighting matrix

ThenI.e. the weighted least squares parameter estimate

wherein ,

x _i is a time series; y is an (n-p) x 1 matrix composed of real values before the prediction time t, n is the number of samples, and p is the autoregressive order.Is a weighted least squares parameter estimate.

(4) And obtaining an ARIMA model according to the model parameters, and simultaneously checking the significance of the model, namely that the residual sequence is white noise. If the fitted model fails the test, the model is re-selected for re-fitting.

(5) And predicting the next second value of the CPU utilization rate and the memory utilization rate.

S206: and calculating the load value of each server by using a load calculation method based on dynamic weights according to the historical resource values and the predicted values. The process comprises the following steps: and subtracting the absolute value of the predicted resource utilization rate from the resource utilization rate of the current node to obtain the variation of the CPU utilization rate and the variation of the memory utilization rate, then calculating the weights corresponding to the CPU and the memory, and finally obtaining the load value of the node server according to the weight value and the current resource value.

S207: comparing the load value of each server, and selecting the server corresponding to the minimum load value

S208: the request is executed.

As shown in fig. 3, the load balancing system of the micro service architecture provided by the present application includes:

the client 200 is responsible for obtaining service request information sent by a user, sending and receiving information to the cluster server, and comparing load values.

The server 201 is configured to accept a user information request, and is responsible for receiving information sent by a client, recording a server resource utilization rate, calculating a dynamic weight method, and the like.

The technical scheme of the application is further described below with reference to the accompanying drawings. As shown in fig. 4, the present application is mainly applied to the information technology processing field of the micro-service architecture separated from the client and the server; the method specifically comprises the following steps:

the client module is responsible for the functions of verification and filtration of user requests and the like.

The server module is responsible for providing service examples for the registry, and the client sends the functions of processing load value requests and the like. Specifically, a system server main body framework can be constructed through a Spring Cloud framework set, and a Spring Boot framework is used as a business model substrate to complete the development of related server realization.

Information transfer between the client and the server may be implemented using a RestTemplate.

The specific implementation process of the application is as follows:

step 1: initializing a system, and initializing a registration center module, a Redis cache module and an information collection module.

Step 2: the user sends a request, obtains the application request sent from the client and filters it, e.g. error requests are filtered out directly.

Step 3: and processing the request sent by the user to obtain the micro service (consumer) to be accessed, and if the Redis cache has the service instance registry corresponding to the Redis cache, directly executing the next step. If the Redis cache is not available, pulling a micro-service instance registry corresponding to the Redis cache from a registry, wherein the instance registry comprises: service name, port number, IP address, etc. And stores the registry information corresponding to the service in the Redis cache. Further, the service instance registry in Redis should be dynamically updated in real time.

Step 4: the client load balancing executor acquires a service instance registry and judges whether the service is in cluster deployment, if not, the load balancing executor directly forwards a user request to process according to server information in the service instance registry; if the cluster is deployed, load value request information is sent to the cluster according to the service instance registry.

Step 5: after each node of the micro service cluster receives the load value request information sent by the client, a corresponding server resource history record including CPU utilization rate and memory utilization rate is extracted from the information collector module process, and data is standardized through z-score.

Step 6: and taking the obtained standard sample sequence as input quantity, inputting the input quantity into an ARIMA model prediction module, respectively obtaining a predicted value of the CPU utilization rate and a predicted value of the memory utilization rate in the next second, and combining the server resource historical value and the predicted value to serve as the input quantity of the next step.

Step 7: after the weight calculation module receives the server resource value and the predicted value, the weight occupied by the CPU and the memory is calculated, the load value of the server is calculated, each service node obtains the corresponding load value, and information is sent to the load balancing executor of the client.

Step 8: the client load balancing executor receives the load value information and the node server information sent by the micro service cluster, compares the load values, selects the node server with the optimal load value, and forwards the user request to the service of the node server.

Step 9: after receiving the forwarded request, the node corresponding to the cluster server processes the user request and returns the processing result to the client.

The load balancing system can dynamically reflect the load value of each service node in the cluster in real time, reasonably distribute the user request to the corresponding server, improve the utilization rate of the cluster and reduce the waiting time of the client.

The client load balancing algorithm for the micro-service architecture presented herein is compared with a polling algorithm and a greedy algorithm-based load balancing algorithm. After various algorithms are realized, experiments are carried out under the same experimental environment, and throughput of each algorithm or method is compared, wherein the specific experimental environment is shown in table 1:

table 1 experimental environment

Evaluation index:

throughput is the number of requests that a system can process per unit time, and generally represents the processing speed of the system, reflecting the overall performance of the system, the higher the throughput per unit time, the better the system performance in requests per second.

Comparison of experimental results:

the client load balancing algorithm for the micro-service architecture presented herein is compared with a polling algorithm and a greedy algorithm-based load balancing algorithm. After various algorithms are realized, experiments are carried out under the same experimental environment, the throughput of each algorithm or method is compared, and effect analysis is carried out, the specific results are shown in fig. 5, and the result analysis is shown in table 2.

TABLE 2

From the experimental results of the three algorithms, the throughput of the polling algorithm is maximum when the number of the earlier concurrent requests is small, and the load balancing algorithm based on the greedy algorithm and the client load balancing method provided herein need to collect and process relevant machine resources, so the throughput is small.

When the number of concurrent requests increases to more than 500, the load balancing algorithm based on the greedy algorithm and the client load balancing method facing the micro-service architecture provided herein have great advantages. When the number of requests is large, the calculation and prediction server load time is short relative to the execution time of the polling algorithm, so that good load balancing is achieved. In addition, the client load balancing method for the micro-service architecture provided by the application can reflect the change of the server load value and the load predicted by the nodes in real time, so that a better decision effect is achieved, and under the condition of large request number, the load balancing method provided by the application is obviously higher than the throughput of other algorithms.

The foregoing is merely illustrative of specific embodiments of the present application, and the scope of the application is not limited thereto, but any modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present application will be apparent to those skilled in the art within the scope of the present application.

Claims

1. A method for balancing load of a client facing a micro-service architecture, the method comprising:

s3: the node server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted next second resource utilization rate to obtain a corresponding node server load value, and the node server load value is returned to the client for comparison, and a node server execution request with the minimum node server load value is selected;

the S2 specifically comprises the following steps:

ATC＝n logσ ² +(p+q+1)log n (1)；

the residual term is

Constructing a weighting matrix;

thenNamely, the weighted least square parameter estimation value;

wherein ,

x _i is a time series; y is an (n-p) x 1 matrix consisting of real values before the prediction time t;

the node server performs calculation of a dynamic weight load method, and the method comprises the following steps:

2. The method for balancing load of a client facing to a micro-service architecture according to claim 1, wherein predicting the next second resource utilization of the corresponding node by using an ARIMA model comprises:

3. The method for balancing load of clients for a micro-service architecture according to claim 1 or 2, wherein before receiving service request information sent by a user, the method further comprises:

4. The method for balancing load of clients facing to micro-service architecture according to claim 1 or 2, wherein after receiving service request information sent by a user, determining whether the requested service is cluster deployment comprises:

5. The method for balancing the load of the client terminal facing the micro-service architecture according to claim 1 or 2, wherein the corresponding node server obtains the historical resource utilization information of the corresponding node server, and the method comprises the following steps:

6. The micro-service architecture-oriented client load balancing method according to claim 1 or 2, wherein communication is realized between the node server and the client by an HTTP service method; after the node server calculates the load value, a request carrying the load value of the node server and server-related information including a server application port number, a server IP address and service name information is sent to the client.

7. A system for implementing the micro-service architecture-oriented client load balancing method of any one of claims 1-6, wherein the micro-service architecture-oriented client load balancing system comprises: