CN115242797B - Micro-service architecture-oriented client load balancing method and system - Google Patents

Micro-service architecture-oriented client load balancing method and system Download PDF

Info

Publication number
CN115242797B
CN115242797B CN202210692502.XA CN202210692502A CN115242797B CN 115242797 B CN115242797 B CN 115242797B CN 202210692502 A CN202210692502 A CN 202210692502A CN 115242797 B CN115242797 B CN 115242797B
Authority
CN
China
Prior art keywords
service
server
utilization rate
load
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210692502.XA
Other languages
Chinese (zh)
Other versions
CN115242797A (en
Inventor
吴昊
贺小伟
赵军壮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NORTHWEST UNIVERSITY
Original Assignee
NORTHWEST UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NORTHWEST UNIVERSITY filed Critical NORTHWEST UNIVERSITY
Priority to CN202210692502.XA priority Critical patent/CN115242797B/en
Publication of CN115242797A publication Critical patent/CN115242797A/en
Application granted granted Critical
Publication of CN115242797B publication Critical patent/CN115242797B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Algebra (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer And Data Communications (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a method and a system for balancing client loads facing a micro-service architecture. Comprising the following steps: after receiving service request information sent by a user, a client acquires an instance registry for deploying the service in a cluster from a registry; acquiring historical information of resource utilization rates of all servers for deploying the service from an information collector module, and predicting the resource utilization rate of the corresponding node in the next second by using a server resource prediction algorithm based on an improved ARIMA model; and the server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted next second resource utilization rate to obtain a corresponding server load value, returns load value information to the client and compares and selects a server execution request with the optimal load value. The method and the system reasonably distribute the requests of the users, and improve the utilization rate of cluster resources.

Description

Micro-service architecture-oriented client load balancing method and system
Technical Field
The application relates to the technical field of information, in particular to a client load balancing method and system for a micro-service architecture.
Background
With the high-speed development of the Internet and industry, internet enterprises are developed in a micro-service architecture mode at present, but with the increase of power exponent of service access concurrency, the load pressure on a server host is continuously increased, and load balancing becomes the focus of research work of domestic and foreign scientific researchers and Internet manufacturers at present. How service requests are distributed evenly among multiple servers is currently a very important area of research.
At present, the research of the load balancing algorithm is mainly divided into two parts, namely a static load balancing algorithm and a dynamic load balancing algorithm. The static load balancing algorithm is to schedule the service node according to the prior information of the service node, distribute tasks without considering the current load condition of the node, and comprises a polling algorithm, a weighted polling algorithm, a region-aware polling algorithm, a random algorithm and the like. Over time, static load balancing algorithms tend to create situations where cluster server loads are unbalanced, resulting in underutilization of cluster server resources and reduced cluster system efficiency. The dynamic load balancing algorithm calculates the real-time load condition of the service node according to the selected load index, reasonably schedules the service node, distributes tasks, and comprises a minimum connection algorithm, a weighted polling algorithm and the like. When a dynamic load balancing algorithm is adopted, a weight value of load balancing is often set by manual experience, when the user demand changes, a weight distribution strategy cannot be adjusted in real time according to the demand, the response time is influenced, and certain hysteresis exists.
In addition, in the system application of the micro-service architecture, the load balancing of the server is implemented by using a special instance server as a load balancer to forward the request. When faced with a large number of requests from users, the problem caused by centralized computation can only be the capacity expansion of the load balancer to increase its throughput. Similarly, the server load balancing can only configure a load balancing policy, and the forwarding of the request cannot be performed according to the load value of each server, which obviously consumes a great deal of material resources and financial resources unless the source code is changed.
Therefore, how to dynamically forward the service request in balance according to the server resource value in the cluster and send the final result to the client, so that the whole system can better process high and is a problem to be solved at present.
Disclosure of Invention
The application aims to provide a client load balancing method and a system for a micro-service architecture, which lighten the pressure of a server, and in addition, a service registry adopts real-time dynamic update, so that the effectiveness of an acquired service registration instance can be ensured.
The method specifically comprises the following steps:
a method for load balancing of clients oriented to a micro-service architecture, the method comprising:
s1, acquiring service request information of a user, and acquiring an instance registry for deploying the service in a cluster by a client in a registry;
s2: acquiring historical information of resource utilization rates of all node servers deploying the service from an information collector, and predicting the resource utilization rate of the corresponding node server in the next second by using an ARIMA model;
s3: and the node server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted next second resource utilization rate to obtain a corresponding node server load value, returns the node server load value to the client and compares the node server load value, and selects a node server with the minimum node server load value to execute the request.
Optionally, the predicting the next second resource utilization of the corresponding node by using the ARIMA model includes:
(1) Using the processed data as an input data sequence, judging whether the data is stable or not by using a unit root test rule, and if the sequence is not stable, carrying out differential processing on the sequence until the unit root test is met, and determining the differential order;
(2) Determining the optimal value of the model according to the trailing property and the ending property of the sample autocorrelation function ACF and the partial autocorrelation function PACF, and outputting a corresponding predicted value according to sample data;
(3) Historical resource information values and predicted value data of CPU utilization rate and memory utilization rate of the node server can be converted into actual data by normalization.
Optionally, the step S2 specifically includes:
taking the resource utilization rate of the server in a period of time as input quantity, modeling an ARIMA model, wherein the modeling process is as follows:
s2.1, judging whether the sample sequence is a stable sequence, if not, adopting a difference method for the sample sequence until the sequence is stable, and simultaneously determining a difference order d to obtain a time sequence model after difference;
s2.2, calculating an autocorrelation function and a partial autocorrelation function of the sample sequence, and calculating a q value of the MA model and a p value of the AR model by utilizing a red pool information criterion AIC, wherein the AIC standard function is as follows:
AIC=n logσ 2 +(p+q+1)log n (1);
wherein n is the number of samples, σ 2 To fit the sum of squares of the residuals, p is the order of autoregressive, q is the order of moving smoothing;
s2.3, estimating the values of parameters in a linear prediction model by using a least square method, wherein the ARIMA model is as follows:
wherein ,is an autoregressive coefficient; p is the order of autoregressive; θ i Is a motion smoothing coefficient; q is the order of motion smoothing; x is x t A load value at time t when the load data is a time series; epsilon i A zero-mean white noise sequence;
the residual term is
wherein zt Is the real load value of the load data at the moment of the data t,is a predicted object observation;
definition w i For the weight of the data lagging the i-order, expressed by the square of the residual, i.e
Constructing a weighting matrix;
thenNamely, the weighted least square parameter estimation value;
wherein ,
x i is a time series; y is an (n-p) x 1 matrix of real values before the prediction time t.
Optionally, before receiving the service request information sent by the user, the method further includes:
initializing a registry, periodically sending a heartbeat detection mechanism to each service registry by a service provider and a service caller, and updating or removing the changed or deactivated service instance in the registry; initializing servers, setting an independent thread in each server to collect the utilization rate of server resources, taking seconds as a unit, and storing the resources into a corresponding information collector; the client load balancing executor and cache are initialized.
Optionally, after receiving service request information sent by a user, determining whether the requested service is cluster deployment includes:
acquiring a registry of a registry, searching a service instance registry according to service information requested by a user, and judging whether the service is deployed by a cluster server according to the service instance registry;
if the cluster server does not exist to deploy the service, a corresponding node server is screened out from a service instance registry to process a request sent by a user;
if there is a cluster server for deploying the service, acquiring all node servers for deploying the service according to a service registration instance table, simultaneously sending requests for collecting server load values to the servers, acquiring resource utilization rates from information collector threads by corresponding node servers, taking normalized data as input values based on ARIMA model prediction, wherein the server resources comprise: CPU utilization and memory utilization per second for the server.
Optionally, the obtaining, by the corresponding node server, the historical resource utilization information of the corresponding node server includes:
(1) Each node server simultaneously starts respective resource information processing, acquires the values of the CPU and the memory utilization rate of the current server for a period of time, and fills the CPU or the memory utilization rate with the average value of the first 1s and the second 1s if the CPU or the memory utilization rate is empty in a certain second;
(2) If the CPU utilization rate and the memory utilization rate data are correct, the node server normalizes the respective CPU utilization rate and memory utilization rate sample sequences, inputs the normalized data into an ARIMA model based on improvement for model training, obtains an optimal ARIMA model and predicts the CPU utilization rate and the memory utilization rate of the next second.
Optionally, the node server performs calculation of a dynamic weight loading method, including:
(1) Each node server calculates the variation of the current CPU utilization rate and the variation of the memory utilization rate;
(2) Obtaining the weight ratio occupied by the CPU and the memory according to the variation of the CPU utilization rate and the variation of the memory utilization rate;
(3) And finally, calculating the load value of the node server according to the values of the current CPU utilization rate and the memory utilization rate and the calculated weights of the CPU and the memory.
Optionally, communication is realized between the node server and the client through an HTTP service method; after the node server calculates the load value, a request carrying the load value of the node server and server-related information including a server application port number, a server IP address and service name information is sent to the client.
A system for implementing the method for balancing client load of micro-service architecture according to any one of the present application, the system for balancing client load of micro-service architecture includes:
client side: the method is used for acquiring service request information sent by a user, sending and receiving the information to a cluster server and comparing load values;
the server side: the method is used for receiving the user information request, is responsible for receiving the information sent by the client, recording the utilization rate of the server resources and calculating the dynamic weight load method.
The beneficial effects of the application are as follows:
after receiving the service request sent by the user, the client side and the server side start to execute respective processes. The client acquires the instance registry for deploying the request service from the registry and caches the registry locally, so that the pressure of the server is reduced, and in addition, the service registry adopts real-time dynamic update, so that the validity of the acquired service registration instance can be ensured.
The application obtains the resource value of each service node, and uses the weight calculation based on the improved ARIMA model to predict the utilization rate value of the next second and the server load value to be distributed on each node server for processing, thereby avoiding the centralized calculation on one node server in the past and relieving the pressure of the node server. Furthermore, the load value of the current node of the corresponding server is calculated by using a dynamic service weight method according to the node resource information and the ARIMA algorithm model predicted value, so that the load information of the nodes in the cluster can be dynamically monitored in real time. Further, after the load value is calculated, each node sends the load value information of the node and the server information of the node to a client (consumer), and the client selects an optimal server according to the sent load value to execute the processing of the user information request, so that the utilization rate of cluster resources is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some examples of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without the inventive effort of a person skilled in the art.
FIG. 1 shows a flow chart of one embodiment of a load balancing method according to the present application;
FIG. 2 is a flowchart of another method for balancing load of clients facing to a micro-service architecture according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a load balancing system facing to a micro-service architecture according to an embodiment of the present application; in the figure: 1. a client; 2, a server side;
fig. 4 shows a schematic diagram of a client load balancing system facing to a micro-service architecture according to an embodiment of the present application;
fig. 5 is a graph of throughput versus various algorithms.
Detailed Description
For the purposes of promoting an understanding of the principles and technical aspects and advantages of embodiments of the application, reference will now be made in detail to the drawings and specific examples. The following examples or figures are illustrative of the application and are not intended to limit the scope of the application.
As shown in fig. 1, the method for balancing the load of the client facing the micro-service architecture disclosed by the application comprises the following steps:
s101, receiving service request information sent by a user, and acquiring an instance registry for deploying the service in the cluster from a registry.
The registry may be a registry provided by an existing internet manufacturer, or may be a registry customized by a user by using open source software, so long as the purpose of the embodiment can be achieved. In addition, each server is in communication with the registry, and typically uses a heartbeat detection mechanism to ensure proper operation of the service instance. Further, the registry instance table obtained from the registry may be cached by various middleware as long as the purpose of the present embodiment can be achieved.
S102, acquiring historical information of resource utilization rates of all servers for deploying the service from an information collector, and predicting the resource utilization rate of the corresponding node in the next second by using a server resource prediction algorithm based on an improved ARIMA model.
And sending a load value prediction request according to the service instance in the registry, and after the corresponding node receives the request, inputting the historical resource value in the information collector as an input sample into an ARIMA model based on improvement to obtain the resource utilization rate of the node server in the next second. The request transmitted here may be set by the user himself, whether it be a request of the Restful style or a request of the RPC style, as long as the object of the present embodiment can be achieved.
And S103, the server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted resource utilization rate of the next second to obtain a corresponding server load value, returns load value information to the client and compares and selects a server execution request with the optimal load value.
After the service weight calculation module is called, the historical resource value and the predicted resource value are integrated. And calculating to obtain the weight values of the CPU and the memory, and calculating to obtain the corresponding load value of the server by using a load balancing method based on dynamic weight. And forwards the user request to the server with the optimal load value. The embodiment can obtain the load values at different moments through the dynamic weights, and can reflect the real conditions of the servers at the moments.
Fig. 2 is a flowchart of a method for balancing load of a client end of a micro-service oriented architecture according to the present application, including:
s201, initializing a registration center, a Redis cache, a client load balancing executor and an information collection module.
S202, receiving request information sent by a user, screening out micro-services to be called, and acquiring an instance registry for deploying the services from a registry.
S203: determining whether it is a cluster that deploys the service instance
Judging whether the service is deployed by the cluster according to the service registration instance table, if so, executing the next step. If the service is not deployed by the cluster, the request is forwarded to the corresponding server through the client load executor to be directly executed.
S204: and acquiring the historical resources in the information collection module from the corresponding node server in the service instance registry, and carrying out normalization processing on the data.
The server in the service instance registry collects and integrates the historical resource information of the local machine, namely, the historical resource information is acquired from the information collecting module, and the acquired data is standardized by utilizing the z-score.
S205: each node server inputs the historical resource value in the information collector as input quantity into an ARIMA model, and predicts the utilization rate of each resource of the node server in the next second.
The specific implementation content of the step is that the CPU utilization rate and the memory utilization rate in a period of time are respectively put into different sequences to be used as input quantity, an ARIMA model is modeled, and the modeling process is as follows:
(1) And judging whether the sample sequence is a stable sequence or not according to the ADF unit root test, if not, adopting a difference method for the sample sequence until the sequence is stable, and simultaneously determining a difference order d to obtain a time sequence model after difference.
(2) The autocorrelation function and the partial autocorrelation function of the sample sequence are calculated, and the q value of the MA (moving average) model and the p value of the AR (autoregressive) model are calculated by using the red-cell information criterion AIC. The AIC standard function is:
AIC=n logσ 2 +(p+q+1)log n (1);
wherein n is the number of samples, σ 2 To fit the sum of squares of the residuals, p is the order of autoregressive and q is the order of motion smoothing.
(3) A common method is the least squares method. Recording in ARIMA model
wherein ,is an autoregressive coefficient; p is the order of autoregressive; θ i Is a motion smoothing coefficient; q is the order of motion smoothing; x is x t A load value at time t when the load data is a time series; epsilon i Is a zero-mean white noise sequence. The residual term is
wherein zt Is the real load value of the load data at the moment of the data t,is a predicted object observation.
The method for selecting the weighting matrix optimizes the parameter estimation of the least square method, so as to eliminate the heteroscedasticity and achieve better model fitting effect. Definition w i To lag the weight of the data of the i-order (several differences lag several orders), the weight taken by the term with the larger residual should be lower, thus enabling smaller errors. To eliminate the effect of sign, it is expressed in terms of the square of the residual, i.e
Construction of a weighting matrix
ThenI.e. the weighted least squares parameter estimate
wherein ,
x i is a time series; y is an (n-p) x 1 matrix composed of real values before the prediction time t, n is the number of samples, and p is the autoregressive order.Is a weighted least squares parameter estimate.
(4) And obtaining an ARIMA model according to the model parameters, and simultaneously checking the significance of the model, namely that the residual sequence is white noise. If the fitted model fails the test, the model is re-selected for re-fitting.
(5) And predicting the next second value of the CPU utilization rate and the memory utilization rate.
S206: and calculating the load value of each server by using a load calculation method based on dynamic weights according to the historical resource values and the predicted values. The process comprises the following steps: and subtracting the absolute value of the predicted resource utilization rate from the resource utilization rate of the current node to obtain the variation of the CPU utilization rate and the variation of the memory utilization rate, then calculating the weights corresponding to the CPU and the memory, and finally obtaining the load value of the node server according to the weight value and the current resource value.
S207: comparing the load value of each server, and selecting the server corresponding to the minimum load value
S208: the request is executed.
As shown in fig. 3, the load balancing system of the micro service architecture provided by the present application includes:
the client 200 is responsible for obtaining service request information sent by a user, sending and receiving information to the cluster server, and comparing load values.
The server 201 is configured to accept a user information request, and is responsible for receiving information sent by a client, recording a server resource utilization rate, calculating a dynamic weight method, and the like.
The technical scheme of the application is further described below with reference to the accompanying drawings. As shown in fig. 4, the present application is mainly applied to the information technology processing field of the micro-service architecture separated from the client and the server; the method specifically comprises the following steps:
the client module is responsible for the functions of verification and filtration of user requests and the like.
The server module is responsible for providing service examples for the registry, and the client sends the functions of processing load value requests and the like. Specifically, a system server main body framework can be constructed through a Spring Cloud framework set, and a Spring Boot framework is used as a business model substrate to complete the development of related server realization.
Information transfer between the client and the server may be implemented using a RestTemplate.
The specific implementation process of the application is as follows:
step 1: initializing a system, and initializing a registration center module, a Redis cache module and an information collection module.
Step 2: the user sends a request, obtains the application request sent from the client and filters it, e.g. error requests are filtered out directly.
Step 3: and processing the request sent by the user to obtain the micro service (consumer) to be accessed, and if the Redis cache has the service instance registry corresponding to the Redis cache, directly executing the next step. If the Redis cache is not available, pulling a micro-service instance registry corresponding to the Redis cache from a registry, wherein the instance registry comprises: service name, port number, IP address, etc. And stores the registry information corresponding to the service in the Redis cache. Further, the service instance registry in Redis should be dynamically updated in real time.
Step 4: the client load balancing executor acquires a service instance registry and judges whether the service is in cluster deployment, if not, the load balancing executor directly forwards a user request to process according to server information in the service instance registry; if the cluster is deployed, load value request information is sent to the cluster according to the service instance registry.
Step 5: after each node of the micro service cluster receives the load value request information sent by the client, a corresponding server resource history record including CPU utilization rate and memory utilization rate is extracted from the information collector module process, and data is standardized through z-score.
Step 6: and taking the obtained standard sample sequence as input quantity, inputting the input quantity into an ARIMA model prediction module, respectively obtaining a predicted value of the CPU utilization rate and a predicted value of the memory utilization rate in the next second, and combining the server resource historical value and the predicted value to serve as the input quantity of the next step.
Step 7: after the weight calculation module receives the server resource value and the predicted value, the weight occupied by the CPU and the memory is calculated, the load value of the server is calculated, each service node obtains the corresponding load value, and information is sent to the load balancing executor of the client.
Step 8: the client load balancing executor receives the load value information and the node server information sent by the micro service cluster, compares the load values, selects the node server with the optimal load value, and forwards the user request to the service of the node server.
Step 9: after receiving the forwarded request, the node corresponding to the cluster server processes the user request and returns the processing result to the client.
The load balancing system can dynamically reflect the load value of each service node in the cluster in real time, reasonably distribute the user request to the corresponding server, improve the utilization rate of the cluster and reduce the waiting time of the client.
The client load balancing algorithm for the micro-service architecture presented herein is compared with a polling algorithm and a greedy algorithm-based load balancing algorithm. After various algorithms are realized, experiments are carried out under the same experimental environment, and throughput of each algorithm or method is compared, wherein the specific experimental environment is shown in table 1:
table 1 experimental environment
Evaluation index:
throughput is the number of requests that a system can process per unit time, and generally represents the processing speed of the system, reflecting the overall performance of the system, the higher the throughput per unit time, the better the system performance in requests per second.
Comparison of experimental results:
the client load balancing algorithm for the micro-service architecture presented herein is compared with a polling algorithm and a greedy algorithm-based load balancing algorithm. After various algorithms are realized, experiments are carried out under the same experimental environment, the throughput of each algorithm or method is compared, and effect analysis is carried out, the specific results are shown in fig. 5, and the result analysis is shown in table 2.
TABLE 2
From the experimental results of the three algorithms, the throughput of the polling algorithm is maximum when the number of the earlier concurrent requests is small, and the load balancing algorithm based on the greedy algorithm and the client load balancing method provided herein need to collect and process relevant machine resources, so the throughput is small.
When the number of concurrent requests increases to more than 500, the load balancing algorithm based on the greedy algorithm and the client load balancing method facing the micro-service architecture provided herein have great advantages. When the number of requests is large, the calculation and prediction server load time is short relative to the execution time of the polling algorithm, so that good load balancing is achieved. In addition, the client load balancing method for the micro-service architecture provided by the application can reflect the change of the server load value and the load predicted by the nodes in real time, so that a better decision effect is achieved, and under the condition of large request number, the load balancing method provided by the application is obviously higher than the throughput of other algorithms.
The foregoing is merely illustrative of specific embodiments of the present application, and the scope of the application is not limited thereto, but any modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present application will be apparent to those skilled in the art within the scope of the present application.

Claims (7)

1. A method for balancing load of a client facing a micro-service architecture, the method comprising:
s1, acquiring service request information of a user, and acquiring an instance registry for deploying the service in a cluster by a client in a registry;
s2: acquiring historical information of resource utilization rates of all node servers deploying the service from an information collector, and predicting the resource utilization rate of the corresponding node server in the next second by using an ARIMA model;
s3: the node server calculates a dynamic weight load method according to the historical resource utilization rate and the predicted next second resource utilization rate to obtain a corresponding node server load value, and the node server load value is returned to the client for comparison, and a node server execution request with the minimum node server load value is selected;
the S2 specifically comprises the following steps:
taking the resource utilization rate of the server in a period of time as input quantity, modeling an ARIMA model, wherein the modeling process is as follows:
s2.1, judging whether the sample sequence is a stable sequence, if not, adopting a difference method for the sample sequence until the sequence is stable, and simultaneously determining a difference order d to obtain a time sequence model after difference;
s2.2, calculating an autocorrelation function and a partial autocorrelation function of the sample sequence, and calculating a q value of the MA model and a p value of the AR model by utilizing a red pool information criterion AIC, wherein the AIC standard function is as follows:
ATC=n logσ 2 +(p+q+1)log n (1);
wherein n is the number of samples, σ 2 To fit the sum of squares of the residuals, p is the order of autoregressive, q is the order of moving smoothing;
s2.3, estimating the values of parameters in a linear prediction model by using a least square method, wherein the ARIMA model is as follows:
wherein ,is an autoregressive coefficient; p is the order of autoregressive; θ i Is a motion smoothing coefficient; q is the order of motion smoothing; x is x t A load value at time t when the load data is a time series; epsilon i A zero-mean white noise sequence;
the residual term is
wherein zt Is the real load value of the load data at the moment of the data t,is a predicted object observation;
definition w i For the weight of the data lagging the i-order, expressed by the square of the residual, i.e
Constructing a weighting matrix;
thenNamely, the weighted least square parameter estimation value;
wherein ,
x i is a time series; y is an (n-p) x 1 matrix consisting of real values before the prediction time t;
the node server performs calculation of a dynamic weight load method, and the method comprises the following steps:
(1) Each node server calculates the variation of the current CPU utilization rate and the variation of the memory utilization rate;
(2) Obtaining the weight ratio occupied by the CPU and the memory according to the variation of the CPU utilization rate and the variation of the memory utilization rate;
(3) And finally, calculating the load value of the node server according to the values of the current CPU utilization rate and the memory utilization rate and the calculated weights of the CPU and the memory.
2. The method for balancing load of a client facing to a micro-service architecture according to claim 1, wherein predicting the next second resource utilization of the corresponding node by using an ARIMA model comprises:
(1) Using the processed data as an input data sequence, judging whether the data is stable or not by using a unit root test rule, and if the sequence is not stable, carrying out differential processing on the sequence until the unit root test is met, and determining the differential order;
(2) Determining the optimal value of the model according to the trailing property and the ending property of the sample autocorrelation function ACF and the partial autocorrelation function PACF, and outputting a corresponding predicted value according to sample data;
(3) Historical resource information values and predicted value data of CPU utilization rate and memory utilization rate of the node server can be converted into actual data by normalization.
3. The method for balancing load of clients for a micro-service architecture according to claim 1 or 2, wherein before receiving service request information sent by a user, the method further comprises:
initializing a registry, periodically sending a heartbeat detection mechanism to each service registry by a service provider and a service caller, and updating or removing the changed or deactivated service instance in the registry; initializing servers, setting an independent thread in each server to collect the utilization rate of server resources, taking seconds as a unit, and storing the resources into a corresponding information collector; the client load balancing executor and cache are initialized.
4. The method for balancing load of clients facing to micro-service architecture according to claim 1 or 2, wherein after receiving service request information sent by a user, determining whether the requested service is cluster deployment comprises:
acquiring a registry of a registry, searching a service instance registry according to service information requested by a user, and judging whether the service is deployed by a cluster server according to the service instance registry;
if the cluster server does not exist to deploy the service, a corresponding node server is screened out from a service instance registry to process a request sent by a user;
if there is a cluster server for deploying the service, acquiring all node servers for deploying the service according to a service registration instance table, simultaneously sending requests for collecting server load values to the servers, acquiring resource utilization rates from information collector threads by corresponding node servers, taking normalized data as input values based on ARIMA model prediction, wherein the server resources comprise: CPU utilization and memory utilization per second for the server.
5. The method for balancing the load of the client terminal facing the micro-service architecture according to claim 1 or 2, wherein the corresponding node server obtains the historical resource utilization information of the corresponding node server, and the method comprises the following steps:
(1) Each node server simultaneously starts respective resource information processing, acquires the values of the CPU and the memory utilization rate of the current server for a period of time, and fills the CPU or the memory utilization rate with the average value of the first 1s and the second 1s if the CPU or the memory utilization rate is empty in a certain second;
(2) If the CPU utilization rate and the memory utilization rate data are correct, the node server normalizes the respective CPU utilization rate and memory utilization rate sample sequences, inputs the normalized data into an ARIMA model based on improvement for model training, obtains an optimal ARIMA model and predicts the CPU utilization rate and the memory utilization rate of the next second.
6. The micro-service architecture-oriented client load balancing method according to claim 1 or 2, wherein communication is realized between the node server and the client by an HTTP service method; after the node server calculates the load value, a request carrying the load value of the node server and server-related information including a server application port number, a server IP address and service name information is sent to the client.
7. A system for implementing the micro-service architecture-oriented client load balancing method of any one of claims 1-6, wherein the micro-service architecture-oriented client load balancing system comprises:
client side: the method is used for acquiring service request information sent by a user, sending and receiving the information to a cluster server and comparing load values;
the server side: the method is used for receiving the user information request, is responsible for receiving the information sent by the client, recording the utilization rate of the server resources and calculating the dynamic weight load method.
CN202210692502.XA 2022-06-17 2022-06-17 Micro-service architecture-oriented client load balancing method and system Active CN115242797B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210692502.XA CN115242797B (en) 2022-06-17 2022-06-17 Micro-service architecture-oriented client load balancing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210692502.XA CN115242797B (en) 2022-06-17 2022-06-17 Micro-service architecture-oriented client load balancing method and system

Publications (2)

Publication Number Publication Date
CN115242797A CN115242797A (en) 2022-10-25
CN115242797B true CN115242797B (en) 2023-10-27

Family

ID=83669001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210692502.XA Active CN115242797B (en) 2022-06-17 2022-06-17 Micro-service architecture-oriented client load balancing method and system

Country Status (1)

Country Link
CN (1) CN115242797B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302509A (en) * 2023-02-21 2023-06-23 中船(浙江)海洋科技有限公司 Cloud server dynamic load optimization method and device based on CNN-converter

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109787855A (en) * 2018-12-17 2019-05-21 深圳先进技术研究院 Server Load Prediction method and system based on Markov chain and time series models
CN110149396A (en) * 2019-05-20 2019-08-20 华南理工大学 A kind of platform of internet of things construction method based on micro services framework
CN110704542A (en) * 2019-10-15 2020-01-17 南京莱斯网信技术研究院有限公司 Data dynamic partitioning system based on node load
CN111488200A (en) * 2020-06-28 2020-08-04 四川新网银行股份有限公司 Virtual machine resource utilization rate analysis method based on dynamic analysis model
CN113110933A (en) * 2021-03-11 2021-07-13 浙江工业大学 System with Nginx load balancing technology
CN113377544A (en) * 2021-07-06 2021-09-10 哈尔滨理工大学 Web cluster load balancing method based on load data dynamic update rate

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102232282B (en) * 2010-10-29 2014-03-26 华为技术有限公司 Method and apparatus for realizing load balance of resources in data center
TWI725744B (en) * 2020-02-19 2021-04-21 先智雲端數據股份有限公司 Method for establishing system resource prediction and resource management model through multi-layer correlations

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109787855A (en) * 2018-12-17 2019-05-21 深圳先进技术研究院 Server Load Prediction method and system based on Markov chain and time series models
CN110149396A (en) * 2019-05-20 2019-08-20 华南理工大学 A kind of platform of internet of things construction method based on micro services framework
CN110704542A (en) * 2019-10-15 2020-01-17 南京莱斯网信技术研究院有限公司 Data dynamic partitioning system based on node load
CN111488200A (en) * 2020-06-28 2020-08-04 四川新网银行股份有限公司 Virtual machine resource utilization rate analysis method based on dynamic analysis model
CN113110933A (en) * 2021-03-11 2021-07-13 浙江工业大学 System with Nginx load balancing technology
CN113377544A (en) * 2021-07-06 2021-09-10 哈尔滨理工大学 Web cluster load balancing method based on load data dynamic update rate

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Rodrigo N. Calheiros.Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS.《 IEEE Transactions on Cloud Computing》.2014,全文. *
李慧斌 ; 何利力 ; .基于预测阈值的动态权值负载均衡算法.软件导刊.2020,(第06期),全文. *

Also Published As

Publication number Publication date
CN115242797A (en) 2022-10-25

Similar Documents

Publication Publication Date Title
CN109918198B (en) Simulation cloud platform load scheduling system and method based on user characteristic prediction
CN109714400B (en) Container cluster-oriented energy consumption optimization resource scheduling system and method thereof
Ou et al. An adaptive multi-constraint partitioning algorithm for offloading in pervasive systems
US20040243915A1 (en) Autonomic failover of grid-based services
CN112783649A (en) Cloud computing-oriented interactive perception containerized micro-service resource scheduling method
CN103516807A (en) Cloud computing platform server load balancing system and method
CN112019620B (en) Web cluster load balancing method and system based on Nginx dynamic weighting
US9501326B2 (en) Processing control system, processing control method, and processing control program
CN115242797B (en) Micro-service architecture-oriented client load balancing method and system
Nastic et al. Polaris scheduler: Edge sensitive and slo aware workload scheduling in cloud-edge-iot clusters
CN111752678A (en) Low-power-consumption container placement method for distributed collaborative learning in edge computing
CN109783235A (en) A kind of load equilibration scheduling method based on principle of maximum entropy
CN115914392A (en) Computing power network resource scheduling method and system
CN114666335A (en) DDS-based distributed system load balancing device
CN112130927B (en) Reliability-enhanced mobile edge computing task unloading method
CN110471761A (en) Control method, user equipment, storage medium and the device of server
Pan et al. Sustainable serverless computing with cold-start optimization and automatic workflow resource scheduling
Garg et al. Optimal virtual machine scheduling in virtualized cloud environment using VIKOR method
CN111367632B (en) Container cloud scheduling method based on periodic characteristics
Mehta et al. A modified delay strategy for dynamic load balancing in cluster and grid environment
CN116755872A (en) TOPSIS-based containerized streaming media service dynamic loading system and method
CN116893900A (en) Cluster computing pressure load balancing method, system, equipment and IC design platform
CN106210120B (en) A kind of recommended method and its device of server
CN110704159B (en) Integrated cloud operating system based on OpenStack
Mohamed et al. A study of an adaptive replication framework for orchestrated composite web services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant