CN115714775A

CN115714775A - Load balancing method and device

Info

Publication number: CN115714775A
Application number: CN202211256646.7A
Authority: CN
Inventors: 江河清; 刘军; 郭浩; 罗毅
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2022-10-14
Filing date: 2022-10-14
Publication date: 2023-02-24

Abstract

The application discloses a load balancing method and a load balancing device, which relate to the technical field of cloud computing, and the method comprises the following steps: acquiring parameter information of a server; determining the real-time weight of each server, and performing weight pre-grouping on each server according to the real-time weight; when receiving a call request, determining the number of available addresses in each weight group by using a dynamic routing result and a pre-grouping result; generating a weight calculation result by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight; and determining the target calling address of the calling request according to the weight calculation result. According to the method, a pre-grouping mode is adopted, a large amount of repeated calculation work is moved to background asynchronous calculation in the weight calculation process, the work load of traversing address parameter taking which is heavy in operation is removed, the cluster can operate with extremely low performance loss when the scale of the cluster is extremely large, the weight calculation result is generated by combining a dynamic routing result and a pre-grouping result, and powerful support is provided for preheating calculation under a large-scale deployment scene.

Description

Load balancing method and device

Technical Field

The application relates to the field of cloud computing, in particular to a load balancing method and device.

Background

This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

Generally, for a remote call framework under a microservice architecture, the number of deployed servers is often more than one machine, so that it is necessary to select one address from multiple addresses to call when initiating a remote call. The process of address screening can be largely divided into two stages as shown in fig. 2: route screening and load balancing. The purpose of the route screening stage is mainly to screen all available target addresses, for example, in an internationalized scene, only the service terminals in the same region can be called due to the requirement of compliance; the purpose of the load balancing stage is to select an available address from a plurality of available target addresses to make the call. Common load balancing algorithms are: a random algorithm, a weight-based random algorithm, a consensus algorithm, etc. The random algorithm based on the weight can realize the address selection of the server through weight calculation.

For most applications, a stuck phenomenon may be caused by initializing a connection pool, compiling C1 and C2 of Java, and the like within a period of time after the application is started, and the purpose of warming up the application is to reduce an experience loss amount caused by the stuck phenomenon to a service. The application preheating works based on weight calculation in the load balancing rule, the weight of the application is reduced within a period of time after the application is started, and small flow can enter a machine to preheat the environment. However, in an actual deployment scenario, since the weight calculation is performed at each request, the time complexity is O (N), and in a case where the service end deployment scale is large, a serious performance degradation problem is caused.

Disclosure of Invention

The embodiment of the application provides a load balancing method and device, and at least solves the problem that in the prior art, in a preheating scene, when the magnitude and the request quantity of a server side are large, operation consumes large computing resources.

According to an aspect of the present application, there is also provided a load balancing method, including: acquiring parameter information of a server; the parameter information comprises an address of a server, a weight threshold and preheating duration; determining real-time weights of all the service terminals based on the weight threshold, the preheating time length and the preheated time length, and performing weight pre-grouping on all the service terminals according to the real-time weights to obtain a pre-grouping result; the pre-grouping result comprises a plurality of weight groups; each weight group comprises addresses of one or more service terminals with the same real-time weight; when a call request is received, determining the number of available addresses in each weight group by using a dynamic routing result and the pre-grouping result; the dynamic routing result comprises one or more available addresses; generating a weight calculation result by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight; and determining the target calling address of the calling request according to the weight calculation result.

According to another aspect of the application, there is also provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above-mentioned method steps when executing the computer program.

According to another aspect of the application, there is also provided a computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the above-mentioned method steps.

According to another aspect of the application, there is also provided a computer program product comprising a computer program which, when executed by a processor, carries out the above-mentioned method steps.

In the embodiment of the application, firstly, parameter information of a server is obtained; the parameter information comprises an address of a server, a weight threshold and preheating duration; then, based on the weight threshold, the preheating time length and the preheated time length, determining the real-time weight of each service end, and performing weight pre-grouping on each service end according to the real-time weight to obtain a pre-grouping result; the pre-grouping result comprises a plurality of weight groups; each weight group comprises addresses of one or more service ends with the same real-time weight; when a call request is received, determining the number of available addresses in each weight group by using a dynamic routing result and the pre-grouping result; the dynamic routing result comprises one or more available addresses; generating a weight calculation result by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight; and finally, determining the target calling address of the calling request according to the weight calculation result. According to the method and the device, address pre-grouping processing is achieved through the weight threshold and the preheating time length in the parameter information of the server side, and the pre-grouping mode is adopted, so that a large amount of repeated calculation work in the weight calculation process can be moved to background asynchronous calculation, the workload of traversing addresses and obtaining parameters which are heavy in operation can be removed, the operation can be performed with extremely low performance loss when the cluster scale is extremely large, and the calculation amount during calling is greatly reduced; after the pre-grouping result is obtained, a weight calculation result is generated by combining the dynamic routing result and the pre-grouping result, so that load balancing is realized, and powerful support is provided for preheating calculation under a large-scale deployment scene.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:

FIG. 1 is a flow diagram of a method of load balancing according to an embodiment of the present application;

FIG. 2 is a schematic diagram of an address screening process according to an embodiment of the application;

FIG. 3 is a schematic diagram of the computational logic of a weight-based stochastic algorithm in a warm-up scenario according to an embodiment of the present application;

FIG. 4 is an exemplary graph of weights computed at different times in a warm-up scenario according to an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating an implementation process of a traversal-based addressing scheme according to an embodiment of the present application;

FIG. 6 is a diagram illustrating a manner of searching for an address based on a binary tree according to an embodiment of the present application;

FIG. 7 is a diagram illustrating pre-grouping results according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a weight calculation process according to an embodiment of the present application;

fig. 9 is a schematic diagram of a memory structure according to an embodiment of the present application.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

Considering that in the prior art, weight calculation is widely applied to products related to network transmission, in the field of cloud native micro-services, for example, a brpc (also called ***-RPC, which is a hundred-degree development remote procedure call network framework), a grpc (which is a high-performance and general open-source RPC framework developed by ***), and envoy (which is an open-source edge and a service agent for cloud native applications) have corresponding logic based on weight addressing, and common weight-based addressing provided by the brpc, the grpc, and the envoy all need to traverse weight information of each target when a request is initiated each time, so that calculation is performed, after the number of target addresses reaches a certain magnitude, performance loss of each request reaches an extremely high degree, and a large part of resources are occupied.

Fig. 5 is a schematic diagram of an implementation process based on a traversal addressing scheme according to an embodiment of the present application, which can be mainly divided into three processes, namely parameter reading, single weight calculation, and weight screening. The parameter reading means that parameter information required by the calculation is read from a parameter list of the server; the single weight calculation means that the weight of the current address is calculated based on the parameters of the single address, and for a preheating scene, the calculation needs to consider the accumulated time length of the server after starting so as to calculate the current weight; the weight screening stage is to count the weight information of all addresses and select a server based on random numbers. Referring to fig. 3, a logic diagram of a weighted random algorithm in a warm-up scenario according to an embodiment of the present application is shown, where parameters such as a maximum weight and a warm-up time are first obtained from a parameter list of a server, then a latest weight information is calculated based on a current time, and finally a server is selected based on a random number. Referring to fig. 4, a diagram of example weights calculated at different times in a warm-up scenario according to an embodiment of the present application includes weight calculation results at warm-up times of 10 seconds, 50 seconds, and 100 seconds. Fig. 3 and 4 show implementation steps of the traversal addressing scheme.

The traversal addressing scheme needs to calculate all addresses in each request calling process, is an algorithm with time complexity of O (N) (N is the address quantity of a server), and consumes a large amount of computing resources during operation when the magnitude and the request quantity of the server N are large, so that the load balancing scheme based on weight preheating cannot be widely popularized.

In the prior art, a local-aware load balancing (LALB) algorithm is provided in brpc, a binary tree search algorithm is used in weight screening, performance in single call is optimized, but since weight information is dynamically calculated, a binary tree needs to be reconstructed and maintained all the time, and the LALB algorithm does not support dynamic routing (part of addresses are dynamically removed before load balancing).

Fig. 6 is a schematic diagram of a binary tree address lookup manner according to an embodiment of the present application, which separates weight calculation and weight screening processes from each other in comparison with the above traversal address selection scheme, and records the weights of each address in a binary tree by calculating the weights of each address in advance, where each node of the binary tree records the sum of the weights of the left subtrees, so that lookup can be completed in O (logN) time when called. However, this solution has a major limitation: the binary tree can not work together with dynamic routing, and because the binary tree is pre-calculated, the left subtree weight sum of each node is also fixed, and partial nodes can not be dynamically rejected at runtime, the scheme is not suitable for being used in the situation of complex scenes.

In the following embodiment, in consideration of traversing addresses to remove parameters during running, a large number of repeated operations exist, and in order to optimize performance, a part of repeated calculation work in weight calculation may be processed in advance, so that after an address is fetched once, address information is pre-grouped, key information therein is stored, and further, calculation is performed based on an obtained pre-grouping result, thereby avoiding the need of calculating all addresses in each call request process, so that weight screening may be started with extremely low performance loss even when cluster scale is particularly large. In addition, in the following embodiments, load balancing calculations are performed in conjunction with dynamic routing results, and thus dynamic routing can be supported naturally.

First, the terms involved are explained.

Address: RPC (Remote Procedure Call Protocol) service provider.

Routing: and screening the address of the service provider according to a certain rule.

Load balancing: an optimal address is selected from a plurality of addresses through a certain algorithm, and the maximum utilization rate of all address resources can be achieved after the request amount 000 reaches a certain degree.

In the present embodiment, a load balancing method is provided, fig. 1 is a flowchart of a load balancing method according to an embodiment of the present application, and method steps involved in fig. 1 are described below.

Step S102, acquiring parameter information of a server; the parameter information comprises the address of the server, a weight threshold value and preheating time.

In the step, the server registers the related parameter information related to the calling in the registration center in advance, and the consumer acquires the parameter information of the server from the registration center, wherein the parameter information comprises the address, the weight threshold and the preheating time length of the server. The server side can comprise a plurality of machines, and each machine corresponds to one set of parameter information. The weight threshold may be used to describe the performance of the machine at the server corresponding to the set of parameters, and the machine with the higher weight threshold may execute a larger task amount. Referring to fig. 7, if the weight threshold (weight in the figure) of the corresponding machine is 100 for the first full address 10.0.0.1 in the figure, and the weight threshold of the corresponding machine is 50 for the full address 10.0.0.3, it is described that the machine corresponding to the address 10.0.0.1 has stronger performance than the machine corresponding to the address 10.0.0.3, and can bear relatively more tasks and has relatively stronger computing power. The warm-up duration may be used to describe the duration that it takes for the machine to complete warm-up, for example, referring to fig. 7, for the first full address 10.0.0.1 in the figure, the warm-up duration of the machine is (denoted by wartup Time in the figure) 50. The access mode of the RPC service provider can be known according to the address.

Step S104, determining the real-time weight of each service terminal based on the weight threshold, the preheating time length and the preheated time length, and performing weight pre-grouping on each service terminal according to the real-time weight to obtain a pre-grouping result; the pre-grouping result comprises a plurality of weight groups; each weight group comprises addresses of one or more service terminals with the same real-time weight.

In this step, the time period required for the warm-up process from the start to the end is the warm-up time period, and the warmed-up time period is the time period taken from the start of the warm-up to the present time. The progress of the warm-up operation can be known based on the proportion of the warmed-up time period in the warm-up time period. The real-time weight of the server side changes along with the progress of the preheating operation, and reaches the weight threshold value when the preheating is finished. Therefore, the performance occupation ratio of the server can be known by using the preheating time length and the preheated time length, and the performance occupation condition of the server is described by combining the weight threshold, that is, the real-time weight of each server is determined based on the weight threshold, the preheating time length and the preheated time length, and the real-time weight can be used for describing the performance occupation degree of the server in the preheating stage. After the address of the server is obtained, the address of the server is grouped according to the real-time weight of each address, and a pre-grouping result obtained after grouping is recorded. Wherein the pre-grouping result comprises a plurality of weight groups; each weight group comprises addresses of one or more service terminals with the same real-time weight.

In the preheating scene, the weight pre-grouping action needs to be continuously executed in the machine starting preheating period. The pre-packet processing stage may be maintained by a separate thread in the background. Under the condition that the deployment scale of the service end is large, more than half of repeated calculation work can be moved to background asynchronous calculation based on pre-grouping processing, the calculation amount is reduced, and the processing efficiency is improved. After the operation of acquiring the parameter information of the server is performed once, the real-time weight is calculated based on the acquired parameters, so that a pre-grouping result is obtained, and the pre-grouping result can be used for the subsequent weight calculation step. By pre-calculating the pre-heating weight, the condition of traversing addresses for multiple times during operation is relieved, the frequency of fetching addresses by a consumption end is reduced, and the calculation amount during calling is greatly reduced.

Step S106, when receiving the calling request, determining the number of available addresses in each weight group by using the dynamic routing result and the pre-grouping result; the dynamic routing result includes one or more available addresses.

In this step, when a call request is received, allocation of a target call address to the call request is initiated. Before load balancing, the address information of the server is screened through dynamic routing to obtain a dynamic routing result. The address information which is unavailable or not in accordance with the use condition of the server is removed from the dynamic routing result, and the dynamic routing result comprises one or more available addresses. And in the pre-grouping result, all the addresses of the server are grouped to obtain a plurality of right groups, and unusable addresses in the right groups are removed based on the dynamic routing result and the plurality of right groups, so that the number of the usable addresses in each right group is obtained.

In the above steps, the step of dynamic routing may remove a part of unavailable addresses, and the step removes the unavailable addresses in each weight group of the pre-grouping result based on the dynamic routing result, and screens the available addresses in the pre-grouping result based on the dynamic routing result, thus naturally supporting the dynamic routing. The step can quickly screen out available addresses, and the complexity of a traversing calculation scheme is reduced from O (N) to be close to O (1), wherein N is the number of server-side machines, so that the weight screening can be started with extremely low performance loss when the cluster scale is extremely large.

And step S108, generating a weight calculation result by using the real-time weight of the weight group and the number of the available addresses corresponding to the real-time weight.

In this step, since each group of weights has one or more addresses with the same real-time weight, for the same group of weights, the number of available addresses in the same group of weights and the group of weights can be used to generate a group weight, and the group weights of the group of weights are summed to obtain a weight calculation result.

In the above steps, the weight calculation result is generated by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight, so that complete preheating weight calculation for all the servers is avoided, and the weight calculation efficiency in a preheating scene is improved.

In specific implementation, referring to the schematic diagram of the weight calculation process shown in fig. 8, the number of available addresses in the weight group with the weight value of 20 is 2, and the group weight value of the group is calculated to be 40 based on the weight value of 20 and the number of available addresses 2. In the same way, the total weight value of each packet is calculated. As shown in fig. 8, the result value of the dynamic routing is 3 addresses, only 2 addresses are left after the group comparison of the weight 20, no address exists after the group comparison of the weight 10, and only 1 address is left after the group comparison of the weight 5, so that the total weight value of each group is 40, 0, and 5, respectively, and the total weight value of the dynamic routing result is 45, that is, the weight calculation result is 45.

And step S110, determining the target calling address of the calling request according to the weight calculation result.

In this step, the generation of the weight calculation result is realized based on the pre-grouping processing result, and the obtained pre-grouping processing result needs to be based on the weight threshold, the preheated time length and the preheated time length, so that the target call address obtained based on the weight calculation result can be applied to a preheating scene to provide powerful support for preheating calculation.

It should be noted that the weight-based random algorithm extension can implement a server preheating function, and the purpose is to allow only a small flow to enter by reducing the weight when the server is just started, and to put in a full flow after all processing is completed during the operation of the server (such as a connection pool, java C2 compilation, and the like), so as to reduce the influence caused by starting the jam.

The application provides a load balancing method, which comprises the steps of firstly, acquiring parameter information of a server; the parameter information comprises an address of a server, a weight threshold and preheating duration; then, based on the weight threshold, the preheating time length and the preheated time length, determining the real-time weight of each service end, and performing weight pre-grouping on each service end according to the real-time weight to obtain a pre-grouping result; the pre-grouping result comprises a plurality of weight groups; each weight group comprises addresses of one or more service terminals with the same real-time weight; when a call request is received, determining the number of available addresses in each weight group by using a dynamic routing result and the pre-grouping result; the dynamic routing result comprises one or more available addresses; generating a weight calculation result by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight; and finally, determining the target calling address of the calling request according to the weight calculation result. According to the method and the device, address pre-grouping processing is achieved through the weight threshold and the preheating time length in the parameter information of the server side, and the pre-grouping mode is adopted, so that a large amount of repeated calculation work in the weight calculation process can be moved to background asynchronous calculation, the workload of traversing addresses and obtaining parameters which are heavy in operation can be removed, the operation can be performed with extremely low performance loss when the cluster scale is extremely large, and the calculation amount during calling is greatly reduced; after the pre-grouping result is obtained, a weight calculation result is generated by combining the dynamic routing result and the pre-grouping result, so that load balancing is realized, and powerful support is provided for preheating calculation under a large-scale deployment scene.

In order to further improve the efficiency of weight calculation, as an alternative embodiment, the number of available addresses in each weight group is determined by using a dynamic routing result and the pre-grouping result, and the method is performed according to the following steps:

if the dynamic routing result and the same address exist in the weight group, the same address is used as an available address; and counting the number of available addresses in each weight group.

In the step, the dynamic routing result is compared with each weight group respectively, the same address is used as an available address, different addresses in each weight group are eliminated, and then the number of the available addresses in each weight group is counted.

In specific implementation of this step, for example, referring to a schematic diagram of a weight calculation process according to an embodiment of the present application shown in fig. 8, a dynamic routing result is compared with a weight group with a weight value of 20, and an address corresponding to a dashed box with a weight of 20 in the diagram is screened out. When calling is carried out, the current remaining address number under each group can be quickly obtained by comparing the result of the dynamic routing with the weight result of each group, namely the available address number.

In an alternative embodiment, if the same address exists in the dynamic routing result and the weight set, the same address is used as an available address, and the following steps may be performed:

sequencing the addresses of the server to obtain sequencing information; according to the sorting information, the binary digits are used for respectively marking the existence states of the addresses of the service terminals in the dynamic routing result and each weight group; and if the binary values of the marked dynamic routing result and the marked weight set at the same sorting position are the same, taking the address of the sorting position in the weight set as an available address.

In this step, the addresses of the server are sorted according to a preset sorting rule, for example, the addresses may be sorted according to a sequence number of the addresses or according to a randomly distributed sequence, the obtained sorting information is recorded after sorting, and the binary marking result is obtained by marking the dynamic routing result according to the sequence of the sorting information by using binary bits, for example, if the address exists, the address is marked as "1", and if the address does not exist, the address is marked as "0". And marking the address in each weight group according to the sequence of the sequencing information by using the binary bit to obtain a binary marking result, wherein the marking method is the same as the above and is not described herein again. Finally, the marked dynamic routing result and the marked weight group are compared according to the sorting position to obtain the available address, namely if the binary values of the marked dynamic routing result and the marked weight group at the same sorting position are the same, the address of the sorting position in the weight group is used as the available address. In an alternative embodiment, if the binary values of the marked dynamic routing result and the marked weight group at the same sorting position are the same, the address of the sorting position in the weight group is used as an available address, which includes: and respectively performing bitwise AND operation on the marked dynamic routing result and each marked weight group to judge whether binary values at the same sequencing position are the same.

In the step, the real available address under each group is obtained by adopting bit operation fast comparison during the operation, and the dynamic calculation of the route is naturally supported. According to the embodiment of the application, bit operation is adopted for comparison, so that each address is prevented from being traversed, time complexity is compressed to a simple CPU bottom layer and instruction from each node needing to be traversed when a total weight value is obtained, actual time consumption is close to O (1), influence of cluster scale is small, and powerful support is provided for preheating calculation under a large-scale deployment scene.

In specific implementation of this step, refer to a schematic diagram of a storage structure according to an embodiment of the present application shown in fig. 9, where the diagram is a storage structure corresponding to the present application, in the present application, whether each address exists is marked with a binary bit, and for a case where there are 5 addresses, if the binary storage bit is 00100, it represents that only the address where the 3 rd bit is located exists, and the others are all screened out.

Taking fig. 9 as an example, the dynamic routing results originally have three addresses, which respectively correspond to the 1 st, 4 th and 5 th bits, and the corresponding binary storage bit is 10011; three original address results of the packet cache with the weight of 20 correspond to the 1 st, 2 nd and 5 th bit addresses respectively, and the corresponding binary storage bit is 11001; similarly, the weight 10 grouping stores the value 00100, indicating that the 3 rd address is present in the current grouping; the weight 5 packet stores a value of 00010, representing that the 4 th address is present in the current packet. The dynamic routing result is labeled 10011, which represents that only 1 st, 4 th and 5 th addresses exist in the routing result.

After each group of addresses is represented by binary storage bits, the intersection between two result groups can be quickly obtained through binary and operation, for example, 10011 and 11001 are subjected to and operation to obtain 10001, which indicates that the 1 st bit and the 5 th bit coexist. Based on the binary value of each group and the dynamic routing result, the remaining address information of each group under the current dynamic routing result can be obtained by performing bit AND operation on the binary value of each group and the dynamic routing result. And finally, the current actual existing address number under each group of weights can be obtained.

After the number of addresses is obtained, the total weight value of the packet can be obtained based on the weight attribute of the packet, and then the total weight values of all the packets are obtained.

The operation of the weight pre-grouping stage is mainly to calculate the real-time weight of each address, and to obtain the weight information updated along with the preheating of the device, as an optional implementation manner, the real-time weight of each service end is determined based on the weight threshold, the preheating time length and the preheated time length, and the weight pre-grouping is performed on each service end according to the real-time weight, and the method is performed according to the following steps: calculating a weight coefficient according to the preheating time length and the preheated time length; calculating the real-time weight of each address of the server by using the weight coefficient and the weight threshold value; and adding the addresses with the same real-time weight to the same weight group.

In this step, the warmed-up period may indicate the progress of the execution of the warm-up phase. The weight coefficient is calculated according to the preheating time length and the preheated time length, and a specific method for calculating the weight coefficient can be selected according to actual requirements, and the embodiment of the application is not specifically limited herein. In implementation, the ratio of the preheated time period to the preheated time period may be used as a weighting factor. Note that the value of the weight coefficient does not exceed 1. Then, calculating the real-time weight of each address of the server by using the weight coefficient and the weight threshold value; and adding the addresses with the same real-time weight into the same weight group to obtain a plurality of groups of weight groups.

In specific implementation of this step, refer to a diagram of pre-grouping results according to an embodiment of the present application shown in fig. 7, where fig. 7 shows a weight distribution when the preheating time is 10s, 3 addresses exist for a weight calculation result of 20, 1 address exists for a weight calculation result of 10, and 1 address exists for a weight calculation result of 5. The step adopts a weight pre-grouping mode, and moves more than half of repeated calculation work of weight calculation to background asynchronous calculation, thereby being beneficial to removing hot spots which are heavy in operation and traverse address parameter acquisition, and effectively reducing the calculation amount.

In the embodiment of the application, the weight pre-grouping action needs to be continuously executed in the machine starting and preheating period in the preheating scene, but for a large-scale deployed cluster, the weight of a single machine does not need to be particularly high in accuracy, so that the frequency of weight pre-calculation is not high, and the timeliness requirement is low. (compared with the LALB algorithm in bRPC, since the time window of the LALB algorithm is small, and the cluster traffic scheduling depends on the precision of the weight, and cannot be deployed under the condition of large cluster size.) therefore, in order to further reduce the performance loss, as an optional implementation, the method may further perform the following steps:

and when the preheated time length of each server is greater than the preheated time length, if a calling request is received, generating a weight calculation result based on the dynamic routing result by using a random algorithm.

In this step, if the preheated duration of each server is greater than the preheated duration, the preheating stage is finished, and after the preheating stage is finished, the background calculation thread finds that the weights of all addresses are consistent, that is, the real-time weights of all addresses reach the weight threshold value and do not change any more, the embodiment of the present application automatically degenerates to a common random algorithm, and the grouping result is not compared any more during calling, but weight calculation is directly performed based on the dynamic routing result and the random algorithm, so that performance loss is further reduced. According to the embodiment of the application, the weight calculation logic is quickly skipped when the overall weights are the same through the result of regularly detecting the weights, so that unnecessary calculation is avoided.

After obtaining the weight calculation result, as an optional implementation manner, determining a target call address of the call request according to the weight calculation result, and performing the following steps:

generating a weight value range by using the weight calculation result; determining a target weight within the weight value range by using a target random algorithm; and determining a target calling address of the calling request according to the real-time weight and the target weight in the pre-grouping result.

In this step, a weight value range is generated by using the weight calculation result, the weight calculation result may be used as one end point of the weight value range, a value 0 is used as the other end point, after the weight value range is determined, a target weight is determined in the weight value range by using a target random algorithm, for example, after the weight calculation result is 45, a value of [0,45 ] may be obtained by using a random number, so as to obtain the target weight selected at this time, then a weight reduction mechanism may be used to obtain a group corresponding to the weight, and further obtain corresponding address information, that is, a target call address of the call request is determined according to a real-time weight in the pre-grouping result and the target weight.

In order to further improve the address screening efficiency, as an optional implementation manner, the target call address of the call request is determined according to the real-time weight and the target weight in the pre-grouping result, and the method is performed according to the following steps:

sorting the address information of the server according to the real-time weight in the pre-grouping processing result to obtain an address sorting result; summing the weighted values of one or more address information of the server according to the address sorting result to obtain a summation result; and when the summation result is not less than the target weight, determining the address information corresponding to the calculation result as a target call address.

In the step, because the real-time weights of all the addresses are recorded in a grouping mode in the pre-grouping processing result, the accumulated real-time weights are calculated according to the sorting result, and when the summation result is not smaller than the target weight, the last address information with the weighted value of the last address is recorded into the summation result to be determined as the target calling address.

recording the real-time weight of each address in the pre-grouping result by using a binary tree; and calculating the target calling address of the calling request based on the binary tree and the target weight.

In the step, each node of the binary tree records the sum of the weights of the left subtrees, so that the search can be completed in O (logN) time when the binary tree is called, and the processing efficiency is improved.

In order to improve the calculation efficiency of the weight calculation result, as a preferred embodiment, the weight calculation result is generated by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight, and the method may be performed according to the following steps: calculating a packet weight value by utilizing the product of the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight; and summing all the grouping weight values to obtain a weight calculation result.

In this step, the grouping weight value of each weight group can be quickly calculated by multiplication, and then the grouping weight values are summed to obtain a weight calculation result.

Considering that the maximum utilization rate of all address resources can be reached after the request amount reaches a certain degree, after the target call address of the call request is determined according to the weight calculation result, the following steps can be executed: and calling the server side according to the target calling address.

The method can be applied to related products such as Dubbo3 (a micro Service development Framework), MSE (micro Service Engine), micro Service administration, gateway, SAE (Service Application Engine), ASM (Alibaa Cloud Service Mesh), CSB (Cloud Service Bus), EDAS (Enterprise Distributed Application Service), HSF3 (a micro Service development Framework, which is called High-speed Service Framework), and the like, and can be used for pre-grouping weights, quickly matching the weights of available weights based on bit operation, and opening a compression scheme from a large-scale calculation machine stage to a large-scale calculation server stage (which is called a High-speed Service Framework), so that the loss of the cluster can be screened particularly from O (O) to O (O) in a very low-scale. In addition, the method enables the calculation of the normally-0 state weight, supports the dynamic calculation of the route and has the capability of landing in a complex scene.

In this embodiment, an electronic device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the method in the above embodiment is implemented.

The programs described above may be run on a processor or stored in memory (or referred to as computer-readable media), which includes both non-transitory and non-transitory, removable and non-removable media, that enable storage of information by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks, and corresponding steps may be implemented by different modules.

Such an arrangement is provided in this embodiment. The apparatus is referred to as a load balancing apparatus, and includes: the parameter acquisition module is used for acquiring the parameter information of the server; the parameter information comprises an address of a server, a weight threshold and preheating duration; the pre-grouping module is used for determining the real-time weight of each service terminal based on the weight threshold, the preheating time length and the preheated time length, and performing weight pre-grouping on each service terminal according to the real-time weight to obtain a pre-grouping result; the pre-grouping result comprises a plurality of weight groups; each weight group comprises addresses of one or more service terminals with the same real-time weight; the comparison module is used for determining the number of available addresses in each weight group by using a dynamic routing result and the pre-grouping result; the dynamic routing result comprises one or more available addresses; the weight calculation module is used for generating a weight calculation result by utilizing the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight; and the load balancing module is used for determining a target calling address of the calling request according to the weight calculation result.

The system or the apparatus is used for implementing the functions of the method in the foregoing embodiments, and each module in the system or the apparatus corresponds to each step in the method, which has been described in the method and is not described herein again.

Optionally, determining the number of available addresses in each of the weight groups by using the dynamic routing result and the pre-grouping result includes: if the dynamic routing result and the same address exist in the weight group, the same address is used as an available address; and counting the number of available addresses in each weight group.

Optionally, if the same address exists in the dynamic routing result and the weight set, taking the same address as an available address includes: sequencing the addresses of the server to obtain sequencing information; according to the sorting information, the binary digits are used for respectively marking the existence states of the addresses of the service terminals in the dynamic routing result and each weight group; and if the binary values of the marked dynamic routing result and the marked weight set at the same sequencing position are the same, taking the address of the sequencing position in the weight set as an available address.

Optionally, if binary values of the marked dynamic routing result and the marked weight group at the same sorting position are the same, taking the address of the sorting position in the weight group as an available address, including: and respectively performing bit-wise AND operation on the marked dynamic routing result and each marked weight group to judge whether the binary values of the same sequencing position are the same.

Optionally, determining a real-time weight of each service end based on the weight threshold, the preheating time length and the preheated time length, and performing weight pre-grouping on each service end according to the real-time weight, including: calculating a weight coefficient according to the preheating time length and the preheated time length; calculating the real-time weight of each address of the server by using the weight coefficient and the weight threshold value; and adding the addresses with the same real-time weight to the same weight group.

Optionally, the method further comprises: and when the preheated time length of each server is greater than the preheated time length, if a calling request is received, generating a weight calculation result based on the dynamic routing result by using a random algorithm.

Optionally, determining a target call address of the call request according to the weight calculation result includes: generating a weight value range by using the weight calculation result; determining a target weight within the weight value range using a target stochastic algorithm; and determining the target calling address of the calling request according to the real-time weight and the target weight in the pre-grouping result.

Optionally, determining a target call address of the call request according to the real-time weight and the target weight in the pre-grouping result includes: sorting the address information of the server according to the real-time weight in the pre-grouping processing result to obtain an address sorting result; summing the real-time weights of one or more addresses of the server according to the address sorting result to obtain a summation result; and when the summation result is not less than the target weight, determining the address information corresponding to the calculation result as a target call address.

Optionally, determining a target call address of the call request according to the real-time weight and the target weight in the pre-grouping result includes: recording the real-time weight of each address in the pre-grouping result by using a binary tree; and calculating the target calling address of the calling request based on the binary tree and the target weight.

Optionally, generating a weight calculation result by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight, including: calculating a packet weight value by utilizing the product of the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight; and summing all the grouping weight values to obtain a weight calculation result.

Optionally, after determining the target call address of the call request according to the weight calculation result, the method further includes: and calling the server side according to the target calling address.

The embodiment solves the problem that in the prior art, in a preheating scene, when the magnitude and the request quantity of the server side are large, the operation consumes large computing resources. According to the embodiment of the application, the address information is pre-grouped through the address, the weight threshold and the preheating time length in the parameter information of the server side, and the pre-grouping mode is adopted, so that a large amount of repeated calculation work can be moved to background asynchronous calculation in the weight calculation process, the heavy work of traversing addresses to obtain parameters in the operation process can be removed, the operation can be performed with extremely low performance loss when the cluster scale is extremely large, and the calculation amount in the calling process is greatly reduced; after the pre-grouping result is obtained, a weight calculation result is generated by combining the dynamic routing result and the pre-grouping result, so that load balancing is realized, and powerful support is provided for preheating calculation under a large-scale deployment scene.

There is also provided in this embodiment a computer readable storage medium storing a computer program which, when executed by a processor, implements the method in the above embodiments.

There is also provided in this embodiment a computer program product comprising a computer program which, when executed by a processor, implements the method in the above embodiments.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of load balancing, comprising:

acquiring parameter information of a server; the parameter information comprises an address of a server, a weight threshold and preheating duration;

determining real-time weights of all the service terminals based on the weight threshold, the preheating time length and the preheated time length, and performing weight pre-grouping on all the service terminals according to the real-time weights to obtain a pre-grouping result; the pre-grouping result comprises a plurality of weight groups; each weight group comprises addresses of one or more service terminals with the same real-time weight;

when a call request is received, determining the number of available addresses in each weight group by using a dynamic routing result and the pre-grouping result; the dynamic routing result comprises one or more available addresses;

generating a weight calculation result by using the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight;

and determining the target calling address of the calling request according to the weight calculation result.

2. The method of claim 1, wherein determining the number of available addresses in each of the sets of weights using the dynamic routing results and the pre-grouping results comprises:

if the dynamic routing result and the same address exist in the weight group, the same address is used as an available address;

and counting the number of the available addresses in each weight group.

3. The method of claim 2, wherein if the same address exists in the dynamic routing result and the set of weights, using the same address as an available address comprises:

sequencing the addresses of the server to obtain sequencing information;

according to the sorting information, the binary digits are used for respectively marking the existence states of the addresses of the service terminals in the dynamic routing result and each weight group;

and if the binary values of the marked dynamic routing result and the marked weight set at the same sorting position are the same, taking the address of the sorting position in the weight set as an available address.

4. The method of claim 3, wherein if the binary values of the marked dynamic routing result and the marked weight group at the same sorting position are the same, the step of using the address of the sorting position in the weight group as the available address comprises:

and respectively performing bitwise AND operation on the marked dynamic routing result and each marked weight group to judge whether binary values at the same sequencing position are the same.

5. The method of claim 1, wherein determining real-time weights for each server based on the weight threshold, the warm-up duration, and the warmed-up duration, and pre-grouping the weights for each server according to the real-time weights comprises:

calculating a weight coefficient according to the preheating time length and the preheated time length;

calculating the real-time weight of each address of the server by using the weight coefficient and the weight threshold value;

and adding the addresses with the same real-time weight to the same weight group.

6. The method of claim 1, further comprising:

when the preheated time length of each server is longer than the preheating time length,

and if the calling request is received, generating a weight calculation result based on the dynamic routing result by using a random algorithm.

7. The method of claim 1, wherein determining the target call address of the call request according to the weight calculation result comprises:

generating a weight value range by using the weight calculation result;

determining a target weight within the weight value range using a target stochastic algorithm;

and determining the target calling address of the calling request according to the real-time weight and the target weight in the pre-grouping result.

8. The method of claim 7, wherein determining the target call address of the call request according to the real-time weight and the target weight in the pre-grouping result comprises:

sorting the address information of the server according to the real-time weight in the pre-grouping processing result to obtain an address sorting result;

summing the real-time weights of one or more addresses of the server according to the address sorting result to obtain a summation result;

and when the summation result is not less than the target weight, determining the address information corresponding to the calculation result as a target call address.

9. The method of claim 7, wherein determining the target call address of the call request according to the real-time weight and the target weight in the pre-grouping result comprises:

recording the real-time weight of each address in the pre-grouping result by using a binary tree;

and calculating the target calling address of the calling request based on the binary tree and the target weight.

10. The method of claim 1, wherein generating a weight calculation result using the real-time weights of the set of weights and the number of available addresses corresponding to the real-time weights comprises:

calculating a packet weight value by utilizing the product of the real-time weight of the weight group and the number of available addresses corresponding to the real-time weight;

and summing all the grouping weight values to obtain a weight calculation result.

11. The method according to any one of claims 1-10, wherein after determining the target call address of the call request according to the weight calculation result, further comprising:

and calling the server side according to the target calling address.

12. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 11 are implemented by the processor when executing the computer program.

13. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when being executed by a processor, carries out the method steps of any one of claims 1 to 11.

14. A computer program product, characterized in that the computer program product comprises a computer program which, when being executed by a processor, carries out the method steps of any one of claims 1 to 11.