CN114143327A

CN114143327A - Cluster resource quota allocation method and device and electronic equipment

Info

Publication number: CN114143327A
Application number: CN202111503812.4A
Authority: CN
Inventors: 韩向前; 谢健; 邸帅; 卢道和
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2022-03-04
Anticipated expiration: 2041-12-09
Also published as: WO2023103342A1; CN114143327B

Abstract

The embodiment of the application provides a method and a device for allocating cluster resource quotas and electronic equipment, wherein the method comprises the following steps: receiving a service processing request of a service to be processed sent by a client, performing resource quota verification on the service to be processed according to the latest load increment and prestored resource configuration information to obtain a verification result, if the verification result is that the verification is passed, allocating a resource quota for the service to be processed according to the service processing request, processing the service to be processed according to the allocated resource quota to obtain a service processing result, and if the verification result is that the verification is not passed, processing the service to be processed according to a prestored excess quota request processing rule. The method and the device reduce the condition of inaccurate judgment of the cluster load, improve the rationality and accuracy of resource quota allocation, and further ensure the normal realization of each financial service.

Description

Cluster resource quota allocation method and device and electronic equipment

Technical Field

The embodiment of the application relates to the technical field of big data, in particular to a cluster resource quota allocation method and device and electronic equipment.

Background

With the development of computer technology, more and more technologies are applied in the financial field, the traditional financial industry is gradually changing to financial technology (Fintech), and big data technology is no exception, but higher requirements are also put forward for big data technology due to the requirements of security and real-time performance of the financial industry. To meet the growing demands of each financial business, the use of clusters is becoming more and more common.

In the prior art, one cluster can generally serve a plurality of financial services, however, when the cluster serves a plurality of financial services, a situation may occur that one service request volume is suddenly increased and occupies a large amount of cluster resources, which results in insufficiency of other service resources. In order to avoid the above situation, a fixed resource quota may be set for each financial service, and before processing each financial service request, it is determined whether the request exceeds the quota, and if the quota exceeds the quota, an exception is thrown.

However, when a fixed resource quota is configured, the load and bearable load condition of the current cluster often need to be known, when the load and bearable load condition of the current cluster are determined, the load and bearable load condition are generally determined according to pressure test data during cluster construction, the capacity of the cluster for processing service requests is dynamically changed, an initial pressure test scenario is difficult to be consistent with an actual production scenario, the situation that cluster load judgment is inaccurate frequently occurs, the accuracy of resource quota allocation is reduced, and further the normal implementation of each financial service is influenced.

Disclosure of Invention

The embodiment of the application provides a cluster resource quota allocation method, a cluster resource quota allocation device and electronic equipment, so as to improve the accuracy of resource quota allocation.

In a first aspect, an embodiment of the present application provides a method for allocating a cluster resource quota, including:

receiving a service processing request of a service to be processed sent by a client;

performing resource quota verification on the service to be processed according to the latest load increment and prestored resource configuration information to obtain a verification result;

if the verification result is that the verification is passed, allocating a resource quota for the service to be processed according to the service processing request, and processing the service to be processed according to the allocated resource quota to obtain a service processing result;

and if the verification result is that the verification fails, processing the service to be processed according to a pre-stored excess quota request processing rule.

Optionally, the service processing request includes a load request amount, and the resource quota verifying is performed on the service to be processed according to the latest load increment and pre-stored resource configuration information to obtain a verification result, where the verifying includes:

summing the current load of the cluster in the pre-stored resource configuration information and the load request quantity to obtain a summing result;

acquiring the latest load increment;

judging whether the summation result is higher than the sum of the latest load increment and a resource quota distributed for the service to be processed in the pre-stored resource configuration information;

and if the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information, determining that the verification result is verification failure.

Optionally, before the step of obtaining the latest load increment, the method further includes:

judging whether the summation result is higher than a resource quota distributed to the service to be processed in the prestored resource configuration information;

and if the summation result is higher than the resource quota distributed to the service to be processed in the pre-stored resource configuration information, executing the step of obtaining the latest load increment.

Optionally, after the summing result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, the method further includes:

determining whether the pre-stored resource configuration information contains overflow ratio information;

if the business to be processed contains overflow ratio information, determining the highest quota of the business to be processed according to the overflow ratio information;

judging whether the summation result is higher than the highest quota;

if the summation result is not higher than the highest quota, continuing to execute the steps of obtaining the latest load increment and later;

and if the summation result is higher than the highest quota, determining that the verification result is verification failure.

Optionally, the obtaining the latest load increment includes:

acquiring the total processing time length, a processing time length threshold value, the lowest cluster load and the highest cluster load according to a preset acquisition rule;

and determining the latest load increment according to the total processing time length, the processing time length threshold, the cluster lowest load, the cluster highest load and the cluster current load in the pre-stored resource configuration information.

Optionally, the determining a latest load increment according to the total processing time length, the processing time length threshold, the cluster minimum load, the cluster maximum load, and the cluster current load in the pre-stored resource configuration information includes:

if the total processing time length is greater than or equal to the processing time length threshold value, determining that the latest load increment is zero;

if the total processing time length is smaller than the processing time length threshold value and the current cluster load in the pre-stored resource configuration information is smaller than the lowest cluster load, determining the latest load increment as the difference between the lowest cluster load and the current cluster load in the pre-stored resource configuration information;

if the total processing time length is smaller than the processing time length threshold value, and the cluster current load in the pre-stored resource configuration information is larger than the cluster lowest load and smaller than the cluster highest load, determining the latest load increment as the difference between the cluster highest load and the cluster current load in the pre-stored resource configuration information;

and if the total processing time length is smaller than the processing time length threshold value and the current load of the cluster in the pre-stored resource configuration information is larger than the highest load of the cluster, determining the latest load increment as a preset load threshold value.

Optionally, the obtaining the lowest load of the cluster according to a preset obtaining rule includes:

acquiring a preset processing time threshold value, and the total processing time and the load of a target moment;

judging the processing time length threshold value and the total processing time length and the load capacity of the target moment according to a preset minimum load judgment rule, and determining the minimum load of the initial cluster;

acquiring a plurality of historical lowest loads in a preset historical time period, and determining a historical average lowest load according to the plurality of historical lowest loads;

and taking the minimum value of the initial cluster lowest load and the historical average lowest load as the cluster lowest load.

Optionally, the obtaining the highest load of the cluster according to the preset obtaining rule includes:

judging the processing time length threshold value and the total processing time length and the load capacity of the target moment according to a preset highest load judgment rule, and determining the highest load of the initial cluster;

acquiring a plurality of historical highest loads in a preset historical time period, and determining a historical average highest load according to the plurality of historical highest loads;

and taking the maximum value of the initial cluster highest load and the historical average highest load as the cluster highest load.

Optionally, the method further includes:

and if the summation result is not higher than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information, determining that the verification result is verified.

Optionally, if the verification result is that the verification fails, processing the service to be processed according to a pre-stored super quota request processing rule, including:

if the verification result is that the verification fails, judging whether the pre-stored resource configuration information contains a retry mark;

if the pre-stored resource configuration information contains a retry mark, adding the service processing request of the service to be processed into a request queue;

and if the pre-stored resource configuration information does not contain a retry mark, generating an exception handling prompt.

Optionally, before the obtaining the latest load increment, the method further includes:

judging whether a preset updating time length threshold value is reached;

if the update duration threshold is reached, acquiring the latest load increment;

or, judging whether the total processing quantity of the service processing requests reaches a preset quantity threshold value;

and if the number threshold is reached, acquiring the latest load increment.

In a second aspect, an embodiment of the present application provides a cluster resource quota allocating apparatus, including:

the receiving module is used for receiving a service processing request of a service to be processed, which is sent by a client;

the processing module is used for carrying out resource quota verification on the service to be processed according to the latest load increment and prestored resource configuration information to obtain a verification result;

the processing module is further configured to allocate a resource quota to the service to be processed according to the service processing request and process the service to be processed according to the allocated resource quota to obtain a service processing result if the verification result is that the verification is passed;

and the processing module is further configured to process the service to be processed according to a pre-stored super quota request processing rule if the verification result is that the verification fails.

In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes the computer-executable instructions stored in the memory to implement the cluster resource quota allocation method as described in the first aspect and various possible designs of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer executing instruction is stored in the computer-readable storage medium, and when a processor executes the computer executing instruction, the cluster resource quota allocating method according to the first aspect and various possible designs of the first aspect is implemented.

In a fifth aspect, an embodiment of the present application provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the method for allocating a quota of a cluster resource is implemented as described in the first aspect and various possible designs of the first aspect.

After the scheme is adopted, a service processing request of the service to be processed sent by the client can be received, and then the service to be processed is subjected to resource quota verification according to the latest load increment and the prestored resource configuration information to obtain a verification result. In one implementation manner, if the verification result is that the verification is passed, a resource quota is allocated to the service to be processed according to the service processing request, and then the service to be processed is processed according to the allocated resource quota, so that a service processing result is obtained. In another implementation manner, if the verification result is that the verification fails, the service to be processed may be processed according to a pre-stored super quota request processing rule. The resource quota allocation is carried out on the service to be processed by combining the pre-stored resource configuration information with the dynamically determined load increment, the condition that cluster load judgment is inaccurate is reduced, the rationality and the accuracy of the resource quota allocation are improved, and the normal implementation of each financial service is further ensured.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a schematic architecture diagram of an application system of a cluster resource quota allocation method provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a cluster resource quota allocation method provided in an embodiment of the present application;

fig. 3 is a schematic flowchart of a cluster resource quota allocation method according to another embodiment of the present application;

fig. 4 is a schematic structural diagram of a cluster resource quota allocating apparatus according to an embodiment of the present application;

fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the above-described drawings (if any) are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of including other sequential examples in addition to those illustrated or described. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the prior art, financial services may be account transfer, loan amount adjustment, balance inquiry, and the like, each financial service needs to be implemented by an allocated resource quota, and in the prior art, a fixed resource quota is generally allocated to each financial service. When a fixed resource quota is configured for each financial service, the load and bearable load condition of the current cluster often need to be known, when the load and bearable load condition of the current cluster is determined, the load and bearable load condition is generally determined according to pressure test data during cluster construction, the capacity of the cluster for processing service requests is dynamically changed, an initial pressure test scene is difficult to be consistent with an actual production scene, the condition that cluster load judgment is inaccurate frequently occurs, and the accuracy of resource quota allocation is reduced. In addition, under a multi-financial service scenario, the service request volume is often a fluctuating curve, the time of occurrence of the service peaks and troughs can be predicted according to the service scenario, in order to effectively utilize cluster resources, services with scattered request peaks share the same set of clusters, and the request peaks are set as quotas. Therefore, the determination of the request peak value is important, the peak value often depends on experience and historical data of operation and maintenance personnel, but when a certain service causes sudden increase of service request amount due to an unexpected situation (for example, master-slave switching of a cluster due to host failure) beyond prediction, the situation that all requests of the sudden increase under the current limiting logic fail is caused, and the normal implementation of each financial service is influenced.

Based on the technical problems, the resource quota allocation is carried out on the service to be processed in a mode of combining the pre-stored resource configuration information with the dynamically determined load increment, so that the condition that the cluster load judgment is inaccurate is reduced, the rationality and the accuracy of the resource quota allocation are improved, and the technical effect of normally realizing each financial service is further ensured.

Fig. 1 is a schematic architecture diagram of an application system of a cluster resource quota allocation method provided in an embodiment of the present application, and as shown in fig. 1, the application system includes: the system comprises a cluster 101, a database 102 and a client 103, wherein the cluster 101 can receive a service processing request sent by the client 103, then obtain pre-stored resource configuration information from the database 102, perform resource quota verification on a service to be processed by combining with newly obtained load increment to obtain a verification result, and further process the service according to the verification result.

The client 103 may have one or more, for example, a smart phone, a tablet, a personal computer, or a wearable smart device.

The cluster 101 may be an Hbase cluster, which is a distributed, scalable, highly available, high performance NoSQL database that can support random or real-time read and write functions of very large tables.

The technical solution of the present application will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

Fig. 2 is a schematic flowchart of a cluster resource quota allocation method provided in an embodiment of the present application, where the method of this embodiment may be executed by the cluster 101. As shown in fig. 2, the method of this embodiment may include:

s201: and receiving a service processing request of the service to be processed sent by the client.

In this embodiment, when a user wants to implement a financial service, the user may send a service processing request corresponding to a service to be processed to the cluster through the client.

S202: and performing resource quota verification on the service to be processed according to the latest load increment and the pre-stored resource configuration information to obtain a verification result.

In this embodiment, after receiving the service processing request, the latest load increment and the pre-stored resource configuration information may be obtained, and the resource quota verification is performed on the service to be processed according to the obtained latest load increment and the pre-stored resource configuration information, so as to obtain a verification result.

The pre-stored resource configuration information may include pre-configured overflow ratio information, a resource Quota (RNQ, Request Num Quota) allocated to each service to be processed, that is, a Quota of the number of requests in a unit time, where the unit of time may be sec, min, hour, day, and the number of requests is expressed by req, for example, 1000req/sec, 50000req/hour, a current load of the cluster, and a retry flag.

Specifically, the overflow ratio information can be set according to the actual application scene, the specific setting mode refers to the following calculation mode, the service access burst problem can be dynamically processed through the overflow ratio information, and when the cluster resources are available, normal request failure caused by the fact that the resource quota of the service application reaches the limit cannot occur.

The configuration principle of allocating resource quota for each service to be processed is to ensure that the sum of RNQ of all services to be processed is less than or equal to the lowest guaranteed load of the cluster and the sum of (quota overflow percentage/100 +1) × RNQ of all services to be processed is less than or equal to the highest reachable load on the basis of meeting the service requirements of each service to be processed. The configuration policy may specifically be that, at the beginning of cluster construction, an existing pressure test tool is used to perform a pressure test on the cluster, so as to obtain an initial cluster lowest load (LowTPS, Low Transactions Per Second) and a cluster highest load (upptps, Up Transactions Per Second) of the cluster. When each service is online, an operation and maintenance person may estimate the size of TPS (Transactions Per Second, which is an important measure of cluster throughput) of the service according to the traffic of the service system, and set the estimated size of TPS as the RNQ value of the service, and meanwhile, it is ensured that the sum of RNQ of each service is less than or equal to the lowest load of the cluster. If the sum of the RNQs of the traffic is greater than the cluster minimum load, the cluster may be extended to increase the cluster minimum load. Correspondingly, if the TPS of the service includes a spur, the TPS peak value from which the spur is removed may be set as the RNQ of the service, and if the TPS of the service has no spur, the TPS peak value may be set as the RNQ of the service.

In addition, the overflow ratio information may be calculated in a manner of:

the spur TPS of the service 1 is MTPS1, the spur TPS of the service n is MTPSn, the overflow ratio information of the service 1 is op1, and the overflow ratio information of the service n is opn, (SUM (MTPS1 … MTPSn)/RNQ-1) × 100 is SUM (op1 … … opn).

opn＝MTPSn/sum(MTPS1…MTPSn)*sum(op1……opn)。

In addition, each service and each cluster can be continuously monitored, the latest TPS, the lowest load of each cluster, the highest load of each cluster and the current load of each cluster of each service are obtained, an index graph is drawn, and a data basis is provided for the subsequent configuration of RNQ and overflow ratio information.

The cluster current load (CTPS, CurrentTPS) may be counted for the TotalRequestCount index of the regionser at fixed time intervals (default 5 seconds) by continuously monitoring the RegonServer service of the cluster. Illustratively, TotalRequestCount at time T is denoted RequestCount (T), TotalRequestCount at time T1 is denoted RequestCount (T1), and CurrentTPS ═ is (RequestCount (T1) -RequestCount (T))/(T1-T). The TotalRequestCount represents the total number of the processed service requests, is an index provided by the cluster itself, can be directly obtained through an existing function, and is a dynamic index, and 1 is added on the basis of an original value when one request is processed. After obtaining the CurrentTPS, the CurrentTPS may be stored in the pre-stored resource configuration information, that is, the current load of the cluster in the pre-stored resource configuration information is updated.

Further, the service processing request includes a load request amount, and the service to be processed is subjected to resource quota verification according to the latest load increment and pre-stored resource configuration information to obtain a verification result, which may specifically include:

and summing the current load of the cluster in the pre-stored resource configuration information and the load request quantity to obtain a summation result.

The latest load increment is obtained.

And judging whether the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information.

In addition, if the summation result is not higher than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information, it is determined that the verification result is that the verification is passed.

Further, before the step of obtaining the latest load increment, the method may further include: and judging whether the summation result is higher than a resource quota distributed for the service to be processed in the pre-stored resource configuration information.

In addition, if the summation result is not higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, the verification result is determined to be passed.

Specifically, the service processing request may directly include the service request amount, or may include type identifiers indicating different request types, and the different type identifiers may correspond to different service request amounts. Illustratively, the type identifier may be Put, Get, Scan, and Mulit, the service request amount corresponding to the Put type identifier is 1 time, the service request amount corresponding to the Get type identifier is 1 time, the service request amount corresponding to the Scan type identifier is 1 time, and the service request amount corresponding to the Mulit type identifier is the number of the involved regions.

When the load request amount and the current load of the cluster are obtained, the current load of the cluster and the load request amount can be summed to obtain a summation result. The summation result may be in two cases, one case is that the summation result is less than or equal to the resource quota allocated to the service to be processed in advance, and the other case is that the summation result is greater than the resource quota allocated to the service to be processed. After the summation result is obtained, the latest load increment can be directly obtained, then whether the summation result is higher than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information is judged, that is, the latest load increment is obtained first each time, then the relation between the summation result and the sum of the latest load increment and the resource quota allocated to the service to be processed is directly judged, and further processing is performed according to the judgment result.

In addition, after the summation result is obtained, it may also be determined that the verification result is verified, and the service to be processed may be processed subsequently according to the service processing request, without obtaining the latest load increment, but determining whether the summation result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information. If the summation result is greater than the resource quota allocated to the service to be processed, the latest load increment determined by the load calculation dynamic module can be obtained, and then whether the summation result is greater than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information is judged. And if the summation result is less than or equal to the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information, determining that the verification result is that the verification is passed, and subsequently processing the service to be processed according to the service processing request. If the summation result is higher than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information, the verification result can be directly determined to be failed in verification.

The method includes the steps that a new load increment can be obtained through triggering according to a preset load increment triggering rule, when the preset load increment triggering rule is reached, the newly obtained load increment is the latest load increment, and if the preset load increment triggering rule is not reached, the latest obtained load increment is the latest load increment.

Illustratively, the latest load increment is 10, the resource quota allocated to the service is 100, the sum of the latest load increment and the resource quota allocated to the service to be processed is 100+ 10-110, if the current requested load is 11, the sum is 100+ 11-111, and the verification result is that the verification fails, which is higher than the sum of the latest load increment and the resource quota allocated to the service to be processed. If the current requested load is 9, the summation result is 100+9 which is 109 and is lower than the sum of the latest load increment and the resource quota allocated to the service to be processed, the obtained verification result is that the verification is passed, the dynamic accurate allocation of the resource quota is realized by setting the latest load increment and combining the latest load increment, the rationality of allocating the resource quota quantity to the service to be processed is improved, and the normal realization of each service is further ensured.

In addition, after the summation result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, the method may further include:

and determining whether the pre-stored resource configuration information contains overflow ratio information.

And if the service quota information contains overflow ratio information, determining the highest quota of the service to be processed according to the overflow ratio information.

And judging whether the summation result is higher than the highest quota.

And if the summation result is not higher than the highest quota, continuing to execute the steps of obtaining the latest load increment and the subsequent steps.

Specifically, after it is determined that the summation result is higher than the resource quota allocated to the service to be processed in the pre-stored resource configuration information, it may be determined whether the pre-stored resource configuration information includes the overflow ratio information. When resource quotas are set for all the services, a positive number can be set as overflow ratio information, namely when the overall resources of the cluster are sufficient, the actual request volume of the services can exceed the maximum percentage of the quotas, and the default is 0 without overflow. The highest quota allocated to the pending traffic may be determined according to the overflow ratio information, that is, the maximum allowed request amount is (quota overflow percentage/100 +1) × the resource quota allocated to the pending traffic. After the highest quota is determined, it may be determined whether the summation result is greater than the highest quota, and if the summation result is less than or equal to the highest quota, the obtaining of the latest load increment is continuously performed, and it is determined whether the summation result is greater than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information. And if the summation result is not higher than the sum of the latest load increment and the resource quota allocated to the service to be processed in the pre-stored resource configuration information, determining that the verification result is verified. And if the summation result is higher than the sum of the latest load increment and the resource quota distributed for the service to be processed in the pre-stored resource configuration information, determining that the verification result is verification failure. And if the summation result is higher than the highest quota, determining that the verification result is verification failure.

For example, the maximum quota may be 120, the latest load increment is 10, and the resource quota allocated to the service is 100, the sum of the latest load increment and the resource quota allocated to the service to be processed is 100+10 ═ 110, if the current requested load is 9, the sum is 100+9 ═ 109, then the verification result is that the verification is passed, the sum is compared with the maximum quota first, and when the sum is smaller than the maximum configuration, the sum is compared with the latest load increment and the sum of the resource quotas allocated to the service to be processed, and the rationality of allocating the number of the resource quotas to the service to be processed is improved in a secondary comparison manner.

In addition, the current requested load capacity may also be 11, and the summation result is 100+11 — 111, in this case, although the summation result is lower than the highest quota, but is higher than the sum of the latest load increment and the resource quota allocated to the service to be processed, and the obtained verification result is that the verification fails, through the setting of the load increment, the number of the resource quotas allocated to the service to be processed is increased, and the situation that the number of the resource quotas allocated to the service to be processed is too large is avoided, so that the rationality of allocating the number of the resource quotas to the service to be processed is further improved, and meanwhile, when a certain service suddenly increases the service request capacity due to an unexpected situation, the situation that all requests fail is reduced.

S203: and if the verification result is that the verification is passed, allocating a resource quota for the service to be processed according to the service processing request, and processing the service to be processed according to the allocated resource quota to obtain a service processing result.

In this embodiment, the verification result may include two cases, one is verification passed, and the other is verification failed. If the verification result is that the verification is passed, it indicates that the current load condition of the cluster can serve the service to be processed, so that a resource quota can be allocated to the service to be processed according to the service processing request, and the service to be processed is processed according to the allocated resource quota, so as to obtain a service processing result.

S204: and if the verification result is that the verification fails, processing the service to be processed according to the pre-stored over-quota request processing rule.

In this embodiment, if the verification result is that the verification fails, the pre-stored super quota request processing rule may be first obtained, and then the service to be processed is processed according to the obtained super quota request processing rule.

Further, if the verification result is that the verification fails, processing the service to be processed according to a pre-stored excess quota request processing rule, which may specifically include:

and if the verification result is that the verification fails, judging whether the pre-stored resource configuration information contains a retry mark.

And if the pre-stored resource configuration information contains a retry identifier, adding the service processing request of the service to be processed into a request queue.

Specifically, the existing quota current limiting logic simply compares requests concurrently with a quota, a request larger than the quota discards and throws an exception, and for a scenario that writing can be delayed but discarding is not allowed when the quota is exceeded, the existing technology may cause data loss and affect the integrity of the data. The method and the device can be used for configuring whether the service to be processed needs to be retried in advance and storing the resource configuration information. In the scenario that the requirement on data consistency is high and data write failure is not allowed, a retry identifier may be configured for the service to be processed, so as to better maintain data integrity.

Correspondingly, if the pre-stored resource configuration information includes the retry identifier, the service processing requests of the service to be processed may be added to the request queue, and the cluster may sequentially process the service to be processed according to the adding sequence of the service processing requests in the request queue. If the pre-stored resource configuration information does not contain the retry identification, an exception handling prompt can be generated, namely, an exception is thrown out, operation and maintenance personnel are reminded to maintain in time, and more choices are provided for the operation and maintenance personnel when the operation and maintenance personnel handle the over-quota situation, so that the normal implementation of each service is ensured.

After the scheme is adopted, the service processing request of the service to be processed sent by the client can be received, and then the service to be processed is subjected to resource quota verification according to the latest load increment and the pre-stored resource configuration information to obtain a verification result. In one implementation manner, if the verification result is that the verification is passed, a resource quota is allocated to the service to be processed according to the service processing request, and then the service to be processed is processed according to the allocated resource quota, so that a service processing result is obtained. In another implementation manner, if the verification result is that the verification fails, the service to be processed may be processed according to a pre-stored super quota request processing rule. The resource quota allocation is carried out on the service to be processed by combining the pre-stored resource configuration information with the dynamically determined load increment, the condition that cluster load judgment is inaccurate is reduced, the rationality and the accuracy of the resource quota allocation are improved, and the normal implementation of each financial service is further ensured.

Based on the method of fig. 2, the present specification also provides some specific embodiments of the method, which are described below.

Further, in another embodiment, obtaining the latest load increment may include:

and acquiring the total processing time length, the processing time length threshold value, the cluster lowest load and the cluster highest load according to a preset acquisition rule.

In this embodiment, the total processing time duration may be represented by TotalCallTime, which is a time duration when a pending request is received by a cluster and processing is completed, and the indicator may directly reflect the processing performance of the current Regionserver (i.e., cluster) service. Correspondingly, TotalCallTime ═ QueueCallTime + ProcessCallTime. The QueueCallTime is an index of a RegionServer level in the cluster, after receiving a to-be-processed request of a client, the cluster can put the to-be-processed request into a request queue, then a special thread consumes the to-be-processed request from the queue and sends the to-be-processed request to a processing thread for processing, and the waiting time of one to-be-processed request in the queue is QueutCallTime. ProcessCallTime is an indicator of a RegionServer level in a cluster, and refers to the time length from when a request to be processed is consumed from a queue until the processing is completed, and the indicator is a key indicator reflecting the processing efficiency of the cluster.

The processing duration threshold may be denoted as MaxCallTime, which is the maximum value of TotalCallTime that can be tolerated by the service set from the service side perspective. Illustratively, the query of the pending service tolerates a time consumption of at most 0.5s (if the request cannot be processed within 0.5s, the service application reports an error), MaxCallTime may be set to a value of no more than 0.5 s.

For the lowest load of the cluster, obtaining the lowest load of the cluster according to a preset obtaining rule, which may specifically include:

and acquiring a preset processing time threshold value and the total processing time and the load of the target moment.

And judging the processing time length threshold value, the total processing time length and the load capacity at the target moment according to a preset minimum load judgment rule, and determining the minimum load of the initial cluster.

The method comprises the steps of obtaining a plurality of historical lowest loads in a preset historical time period, and determining historical average lowest loads according to the plurality of historical lowest loads.

Specifically, the cluster minimum load may also be referred to as LowTPS, which represents a TPS lower limit when TotalCallTime reaches MaxCallTime, that is, a TPS load that can be provided by the current Regionserver service when performance is the worst. The load (TPS), which represents the number of Transactions executed Per Second, is an important measure of cluster throughput. Correspondingly, the RegionServer service can be continuously monitored, TotalCallTime at the time t is CTt, TPS is TPSt, and then the following calculation can be performed at fixed time intervals:

the minimum value of TPSt at all times in the interval of MaxCallTime 99% < ═ CTt < ═ MaxCallTime 101% in the time interval is taken as LowTPS for the current time interval and filed. And if the CTt is continuously smaller than MaxCallTime 99% in the current time interval, indicating that the RegionServer runs at a light load, sequencing the CTts, and acquiring the TPS value corresponding to the maximum CTt as the LowTps of the current time interval. If the CTt of the current time interval is continuously greater than MaxCallTime 101%, it indicates that the regionser is running in an overload state, and the current time interval has no valid LowTPS data. Then, the minimum value of the average LowTPS in the historical time period and the LowTPS in the latest time interval is taken as the LowTPS of the current cluster service, and the validity period is the time interval. The time interval and the historical time period can be set according to the practical application scene in a self-defined mode, and exemplarily, the time interval can be any value within 3-6 minutes. The historical period of time may be any value from 1-3 months.

For the cluster highest load, acquiring the cluster highest load according to a preset acquisition rule, which may specifically include:

And judging the processing time length threshold value, the total processing time length and the load capacity at the target moment according to a preset highest load judgment rule, and determining the highest load of the initial cluster.

The method comprises the steps of obtaining a plurality of historical highest loads in a preset historical time period, and determining historical average highest loads according to the plurality of historical highest loads.

Specifically, the cluster highest load may also be referred to as uppts, which indicates the TPS upper limit when TotalCallTime reaches MaxCallTime, that is, the TPS load that the current Regionserver service can provide when the performance is the best. The load (TPS), which represents the number of Transactions executed Per Second, is an important measure of cluster throughput. Correspondingly, the RegionServer service can be continuously monitored, TotalCallTime at the time t is CTt, TPS is TPSt, and then the following calculation can be performed at fixed time intervals:

the maximum value of TPSt at all times in the interval of MaxCallTime 99% < ═ CTt < ═ MaxCallTime 101% in the time interval is taken as the uppts of the current time interval. And if the CTt is continuously smaller than MaxCallTime 99% in the current time interval, indicating that the RegionServer runs with light load, sequencing the CTts, and acquiring TPS 101% value corresponding to the maximum moment of the CTt as UpTps of the time interval. If the CTt of the current hour is continuously greater than MaxCallTime x 101%, the table name regionserver is overloaded and has no valid uppts data in the current time interval. And then taking the maximum value of the average UpTPS in the historical time period and the UpTPS of the latest time interval as the UpTPS of the current Rgioonserver service, wherein the validity period is the time interval. The time interval and the historical time period can be set according to the practical application scene in a self-defined mode, and exemplarily, the time interval can be any value within 3-6 minutes. The historical period of time may be any value from 1-3 months.

For the current load of the cluster, the current load can be directly obtained from the pre-stored resource configuration information. While the cluster current load in the pre-stored resource configuration information is updated in real time, as in the foregoing embodiment, the RegonServer service may be continuously monitored, and after a fixed time interval (default to 5 seconds), the TotalRequestCount index of the regionser may be counted, the TotalRequestCount at time T is denoted as RequestCount (T), the TotalRequestCount at time T1 is denoted as RequestCount (T1), CurrentTPS ═ is (RequestCount (T1) -RequestCount (T))/(T1-T), and then the newly determined CurrentTPS may be stored in the resource configuration information, that is, only the cluster current load in the configuration information is updated. By updating the current load of the cluster in the resource configuration information in real time, the current load condition of the cluster can be accurately determined, and further basis is provided for whether the service can be continuously increased or not and the increased service scale, so that not only is the waste of resources avoided, but also the overload operation condition is avoided, and the normal implementation of each service is ensured.

Further, after obtaining the total processing time length, the processing time length threshold, the cluster minimum load, the cluster maximum load, and the cluster current load, determining a latest load increment, that is, determining a latest load increment (also referred to as a cluster bearable load) according to the total processing time length, the processing time length threshold, the cluster minimum load, the cluster maximum load, and the cluster current load in the pre-stored resource configuration information, specifically, may include:

and if the total processing time length is greater than or equal to the processing time length threshold value, determining that the latest load increment is zero.

And if the total processing time length is smaller than the processing time length threshold value and the current cluster load in the pre-stored resource configuration information is smaller than the lowest cluster load, determining the latest load increment as the difference between the lowest cluster load and the current cluster load in the pre-stored resource configuration information.

And if the total processing time length is smaller than the processing time length threshold value, and the cluster current load in the pre-stored resource configuration information is larger than the cluster lowest load and smaller than the cluster highest load, determining the latest load increment as the difference between the cluster highest load and the cluster current load in the pre-stored resource configuration information.

Specifically, if TotalCallTime > -MaxCallTime, it is determined that the current Regionserver load has reached the maximum value, and quota overflow cannot be performed. If TotalCallTime < MaxCallTime and CurrentTPS < LowTPS, it is determined that the Regionserver is operating with light load and has idle load, and the load amount that may be increased is LowTPS-CurrentTPS. If TotalCallTime < MaxCalallTime and LowTPS < CurrentTPS < UpTPS, determining that the RegionServer has an idle load, and the increased load is UpTPS-CurrentTPS. If TotalCallTime is less than MaxCallTime and CurrentTPS is greater than UpTPS, the condition that the RegionServer load is new and high is determined, the processing capacity meets the requirement, and the load can be increased in a small scale. The number of small-scale load increase can be set according to the practical application scene, and illustratively, 10TPS can be added.

In addition, when the latest load increment is obtained, there may be a plurality of trigger mechanisms, which may specifically be:

in one implementation, it may be determined whether a preset update duration threshold is reached.

And if the update duration threshold is reached, acquiring the latest load increment.

For example, the update duration threshold may be any value from 3 to 6 minutes, and when the update duration threshold is reached, the latest load increment rule may be automatically triggered to be obtained, and the latest load increment may be obtained.

In another implementation, it may be determined whether the total processing quantity of the service processing requests reaches a preset quantity threshold.

And if the number threshold is reached, acquiring the latest load increment.

Illustratively, the quantity threshold may be 10000, and when the quantity of the service processing requests is accumulated to 10000, the obtaining of the latest load increment rule and the obtaining of the latest load increment may be automatically triggered. Meanwhile, the number of the service processing requests can be cleared, and the calculation is carried out again from zero.

In addition, if the latest load increment is obtained, neither the update duration threshold nor the number threshold is reached, the load increment obtained in the previous time may be used as the load increment newly obtained this time.

Fig. 3 is a schematic flowchart of a cluster resource quota allocation method according to another embodiment of the present application, as shown in fig. 3, in this embodiment, the method may include: and receiving a to-be-processed request of the to-be-processed service, analyzing the to-be-processed request, and determining the load request quantity. After the load request amount is determined, a summation result can be determined according to the cluster current load and the load request amount in the pre-stored resource configuration information, and whether the summation result exceeds a resource quota allocated to the service to be processed is judged. And if not, processing the request to be processed. If yes, judging whether overflow ratio information is configured. If yes, determining the highest quota according to the overflow ratio information, and judging whether the summation result exceeds the highest quota. And if the sum does not exceed the sum of the latest load increment and the resource quota allocated to the service to be processed, processing the request to be processed. And if the sum of the latest load increment and the resource quota distributed for the service to be processed is exceeded, determining whether to retry according to a pre-stored excess quota request processing rule, if so, adding the request queue again, and otherwise, directly throwing the exception.

In addition, if the overflow ratio information is not configured, whether to retry or not is determined according to a pre-stored excess quota request processing rule, if so, the request queue is added again, and otherwise, the exception is directly thrown out.

And if the maximum configuration is exceeded, determining whether to retry according to a pre-stored excess quota request processing rule, if so, re-joining the request queue, otherwise, directly throwing the exception.

Two mechanisms for triggering recalculation are determined for the latest load increment, the timing calculation defaults to 5 minutes, and a new round of calculation is triggered after a processing request reaches a certain amount (default 10000).

Based on the same idea, an embodiment of the present specification further provides a device corresponding to the foregoing method, and fig. 4 is a schematic structural diagram of the device for allocating a quota of a cluster resource provided in the embodiment of the present application, as shown in fig. 4, the device provided in this embodiment may include:

the receiving module 401 is configured to receive a service processing request of a service to be processed, which is sent by a client.

And the processing module 402 is configured to perform resource quota verification on the service to be processed according to the latest load increment and pre-stored resource configuration information, so as to obtain a verification result.

In this embodiment, the service processing request includes a load request amount, and the processing module 402 is further configured to:

The latest load increment is obtained.

In addition, the processing module 402 is further configured to:

and judging whether the summation result is higher than a resource quota distributed for the service to be processed in the pre-stored resource configuration information.

In addition, the processing module 402 is further configured to:

Further, the processing module 402 is further configured to:

And judging whether the summation result is higher than the highest quota.

The processing module 402 is further configured to allocate a resource quota to the to-be-processed service according to the service processing request if the verification result is that the verification is passed, and process the to-be-processed service according to the allocated resource quota to obtain a service processing result.

The processing module 402 is further configured to process the service to be processed according to a pre-stored super quota request processing rule if the verification result is that the verification fails.

In this embodiment, the processing module 402 is further configured to:

Moreover, in another embodiment, the processing module 402 is further configured to:

In this embodiment, the processing module 402 is further configured to:

In addition, the processing module 402 is further configured to:

In this embodiment, the processing module 402 is further configured to:

In addition, the processing module 402 is further configured to:

and judging whether a preset updating time length threshold value is reached.

Or, judging whether the total processing quantity of the service processing requests reaches a preset quantity threshold value.

And if the number threshold is reached, acquiring the latest load increment.

The apparatus provided in the embodiment of the present application can implement the method of the embodiment shown in fig. 2, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application, and as shown in fig. 5, a device 500 according to the embodiment includes: a processor 501, and a memory communicatively coupled to the processor. The processor 501 and the memory 502 are connected by a bus 503.

In a specific implementation process, the processor 501 executes the computer execution instruction stored in the memory 502, so that the processor 501 executes the cluster resource quota allocating method in the foregoing method embodiment.

For a specific implementation process of the processor 501, reference may be made to the above method embodiments, which implement the similar principle and technical effect, and this embodiment is not described herein again.

In the embodiment shown in fig. 5, it should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.

The memory may comprise high speed RAM memory and may also include non-volatile storage NVM, such as at least one disk memory.

The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.

An embodiment of the present application further provides a computer-readable storage medium, where a computer execution instruction is stored in the computer-readable storage medium, and when a processor executes the computer execution instruction, the method for allocating a cluster resource quota according to the above method embodiment is implemented.

An embodiment of the present application further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the method for allocating a cluster resource quota as described above is implemented.

The computer-readable storage medium may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the readable storage medium may also reside as discrete components in the apparatus.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A method for allocating cluster resource quotas, comprising:

2. The method according to claim 1, wherein the service processing request includes a load request amount, and the performing resource quota verification on the service to be processed according to a latest load increment and pre-stored resource configuration information to obtain a verification result includes:

acquiring the latest load increment;

3. The method of claim 2, wherein the step of obtaining the latest load increment is preceded by the method further comprising:

4. The method according to claim 3, wherein after the step of determining that the summation result is higher than the resource quota allocated to the pending service in the pre-stored resource configuration information, the method further comprises:

judging whether the summation result is higher than the highest quota;

5. The method according to any of claims 2-4, wherein said obtaining the latest load increment comprises:

6. The method of claim 5, wherein the determining a latest load increment according to the total processing time, the processing time threshold, the cluster minimum load, the cluster maximum load, and the cluster current load in the pre-stored resource configuration information comprises:

7. The method according to claim 5, wherein the obtaining the cluster minimum load according to the preset obtaining rule comprises:

8. The method according to claim 5, wherein the obtaining the highest load of the cluster according to the preset obtaining rule comprises:

9. The method of claim 2, further comprising:

10. The method according to any one of claims 1 to 4, wherein if the verification result is that the verification fails, processing the service to be processed according to a pre-stored super quota request processing rule includes:

11. The method according to any of claims 2-4, further comprising, prior to said obtaining the latest load increment:

judging whether a preset updating time length threshold value is reached;

and if the number threshold is reached, acquiring the latest load increment.

12. A cluster resource quota allocating apparatus, comprising:

13. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes the memory-stored computer-executable instructions to implement the cluster resource quota allocation method of any of claims 1 to 11.

14. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the cluster resource quota allocation method of any of claims 1 to 11.

15. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the method for cluster resource quota allocation according to any of claims 1 to 11.