CN107665144B

CN107665144B - Balanced scheduling center, method, system and device for distributed tasks

Info

Publication number: CN107665144B
Application number: CN201610615952.3A
Authority: CN
Inventors: 安伟佳
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2016-07-29
Filing date: 2016-07-29
Publication date: 2021-02-26
Anticipated expiration: 2036-07-29
Also published as: CN107665144A

Abstract

The invention discloses a distributed task balanced scheduling center, a method, a system and a device, wherein the method comprises the following steps: creating a plurality of task groups according to functions or types, and configuring a corresponding scheduling strategy for each task group; customizing a plurality of scheduling plans for each task group, wherein each scheduling plan corresponds to an executable task set; monitoring the current task execution amount of each task group, and calculating to obtain the task scheduling amount of each scheduling plan according to configuration parameters; calculating to obtain the task issuing quantity of each task host according to the current task execution quantity of each task group and the configuration parameters; and issuing an executable task for the corresponding task host according to the issuing amount of each task host. The scheduling center comprises a configuration module, a task scheduling module, a task issuing module, a task cache and a monitoring module. The invention effectively improves the task execution efficiency and the utilization rate of the task host, and balances the load of the task host.

Description

Balanced scheduling center, method, system and device for distributed tasks

Technical Field

The invention relates to the technical field of data transmission, in particular to a distributed task balanced scheduling center, a method, a system and a device.

Background

With the development of computers and network information technology, in some enterprise-level systems, a large number of tasks are distributed on different task hosts to be completed. How to schedule these tasks becomes an important technical problem to be reasonably allocated to the corresponding task hosts.

Currently, task scheduling execution is mainly divided into the following two types:

and (4) serial single-node task scheduling, namely scheduling and executing a task by a single machine. The method is characterized in that: the realization is simple, easy to maintain. The disadvantages are that: however, when a large number of tasks need to be scheduled and executed, the pressure of the single machine is so large that the completion of the task performance cannot be guaranteed.

And (4) distributed task scheduling, namely multi-machine parallel scheduling and task execution. Is characterized in that: the multi-machine hardware resources are fully utilized, a large number of tasks can be loaded, transverse expansion is facilitated when the tasks exceed the load, and the defect of serial single-node task scheduling is effectively overcome.

Currently, enterprise-level task scheduling basically mainly refers to distributed scheduling, and provides various scheduling strategies, task synchronization and other mechanisms. For example:

and a Quartz lightweight scheduling framework is adopted, and is mainly used for task scheduling of time periods and time points. Or a scheduled executive is adopted, which is a periodic scheduling thread pool provided by Java itself, and task execution with a fixed time period can be realized. Or the ZooKeeper framework is used for executing synchronous tasks on the distributed tasks to prevent repeated or unordered execution.

For example, the prior art adopts the following technical solutions to implement the scheduling of distributed tasks:

defining one or more executable tasks;

configuring a trigger period or an execution time point of each executable task;

starting Quartz, scheduled executors or other scheduling frameworks, reading scheduling configuration, and waiting for triggering task execution;

the scheduling framework triggers the task to execute, and then enters the next task scheduling period.

The executable task carries out synchronous control through the ZooKeeper, and prevents multiple machines from being executed repeatedly.

However, in the technical scheme in the prior art, only the scheduling framework is used, and the execution condition of the task cannot be monitored, so that the support for the execution of the distributed multiple machines is not ideal, and backlog and load imbalance of tasks of some hosts are easily caused.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a distributed task balancing scheduling center, method, system and device, which flexibly configure and adjust scheduling policies and task issuing parameters, thereby balancing the task amount of each task host.

In order to solve the above technical problem, the present invention provides a distributed task balanced scheduling method, including:

creating a plurality of task groups according to functions or types, and configuring a corresponding scheduling strategy for each task group;

customizing a plurality of scheduling plans for each task group, wherein each scheduling plan corresponds to an executable task set;

monitoring the current task execution amount of each task group, calculating to obtain the task scheduling amount of each scheduling plan according to configuration parameters, and loading tasks with the amount corresponding to the task scheduling amount into a task queue in a task cache;

calculating to obtain the task issuing quantity of each task host according to the current task execution quantity and the configuration parameters of each task group;

and taking out executable tasks from the corresponding task queue in the task cache according to the issued quantity of each task host and issuing the executable tasks to the corresponding task host.

Wherein the scheduling policy at least comprises a grouping configuration of the task host; the dispatch plan includes at least a dispatch period and a weight.

Preferably, the step of monitoring the current task execution amount of each task group includes:

receiving task execution feedback information of a task host to obtain a total task execution amount in a time period;

calculating the average task execution amount in the time period;

according to a preset fine adjustment coefficient, multiplying the average task execution quantity by the fine adjustment coefficient to obtain the current task execution quantity of each task group

Wherein the value of the fine adjustment coefficient is 0.8-1.2.

Preferably, the step of calculating the task scheduling amount of each scheduling plan includes:

according to the current task queue length in the cache, the current task queue length and the maximum task are calculatedDifference Q of service queue length_e；

In case it is guaranteed that the maximum modulation amount and the maximum queue length are not exceeded,

and Q_eTaking the maximum value as the scheduling total quantity Q of the task_t；

Calculating the weighted sum W of the plurality of dispatching plans of each task group according to a summation formula_sum；

According to the weight W of each scheduling plan_iSum of weights W_sumRatio of (a) to the total scheduling amount Q of this task_tCalculating the scheduling amount Q of each scheduling plan according to equation 1-1_i，

Q_i＝Q_t×(W_i÷W_sum) 1-1。

Preferably, if the weights of the plurality of scheduling plans in the task group are different from each other, the step of calculating the task issuing amount of each task host includes:

calculating the weighted sum W of multiple dispatch plans in the task group according to a summation formula_sum；

According to the current task execution amount, the weight and the weight sum, the issuing amount Ca for each scheduling plan is calculated according to a formula 2-1_i，

According to the length of the current task queue, the average required delivery amount of each task host is calculated

In that

And Ca_iTaking the minimum value as a task host sending quantity C;

if the plurality of scheduling plans in the task group have the same weight, the step of calculating the task issuing quantity C of each task host comprises the following steps:

calculating the de-weight sum of multiple scheduling plan weights in the task group after de-weighting

Wherein m is the number of weight removal, W_i' is the ith de-weight;

according to the current task execution amount, the deduplication weight and the deduplication weight sum, calculating the issued amount C 'of each deduplication weight according to a formula 2-2'_w，

Calculating the period sum of the dispatching plans under the same weight

Wherein, the I_iA scheduling period for the ith scheduling plan;

calculating the inverse period ratio R of the dispatching plan under the same weight according to the formulas 2-3_i，

R_i＝I'_sum÷I_i 2-3；

Calculating the sum of inverse periodic ratios of the scheduling plans under the same weight according to formulas 2-4 to obtain the sum R of the inverse ratios_sum，

According to equations 2-5, the inverse period ratio R of the dispatch plan under the same weight_iAnd the inverse sum R_sumAnd a down amount C 'of the de-weight'_wObtaining the task issuing quantity C' a of the scheduling plan in the period_i，

C'a_i＝C'_w×(R_i÷R_sum) 2-5；

According to the length of the current task queueAnd calculating the average required issuing number of each task host

In that

And C' a_iAnd taking the minimum value as the task host sending quantity C.

Preferably, the method for balanced scheduling of distributed tasks further includes the steps of setting a synchronization lock for the task to be executed and removing the synchronization lock after the task to be executed is completed.

Preferably, before executing the task, the task host sets the synchronization lock for the task to be executed by executing a setnx (key, value) command in the Redis database.

The invention also provides a balanced dispatching center of the distributed tasks, which comprises the following steps:

the configuration module is used for creating a plurality of task groups according to functions or types and configuring a corresponding scheduling strategy for each task group; customizing a plurality of scheduling plans for each task group, and dividing the task groups into corresponding executable task sets according to the scheduling plans;

the task scheduling module is used for calculating and obtaining the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plans, and loading the tasks with the number corresponding to the task scheduling amount into a task queue in a task cache;

the task issuing module calculates the issuing quantity of each task host according to the current task execution quantity and the configuration parameters of the scheduling plan in the task group, and takes out the executable tasks from the corresponding task queue in the task cache according to the issuing quantity and issues the executable tasks to the corresponding task host; and

and the monitoring module is used for receiving the feedback information of the task host, acquiring the task execution amount of a set time period, and calculating the task execution amount of each task group in the time period.

Preferably, the task scheduling module includes:

the first data reading unit is used for reading the configuration parameters of the scheduling strategy and the scheduling plan and the current task execution amount obtained from the monitoring module;

the first calculation unit is used for sequentially obtaining corresponding parameter data from the first data reading unit according to a preset scheduling algorithm and calculating the task scheduling amount of each scheduling plan according to a corresponding scheduling algorithm flow; and

and the task loading unit loads the tasks into the task queue in the cache according to the task scheduling amount of each scheduling plan.

Preferably, the task scheduling module further includes:

and the synchronous lock identifier setting unit is used for setting a synchronous lock identifier for the scheduled executable task.

Preferably, the task issuing module includes:

the second data reading unit is used for reading the current task execution amount obtained by the monitoring module and reading the scheduling strategy configured by the configuration module and the configuration parameters of the scheduling plan;

the second calculation unit is used for sequentially obtaining corresponding parameter data from the second data reading unit according to a preset issuing algorithm and calculating to obtain issuing quantity of each task host according to a corresponding issuing algorithm process; and

and the task issuing unit is used for taking out the executable task from the corresponding task queue in the task cache according to the issuing quantity of each task host and issuing the executable task to the corresponding task host.

The invention also provides a distributed task balanced scheduling device, which comprises a memory and a processor, wherein the memory is used for storing data and instructions, and the processor is configured as follows according to the instructions:

monitoring the current task execution amount of each task group, calculating to obtain the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loading the tasks with the amount corresponding to the task scheduling amount into task queues in a task cache;

calculating to obtain the task issuing amount of each task host according to the current task execution amount of each task group and the configuration parameters of the scheduling plan;

The invention also provides a distributed task balanced scheduling system, which comprises:

as the aforementioned dispatch center, an

And the task host cluster is used for receiving the tasks issued by the dispatching center, executing the tasks and sending task execution feedback information to the dispatching center.

Preferably, each task host in the task host cluster includes:

the task obtaining unit is used for obtaining the tasks with corresponding quantity from the dispatching center;

the task execution unit is used for executing the acquired task;

the feedback unit is used for sending task execution feedback information to the scheduling center; and

and the synchronous lock setting unit is used for setting a synchronous lock for the task according to a synchronous lock identifier in the task before the task is executed, and removing the synchronous lock after the task is completed.

Preferably, the system for balanced scheduling of distributed tasks further includes a Redis database, where the synchronization lock setting unit of the task host executes a setnx (key, value) command in the Redis database to set the synchronization lock for the task, and after the task is executed, clears the synchronization lock.

The invention effectively improves the task execution efficiency and the utilization rate of the task host and balances the load of the task host through the algorithm of task scheduling and task issuing. And synchronous locking is realized through a Redis database, so that lock conflict caused by network jitter is avoided, and the stability of task execution is improved. The invention manages tasks according to groups, can flexibly customize a scheduling plan, supports a task execution feedback mechanism, can monitor the health conditions of task scheduling, task issuing and task execution, flexibly adjusts the issued task quantity according to the feedback information, and dynamically maintains the load balance of the task host.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent by describing embodiments of the present invention with reference to the following drawings, in which:

FIG. 1 is a schematic diagram of a framework of a method and system for balanced scheduling of distributed tasks according to the present invention;

FIG. 2 is a schematic diagram illustrating the structure of a scheduling policy according to the present invention;

FIG. 3 is a schematic diagram of the schematic structure of the balanced scheduling system for distributed tasks according to the present invention;

FIG. 4 is a schematic structural diagram of the task scheduling module;

FIG. 5 is a schematic diagram of the principle structure of the task issuing module according to the present invention;

FIG. 6 is a general flowchart of a method for uniform scheduling of distributed tasks according to the present invention;

FIG. 7 is a flowchart of task scheduling in the balanced scheduling of distributed tasks according to the present invention;

FIG. 8 is a diagram of a queue issued in a buffer;

FIG. 9 is a flowchart of task delivery in the balanced scheduling of distributed tasks according to the present invention;

FIG. 10 is a schematic structural diagram of a task host according to the present invention;

FIG. 11 is a flowchart illustrating a task host executing a task according to the present invention.

Detailed Description

The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, and procedures have not been described in detail so as not to obscure the present invention. The figures are not necessarily drawn to scale.

The flowcharts and block diagrams in the figures and block diagrams illustrate the possible architectures, functions, and operations of the systems, methods, and apparatuses according to the embodiments of the present invention, and may represent a module, a program segment, or merely a code segment, which is an executable instruction for implementing a specified logical function. It should also be noted that the executable instructions that implement the specified logical functions may be recombined to create new modules and program segments. The blocks of the drawings, and the order of the blocks, are thus provided to better illustrate the processes and steps of the embodiments and should not be taken as limiting the invention itself.

In order to balance the load of the task host, a scheduling strategy and task issuing parameters need to be configured flexibly, the task issuing parameters are adjusted flexibly according to the current running condition by providing a scheduling, issuing, monitoring and counting mechanism, and specifically, as shown in fig. 1, the invention is a schematic diagram of a principle framework of the distributed task balancing scheduling method and system.

The scheduling center is responsible for task scheduling and task issuing, and loads a certain number of calculated tasks to be executed into a task cache during task scheduling. When the tasks are issued, the issued quantity of each task host is calculated through a task issuing algorithm, and the tasks with corresponding quantity are taken out from the cache and issued to the corresponding host in the task host cluster.

And when the task is scheduled, reading the task scheduling strategy and scheduling according to a scheduling algorithm.

And when the task is issued, reading the task scheduling strategy, acquiring the calculation parameters, and issuing according to a task issuing algorithm.

When task synchronization is needed, a synchronization lock is set for an executable task through a Redis database, and only the task which is successfully set can be executed.

And the host packs information such as task execution results, executable task queue states, task execution quantity in a period and the like and feeds back the information to the scheduling center. The dispatching center dynamically adjusts the quantity of the dispatched and dispatched tasks, namely the dispatching quantity and the dispatching quantity, depending on the feedback information.

Fig. 2 is a schematic diagram illustrating the structural principle of the scheduling policy described above.

Different task groups are created according to functions or types, as shown in fig. 2, there are 4 task groups, which are respectively a task group 1 to a task group 4, and each task group is configured with a respective scheduling policy, that is, a scheduling policy 1 to a scheduling policy 4. The scheduling policy includes at least task execution node grouping configuration information.

Wherein, because when carrying out task scheduling and task down-sending, can adopt many methods, the method that the invention provides is one of them. In the scheduling policy shown in fig. 2, besides the method described in the present invention, other methods may also be used, and therefore, the scheduling policy also includes a parameter of a scheduling algorithm type to distinguish which method is used when the task group is scheduled and the task group is issued.

Which nodes (i.e., task hosts) can execute which task group is configured by task execution node grouping configuration information. In addition, other configurations, such as a timeout strategy, are provided for setting some processing methods for scheduling and issuing the timeout. Similarly, other configurations are possible, and are not described in detail herein because they are not germane to the present invention.

And for each task group, a plurality of scheduling plans are customized, each scheduling plan corresponds to a series of executable task sets, and the scheduling plans mainly comprise task types, scheduling periods, weights, scheduling scripts and the like.

The dispatching center calculates the dispatching amount of each dispatching plan through a dispatching algorithm, and loads the tasks corresponding to the dispatching plans into corresponding task queues in the cache respectively. And calculating the issuing quantity of each host according to an issuing algorithm, taking out the tasks of the issuing quantity from the corresponding task queue, and issuing the tasks to the corresponding task host.

According to the foregoing principle and scheduling policy framework, the present invention provides a balanced scheduling system for distributed tasks, and a schematic structural diagram of an embodiment of the balanced scheduling system is shown in fig. 3.

The distributed task balanced scheduling system comprises a scheduling center 1 and a task host cluster 2, wherein the scheduling center 1 comprises:

a configuration module 11, configured to create a plurality of task groups according to functions or types, and configure a corresponding scheduling policy for each task group; customizing a plurality of scheduling plans for each task group, and dividing the task groups into corresponding executable task sets according to the scheduling plans;

the task scheduling module 12 calculates the scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loads the tasks to be executed into the corresponding task queues in the task cache 14;

the task issuing module 13 calculates an issuing amount of each task host according to the task execution feedback information and the configuration parameters of the scheduling plan in the task group, and takes out the executable task from the task queue in the task cache 14 according to the issuing amount and issues the executable task to the corresponding task host;

the task cache 14 is used for storing tasks to be executed, and comprises a plurality of task queues; and

and the monitoring module 15 is configured to receive the task host feedback information, obtain a task execution amount in a set time period, and calculate an average task execution amount of each task group in the time period.

As described above, the scheduling policy at least includes task execution host grouping configuration, and in addition, may also include a task scheduling algorithm type, a timeout policy, and the like; the scheduling plan comprises a task type, a scheduling period, a weight, a scheduling script and the like.

The task scheduling module, as shown in fig. 4, includes a first data reading unit 121, a first calculating unit 122, and a task loading unit 123. The first data reading unit 121 is configured to read the average task execution amount of the task group, the scheduling policy, and the configuration parameters in the scheduling plan, which are obtained by the monitoring module 15; the first calculating unit 122 is configured to sequentially obtain corresponding parameter data from the first data reading unit 121 according to a preset scheduling algorithm, and calculate a scheduling amount of each scheduling plan according to a corresponding scheduling algorithm flow; the task loading unit 123 loads the task into the corresponding task queue in the cache according to the scheduling amount of each scheduling plan.

The task scheduling module 12 further includes a synchronization lock identifier setting unit 124, configured to set a synchronization lock identifier for the task before loading the task into the cache, and set a synchronization lock for the task when the task is executed by the task host and the synchronization lock identifier is read.

As shown in fig. 5, which is a schematic diagram of a principle structure of the task issuing module according to the present invention, the task issuing module 13 includes a second data reading unit 131, a second calculating unit 132, and a task issuing unit 133. The second data reading unit 131 reads the average task execution amount of the task group, the scheduling policy, and the relevant parameters in the scheduling plan, such as the current task queue length, the weight of the scheduling plan, the scheduling cycle of the scheduling plan, and the number of task hosts, obtained by the monitoring module 15; the second calculating unit 132 sequentially obtains corresponding parameter data from the second data reading unit 131 according to a preset issuing algorithm, that is, different formulas and sequences, and calculates to obtain an issuing amount of each task host; the task issuing unit 133 takes out the number of tasks from the corresponding task queue in the task buffer and sends the number of tasks to the task host according to the issued amount of each task host.

As shown in fig. 6, a general flowchart of the balanced scheduling method for distributed tasks according to the present invention specifically includes:

step S1, a plurality of task groups are created according to functions or types, and a corresponding scheduling strategy is configured for each task group;

step S2, a plurality of scheduling plans are customized for each task group, and each scheduling plan corresponds to an executable task set;

step S3, monitoring the task execution amount of each task group, calculating the scheduling amount Q of each scheduling plan according to the scheduling strategy, and loading the tasks to be executed into the task queue in the cache;

step S4, calculating the issuing quantity C of each task host according to the task execution quantity of each task group and the scheduling strategy;

and step S5, taking out the executable task from the cache according to the delivery amount of each task host and delivering the executable task to the corresponding task host.

Specifically, in step 3, the task execution amount of each task group is monitored, and a flow of calculating the task scheduling amount Q of each scheduling plan according to the scheduling policy is shown in fig. 7, which specifically includes the following steps:

step S31, reading corresponding parameters from the scheduling policy, such as the maximum scheduling amount Q in the task group_maxAnd issuing a maximum value Q of queue length_bTaking the obtained value as a threshold value during calculation;

step S32, receiving feedback information of task host from the monitoring module of the dispatching center to obtain task execution amount of a time period; calculating the average value of the task execution amount in the task group within a period of time, such as 5 minutes, and then multiplying the average task execution amount by the fine adjustment according to a preset fine adjustment coefficient to obtain the current task execution amount of each task group

The fine tuning coefficient is a floating point coefficient larger than 0, the floating point coefficient can be manually set, and is mainly used for manual fine tuning of a calculation result, and a value is usually, but not limited to, between 0.8 and 1.2.

Step S33, according to the length Q of the current issue queue_aBy the formula Q_e＝Q_b-Q_aCalculating the length Q of the current issuing queue_aAnd maximum value Q of issuing queue length_bDifference Q of_e。

Step S34, ensuring the maximum modulation Q is not exceeded_maxAnd Q of maximum queue length_bIn the case of the above-described situation,

and Q_eTaking the maximum value as the scheduling total Q_t。

Step S35, calculating the dispatching plan weight sum W_sum，

Wherein, W_iI is a natural number 1, 2, 3 … … for the weight of the ith dispatch plan.

Step S36, according to the ratio of the weight of each scheduling plan to the total weight and the total scheduling amount, according to the formula Q_i＝Q_t×(W_i÷W_sum) Calculating the scheduling amount Q of each scheduling plan_i。

Fig. 8 is a schematic diagram of a issue queue of an embodiment of a task group in a cache according to the present invention. Wherein the task group has 7 scheduling plans, and the adjustment quantity Q of the 7 scheduling plans is obtained through the above process_p1-Q_p7。

When a task is issued, the issuing algorithm is slightly different according to whether the weights of a plurality of scheduling plans are the same or not. As shown in fig. 9, for the purpose that the weights of the multiple scheduling plans in the task group are different from each other, the step of calculating the task issuance amount C of each task host:

step S41a, calculating the total weight of multiple dispatching plans in task group

Step S42a, according to the current task execution amount, weight and the total weight, the issued amount for each scheduling plan is calculated according to the formula 2-1,

step S43a, calculating the average required number of tasks according to the length of the current task queue

Step S44a, at

And Ca_iAnd taking the minimum value as the task host sending quantity C.

If the plurality of scheduling plans in the task group have the same weight, the step of calculating the task issuing amount of each task host comprises the following steps:

step S41b, calculating the weight-removing total sum of the weight-removed multiple scheduling plans in the task group

Wherein m is the number of weight removal, W_i' is the ith de-emphasis weight.

Step S42b, calculating the delivered quantity C 'of each duplication removing weight according to the formula 2-2 according to the current task execution quantity, the duplication removing weight and the duplication removing weight sum'_w，

Step S43b, calculating the cycle sum of the dispatching plan under the same weight

Wherein, the I_iThe scheduling period of the ith scheduling plan.

Step S44b, calculating the inverse period ratio R of the dispatching plan under the same weight according to the formula 2-3_i；

R_i＝I'_sum÷I_i 2-3；

Step S45b, calculating the sum of inverse periodic ratios of the dispatching plans under the same weight according to the formula 2-4 to obtain the sum R of the inverse ratios_sum；

Step S46b, according to formula 2-5, the period inverse ratio R of the dispatching plan under the same weight_iAnd the inverse sum R_sumAnd a down amount C 'of the de-weight'_wObtaining the task issuing quantity C' a of the scheduling plan in the period_i；

C'a_i＝C'_w×(R_i÷R_sum) 2-5；

Step S47b, calculating the average required number of tasks according to the length of the current task queue

Step S48b, at

And C' a_iAnd taking the minimum value as the task host sending quantity C.

And obtaining the issuing quantity of each task host according to the flow, and issuing the tasks to the corresponding task hosts from the cache according to the issuing quantity.

In the present invention, the structural principle of the task host is shown in fig. 10, the task host includes a task obtaining unit 21, a task executing unit 22, a feedback unit 23, and a synchronization lock setting unit 24, where the task obtaining unit 21 obtains a corresponding number of tasks (i.e. the issued amount calculated by the scheduling center) from the scheduling center; the task execution unit 22 executes the acquired task; the feedback unit 23 sends task execution feedback information to the scheduling center; the synchronization lock setting unit 24 is configured to set a synchronization lock for the task according to a synchronization lock identifier in the task before executing the task, and remove the synchronization lock after completing the task.

The process of the task host executing the task is shown in fig. 11.

Step S1a, obtaining tasks with the number of delivered quantities from the scheduling center, specifically, after completing the tasks, the task host requests the scheduling center for the tasks, and the scheduling center calculates the delivered quantities corresponding to the task host according to a delivery algorithm, and sends the tasks with the delivered quantities to the task host.

Step S2a, after obtaining the tasks, according to the synchronization lock identifier in the task, executing a setnx (key, value) command in the Redis database to set the synchronization lock for the task.

And step S3a, sequentially executing the tasks.

Step S4a, judging whether a task is completed, if yes, executing step S5 a; if not, then the process continues to step S3a to perform the task.

And step S5a, removing the synchronization lock.

And step S6a, judging whether the feedback period is reached, if so, executing step S7a, and if not, continuing to execute the task.

And step S7a, packaging and feeding back the task execution condition to the task scheduling center.

Step S8a, judging whether all tasks are completed, if not, continuing to execute the tasks; if all the tasks are finished, ending the execution process of the task, and starting a new task acquisition and execution process.

Under the default condition, one task can only be taken and executed by one host at the same time, if the task is not executed in a period, the next period dispatches and issues the task, the condition that a plurality of hosts execute the same task at the same time can occur, so that the hosts can set a lock before executing the task and can execute the task only if the setting is successful; if the setting is unsuccessful, the system waits for the timeout and returns directly.

The invention effectively improves the task execution efficiency and the utilization rate of the task host and balances the load of the task host through the algorithm of task scheduling and task issuing. And synchronous locking is realized through a Redis database, so that lock conflict caused by network jitter is avoided, and the stability of task execution is improved. The invention manages tasks according to groups, can flexibly customize a scheduling plan, supports a task execution feedback mechanism, can monitor the health conditions of task dropping, task issuing and task execution, flexibly adjusts the issued task quantity according to the feedback information, and dynamically maintains the load balance of the task host.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A balanced scheduling method for distributed tasks comprises the following steps:

receiving task execution feedback information of a task host to obtain a total task execution amount in a time period; calculating the average task execution amount in the time period; according to a preset fine adjustment coefficient, multiplying the average task execution quantity by the fine adjustment coefficient to obtain the current task execution quantity of each task group

Wherein the fine tuning coefficient is a floating point coefficient larger than 0;

calculating to obtain the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loading the tasks with the amount corresponding to the task scheduling amount into a task queue in a task cache;

2. The balanced scheduling method of distributed tasks according to claim 1, wherein the scheduling policy comprises at least a grouping configuration of task hosts; the dispatch plan includes at least a dispatch period and a weight.

3. The balanced scheduling method of distributed tasks according to claim 2, wherein the fine tuning coefficient has a value in the range of 0.8-1.2.

4. The method for balanced scheduling of distributed tasks according to claim 3, wherein said step of calculating the task scheduling amount for each scheduling plan comprises:

calculating the difference value Q between the length of the current task queue and the length of the maximum task queue according to the length of the current task queue in the cache_e；

and Q_eTaking the maximum value as the scheduling total quantity Q of the task_tThe maximum scheduling amount is a threshold value set by the task group for the scheduling amount of each scheduling plan;

Weight W according to ith Dispatch plan_iAnd the sum of the weights W of the plurality of dispatch plans_sumRatio of (a) to the total scheduling amount Q of this task_tCalculating the scheduling quantity Q of the ith scheduling plan according to the formula 1-1_i，

Q_i＝Q_t×(W_i÷W_sum) 1-1。

5. The balanced scheduling method of distributed tasks according to claim 3, wherein if the weights of the plurality of scheduling plans in the task group are different from each other, the step of calculating the task issuing amount of each task host includes:

calculating the weighted sum W of multiple scheduling plans in one task group according to a summation formula_sum；

According to the current task execution amount, the weight of the ith dispatching plan and the weight sum of the dispatching plans, the issuing amount Ca of the ith dispatching plan is calculated according to a formula 2-1_i，

In that

And Ca_iTaking the minimum value as a task host sending quantity C;

Wherein m is the number of weight removal, W_i' is the ith de-weight;

according to the current task execution amount, the weight removal weight and the weight removal sum, calculating the delivery amount C 'of the ith weight removal weight according to a formula 2-2'_w，

Calculating the period sum of the dispatching plans under the same weight

Wherein, the I_iA scheduling period for the ith scheduling plan;

calculating the cycle inverse ratio R of the ith dispatching plan under the same weight according to the formula 2-3_i，

R_i＝I'_sum÷I_i 2-3；

C'a_i＝C'_w×(R_i÷R_sum) 2-5；

According to the length of the current task queue, the average required issuing number of each task host is calculated

In that

And C' a_iAnd taking the minimum value as the task host sending quantity C.

6. The method for balanced scheduling of distributed tasks according to claim 1, further comprising the steps of setting a synchronization lock for a task to be executed and removing the synchronization lock after completion of the task to be executed.

7. The method for balanced scheduling of distributed tasks according to claim 6, wherein a task host sets the synchronization lock for the task to be executed by executing a setnx (key, value) command in a Redis database before executing the task.

8. A balanced dispatch center for distributed tasks, comprising:

the monitoring module is used for receiving task execution feedback information of the task host and acquiring the total task execution amount in a time period; calculating the average task execution amount in the time period; according to a preset fine adjustment coefficient, multiplying the average task execution quantity by the fine adjustment coefficient to obtain the current task execution quantity of each task group

the task scheduling module is used for calculating the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loading the tasks with the amount corresponding to the task scheduling amount into a task queue in a task cache;

and the task issuing module is used for calculating the issuing quantity of each task host according to the current task execution quantity and the configuration parameters of the scheduling plan in the task group, and taking out the executable tasks from the corresponding task queue in the task cache according to the issuing quantity to issue the executable tasks to the corresponding task host.

9. The balanced task scheduling center according to claim 8, wherein the task scheduling module comprises:

10. The distributed task balanced scheduling center of claim 9 wherein the task scheduling module further comprises:

11. The balanced task scheduling center according to claim 8, wherein the task issuing module comprises:

12. A distributed task balanced scheduling device, comprising a memory and a processor, wherein the memory is used for storing data and instructions, and the processor is configured according to the instructions as follows:

calculating to obtain a task scheduling amount of each scheduling plan according to task execution feedback information and configuration parameters of the scheduling plan, and loading tasks with the number corresponding to the task scheduling amount into a task queue in a task cache;

13. A system for balanced scheduling of distributed tasks, comprising:

a dispatch center as claimed in any one of claims 8-11, and

14. The system for balanced scheduling of distributed tasks according to claim 13, wherein each task host in the task host cluster comprises:

the task execution unit is used for executing the acquired task;

15. The system for balanced scheduling of distributed tasks according to claim 14, further comprising a Redis database, wherein the synchronization lock setting unit of the task master executes a setnx (key, value) command in the Redis database to set the synchronization lock for the task, and wherein the synchronization lock is cleared after the task is executed.