CN107665144B - Balanced scheduling center, method, system and device for distributed tasks - Google Patents

Balanced scheduling center, method, system and device for distributed tasks Download PDF

Info

Publication number
CN107665144B
CN107665144B CN201610615952.3A CN201610615952A CN107665144B CN 107665144 B CN107665144 B CN 107665144B CN 201610615952 A CN201610615952 A CN 201610615952A CN 107665144 B CN107665144 B CN 107665144B
Authority
CN
China
Prior art keywords
task
scheduling
amount
host
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610615952.3A
Other languages
Chinese (zh)
Other versions
CN107665144A (en
Inventor
安伟佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201610615952.3A priority Critical patent/CN107665144B/en
Publication of CN107665144A publication Critical patent/CN107665144A/en
Application granted granted Critical
Publication of CN107665144B publication Critical patent/CN107665144B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Multi Processors (AREA)

Abstract

The invention discloses a distributed task balanced scheduling center, a method, a system and a device, wherein the method comprises the following steps: creating a plurality of task groups according to functions or types, and configuring a corresponding scheduling strategy for each task group; customizing a plurality of scheduling plans for each task group, wherein each scheduling plan corresponds to an executable task set; monitoring the current task execution amount of each task group, and calculating to obtain the task scheduling amount of each scheduling plan according to configuration parameters; calculating to obtain the task issuing quantity of each task host according to the current task execution quantity of each task group and the configuration parameters; and issuing an executable task for the corresponding task host according to the issuing amount of each task host. The scheduling center comprises a configuration module, a task scheduling module, a task issuing module, a task cache and a monitoring module. The invention effectively improves the task execution efficiency and the utilization rate of the task host, and balances the load of the task host.

Description

Balanced scheduling center, method, system and device for distributed tasks
Technical Field
The invention relates to the technical field of data transmission, in particular to a distributed task balanced scheduling center, a method, a system and a device.
Background
With the development of computers and network information technology, in some enterprise-level systems, a large number of tasks are distributed on different task hosts to be completed. How to schedule these tasks becomes an important technical problem to be reasonably allocated to the corresponding task hosts.
Currently, task scheduling execution is mainly divided into the following two types:
and (4) serial single-node task scheduling, namely scheduling and executing a task by a single machine. The method is characterized in that: the realization is simple, easy to maintain. The disadvantages are that: however, when a large number of tasks need to be scheduled and executed, the pressure of the single machine is so large that the completion of the task performance cannot be guaranteed.
And (4) distributed task scheduling, namely multi-machine parallel scheduling and task execution. Is characterized in that: the multi-machine hardware resources are fully utilized, a large number of tasks can be loaded, transverse expansion is facilitated when the tasks exceed the load, and the defect of serial single-node task scheduling is effectively overcome.
Currently, enterprise-level task scheduling basically mainly refers to distributed scheduling, and provides various scheduling strategies, task synchronization and other mechanisms. For example:
and a Quartz lightweight scheduling framework is adopted, and is mainly used for task scheduling of time periods and time points. Or a scheduled executive is adopted, which is a periodic scheduling thread pool provided by Java itself, and task execution with a fixed time period can be realized. Or the ZooKeeper framework is used for executing synchronous tasks on the distributed tasks to prevent repeated or unordered execution.
For example, the prior art adopts the following technical solutions to implement the scheduling of distributed tasks:
defining one or more executable tasks;
configuring a trigger period or an execution time point of each executable task;
starting Quartz, scheduled executors or other scheduling frameworks, reading scheduling configuration, and waiting for triggering task execution;
the scheduling framework triggers the task to execute, and then enters the next task scheduling period.
The executable task carries out synchronous control through the ZooKeeper, and prevents multiple machines from being executed repeatedly.
However, in the technical scheme in the prior art, only the scheduling framework is used, and the execution condition of the task cannot be monitored, so that the support for the execution of the distributed multiple machines is not ideal, and backlog and load imbalance of tasks of some hosts are easily caused.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a distributed task balancing scheduling center, method, system and device, which flexibly configure and adjust scheduling policies and task issuing parameters, thereby balancing the task amount of each task host.
In order to solve the above technical problem, the present invention provides a distributed task balanced scheduling method, including:
creating a plurality of task groups according to functions or types, and configuring a corresponding scheduling strategy for each task group;
customizing a plurality of scheduling plans for each task group, wherein each scheduling plan corresponds to an executable task set;
monitoring the current task execution amount of each task group, calculating to obtain the task scheduling amount of each scheduling plan according to configuration parameters, and loading tasks with the amount corresponding to the task scheduling amount into a task queue in a task cache;
calculating to obtain the task issuing quantity of each task host according to the current task execution quantity and the configuration parameters of each task group;
and taking out executable tasks from the corresponding task queue in the task cache according to the issued quantity of each task host and issuing the executable tasks to the corresponding task host.
Wherein the scheduling policy at least comprises a grouping configuration of the task host; the dispatch plan includes at least a dispatch period and a weight.
Preferably, the step of monitoring the current task execution amount of each task group includes:
receiving task execution feedback information of a task host to obtain a total task execution amount in a time period;
calculating the average task execution amount in the time period;
according to a preset fine adjustment coefficient, multiplying the average task execution quantity by the fine adjustment coefficient to obtain the current task execution quantity of each task group
Figure GDA0002793047660000031
Wherein the value of the fine adjustment coefficient is 0.8-1.2.
Preferably, the step of calculating the task scheduling amount of each scheduling plan includes:
according to the current task queue length in the cache, the current task queue length and the maximum task are calculatedDifference Q of service queue lengthe
In case it is guaranteed that the maximum modulation amount and the maximum queue length are not exceeded,
Figure GDA0002793047660000032
and QeTaking the maximum value as the scheduling total quantity Q of the taskt
Calculating the weighted sum W of the plurality of dispatching plans of each task group according to a summation formulasum
According to the weight W of each scheduling planiSum of weights WsumRatio of (a) to the total scheduling amount Q of this tasktCalculating the scheduling amount Q of each scheduling plan according to equation 1-1i
Qi=Qt×(Wi÷Wsum) 1-1。
Preferably, if the weights of the plurality of scheduling plans in the task group are different from each other, the step of calculating the task issuing amount of each task host includes:
calculating the weighted sum W of multiple dispatch plans in the task group according to a summation formulasum
According to the current task execution amount, the weight and the weight sum, the issuing amount Ca for each scheduling plan is calculated according to a formula 2-1i
Figure GDA0002793047660000033
According to the length of the current task queue, the average required delivery amount of each task host is calculated
Figure GDA0002793047660000034
In that
Figure GDA0002793047660000035
And CaiTaking the minimum value as a task host sending quantity C;
if the plurality of scheduling plans in the task group have the same weight, the step of calculating the task issuing quantity C of each task host comprises the following steps:
calculating the de-weight sum of multiple scheduling plan weights in the task group after de-weighting
Figure GDA0002793047660000036
Wherein m is the number of weight removal, Wi' is the ith de-weight;
according to the current task execution amount, the deduplication weight and the deduplication weight sum, calculating the issued amount C 'of each deduplication weight according to a formula 2-2'w
Figure GDA0002793047660000037
Calculating the period sum of the dispatching plans under the same weight
Figure GDA0002793047660000038
Wherein, the IiA scheduling period for the ith scheduling plan;
calculating the inverse period ratio R of the dispatching plan under the same weight according to the formulas 2-3i
Ri=I'sum÷Ii 2-3;
Calculating the sum of inverse periodic ratios of the scheduling plans under the same weight according to formulas 2-4 to obtain the sum R of the inverse ratiossum
Figure GDA0002793047660000041
According to equations 2-5, the inverse period ratio R of the dispatch plan under the same weightiAnd the inverse sum RsumAnd a down amount C 'of the de-weight'wObtaining the task issuing quantity C' a of the scheduling plan in the periodi
C'ai=C'w×(Ri÷Rsum) 2-5;
According to the length of the current task queueAnd calculating the average required issuing number of each task host
Figure GDA0002793047660000042
In that
Figure GDA0002793047660000043
And C' aiAnd taking the minimum value as the task host sending quantity C.
Preferably, the method for balanced scheduling of distributed tasks further includes the steps of setting a synchronization lock for the task to be executed and removing the synchronization lock after the task to be executed is completed.
Preferably, before executing the task, the task host sets the synchronization lock for the task to be executed by executing a setnx (key, value) command in the Redis database.
The invention also provides a balanced dispatching center of the distributed tasks, which comprises the following steps:
the configuration module is used for creating a plurality of task groups according to functions or types and configuring a corresponding scheduling strategy for each task group; customizing a plurality of scheduling plans for each task group, and dividing the task groups into corresponding executable task sets according to the scheduling plans;
the task scheduling module is used for calculating and obtaining the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plans, and loading the tasks with the number corresponding to the task scheduling amount into a task queue in a task cache;
the task issuing module calculates the issuing quantity of each task host according to the current task execution quantity and the configuration parameters of the scheduling plan in the task group, and takes out the executable tasks from the corresponding task queue in the task cache according to the issuing quantity and issues the executable tasks to the corresponding task host; and
and the monitoring module is used for receiving the feedback information of the task host, acquiring the task execution amount of a set time period, and calculating the task execution amount of each task group in the time period.
Preferably, the task scheduling module includes:
the first data reading unit is used for reading the configuration parameters of the scheduling strategy and the scheduling plan and the current task execution amount obtained from the monitoring module;
the first calculation unit is used for sequentially obtaining corresponding parameter data from the first data reading unit according to a preset scheduling algorithm and calculating the task scheduling amount of each scheduling plan according to a corresponding scheduling algorithm flow; and
and the task loading unit loads the tasks into the task queue in the cache according to the task scheduling amount of each scheduling plan.
Preferably, the task scheduling module further includes:
and the synchronous lock identifier setting unit is used for setting a synchronous lock identifier for the scheduled executable task.
Preferably, the task issuing module includes:
the second data reading unit is used for reading the current task execution amount obtained by the monitoring module and reading the scheduling strategy configured by the configuration module and the configuration parameters of the scheduling plan;
the second calculation unit is used for sequentially obtaining corresponding parameter data from the second data reading unit according to a preset issuing algorithm and calculating to obtain issuing quantity of each task host according to a corresponding issuing algorithm process; and
and the task issuing unit is used for taking out the executable task from the corresponding task queue in the task cache according to the issuing quantity of each task host and issuing the executable task to the corresponding task host.
The invention also provides a distributed task balanced scheduling device, which comprises a memory and a processor, wherein the memory is used for storing data and instructions, and the processor is configured as follows according to the instructions:
creating a plurality of task groups according to functions or types, and configuring a corresponding scheduling strategy for each task group;
customizing a plurality of scheduling plans for each task group, wherein each scheduling plan corresponds to an executable task set;
monitoring the current task execution amount of each task group, calculating to obtain the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loading the tasks with the amount corresponding to the task scheduling amount into task queues in a task cache;
calculating to obtain the task issuing amount of each task host according to the current task execution amount of each task group and the configuration parameters of the scheduling plan;
and taking out executable tasks from the corresponding task queue in the task cache according to the issued quantity of each task host and issuing the executable tasks to the corresponding task host.
The invention also provides a distributed task balanced scheduling system, which comprises:
as the aforementioned dispatch center, an
And the task host cluster is used for receiving the tasks issued by the dispatching center, executing the tasks and sending task execution feedback information to the dispatching center.
Preferably, each task host in the task host cluster includes:
the task obtaining unit is used for obtaining the tasks with corresponding quantity from the dispatching center;
the task execution unit is used for executing the acquired task;
the feedback unit is used for sending task execution feedback information to the scheduling center; and
and the synchronous lock setting unit is used for setting a synchronous lock for the task according to a synchronous lock identifier in the task before the task is executed, and removing the synchronous lock after the task is completed.
Preferably, the system for balanced scheduling of distributed tasks further includes a Redis database, where the synchronization lock setting unit of the task host executes a setnx (key, value) command in the Redis database to set the synchronization lock for the task, and after the task is executed, clears the synchronization lock.
The invention effectively improves the task execution efficiency and the utilization rate of the task host and balances the load of the task host through the algorithm of task scheduling and task issuing. And synchronous locking is realized through a Redis database, so that lock conflict caused by network jitter is avoided, and the stability of task execution is improved. The invention manages tasks according to groups, can flexibly customize a scheduling plan, supports a task execution feedback mechanism, can monitor the health conditions of task scheduling, task issuing and task execution, flexibly adjusts the issued task quantity according to the feedback information, and dynamically maintains the load balance of the task host.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing embodiments of the present invention with reference to the following drawings, in which:
FIG. 1 is a schematic diagram of a framework of a method and system for balanced scheduling of distributed tasks according to the present invention;
FIG. 2 is a schematic diagram illustrating the structure of a scheduling policy according to the present invention;
FIG. 3 is a schematic diagram of the schematic structure of the balanced scheduling system for distributed tasks according to the present invention;
FIG. 4 is a schematic structural diagram of the task scheduling module;
FIG. 5 is a schematic diagram of the principle structure of the task issuing module according to the present invention;
FIG. 6 is a general flowchart of a method for uniform scheduling of distributed tasks according to the present invention;
FIG. 7 is a flowchart of task scheduling in the balanced scheduling of distributed tasks according to the present invention;
FIG. 8 is a diagram of a queue issued in a buffer;
FIG. 9 is a flowchart of task delivery in the balanced scheduling of distributed tasks according to the present invention;
FIG. 10 is a schematic structural diagram of a task host according to the present invention;
FIG. 11 is a flowchart illustrating a task host executing a task according to the present invention.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, and procedures have not been described in detail so as not to obscure the present invention. The figures are not necessarily drawn to scale.
The flowcharts and block diagrams in the figures and block diagrams illustrate the possible architectures, functions, and operations of the systems, methods, and apparatuses according to the embodiments of the present invention, and may represent a module, a program segment, or merely a code segment, which is an executable instruction for implementing a specified logical function. It should also be noted that the executable instructions that implement the specified logical functions may be recombined to create new modules and program segments. The blocks of the drawings, and the order of the blocks, are thus provided to better illustrate the processes and steps of the embodiments and should not be taken as limiting the invention itself.
In order to balance the load of the task host, a scheduling strategy and task issuing parameters need to be configured flexibly, the task issuing parameters are adjusted flexibly according to the current running condition by providing a scheduling, issuing, monitoring and counting mechanism, and specifically, as shown in fig. 1, the invention is a schematic diagram of a principle framework of the distributed task balancing scheduling method and system.
The scheduling center is responsible for task scheduling and task issuing, and loads a certain number of calculated tasks to be executed into a task cache during task scheduling. When the tasks are issued, the issued quantity of each task host is calculated through a task issuing algorithm, and the tasks with corresponding quantity are taken out from the cache and issued to the corresponding host in the task host cluster.
And when the task is scheduled, reading the task scheduling strategy and scheduling according to a scheduling algorithm.
And when the task is issued, reading the task scheduling strategy, acquiring the calculation parameters, and issuing according to a task issuing algorithm.
When task synchronization is needed, a synchronization lock is set for an executable task through a Redis database, and only the task which is successfully set can be executed.
And the host packs information such as task execution results, executable task queue states, task execution quantity in a period and the like and feeds back the information to the scheduling center. The dispatching center dynamically adjusts the quantity of the dispatched and dispatched tasks, namely the dispatching quantity and the dispatching quantity, depending on the feedback information.
Fig. 2 is a schematic diagram illustrating the structural principle of the scheduling policy described above.
Different task groups are created according to functions or types, as shown in fig. 2, there are 4 task groups, which are respectively a task group 1 to a task group 4, and each task group is configured with a respective scheduling policy, that is, a scheduling policy 1 to a scheduling policy 4. The scheduling policy includes at least task execution node grouping configuration information.
Wherein, because when carrying out task scheduling and task down-sending, can adopt many methods, the method that the invention provides is one of them. In the scheduling policy shown in fig. 2, besides the method described in the present invention, other methods may also be used, and therefore, the scheduling policy also includes a parameter of a scheduling algorithm type to distinguish which method is used when the task group is scheduled and the task group is issued.
Which nodes (i.e., task hosts) can execute which task group is configured by task execution node grouping configuration information. In addition, other configurations, such as a timeout strategy, are provided for setting some processing methods for scheduling and issuing the timeout. Similarly, other configurations are possible, and are not described in detail herein because they are not germane to the present invention.
And for each task group, a plurality of scheduling plans are customized, each scheduling plan corresponds to a series of executable task sets, and the scheduling plans mainly comprise task types, scheduling periods, weights, scheduling scripts and the like.
The dispatching center calculates the dispatching amount of each dispatching plan through a dispatching algorithm, and loads the tasks corresponding to the dispatching plans into corresponding task queues in the cache respectively. And calculating the issuing quantity of each host according to an issuing algorithm, taking out the tasks of the issuing quantity from the corresponding task queue, and issuing the tasks to the corresponding task host.
According to the foregoing principle and scheduling policy framework, the present invention provides a balanced scheduling system for distributed tasks, and a schematic structural diagram of an embodiment of the balanced scheduling system is shown in fig. 3.
The distributed task balanced scheduling system comprises a scheduling center 1 and a task host cluster 2, wherein the scheduling center 1 comprises:
a configuration module 11, configured to create a plurality of task groups according to functions or types, and configure a corresponding scheduling policy for each task group; customizing a plurality of scheduling plans for each task group, and dividing the task groups into corresponding executable task sets according to the scheduling plans;
the task scheduling module 12 calculates the scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loads the tasks to be executed into the corresponding task queues in the task cache 14;
the task issuing module 13 calculates an issuing amount of each task host according to the task execution feedback information and the configuration parameters of the scheduling plan in the task group, and takes out the executable task from the task queue in the task cache 14 according to the issuing amount and issues the executable task to the corresponding task host;
the task cache 14 is used for storing tasks to be executed, and comprises a plurality of task queues; and
and the monitoring module 15 is configured to receive the task host feedback information, obtain a task execution amount in a set time period, and calculate an average task execution amount of each task group in the time period.
As described above, the scheduling policy at least includes task execution host grouping configuration, and in addition, may also include a task scheduling algorithm type, a timeout policy, and the like; the scheduling plan comprises a task type, a scheduling period, a weight, a scheduling script and the like.
The task scheduling module, as shown in fig. 4, includes a first data reading unit 121, a first calculating unit 122, and a task loading unit 123. The first data reading unit 121 is configured to read the average task execution amount of the task group, the scheduling policy, and the configuration parameters in the scheduling plan, which are obtained by the monitoring module 15; the first calculating unit 122 is configured to sequentially obtain corresponding parameter data from the first data reading unit 121 according to a preset scheduling algorithm, and calculate a scheduling amount of each scheduling plan according to a corresponding scheduling algorithm flow; the task loading unit 123 loads the task into the corresponding task queue in the cache according to the scheduling amount of each scheduling plan.
The task scheduling module 12 further includes a synchronization lock identifier setting unit 124, configured to set a synchronization lock identifier for the task before loading the task into the cache, and set a synchronization lock for the task when the task is executed by the task host and the synchronization lock identifier is read.
As shown in fig. 5, which is a schematic diagram of a principle structure of the task issuing module according to the present invention, the task issuing module 13 includes a second data reading unit 131, a second calculating unit 132, and a task issuing unit 133. The second data reading unit 131 reads the average task execution amount of the task group, the scheduling policy, and the relevant parameters in the scheduling plan, such as the current task queue length, the weight of the scheduling plan, the scheduling cycle of the scheduling plan, and the number of task hosts, obtained by the monitoring module 15; the second calculating unit 132 sequentially obtains corresponding parameter data from the second data reading unit 131 according to a preset issuing algorithm, that is, different formulas and sequences, and calculates to obtain an issuing amount of each task host; the task issuing unit 133 takes out the number of tasks from the corresponding task queue in the task buffer and sends the number of tasks to the task host according to the issued amount of each task host.
As shown in fig. 6, a general flowchart of the balanced scheduling method for distributed tasks according to the present invention specifically includes:
step S1, a plurality of task groups are created according to functions or types, and a corresponding scheduling strategy is configured for each task group;
step S2, a plurality of scheduling plans are customized for each task group, and each scheduling plan corresponds to an executable task set;
step S3, monitoring the task execution amount of each task group, calculating the scheduling amount Q of each scheduling plan according to the scheduling strategy, and loading the tasks to be executed into the task queue in the cache;
step S4, calculating the issuing quantity C of each task host according to the task execution quantity of each task group and the scheduling strategy;
and step S5, taking out the executable task from the cache according to the delivery amount of each task host and delivering the executable task to the corresponding task host.
Specifically, in step 3, the task execution amount of each task group is monitored, and a flow of calculating the task scheduling amount Q of each scheduling plan according to the scheduling policy is shown in fig. 7, which specifically includes the following steps:
step S31, reading corresponding parameters from the scheduling policy, such as the maximum scheduling amount Q in the task groupmaxAnd issuing a maximum value Q of queue lengthbTaking the obtained value as a threshold value during calculation;
step S32, receiving feedback information of task host from the monitoring module of the dispatching center to obtain task execution amount of a time period; calculating the average value of the task execution amount in the task group within a period of time, such as 5 minutes, and then multiplying the average task execution amount by the fine adjustment according to a preset fine adjustment coefficient to obtain the current task execution amount of each task group
Figure GDA0002793047660000101
The fine tuning coefficient is a floating point coefficient larger than 0, the floating point coefficient can be manually set, and is mainly used for manual fine tuning of a calculation result, and a value is usually, but not limited to, between 0.8 and 1.2.
Step S33, according to the length Q of the current issue queueaBy the formula Qe=Qb-QaCalculating the length Q of the current issuing queueaAnd maximum value Q of issuing queue lengthbDifference Q ofe
Step S34, ensuring the maximum modulation Q is not exceededmaxAnd Q of maximum queue lengthbIn the case of the above-described situation,
Figure GDA0002793047660000111
and QeTaking the maximum value as the scheduling total Qt
Step S35, calculating the dispatching plan weight sum Wsum
Figure GDA0002793047660000112
Wherein, WiI is a natural number 1, 2, 3 … … for the weight of the ith dispatch plan.
Step S36, according to the ratio of the weight of each scheduling plan to the total weight and the total scheduling amount, according to the formula Qi=Qt×(Wi÷Wsum) Calculating the scheduling amount Q of each scheduling plani
Fig. 8 is a schematic diagram of a issue queue of an embodiment of a task group in a cache according to the present invention. Wherein the task group has 7 scheduling plans, and the adjustment quantity Q of the 7 scheduling plans is obtained through the above processp1-Qp7
When a task is issued, the issuing algorithm is slightly different according to whether the weights of a plurality of scheduling plans are the same or not. As shown in fig. 9, for the purpose that the weights of the multiple scheduling plans in the task group are different from each other, the step of calculating the task issuance amount C of each task host:
step S41a, calculating the total weight of multiple dispatching plans in task group
Figure GDA0002793047660000113
Step S42a, according to the current task execution amount, weight and the total weight, the issued amount for each scheduling plan is calculated according to the formula 2-1,
Figure GDA0002793047660000114
step S43a, calculating the average required number of tasks according to the length of the current task queue
Figure GDA0002793047660000115
Step S44a, at
Figure GDA0002793047660000116
And CaiAnd taking the minimum value as the task host sending quantity C.
If the plurality of scheduling plans in the task group have the same weight, the step of calculating the task issuing amount of each task host comprises the following steps:
step S41b, calculating the weight-removing total sum of the weight-removed multiple scheduling plans in the task group
Figure GDA0002793047660000117
Wherein m is the number of weight removal, Wi' is the ith de-emphasis weight.
Step S42b, calculating the delivered quantity C 'of each duplication removing weight according to the formula 2-2 according to the current task execution quantity, the duplication removing weight and the duplication removing weight sum'w
Figure GDA0002793047660000118
Step S43b, calculating the cycle sum of the dispatching plan under the same weight
Figure GDA0002793047660000121
Wherein, the IiThe scheduling period of the ith scheduling plan.
Step S44b, calculating the inverse period ratio R of the dispatching plan under the same weight according to the formula 2-3i
Ri=I'sum÷Ii 2-3;
Step S45b, calculating the sum of inverse periodic ratios of the dispatching plans under the same weight according to the formula 2-4 to obtain the sum R of the inverse ratiossum
Figure GDA0002793047660000122
Step S46b, according to formula 2-5, the period inverse ratio R of the dispatching plan under the same weightiAnd the inverse sum RsumAnd a down amount C 'of the de-weight'wObtaining the task issuing quantity C' a of the scheduling plan in the periodi
C'ai=C'w×(Ri÷Rsum) 2-5;
Step S47b, calculating the average required number of tasks according to the length of the current task queue
Figure GDA0002793047660000123
Step S48b, at
Figure GDA0002793047660000124
And C' aiAnd taking the minimum value as the task host sending quantity C.
And obtaining the issuing quantity of each task host according to the flow, and issuing the tasks to the corresponding task hosts from the cache according to the issuing quantity.
In the present invention, the structural principle of the task host is shown in fig. 10, the task host includes a task obtaining unit 21, a task executing unit 22, a feedback unit 23, and a synchronization lock setting unit 24, where the task obtaining unit 21 obtains a corresponding number of tasks (i.e. the issued amount calculated by the scheduling center) from the scheduling center; the task execution unit 22 executes the acquired task; the feedback unit 23 sends task execution feedback information to the scheduling center; the synchronization lock setting unit 24 is configured to set a synchronization lock for the task according to a synchronization lock identifier in the task before executing the task, and remove the synchronization lock after completing the task.
The process of the task host executing the task is shown in fig. 11.
Step S1a, obtaining tasks with the number of delivered quantities from the scheduling center, specifically, after completing the tasks, the task host requests the scheduling center for the tasks, and the scheduling center calculates the delivered quantities corresponding to the task host according to a delivery algorithm, and sends the tasks with the delivered quantities to the task host.
Step S2a, after obtaining the tasks, according to the synchronization lock identifier in the task, executing a setnx (key, value) command in the Redis database to set the synchronization lock for the task.
And step S3a, sequentially executing the tasks.
Step S4a, judging whether a task is completed, if yes, executing step S5 a; if not, then the process continues to step S3a to perform the task.
And step S5a, removing the synchronization lock.
And step S6a, judging whether the feedback period is reached, if so, executing step S7a, and if not, continuing to execute the task.
And step S7a, packaging and feeding back the task execution condition to the task scheduling center.
Step S8a, judging whether all tasks are completed, if not, continuing to execute the tasks; if all the tasks are finished, ending the execution process of the task, and starting a new task acquisition and execution process.
Under the default condition, one task can only be taken and executed by one host at the same time, if the task is not executed in a period, the next period dispatches and issues the task, the condition that a plurality of hosts execute the same task at the same time can occur, so that the hosts can set a lock before executing the task and can execute the task only if the setting is successful; if the setting is unsuccessful, the system waits for the timeout and returns directly.
The invention effectively improves the task execution efficiency and the utilization rate of the task host and balances the load of the task host through the algorithm of task scheduling and task issuing. And synchronous locking is realized through a Redis database, so that lock conflict caused by network jitter is avoided, and the stability of task execution is improved. The invention manages tasks according to groups, can flexibly customize a scheduling plan, supports a task execution feedback mechanism, can monitor the health conditions of task dropping, task issuing and task execution, flexibly adjusts the issued task quantity according to the feedback information, and dynamically maintains the load balance of the task host.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (15)

1. A balanced scheduling method for distributed tasks comprises the following steps:
creating a plurality of task groups according to functions or types, and configuring a corresponding scheduling strategy for each task group;
customizing a plurality of scheduling plans for each task group, wherein each scheduling plan corresponds to an executable task set;
receiving task execution feedback information of a task host to obtain a total task execution amount in a time period; calculating the average task execution amount in the time period; according to a preset fine adjustment coefficient, multiplying the average task execution quantity by the fine adjustment coefficient to obtain the current task execution quantity of each task group
Figure FDA0002793047650000012
Wherein the fine tuning coefficient is a floating point coefficient larger than 0;
calculating to obtain the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loading the tasks with the amount corresponding to the task scheduling amount into a task queue in a task cache;
calculating to obtain the task issuing amount of each task host according to the current task execution amount of each task group and the configuration parameters of the scheduling plan;
and taking out executable tasks from the corresponding task queue in the task cache according to the issued quantity of each task host and issuing the executable tasks to the corresponding task host.
2. The balanced scheduling method of distributed tasks according to claim 1, wherein the scheduling policy comprises at least a grouping configuration of task hosts; the dispatch plan includes at least a dispatch period and a weight.
3. The balanced scheduling method of distributed tasks according to claim 2, wherein the fine tuning coefficient has a value in the range of 0.8-1.2.
4. The method for balanced scheduling of distributed tasks according to claim 3, wherein said step of calculating the task scheduling amount for each scheduling plan comprises:
calculating the difference value Q between the length of the current task queue and the length of the maximum task queue according to the length of the current task queue in the cachee
In case it is guaranteed that the maximum modulation amount and the maximum queue length are not exceeded,
Figure FDA0002793047650000011
and QeTaking the maximum value as the scheduling total quantity Q of the tasktThe maximum scheduling amount is a threshold value set by the task group for the scheduling amount of each scheduling plan;
calculating the weighted sum W of the plurality of dispatching plans of each task group according to a summation formulasum
Weight W according to ith Dispatch planiAnd the sum of the weights W of the plurality of dispatch planssumRatio of (a) to the total scheduling amount Q of this tasktCalculating the scheduling quantity Q of the ith scheduling plan according to the formula 1-1i
Qi=Qt×(Wi÷Wsum) 1-1。
5. The balanced scheduling method of distributed tasks according to claim 3, wherein if the weights of the plurality of scheduling plans in the task group are different from each other, the step of calculating the task issuing amount of each task host includes:
calculating the weighted sum W of multiple scheduling plans in one task group according to a summation formulasum
According to the current task execution amount, the weight of the ith dispatching plan and the weight sum of the dispatching plans, the issuing amount Ca of the ith dispatching plan is calculated according to a formula 2-1i
Figure FDA0002793047650000021
According to the length of the current task queue, the average required delivery amount of each task host is calculated
Figure FDA0002793047650000022
In that
Figure FDA0002793047650000023
And CaiTaking the minimum value as a task host sending quantity C;
if the plurality of scheduling plans in the task group have the same weight, the step of calculating the task issuing quantity C of each task host comprises the following steps:
calculating the de-weight sum of multiple scheduling plan weights in the task group after de-weighting
Figure FDA0002793047650000024
Wherein m is the number of weight removal, Wi' is the ith de-weight;
according to the current task execution amount, the weight removal weight and the weight removal sum, calculating the delivery amount C 'of the ith weight removal weight according to a formula 2-2'w
Figure FDA0002793047650000025
Calculating the period sum of the dispatching plans under the same weight
Figure FDA0002793047650000026
Wherein, the IiA scheduling period for the ith scheduling plan;
calculating the cycle inverse ratio R of the ith dispatching plan under the same weight according to the formula 2-3i
Ri=I'sum÷Ii 2-3;
Calculating the sum of inverse periodic ratios of the scheduling plans under the same weight according to formulas 2-4 to obtain the sum R of the inverse ratiossum
Figure FDA0002793047650000027
According to equations 2-5, the inverse period ratio R of the dispatch plan under the same weightiAnd the inverse sum RsumAnd a down amount C 'of the de-weight'wObtaining the task issuing quantity C' a of the scheduling plan in the periodi
C'ai=C'w×(Ri÷Rsum) 2-5;
According to the length of the current task queue, the average required issuing number of each task host is calculated
Figure FDA0002793047650000031
In that
Figure FDA0002793047650000032
And C' aiAnd taking the minimum value as the task host sending quantity C.
6. The method for balanced scheduling of distributed tasks according to claim 1, further comprising the steps of setting a synchronization lock for a task to be executed and removing the synchronization lock after completion of the task to be executed.
7. The method for balanced scheduling of distributed tasks according to claim 6, wherein a task host sets the synchronization lock for the task to be executed by executing a setnx (key, value) command in a Redis database before executing the task.
8. A balanced dispatch center for distributed tasks, comprising:
the configuration module is used for creating a plurality of task groups according to functions or types and configuring a corresponding scheduling strategy for each task group; customizing a plurality of scheduling plans for each task group, and dividing the task groups into corresponding executable task sets according to the scheduling plans;
the monitoring module is used for receiving task execution feedback information of the task host and acquiring the total task execution amount in a time period; calculating the average task execution amount in the time period; according to a preset fine adjustment coefficient, multiplying the average task execution quantity by the fine adjustment coefficient to obtain the current task execution quantity of each task group
Figure FDA0002793047650000033
Wherein the fine tuning coefficient is a floating point coefficient larger than 0;
the task scheduling module is used for calculating the task scheduling amount of each scheduling plan according to the task execution feedback information and the configuration parameters of the scheduling plan, and loading the tasks with the amount corresponding to the task scheduling amount into a task queue in a task cache;
and the task issuing module is used for calculating the issuing quantity of each task host according to the current task execution quantity and the configuration parameters of the scheduling plan in the task group, and taking out the executable tasks from the corresponding task queue in the task cache according to the issuing quantity to issue the executable tasks to the corresponding task host.
9. The balanced task scheduling center according to claim 8, wherein the task scheduling module comprises:
the first data reading unit is used for reading the configuration parameters of the scheduling strategy and the scheduling plan and the current task execution amount obtained from the monitoring module;
the first calculation unit is used for sequentially obtaining corresponding parameter data from the first data reading unit according to a preset scheduling algorithm and calculating the task scheduling amount of each scheduling plan according to a corresponding scheduling algorithm flow; and
and the task loading unit loads the tasks into the task queue in the cache according to the task scheduling amount of each scheduling plan.
10. The distributed task balanced scheduling center of claim 9 wherein the task scheduling module further comprises:
and the synchronous lock identifier setting unit is used for setting a synchronous lock identifier for the scheduled executable task.
11. The balanced task scheduling center according to claim 8, wherein the task issuing module comprises:
the second data reading unit is used for reading the current task execution amount obtained by the monitoring module and reading the scheduling strategy configured by the configuration module and the configuration parameters of the scheduling plan;
the second calculation unit is used for sequentially obtaining corresponding parameter data from the second data reading unit according to a preset issuing algorithm and calculating to obtain issuing quantity of each task host according to a corresponding issuing algorithm process; and
and the task issuing unit is used for taking out the executable task from the corresponding task queue in the task cache according to the issuing quantity of each task host and issuing the executable task to the corresponding task host.
12. A distributed task balanced scheduling device, comprising a memory and a processor, wherein the memory is used for storing data and instructions, and the processor is configured according to the instructions as follows:
creating a plurality of task groups according to functions or types, and configuring a corresponding scheduling strategy for each task group;
customizing a plurality of scheduling plans for each task group, wherein each scheduling plan corresponds to an executable task set;
receiving task execution feedback information of a task host to obtain a total task execution amount in a time period; calculating the average task execution amount in the time period; according to a preset fine adjustment coefficient, multiplying the average task execution quantity by the fine adjustment coefficient to obtain the current task execution quantity of each task group
Figure FDA0002793047650000041
Wherein the fine tuning coefficient is a floating point coefficient larger than 0;
calculating to obtain a task scheduling amount of each scheduling plan according to task execution feedback information and configuration parameters of the scheduling plan, and loading tasks with the number corresponding to the task scheduling amount into a task queue in a task cache;
calculating to obtain the task issuing amount of each task host according to the current task execution amount of each task group and the configuration parameters of the scheduling plan;
and taking out executable tasks from the corresponding task queue in the task cache according to the issued quantity of each task host and issuing the executable tasks to the corresponding task host.
13. A system for balanced scheduling of distributed tasks, comprising:
a dispatch center as claimed in any one of claims 8-11, and
and the task host cluster is used for receiving the tasks issued by the dispatching center, executing the tasks and sending task execution feedback information to the dispatching center.
14. The system for balanced scheduling of distributed tasks according to claim 13, wherein each task host in the task host cluster comprises:
the task obtaining unit is used for obtaining the tasks with corresponding quantity from the dispatching center;
the task execution unit is used for executing the acquired task;
the feedback unit is used for sending task execution feedback information to the scheduling center; and
and the synchronous lock setting unit is used for setting a synchronous lock for the task according to a synchronous lock identifier in the task before the task is executed, and removing the synchronous lock after the task is completed.
15. The system for balanced scheduling of distributed tasks according to claim 14, further comprising a Redis database, wherein the synchronization lock setting unit of the task master executes a setnx (key, value) command in the Redis database to set the synchronization lock for the task, and wherein the synchronization lock is cleared after the task is executed.
CN201610615952.3A 2016-07-29 2016-07-29 Balanced scheduling center, method, system and device for distributed tasks Active CN107665144B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610615952.3A CN107665144B (en) 2016-07-29 2016-07-29 Balanced scheduling center, method, system and device for distributed tasks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610615952.3A CN107665144B (en) 2016-07-29 2016-07-29 Balanced scheduling center, method, system and device for distributed tasks

Publications (2)

Publication Number Publication Date
CN107665144A CN107665144A (en) 2018-02-06
CN107665144B true CN107665144B (en) 2021-02-26

Family

ID=61115746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610615952.3A Active CN107665144B (en) 2016-07-29 2016-07-29 Balanced scheduling center, method, system and device for distributed tasks

Country Status (1)

Country Link
CN (1) CN107665144B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569252B (en) * 2018-05-16 2023-04-07 杭州海康威视数字技术股份有限公司 Data processing system and method
CN109408212B (en) * 2018-09-28 2023-09-19 平安科技(深圳)有限公司 Task scheduling component construction method and device, storage medium and server
CN109410040A (en) * 2018-11-07 2019-03-01 杭州创金聚乾网络科技有限公司 A kind of match method, device and the equipment of loan application and investment application
CN109542600B (en) * 2018-11-15 2020-12-25 口碑(上海)信息技术有限公司 Distributed task scheduling system and method
CN111338882A (en) * 2018-12-18 2020-06-26 北京京东尚科信息技术有限公司 Data monitoring method, device, medium and electronic equipment
CN109739634A (en) * 2019-01-16 2019-05-10 ***股份有限公司 A kind of atomic task execution method and device
CN110287245B (en) * 2019-05-15 2021-03-19 北方工业大学 Method and system for scheduling and executing distributed ETL (extract transform load) tasks
CN110569263B (en) * 2019-08-27 2022-11-22 苏宁云计算有限公司 Real-time data deduplication counting method and device
CN110750349B (en) * 2019-10-26 2022-07-29 武汉中海庭数据技术有限公司 Distributed task scheduling method and system
CN111506412B (en) * 2020-04-22 2023-04-25 上海德拓信息技术股份有限公司 Airflow-based distributed asynchronous task construction and scheduling system and method
CN111625337A (en) * 2020-05-28 2020-09-04 浪潮电子信息产业股份有限公司 Task scheduling method and device, electronic equipment and readable storage medium
CN111953538A (en) * 2020-07-31 2020-11-17 深圳市高德信通信股份有限公司 CDN bandwidth scheduling system based on big data processing
CN112667393B (en) * 2020-12-19 2021-10-29 飞算数智科技(深圳)有限公司 Method and device for building distributed task computing scheduling framework and computer equipment
CN113032131B (en) * 2021-05-26 2021-08-31 天津中新智冠信息技术有限公司 Redis-based distributed timing scheduling system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101038559A (en) * 2006-09-11 2007-09-19 中国工商银行股份有限公司 Batch task scheduling engine and dispatching method
CN101458634A (en) * 2008-01-22 2009-06-17 中兴通讯股份有限公司 Load equilibration scheduling method and device
CN103793272A (en) * 2013-12-27 2014-05-14 北京天融信软件有限公司 Periodical task scheduling method and periodical task scheduling system
CN103984594A (en) * 2014-05-14 2014-08-13 上海上讯信息技术股份有限公司 Task scheduling method and system based on distributed configurable weighting algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101038559A (en) * 2006-09-11 2007-09-19 中国工商银行股份有限公司 Batch task scheduling engine and dispatching method
CN101458634A (en) * 2008-01-22 2009-06-17 中兴通讯股份有限公司 Load equilibration scheduling method and device
CN103793272A (en) * 2013-12-27 2014-05-14 北京天融信软件有限公司 Periodical task scheduling method and periodical task scheduling system
CN103984594A (en) * 2014-05-14 2014-08-13 上海上讯信息技术股份有限公司 Task scheduling method and system based on distributed configurable weighting algorithm

Also Published As

Publication number Publication date
CN107665144A (en) 2018-02-06

Similar Documents

Publication Publication Date Title
CN107665144B (en) Balanced scheduling center, method, system and device for distributed tasks
CN108762896B (en) Hadoop cluster-based task scheduling method and computer equipment
US8984519B2 (en) Scheduler and resource manager for coprocessor-based heterogeneous clusters
CN105912401B (en) A kind of distributed data batch processing system and method
CN110113387A (en) A kind of processing method based on distributed batch processing system, apparatus and system
US9244721B2 (en) Computer system and divided job processing method and program
CN104580396A (en) Task scheduling method, node and system
CN110162388A (en) A kind of method for scheduling task, system and terminal device
CN109656782A (en) Visual scheduling monitoring method, device and server
CN110928655A (en) Task processing method and device
CN111160873A (en) Batch processing device and method based on distributed architecture
CN103365708A (en) Method and device for scheduling tasks
CN105094982A (en) Multi-satellite remote sensing data processing system
CN105159769A (en) Distributed job scheduling method suitable for heterogeneous computational capability cluster
CN111880939A (en) Container dynamic migration method and device and electronic equipment
CN111459641B (en) Method and device for task scheduling and task processing across machine room
WO2020121292A1 (en) Efficient data processing in a serverless environment
CN109117269A (en) A kind of distributed system dispatching method of virtual machine, device and readable storage medium storing program for executing
CN116340005B (en) Container cluster scheduling method, device, equipment and storage medium
CN109871273A (en) A kind of adaptive task moving method and device
CN113961353A (en) Task processing method and distributed system for AI task
US9032251B2 (en) Re-forming an application control tree without terminating the application
CN117762591A (en) Task control method, task control device, computer device, and storage medium
CN105874453B (en) Consistent tenant experience is provided for more tenant databases
US20230168940A1 (en) Time-bound task management in parallel processing environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant