CN114077486B - MapReduce task scheduling method and system - Google Patents

MapReduce task scheduling method and system Download PDF

Info

Publication number
CN114077486B
CN114077486B CN202111386374.8A CN202111386374A CN114077486B CN 114077486 B CN114077486 B CN 114077486B CN 202111386374 A CN202111386374 A CN 202111386374A CN 114077486 B CN114077486 B CN 114077486B
Authority
CN
China
Prior art keywords
job
resource
task
resources
central
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111386374.8A
Other languages
Chinese (zh)
Other versions
CN114077486A (en
Inventor
高永强
张凯丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University
Original Assignee
Inner Mongolia University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University filed Critical Inner Mongolia University
Priority to CN202111386374.8A priority Critical patent/CN114077486B/en
Publication of CN114077486A publication Critical patent/CN114077486A/en
Application granted granted Critical
Publication of CN114077486B publication Critical patent/CN114077486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Multi Processors (AREA)

Abstract

The invention provides a MapReduce task scheduling method and a MapReduce task scheduling system, which overcome the defect that the existing preemption mechanism based on Kill of Yarn directly kills tasks by introducing the preemption mechanism based on a Docker container. The preemption mechanism based on the Docker container can release resources occupied by tasks while maintaining task progress, and can realize that tasks with high priority preempt the running resources of other tasks by combining task strategies perceived by a service level agreement, so that the completion time of the tasks reaches the goal of the Service Level Agreement (SLA), and the scheduling method can ensure higher cluster resource utilization rate and simultaneously consider the low delay and the instant response speed of the tasks.

Description

MapReduce task scheduling method and system
Technical Field
The invention relates to the technical field of task scheduling in heterogeneous cluster environments, in particular to a MapReduce task scheduling method and system.
Background
At present, with the development of internet technology, the data size required to be calculated and processed in daily production and life is increasingly larger, and a mode of processing large-scale data by using a distributed computing system has been widely used. Among these, the scheduler is a vital part of the distributed system. A well-designed scheduling strategy can efficiently allocate program requirements and available cluster resources and can reduce the operating costs of the data center. The most widely used distributed computing framework is the flagship project Hadoop of Apache at present, and the programming computing framework provided by the system is MapReduce. Hadoop abstracts the resource management part into an independent framework Yarn. Yarn is a universal resource management platform that can provide the resources required for operations for the computing programs, such as MapReduce.
At present, yarn has implemented three different schedulers, namely a first-in first-out scheduler, a capacity scheduler and a fairness scheduler, which are based on different scheduling strategies. Although these three scheduling strategies can improve the utilization rate of the cluster and optimize the cluster performance to a certain extent, how to schedule jobs with different resource requirements and QoS constraints under a complex heterogeneous cluster environment is still a challenge to be solved. The completion time of the job can be classified into short jobs and long jobs. Short jobs generally have low latency requirements, while long jobs can tolerate higher latency, but have requirements on quality of service. So for short jobs, scheduling is needed immediately after they commit to avoid queuing delays. For long jobs, if the cluster has free resources, the scheduler should allow the long jobs to use the cluster resources, which may increase the resource utilization of the cluster.
In a real working environment, a lot of long jobs and short jobs are usually mixed together to be scheduled, and the existing solution either forcibly terminates the running long jobs to ensure low delay of the short jobs or completely forbids the resource preemption behavior to improve the resource utilization of the cluster. This simple scheduling strategy cannot meet the scheduling of jobs for various different resource requirements in a complex heterogeneous environment. Our goal is to make a trade-off between resource utilization and job queuing delay, how to minimize job queuing delay while improving hardware resource utilization and performance, thereby achieving the service level agreement goal.
In view of this problem, there is an urgent need to develop a new scheduling strategy to meet the needs of actual work.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides the MapReduce task scheduling method and the MapReduce task scheduling system for the heterogeneous Yarn cluster environment, which ensure higher cluster resource utilization rate and simultaneously consider low delay and instant response speed of the operation.
The MapReduce task scheduling method provided by the invention comprises the following steps:
s1: the method comprises the steps that a client creates a JobSummiter instance, an input fragment of a job is calculated through an internal method of the JobSummiter, resources required by job operation are copied into a distributed file system, and MapReduce jobs are submitted to a resource scheduler;
s2: after receiving the job submitting message, the resource scheduler transmits the request message to the central resource scheduler, and the central resource scheduler analyzes detailed information related to the job through an internal job analysis method and analyzes the latest expiration date required by reaching the service level agreement;
s3: the central resource scheduler adds new tasks into a central task queue, and reorders all tasks from near to far according to the expiration date of each task;
s4: the central resource scheduler receives heartbeat information from the node resource schedulers, acquires the number of tasks allocated by each node resource scheduler, sequentially selects the node with the least task number from the nodes, and assigns the task with the latest current deadline date for past execution;
s5: after receiving the new task, the node resource scheduler adds the new task into a local task queue, and reorders the task queue according to the expiration date;
s6: the node resource scheduler checks the position of the new task in the task queue and if the expiration date of the new task is closer than the executing task, the new task preempts the executing task.
Further, in step S3, the central resource scheduler obtains the total CPU resources C and the total memory resources M, and obtains the job share of the long job according to the job numberPeriodically calculating the resource share ++of each job in the central task queue according to the fairness principle>
Further, in step S4, after receiving a resource request of a job, the central resource scheduler analyzes whether the job can be completed before the expiration date in combination with the expiration date constraint, the resource condition of the cluster and the resource requirement of the job, and if the central resource scheduler determines that the job can be completed before the expiration date, the job is added into the central task queue; otherwise, the central resource scheduler may refuse execution of the job.
Further, in step S4, when a certain job arrives, the central resource scheduler determines the current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job according to the history log of the job running, and if the job is not executed on the cluster, the scheduler uses a small part of the original data set as a pre-test set to execute the job.
Further, in step S4, if the amount of resources requested by the job does not exceed the amount of resources available in the cluster, the central resource scheduler adds the job to a central task queue;
otherwise, it is necessary to subdivide into two cases: one case includes executing the job if the job is preempted directly from the resources of the currently running job and the job can be completed in time if it is executed immediately;
another scenario includes that the central resource scheduler directly denies execution of the job, even if the job preempts the resources of other jobs, the job fails to meet the deadline requirement.
Further, in step S4, the amount of preemptive resources based on the service level agreement is determined by the following scheme:
after the W map tasks are executed, the reduce task is started to be executed, T up Representing the upper execution time limit of the W map tasks, the following can be obtained:
wherein M is avg For the average execution time of the map task in job j,m is the number of map tasks in job j max Maximum execution time for map task in job j; when Q jobs can be limited in time by T up The previous execution is completed, and the amount of resources released after completion of these jobs is R, the value of R can be calculated by the following formula:
wherein j represents a certain job,representing the number of reduce tasks of job j;
the amount of resources required for the reduce phase is E, the value of E can be calculated by the following equation:
wherein C is r Indicating the amount of available resources in the cluster at the next time,representing the amount of resources required by the map task for job j.
Further, in step S6, when job preemption is required, the resource share of the job k to be preempted is calculatedWherein->Represents the resources actually occupied by job k during execution,/-, for example>Representing the amount of resources that the job k should obtain according to the fair allocation principle of the resources; then the resource share of the job j request to be executed is acquired +.>If->Then the resource to be preempted can be calculated>
Further, in step S6, ifThen calculate the resources to be preempted by algorithm +.>Resources requiring preemption ∈>The calculation of (1) comprises: firstly comparing CPU resources and memory resources, dividing the CPU resources and the memory resources into main resources and secondary resources, and then obtaining the resource recovery quantity of the secondary resources according to the recovery quantity of the main resources in proportion, wherein a calculation formula comprises:
wherein C is j ,M j Representing the CPU resource amount and the memory resource amount requested by the job j respectively, C a ,M a Respectively representing the CPU resource amount and the memory resource amount actually additionally occupied by the current operation k in the cluster;
representing the amount of resources that need to be preempted, if +.>The CPU resource requested by the job j is the main resource, so that all CPU resources additionally occupied by the job k are preempted, and the memory resources additionally occupied by the job k are preempted in proportion; otherwise, the memory is regarded as the main resource requested by the operation j, and the additional occupation of the operation k is preemptedAll memory resources and proportionally preempt the CPU resources additionally occupied by job k.
Further, the MapReduce task scheduling method is performed according to a scheduling policy based on a service level protocol, and the step of the scheduling policy based on the service level protocol comprises the following steps:
when job j arrives, analyzing the expiration date, required throughput and required resource amount of the job;
the central resource scheduler analyzes whether the current resource quantity of the cluster meets the resource demand quantity of the job j, if so, the job j is added into the central task queue;
if not, judging whether the resource quantity of the cluster can meet the resource demand quantity of the map task of the job j and whether the released resource after the execution of the map task is finished can meet the resource demand quantity of the reduce task;
if the two conditions are met, adding the job j into a central task queue, and marking the job j as high priority, so that the job j can occupy the resources of other jobs in the execution process; if the two conditions cannot be met simultaneously, the central resource scheduler refuses to execute the job j;
the central resource scheduler sorts the jobs in the central task queue according to the expiration date, and respectively traverses each job; for the job j in the central task queue, the central resource scheduler judges whether the map task of the job j is completely executed, if not, the priority of the job j is judged, if the job is a high-priority job, the job is immediately communicated with the node resource scheduler, the designated resource is preempted from the cluster to execute the map task of the job j, otherwise, the cluster is waited to generate idle resources and the map task of the job j is allocated;
the node resource scheduler reports the task execution state to the central resource scheduler through the heartbeat information, if the map tasks are all executed, the central resource scheduler judges whether the number of the map tasks which are already executed exceeds a threshold value W, if the number of the map tasks which are already executed exceeds the threshold value W, the request tasks of the job j are started to be executed, the priority of the job j is judged, if the job which is preferentially executed is the job which is also judged, the job preemption is completed together with the node resource scheduler, and otherwise, the allocation of idle resources is waited.
The invention also provides a MapReduce task scheduling system adopting the MapReduce task scheduling method, which comprises the following steps: the distributed data center cluster comprises a center resource scheduler and a plurality of node resource schedulers;
a central task queue is maintained in the central resource scheduler, and when a new job arrives, the central resource scheduler analyzes the job characteristics to obtain the running time and the expiration date of the job;
and maintaining an operating task queue and a suspended task queue in the node resource scheduler, and continuously reporting task information and resource use conditions on the node to the central resource scheduler through a heartbeat mechanism according to the sequencing of the deadlines.
The invention provides a MapReduce task scheduling method used in a heterogeneous Yarn cluster environment, a central resource scheduler runs on a daemon process of a resource manager (resource manager) and is responsible for receiving task information transmitted by the node resource scheduler, and periodically checking a current task scheduling strategy, the resource availability of each working node and acquiring the resource requirement of a newly arrived task, so as to deduce which queues occupy redundant resources, which queues have insufficient resource allocation, calculate the number of resources to be preempted, obtain an optimal resource allocation scheme of a task queue in each time period and transmit scheduling decisions to the node resource scheduler for execution.
The scheduling method creatively introduces a preemption mechanism based on a Docker container, and overcomes the defect that the existing preemption mechanism based on Kill of Yarn directly kills tasks. The preemption mechanism based on the Docker container can release resources occupied by tasks while maintaining task progress, and can realize that tasks with high priority preempt the running resources of other tasks by combining task strategies perceived by a service level agreement, so that the completion time of the tasks reaches the goal of the Service Level Agreement (SLA), and the scheduling method can ensure higher cluster resource utilization rate and simultaneously consider the low delay and the instant response speed of the tasks.
Drawings
The invention will be described in more detail hereinafter on the basis of embodiments and with reference to the accompanying drawings. Wherein:
fig. 1 is a schematic flow chart of a MapReduce task scheduling method in the present invention;
fig. 2 is a system architecture diagram of a MapReduce task scheduling method in the present invention;
fig. 3 is an example deployment diagram of a MapReduce task scheduling method in the present invention.
Detailed Description
In order to clearly illustrate the inventive content of the present invention, the present invention will be described below with reference to examples.
In the description of the present invention, it should be noted that the positional or positional relationship indicated by the terms such as "upper", "lower", "horizontal", "top", "bottom", etc. are based on the positional or positional relationship shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or element in question must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Referring to fig. 1-3, a central resource scheduler in the present invention is a daemon running on a resource manager (resource manager), and is responsible for receiving task information transmitted from a node resource scheduler, and periodically checking a current task scheduling policy, resource availability of each working node, and obtaining resource requirements of newly arrived tasks, so as to infer which queues occupy redundant resources, which queues have insufficient resource allocation, calculate the number of resources to be preempted, obtain an optimal resource allocation scheme of a task queue in each time period, and send scheduling decisions to the node resource scheduler for execution.
The Node resource scheduler is a daemon running on a working Node Manager (Node Manager), which integrates a Docker container with a Yarn framework and solves the preemption mode that a native Yarn container directly kills a task container. After receiving the task request, the node resource scheduler loads the task into a Docker container, and configures the container according to the resource request of the task. In addition, the node resource scheduler is responsible for operations of container suspension and container recovery, and the container resources are recovered or recovered as required.
In the actual job scheduling process, a specific job to be preempted is determined before the job preempting operation, and the invention designs a preemptive job scheduling strategy capable of ensuring QoS service quality to the greatest extent and meeting the SLA service level agreement. The idea of the job scheduling strategy is to execute the job with the earliest expiration date preferentially, so that the number of jobs with the expiration date being missed can be minimized, and the execution effect of the jobs is greatly improved. Specifically, after receiving a resource request of a job, the central resource scheduler analyzes whether the job can be completed before the expiration date according to the expiration date constraint, the resource condition of the cluster and the resource requirement of the job. If the central resource scheduler determines that a job can be completed before its expiration date, it is added to the job queue. Otherwise, the central resource scheduler may refuse execution of this job.
When a job j arrives, the central resource scheduler determines the current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job j according to the history log of the operation of the job, and if the job j is not executed on the cluster, the scheduler uses a small part of the original data set as a pre-test set to execute the job, so as to obtain the performance index related to the job.Represents the total amount of resources needed for job j, +.>The resources required for map task representing job jSource amount (amount of source)>Representing the amount of resources required for the reduce task for job j, we use C r Indicating the amount of available resources in the cluster at the next time instant. If the amount of resources requested by the job does not exceed the amount of resources available in the cluster, i.eThe central resource scheduler adds job j to the job queue. Otherwise, go (L)> Then it needs to be subdivided into two cases: one case is that if job j is preempted directly from the resource of job k currently running and executed immediately, in which case the job can be completed in time, job j is executed; in another case, even if the job j preempts the resources of other jobs, the job j cannot meet the deadline requirement, and the central resource scheduler directly refuses to execute the job j.
The MapReduce task scheduling method is performed according to a scheduling strategy based on a Service Level Agreement (SLA), and the specific deployment mode based on the Service Level Agreement (SLA) comprises the following steps:
step 1: and integrating the central resource scheduler provided by the invention in the resource manager nodes in the Yarn cluster, and integrating the node resource schedulers provided by the invention by the rest NodeManager nodes.
Step 2: when a user submits a batch of jobs to be allocated to a cluster, a central resource scheduler analyzes the jobs, and analyzes the input data size of the jobs, the size of needed CPU, memory and other resource sizes, and the expiration date specified by the user.
Step 3: the central resource scheduler collects node state information sent from each node resource scheduler, and counts the execution progress of the currently executing job and the utilization rate of various resources in the cluster.
Step 4: the central resource scheduler gathers the current available resources of the cluster and the characteristics of the job to be executed, analyzes whether the current cluster resources can meet the resource demand of the job j, if so, adds the job j to the job queue to be executed, otherwise, judges whether the cluster resources can meet the resource demand of the map task of the job j, and after the map task execution is finished, the released resources can meet the resource demand of the reduce task. If both conditions are met, job j is added to the job queue and set to a high priority.
Step 5: the central resource scheduler sorts the jobs in the job queue according to the expiration date, and traverses each job respectively. For the job j in the job queue, the central resource scheduler judges whether the map task of the job j is completely executed, if not, the priority of the job j is judged, if the job is a high-priority job, the job is immediately communicated with the node resource scheduler, the designated resource is preempted from the cluster to execute the map task of the job j, otherwise, the cluster is waited to generate idle resources and the map task of the job j is allocated;
step 6: the node resource scheduler reports the status of task execution to the central resource scheduler via the heartbeat information. If the map tasks have been completely executed, the central resource scheduler determines whether the number of map tasks that have been completed by execution exceeds a threshold value W. If the threshold value W is exceeded, the reduce task of the job j starts to be executed, the priority of the job j is judged, if the job is executed preferentially, the job preemption is completed together with the node resource scheduler, and otherwise, the idle resources are waited for allocation.
The scheduling method in the invention can realize that the task with high priority can occupy the operation resources of other tasks based on the scheduling strategy of the Service Level Agreement (SLA), ensure that the completion time of the job reaches the goal of the Service Level Agreement (SLA), ensure higher cluster resource utilization rate by the scheduling method in the invention, and simultaneously consider the low delay and the instant response speed of the job. The MapReduce task scheduling system can balance the relation between the resource utilization rate and the job queuing delay, effectively improve the hardware resource utilization rate and efficiency, and greatly reduce the job queuing delay, thereby achieving the service level protocol target.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.

Claims (10)

1. The MapReduce task scheduling method is characterized by comprising the following steps of:
s1: the method comprises the steps that a client creates a JobSummiter instance, an input fragment of a job is calculated through an internal method of the JobSummiter, resources required by job operation are copied into a distributed file system, and MapReduce jobs are submitted to a resource scheduler;
s2: after receiving the job submitting message, the resource scheduler transmits the request message to the central resource scheduler, and the central resource scheduler analyzes the detailed information related to the job through an internal job analysis method and analyzes the latest expiration date required by reaching the service level agreement;
s3: the central resource scheduler adds new tasks into a central task queue, and reorders all tasks from near to far according to the expiration date of each task;
s4: the central resource scheduler receives heartbeat information from the node resource schedulers, acquires the number of tasks allocated by each node resource scheduler, sequentially selects the node with the least task number from the nodes, and assigns the task with the latest current deadline date for past execution;
s5: after receiving the new task, the node resource scheduler adds the new task into a local task queue, and reorders the task queue according to the expiration date;
s6: the node resource scheduler checks the position of the new task in the task queue and if the expiration date of the new task is closer than the executing task, the new task preempts the executing task.
2. The MapReduce task scheduling method according to claim 1, wherein in step S2, the central resource scheduler obtains a total CPU resource amount C and a total memory resource amount M, and obtains job shares of long jobs according to the job amountsPeriodically calculating the resource share ++of each job in the central task queue according to the fairness principle>
3. The MapReduce task scheduling method according to claim 2, wherein in step S3, after receiving a resource request of a job, the central resource scheduler analyzes whether the job can be completed before an expiration date in combination with an expiration date constraint, a resource condition of a cluster, and a resource requirement of the job, and if the central resource scheduler determines that the job can be completed before the expiration date, the job is added to a central task queue; otherwise, the central resource scheduler may refuse execution of the job.
4. The MapReduce task scheduling method of claim 1, wherein in step S4, when a job arrives, the central resource scheduler determines the current cluster resource amount based on the heartbeat information sent by each node resource scheduler, and analyzes the resource amount requested by the job based on the history log of the job running, and if the job has not been executed on the cluster, the scheduler executes the job using a small portion of the original data set as a pre-test set.
5. The MapReduce task scheduling method of claim 4, wherein in step S4, if the amount of resources requested by the job does not exceed the amount of resources available in the cluster, the central resource scheduler adds the job to a central task queue;
otherwise, it is necessary to subdivide into two cases: one case includes executing the job if the job is preempted directly from the resources of the currently running job and the job can be completed in time if it is executed immediately;
another scenario includes that the central resource scheduler directly denies execution of the job, even if the job preempts the resources of other jobs, the job fails to meet the deadline requirement.
6. The MapReduce task scheduling method according to claim 4, wherein in step S4, the amount of preemptive resources based on the service level agreement is determined by:
after the W map tasks are executed, the reduce task is started to be executed, T up Representing the upper execution time limit of the W map tasks, the following can be obtained:
wherein M is avg For the average execution time of the map task in job j,m is the number of map tasks in job j max Maximum execution time for map task in job j; when Q jobs can be limited in time by T up The previous execution is completed, and the amount of resources released after completion of these jobs is R, the value of R can be calculated by the following formula:
wherein j represents a certain job,representing the number of reduce tasks of job j;
the amount of resources required for the reduce phase is E, the value of E can be calculated by the following equation:
wherein C is r Indicating the amount of available resources in the cluster at the next time,representing the amount of resources required by the map task for job j.
7. The MapReduce task scheduling method according to claim 6, wherein in step S6, when job preemption is required, the resource share of the job k which needs to be preempted is calculated in advanceWherein->Represents the resources actually occupied by job k during execution,/-, for example>Representing the amount of resources that the job k should obtain according to the fair allocation principle of the resources; then the resource share of the job j request to be executed is acquired +.>If->Then the resources that need to be preempted can be calculated
8. The MapReduce task scheduling method of claim 7, wherein in step S6, ifThen calculate the resources to be preempted by algorithm +.>Resources requiring preemption ∈>The calculation of (1) comprises: firstly comparing CPU resources and memory resources, dividing the CPU resources and the memory resources into main resources and secondary resources, and then obtaining the resource recovery quantity of the secondary resources according to the recovery quantity of the main resources in proportion, wherein a calculation formula comprises:
wherein C is j ,M j Representing the CPU resource amount and the memory resource amount requested by the job j respectively, C a ,M a Respectively representing the CPU resource amount and the memory resource amount actually additionally occupied by the current operation k in the cluster;
representing the amount of resources that need to be preempted, if +.>The CPU resource requested by the job j is the main resource, so that all CPU resources additionally occupied by the job k are preempted, and the memory resources additionally occupied by the job k are preempted in proportion; otherwise, the memory is considered as the main resource requested by the job j, all memory resources additionally occupied by the job k are preempted, and the CPU resources additionally occupied by the job k are preempted in proportion.
9. The MapReduce task scheduling method of claim 8, wherein the MapReduce task scheduling method is performed according to a service level protocol-based scheduling policy, and the service level protocol-based scheduling policy comprises:
when job j arrives, analyzing the expiration date, required throughput and required resource amount of the job;
the central resource scheduler analyzes whether the current resource quantity of the cluster meets the resource demand quantity of the job j, if so, the job j is added into the central task queue;
if not, judging whether the resource quantity of the cluster can meet the resource demand quantity of the map task of the job j and whether the released resource after the execution of the map task is finished can meet the resource demand quantity of the reduce task;
if the two conditions are met, adding the job j into a central task queue, and marking the job j as high priority, so that the job j can occupy the resources of other jobs in the execution process; if the two conditions cannot be met simultaneously, the central resource scheduler refuses to execute the job j;
the central resource scheduler sorts the jobs in the central task queue according to the expiration date, and respectively traverses each job; for the job j in the central task queue, the central resource scheduler judges whether the map task of the job j is completely executed, if not, the priority of the job j is judged, if the job is a high-priority job, the job is immediately communicated with the node resource scheduler, the designated resource is preempted from the cluster to execute the map task of the job j, otherwise, the cluster is waited to generate idle resources and the map task of the job j is allocated;
the node resource scheduler reports the task execution state to the central resource scheduler through the heartbeat information, if the map tasks are all executed, the central resource scheduler judges whether the number of the map tasks which are already executed exceeds a threshold value W, if the number of the map tasks which are already executed exceeds the threshold value W, the request tasks of the job j are started to be executed, the priority of the job j is judged, if the job which is preferentially executed is the job which is also judged, the job preemption is completed together with the node resource scheduler, and otherwise, the allocation of idle resources is waited.
10. A MapReduce task scheduling system employing the MapReduce task scheduling method of any one of claims 1 to 9, comprising: the distributed data center cluster comprises a center resource scheduler and a plurality of node resource schedulers;
a central task queue is maintained in the central resource scheduler, and when a new job arrives, the central resource scheduler analyzes the job characteristics to obtain the running time and the expiration date of the job;
and maintaining an operating task queue and a suspended task queue in the node resource scheduler, and continuously reporting task information and resource use conditions on the node to the central resource scheduler through a heartbeat mechanism according to the sequencing of the deadlines.
CN202111386374.8A 2021-11-22 2021-11-22 MapReduce task scheduling method and system Active CN114077486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111386374.8A CN114077486B (en) 2021-11-22 2021-11-22 MapReduce task scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111386374.8A CN114077486B (en) 2021-11-22 2021-11-22 MapReduce task scheduling method and system

Publications (2)

Publication Number Publication Date
CN114077486A CN114077486A (en) 2022-02-22
CN114077486B true CN114077486B (en) 2024-03-29

Family

ID=80284249

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111386374.8A Active CN114077486B (en) 2021-11-22 2021-11-22 MapReduce task scheduling method and system

Country Status (1)

Country Link
CN (1) CN114077486B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860397A (en) * 2022-04-14 2022-08-05 深圳清华大学研究院 Task scheduling method, device and equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991830A (en) * 2015-07-10 2015-10-21 山东大学 YARN resource allocation and energy-saving scheduling method and system based on service level agreement
WO2020248226A1 (en) * 2019-06-13 2020-12-17 东北大学 Initial hadoop computation task allocation method based on load prediction
CN112395052A (en) * 2020-12-03 2021-02-23 华中科技大学 Container-based cluster resource management method and system for mixed load

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991830A (en) * 2015-07-10 2015-10-21 山东大学 YARN resource allocation and energy-saving scheduling method and system based on service level agreement
WO2020248226A1 (en) * 2019-06-13 2020-12-17 东北大学 Initial hadoop computation task allocation method based on load prediction
CN112395052A (en) * 2020-12-03 2021-02-23 华中科技大学 Container-based cluster resource management method and system for mixed load

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Hadoop集群环境下集成抢占式调度策略的本地性调度算法设计;王越峰;王溪波;;计算机科学;20171231(第S1期);第567-570页 *

Also Published As

Publication number Publication date
CN114077486A (en) 2022-02-22

Similar Documents

Publication Publication Date Title
Yao et al. Haste: Hadoop yarn scheduling based on task-dependency and resource-demand
Thinakaran et al. Phoenix: A constraint-aware scheduler for heterogeneous datacenters
US9367357B2 (en) Simultaneous scheduling of processes and offloading computation on many-core coprocessors
US5999963A (en) Move-to-rear list scheduling
CN112162865A (en) Server scheduling method and device and server
US20110173626A1 (en) Efficient maintenance of job prioritization for profit maximization in cloud service delivery infrastructures
Mehra et al. Structuring communication software for quality-of-service guarantees
Sun et al. Rose: Cluster resource scheduling via speculative over-subscription
CN109564528B (en) System and method for computing resource allocation in distributed computing
CN112506634B (en) Fairness operation scheduling method based on reservation mechanism
CN109842947B (en) Base station task oriented scheduling method and system
Li et al. OPTAS: Decentralized flow monitoring and scheduling for tiny tasks
Guo et al. Delay-optimal scheduling of VMs in a queueing cloud computing system with heterogeneous workloads
Dimopoulos et al. Justice: A deadline-aware, fair-share resource allocator for implementing multi-analytics
CN114077486B (en) MapReduce task scheduling method and system
Zhang et al. Dynamic scheduling with service curve for QoS guarantee of large-scale cloud storage
CN116010064A (en) DAG job scheduling and cluster management method, system and device
Ma et al. Chronos: Meeting coflow deadlines in data center networks
Du et al. Scheduling for cloud-based computing systems to support soft real-time applications
Nagar et al. Class-based prioritized resource control in Linux
CN109062707B (en) Electronic device, method for limiting inter-process communication thereof and storage medium
Li et al. Coordinative scheduling of computation and communication in data-parallel systems
CN113590294B (en) Self-adaptive and rule-guided distributed scheduling method
Mendis et al. Task allocation for decoding multiple hard real-time video streams on homogeneous NoCs
Zou et al. BTP: automatic identification and prediction of tasks in data center networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant