WO2013107012A1 - Task processing system and task processing method for distributed computation - Google Patents

Task processing system and task processing method for distributed computation Download PDF

Info

Publication number
WO2013107012A1
WO2013107012A1 PCT/CN2012/070551 CN2012070551W WO2013107012A1 WO 2013107012 A1 WO2013107012 A1 WO 2013107012A1 CN 2012070551 W CN2012070551 W CN 2012070551W WO 2013107012 A1 WO2013107012 A1 WO 2013107012A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
layer scheduler
subtasks
scheduler
queue
Prior art date
Application number
PCT/CN2012/070551
Other languages
French (fr)
Chinese (zh)
Inventor
靳变变
刘文宇
严军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN2012800001658A priority Critical patent/CN102763086A/en
Priority to PCT/CN2012/070551 priority patent/WO2013107012A1/en
Publication of WO2013107012A1 publication Critical patent/WO2013107012A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Definitions

  • Embodiments of the present invention relate to the field of network communications, and more particularly, to a distributed computing task processing system and a task processing method. Background technique
  • MapReduce is a distributed computing software architecture that supports distributed processing of large amounts of data. This architecture originally originated from the function's map and reduce functions. Map refers to the processing of the original document according to the custom mapping rules, output intermediate results. Reduce combines intermediate results according to custom reduction rules.
  • the general architecture of MapReduce includes a scheduling node and multiple working nodes.
  • the scheduling node is responsible for task scheduling and resource management; it is responsible for decomposing the tasks submitted by the user into two subtasks, map and reduce, according to the user configuration, and assigning map and reduce subtasks to the working node.
  • the worker node is responsible for running the map, reduce subtask, and maintaining communication with the dispatch node.
  • the embodiments of the present invention provide a task processing system and a task processing method, which can solve the problem of processing efficiency in the existing parallel processing architecture.
  • a distributed computing task processing system including: a first layer scheduler, configured to receive a request to perform a task, start or select a second layer scheduler corresponding to the task, and forward the to the second layer scheduler.
  • the second layer scheduler is configured to decompose the task into multiple subtasks according to the logical relationship of the task when receiving the request forwarded by the first layer scheduler.
  • a distributed computing task processing method comprising: the first layer scheduler starts or selects a second layer scheduler corresponding to the task when receiving the request for executing the task; the first layer scheduler The request is forwarded to the second layer scheduler; when receiving the request forwarded by the first layer scheduler, the second layer scheduler decomposes the task into multiple subtasks according to the logical relationship of the task.
  • the embodiment of the invention adopts a two-layer scheduling architecture.
  • the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that it can be applied to different tasks, improving processing efficiency and scheduling. flexibility.
  • FIG. 1 is a block diagram of a task processing system in accordance with an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a processing architecture of an embodiment of the present invention.
  • FIG. 3 is a flow chart of a task processing method in accordance with an embodiment of the present invention.
  • FIG. 4 is a schematic flow chart of a task processing procedure according to an embodiment of the present invention. detailed description
  • FIG. 1 is a block diagram of a distributed computing task processing system in accordance with an embodiment of the present invention.
  • the task processing system 10 of Figure 1 includes a two-tier scheduler, a first level scheduler 11 and a second level scheduler 12.
  • the first layer scheduler 11 receives the request to perform the task, starts or selects the second layer scheduler 12 corresponding to the task, and forwards the request to the second layer scheduler 12.
  • the first layer scheduler 11 can start the second layer scheduler 12 corresponding to the task.
  • the first layer scheduler 11 can select the second layer scheduler 12 corresponding to the task from among these suitable second layer schedulers.
  • the first layer scheduler is further configured to perform priority management on the task, and start or select the second layer scheduler to process the task according to the priority.
  • the second layer scheduler 12 When receiving the request forwarded by the first layer scheduler 11, the second layer scheduler 12 decomposes the task into a plurality of subtasks according to the logical relationship of the task.
  • the embodiment of the invention adopts a two-layer scheduling architecture.
  • the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that it can be applied to different tasks, improving processing efficiency and scheduling. flexibility.
  • the first layer scheduler 11 of the embodiment of the present invention can accept various forms of tasks, and the form of the task is not necessarily limited to the strict two steps of map and reduce in the prior art.
  • the second layer scheduler 12 corresponds to the task, so that the first layer scheduler 11 can send different tasks to the corresponding second layer scheduler 12 for scheduling processing.
  • the second layer scheduler 12 decomposes the tasks into subtasks to process the tasks, such as scheduling the execution of each subtask. Such scheduling has greater flexibility.
  • the task processing needs to be performed strictly in the order of two steps of map and reduce. If there are many steps to process, you need to complete the task request by submitting many times, and the processing efficiency is low.
  • Embodiments of the invention do not have this limitation.
  • the embodiment of the present invention does not limit the task itself and the manner in which the task or subtask is executed.
  • the number of subtasks included in a task can be more than the prior art map, reduce, such as three or more subtasks; and is not limited to the form of map, reduce subtasks.
  • the subtasks do not have to follow a strict sequence, and may be executed in parallel, serially, or partially in parallel. In this way, even with many steps, only a small number of task requests are required, which improves processing efficiency.
  • the number of subtasks is related to specific tasks, such as transcoding, face recognition services, and so on. Depending on the logical relationship of the tasks, these tasks may have the same or different number of subtasks.
  • the logical relationship of the task may be carried in the description file of the task.
  • the task processing system 10 specifically, for example, the second layer scheduler 12
  • can receive a user-uploaded description file such as an XML (Extensible Markup Language) format description file, which carries the task.
  • XML Extensible Markup Language
  • the second layer scheduler 12 when receiving the request forwarded by the first layer scheduler 11, the second layer scheduler 12 obtains a description file in an XML format corresponding to the task.
  • the task is decomposed into a plurality of subtasks according to the logical relationship of the tasks carried in the description file of the XML format.
  • each subtask can be further broken down into sub-tasks of smaller granularity. That is, the subtask of the embodiment of the present invention may be a multi-layer subtask, and the manner of further decomposing each subtask may be determined by the logical relationship carried in the description file. For example, subtask 1 can be decomposed into multiple subtasks 2, and subtask 2 can be further decomposed into multiple subtasks 3 and so on.
  • the logical relationship of the tasks may indicate the execution dependencies of the plurality of subtasks.
  • execution dependency refers to whether the execution operations of each sub-task depend on each other.
  • subtask 2 For example, assuming that subtask 2 must rely on the execution result of subtask 1, subtask 2 should be executed after subtask 1 is executed (ie, subtask 1 and subtask 2 need to be executed serially). On the other hand, if subtask 2 does not depend on the entire execution result of subtask 1, subtask 1 and subtask 2 may be executed in parallel or in series.
  • a non-limiting example of performing a dependency may include: two or more subtasks of the plurality of subtasks are executed in serial, or parallel, or partially parallel partial serial order, and are not limited to the prior art maps. , reduce these two steps. In this way, if there are many steps in a certain task, there is no need to submit a plurality of task requests as in the MapReduce architecture. The embodiment of the present invention may only need to submit one or a few times of task requests, thereby improving the processing efficiency of the task.
  • the logical relationship of the task can explicitly indicate the execution dependencies between the subtasks, for example, explicitly indicating that the task is composed of subtasks 1-3 that are executed serially in succession.
  • the logical relationship of the task may implicitly indicate an execution dependency between the subtasks. For example, for a particular task, the system knows in advance that the task is composed of subtasks 1-3 that are executed serially in succession.
  • the second layer scheduler 12 is further configured to create a corresponding queue for the multiple subtasks to store the tasks included in the subtask.
  • the second layer scheduler 12 may be further configured to apply for a resource for the subtask, and indicate that the work unit manager of the requested resource starts the work unit, to The work unit is caused to acquire a task included in the subtask from the queue to perform a task.
  • the second layer scheduler 12 is further configured to instruct the work unit to put the result of executing the task into another queue or output the result of the execution task.
  • the second layer scheduler 12 is further configured to obtain progress information of the queue and the work unit to determine an execution progress of the task.
  • the embodiment of the present invention does not limit the specific form of the task.
  • the logical relationship setting or selection of the task may support user customization, such as receiving a user's settings or selections through a plugin mechanism.
  • the task processing system 10 of the embodiment of the present invention is applicable to a cloud computing architecture.
  • Cloud computing proposes a high-reliability, low-cost, on-demand, and resilient business model. Many systems can achieve high reliability, flexibility, and low cost by using cloud services.
  • FIG. 2 is a schematic diagram of a processing architecture of an embodiment of the present invention.
  • the processing architecture 20 of FIG. 2 is a cloud computing architecture, including the task processing system 10 of FIG.
  • the task processing system 10 of FIG. 2 can include a plurality of second layer schedulers 12.
  • the number of second-tier schedulers 12 is not limited by this example (may be more or less).
  • Each second level scheduler 12 corresponds to a task to adapt or support different computing models.
  • the plurality of second layer schedulers 12 may also correspond to a task to achieve high concurrency of system scheduling.
  • the first layer scheduler 11 may select the suitable second layer scheduler 12 to process the tasks; If there is no suitable second layer scheduler 12 corresponding to the task in the existing plurality of second layer schedulers 12, the first layer scheduler 11 can start a new suitable second layer scheduler 12 to process the tasks. .
  • first layer scheduler 11 may be distributed to support high concurrency.
  • the first layer scheduler 11 can receive task requests sent by the Webservice 21.
  • the web service 21 is responsible for the receiving and forwarding of the web (network) request of the user.
  • the first layer scheduler 11 when the first layer scheduler 11 receives multiple task requests, the first layer scheduler 11 may also perform priority management on the tasks (eg, prioritize), and according to the priority.
  • the second level scheduler 12 is started or selected to process the task.
  • the first layer scheduler 11 may preferentially start or select the second layer scheduler 12 corresponding to the higher priority task.
  • the first layer scheduler 11 may implement additional functions such as priority adjustment of tasks.
  • additional functions such as priority adjustment of tasks.
  • the way of prioritization or adjustment supports user customization, such as receiving user settings through a plug-in mechanism.
  • the first layer scheduler 11 forwards the task request to the second layer scheduler 12 after starting or selecting the second layer scheduler 12 corresponding to the task.
  • the second layer scheduler 12 decomposes the task into multiple subtasks according to the logical relationship of the tasks, and manages the execution of the plurality of subtasks.
  • the execution of subtasks can be managed through a queue (distributed queue 22 as shown in Figure 2).
  • the distributed queue 22 may include a plurality of queues, respectively storing the tasks included in the corresponding subtasks.
  • the second layer scheduler 12 can create corresponding queues for multiple subtasks to store the sub-tasks.
  • the second layer scheduler 12 can organize the order of the queues according to the logical relationship of the tasks. For example, suppose the task consists of subtasks 1-3 (subtask 1 -> subtask 2 -> subtask 3) that are executed serially in sequence, and the second layer scheduler 12 can establish queues 1-3, respectively storing subtasks 1 -3 contains the tasks, and determines the order of queues 1-3, that is, the tasks included in the corresponding subtasks are executed in the order of queue 1 -> queue 2 -> queue 3.
  • the task execution result of the subtask 1 is put into the queue 2
  • the task execution result of the subtask 2 is put into the queue 3
  • the task execution result of the subtask 3 is output to an appropriate position, for example, output to the distributed storage shown in FIG. Device 24 is either returned to the user.
  • the second layer scheduler 12 may also apply for resources for the subtask, for example, requesting resources from the resource manager 25.
  • the resource manager 25 is responsible for satisfying the resource application and release of the scheduler 11 or 12.
  • the main functions of the resource manager 25 include resource management, resource matching, and automatic resource scaling.
  • the resource matching method can adopt a plug-in mechanism to support user customization.
  • the automatic resource scaling refers to automatically expanding the cluster or reducing the capacity of the cluster according to the load of the cluster when the cluster size of the cluster is within a range.
  • Other implementations of the resource manager 25 can be referred to the prior art, and therefore will not be described again.
  • the resource manager 25 can also employ a distributed scheme to achieve high concurrency.
  • the second layer scheduler 12 may indicate that the worker manager 26 of the requested resource initiates the work unit 27 to cause the work unit 27 to retrieve the tasks contained in the subtask from the queue and Perform this task.
  • the worker manager 26 is responsible for the creation, deletion, and monitoring of the worker 27. There is a worker manager 26 on each node (physical or virtual machine) in the cloud computing architecture. Other implementations of the worker manager 26 can be referred to the prior art, and therefore will not be described again.
  • the worker 27 is responsible for acquiring the tasks included in the subtasks of the user from the corresponding queues in the distributed queue 22, performing preprocessing, and then calling the processing program developed by the user. After the processing is completed, the second layer scheduler 12 can determine The order of the queues, put the results of the execution tasks into another queue or output the results of the execution tasks. Other implementations of Worker 27 can be referred to the prior art, and therefore will not be described again.
  • the second layer scheduler 12 can also implement other scheduling processes, such as task exception processing or task progress statistics.
  • the second-tier scheduler 12 can obtain progress information of queues and workers (such as whether each sub-task is completed or completed, whether the sub-tasks in each queue are completed or completed, etc.) Determine the progress of the execution of the task. In this way, real-time query of task progress can be achieved.
  • the user can go to the second layer scheduler 12 to query the execution of the corresponding task. Progress.
  • the second layer scheduler 12 may report the progress information of the task to the first layer scheduler 11 so that the user can query the execution schedule of the corresponding task to the first layer scheduler 11 to facilitate user monitoring.
  • Fig. 2 For the sake of cleaning, three worker managers 26 and corresponding three workers 27 are illustrated in Fig. 2, but the embodiment of the present invention is not limited to this specific example, and the number of worker managers 26 and workers 27 may be more or less.
  • the cluster management software 28 is responsible for the automated deployment and basic monitoring of the clusters that handle parallel tasks.
  • the implementation manners can refer to the prior art, and therefore will not be described again.
  • the distributed queue 22, the database 23 (such as the nosql database), and the distributed storage device 24 implement the task storage, the database, and the file storage required for the processing architecture 20, and the specific implementation manners can also refer to the prior art, and therefore will not be described again.
  • database 23 can be used for information persistence storage to meet system operational needs or to implement fault tolerance.
  • the bottom layer of the processing architecture 20 supports various heterogeneous hardware such as physical machines or virtual machines 29, and there is no need for user applications.
  • the implementation of the physical machine or virtual machine 29 can be referred to the prior art, and therefore will not be described again.
  • the processing architecture 20 employs a "queue-worker” computing model, but embodiments of the invention are not limited thereto.
  • the processing architecture 20 can also adopt other computing models, for example, part of the processing architecture 20. Therefore, the processing architecture 20 of the embodiment of the present invention adopts a two-layer scheduling architecture, the second layer scheduler corresponds to the task, and the first layer scheduler starts. Or select the second-level scheduler corresponding to the task, which can be applied to different tasks, improving processing efficiency and scheduling flexibility.
  • multiple second-tier schedulers for different tasks can be started at the same time, further improving the concurrent performance.
  • the embodiment of the present invention provides a high-performance, flexible parallel processing architecture 20 that can support physical machines and currently popular cloud computing platforms, support large-scale clustering, support user scheduling policy configuration, and customize and support different computing models.
  • FIG. 3 is a flow chart of a distributed computing task processing method in accordance with an embodiment of the present invention.
  • the method of Fig. 3 can be performed by the task processing system 10 of Figs. 1 and 2, and therefore the method of Fig. 3 will be described below in conjunction with Figs. 1 and 2, and the repeated description will be appropriately omitted.
  • the first layer scheduler 11 starts or selects the second layer scheduler 12 corresponding to the task. For example, when there is no suitable second layer scheduler in the system, the first layer scheduler 11 can start the second layer scheduler 12 corresponding to the task. When a suitable second layer scheduler already exists in the system, the first layer scheduler 11 can select the second layer scheduler 12 corresponding to the task from among these suitable second layer schedulers.
  • the first layer scheduler 11 forwards the request to the second layer scheduler 12.
  • the second layer scheduler 12 decomposes the task into multiple subtasks according to the logical relationship of the task.
  • the embodiment of the invention adopts a two-layer scheduling architecture, the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that the second layer scheduler can be applied to different tasks, and the processing efficiency is improved. Scheduling flexibility.
  • the first layer scheduler 11 of the embodiment of the present invention can accept various forms of tasks, and the form of the task is not necessarily limited to the strict two steps of map and reduce in the prior art.
  • the second layer scheduler 12 corresponds to the task, so that the first layer scheduler 11 can send different tasks to the corresponding second layer scheduler 12 for scheduling processing.
  • the second layer scheduler 12 decomposes the tasks into subtasks to process the tasks, such as scheduling the execution of each subtask. Such scheduling has greater flexibility.
  • the task processing needs to be performed strictly in the order of two steps of map and reduce. If there are many steps to process, you need to complete the task request by submitting many times, and the processing efficiency is low.
  • Embodiments of the invention do not have this limitation.
  • the embodiment of the present invention does not limit the task itself and the manner in which the task or subtask is executed.
  • the number of subtasks included in a task can be more than the prior art map, reduce, such as three or more subtasks; and is not limited to the form of map, reduce subtasks.
  • the subtasks do not have to follow a strict sequence, and may be executed in parallel, serially, or partially in parallel. In this way, even with many steps, only a small number of task requests are required, which improves processing efficiency.
  • the logical relationship of the tasks may indicate the execution dependencies of the plurality of subtasks.
  • execution dependency refers to whether the execution operations of each sub-task depend on each other.
  • subtask 2 For example, assuming that subtask 2 must depend on the execution result of subtask 1, subtask 2 should be executed after subtask 1 is executed (ie, subtask 1 and subtask 2 need to be executed serially). On the other hand, if subtask 2 does not depend on the entire execution result of subtask 1, subtask 1 and subtask 2 may be executed in parallel or in series.
  • a non-limiting example of performing a dependency may include: two or more subtasks of the plurality of subtasks are executed in serial, or parallel, or partially parallel partial serial order, and are not limited to the prior art maps. , reduce these two steps. In this way, if there are many steps of processing for a certain task, it is not necessary to submit a plurality of task requests as in the MapReduce architecture. The embodiment of the present invention may only need to submit one or a few times of task requests, thereby improving the processing efficiency of the task.
  • the second layer scheduler 12 may also create a corresponding queue for the plurality of subtasks to store the tasks included in the subtasks, and organize the order of the queues according to the logical relationship of the tasks.
  • the second layer scheduler 12 may also apply for a resource for the subtask when the task included in the subtask is stored in the queue, and indicate that the worker manager of the requested resource is started.
  • a unit of work such that the unit of work obtains the tasks contained in the subtask from the queue to perform the task.
  • the second layer scheduler 12 may also instruct the work unit to put the result of executing the task into another queue or output the result of executing the task.
  • the second layer scheduler 12 may also obtain progress information of the queue and the work unit to determine the execution progress of the task. In this way, real-time query of task progress can be realized, which is convenient for user monitoring.
  • the first layer scheduler 11 may perform priority management on the task, and start or select the second layer scheduler 12 according to the priority to process the task.
  • Figure 4 is a schematic flow chart of the task processing procedure of one embodiment of the present invention.
  • the process of Fig. 4 can be performed by the processing architecture 20 of Fig. 2, and thus the repeated description is omitted as appropriate.
  • the H ⁇ task consists of subtasks 1-3 (subtask 1-> subtask 2-> subtask 3) that are executed serially.
  • subtasks 1-3 subtask 1-> subtask 2-> subtask 3
  • embodiments of the present invention are not limited to this specific example, and any other type of task can similarly apply the processing of the embodiment of the present invention. Such applications are all within the scope of embodiments of the invention.
  • a web service receives a request submitted by a user to perform a task.
  • the logical relationship of tasks can be defined by the user.
  • the network service forwards the request to the first layer scheduler.
  • the first layer scheduler returns a response to the network service that successfully submits the request.
  • Step 403 is an optional step.
  • the first layer scheduler calculates the priority according to the priority calculation method and performs the sorting, and selects the task with a high priority.
  • the first layer scheduler starts according to the task selected in step 404 (if there is no second layer scheduler corresponding to the type at the time) or selects (the second type corresponding to the type is available in the system) Layer scheduler) A suitable second layer scheduler.
  • the first layer scheduler forwards the task request to the second layer scheduler started or selected in step 405.
  • the second layer scheduler preprocesses the task according to the logical relationship of the task. Specifically, as a non-limiting example, the second layer scheduler can decompose the task into multiple subtasks (subtask 1, subtask 2, subtask 3).
  • the second layer scheduler creates a queue 1-3 for subtasks 1-3 according to the "queue-worker" calculation model. Alternatively, the second layer scheduler may generate an initial subtask (corresponding to subtask 1) and place it in queue 1. At this time, the second layer scheduler may perform the execution dependency relationship between the subtasks 1-3 "subtask 1 -> subtask 2 -> subtask 3". The execution order of the queues 1-3 is "queue 1 -> Queue 2 -> Team ⁇ 1 3".
  • the second layer scheduler finds that there are subtasks 1 in queue 1. For example, the second-tier scheduler can periodically check the queue to see if there are any tasks in the queue. However, the embodiment of the present invention does not limit this, and the second layer scheduler may discover subtasks in the queue according to other manners.
  • the second layer scheduler requests resources for the subtask 1 from the resource manager.
  • the second layer scheduler indicates that the worker of the requested resource starts the worker to process the subtask 1 in the queue 1.
  • the worker manager starts the worker, and tells the worker to put the result obtained by processing the subtask 1 (corresponding to the subtask 2) into the queue 2.
  • the second layer scheduler finds that there are subtasks 2 in queue 2.
  • the second layer scheduler requests resources for the subtask 2 from the resource manager.
  • the second layer scheduler instructs the worker manager to start the worker to process the subtask 2 in queue 2.
  • the worker manager starts the worker, and tells the worker to put the result obtained by processing the subtask 2 (corresponding to the subtask 3) into the queue 3.
  • 418 After the worker starts, it automatically goes to the queue 2 to acquire and execute the tasks included in the subtask 2. After the execution is completed, the execution result (corresponding to the subtask 3) is put into the queue 3.
  • the second layer scheduler finds that there are subtasks 3 in queue 3.
  • the second layer scheduler requests resources for the subtask 3 from the resource manager.
  • the second layer scheduler instructs the worker manager to start the worker to process the subtask 3 in the queue 3.
  • the worker manager starts the worker, and tells the worker to put the result of processing the subtask 3 into a suitable location (such as putting it into a distributed storage device or returning it to the user).
  • steps 409-423 The worker automatically takes the subtask and the subtask task. This allows the entire task to be processed in accordance with the logical relationship defined by the task.
  • 409-423 is depicted as being serially executed in the embodiment of FIG. 4, embodiments of the present invention are not limited thereto.
  • the order of execution of steps 409-413, steps 414-418, and steps 419-423 may be swapped or overlapped. For example, if subtask 2 of queue 2 does not depend on the execution result of all subtasks 1 in queue 1, the worker of queue 2 can also work while the worker of queue 1 is working, and the worker of queue 1 does not need to execute the complete part. Task 1 can start the worker of queue 2.
  • the second layer scheduler can obtain the progress information of the queue or the worker, and judge the progress of the entire task execution.
  • the second layer scheduler is reported to the first layer scheduler.
  • the second layer scheduler can also directly query the progress of the task in real time.
  • the embodiment of the present invention adopts a two-layer scheduling architecture, and the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, thereby being applicable to different tasks and improving processing.
  • Efficiency and scheduling flexibility and can meet the needs of a variety of parallel processing services.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential to the prior art or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Embodiments of the present invention provide a task processing system and a task processing method for distributed computation. The system comprises: a first level scheduler, used for receiving a request of executing a task, starting or selecting a second level scheduler corresponding to the task, and forwarding the request to the second level scheduler; and the second level scheduler, used for decomposing the task into a plurality of subtasks according to a logical relationship of the task when receiving the request forwarded by the first level scheduler. The embodiments of the invention employ a two-level scheduling framework, the second level scheduler corresponding to the task and the first level scheduler starting or selecting the second level scheduler corresponding to the task, so that the task processing system and the task processing method can be used in different tasks, and the processing efficiency and scheduling flexibility are improved.

Description

分布式计算任务处理***和任务处理方法 技术领域  Distributed computing task processing system and task processing method
本发明实施例涉及网络通信领域, 并且更具体地, 涉及分布式计算任务 处理***和任务处理方法。 背景技术  Embodiments of the present invention relate to the field of network communications, and more particularly, to a distributed computing task processing system and a task processing method. Background technique
目前, 随着互联网的发展, 对大量信息的快速处理的需求变得很迫切。 因此数据的并行处理就变得很重要。分布式计算环境提供了网络环境下不同 软、 硬件平台资源共享和互操作的有效手段, 成为并行处理的常用架构。 目 前业界熟知的并行处理***采用 MapReduce架构。 MapReduce是分布式计 算软件构架, 它可以支持大数据量的分布式处理。 这个架构最初起源于函数 式程式的 map (映射)和 reduce (缩减) 两个函数。 map指的是对原始的文 档按照自定义的映射规则进行处理, 输出中间结果。 reduce按照自定义的缩 减规则对中间结果进行合并。  At present, with the development of the Internet, the demand for rapid processing of a large amount of information has become urgent. Therefore, parallel processing of data becomes very important. The distributed computing environment provides an effective means for resource sharing and interoperability of different software and hardware platforms in the network environment, and becomes a common architecture for parallel processing. The parallel processing system well known in the industry currently uses the MapReduce architecture. MapReduce is a distributed computing software architecture that supports distributed processing of large amounts of data. This architecture originally originated from the function's map and reduce functions. Map refers to the processing of the original document according to the custom mapping rules, output intermediate results. Reduce combines intermediate results according to custom reduction rules.
在分布式计算环境中, MapReduce的通用架构包括调度节点和多个工作 节点。 调度节点负责任务调度和资源管理; 负责根据用户配置, 将用户提交 的任务分解为 map、 reduce两种子任务, 并分配 map、 reduce子任务到工作 节点。 工作节点负责运行 map、 reduce子任务, 与调度节点保持通讯。  In a distributed computing environment, the general architecture of MapReduce includes a scheduling node and multiple working nodes. The scheduling node is responsible for task scheduling and resource management; it is responsible for decomposing the tasks submitted by the user into two subtasks, map and reduce, according to the user configuration, and assigning map and reduce subtasks to the working node. The worker node is responsible for running the map, reduce subtask, and maintaining communication with the dispatch node.
在这种并行处理架构中, 由于一个调度节点负责任务以及资源管理, 并 且需要严格地先后按照 map、 reduce两步的顺序进行任务处理。 如果存在很 多步骤的处理, 则需要通过提交很多次任务请求来完成, 处理效率较低, 调 度不够灵活。 发明内容  In this parallel processing architecture, since a scheduling node is responsible for tasks and resource management, it is necessary to perform task processing in strict order of two steps of map and reduce. If there are many steps to process, it needs to be completed by submitting many task requests. The processing efficiency is low and the scheduling is not flexible enough. Summary of the invention
本发明实施例提供一种任务处理***和任务处理方法, 能够解决现有并 行处理架构中处理效率的问题。  The embodiments of the present invention provide a task processing system and a task processing method, which can solve the problem of processing efficiency in the existing parallel processing architecture.
一方面, 提供了一种分布式计算任务处理***, 包括: 第一层调度器, 用于接收执行任务的请求, 启动或选择任务对应的第二层调度器并向第二层 调度器转发所述请求; 第二层调度器, 用于在接收到第一层调度器转发的请 求时, 按照任务的逻辑关系将任务分解为多个子任务。 另一方面, 提供了一种分布式计算任务处理方法, 该方法包括: 第一层 调度器在接收到执行任务的请求时, 启动或选择任务对应的第二层调度器; 第一层调度器向第二层调度器转发该请求; 第二层调度器在接收到第一层调 度器转发的请求时, 按照任务的逻辑关系将任务分解为多个子任务。 In one aspect, a distributed computing task processing system is provided, including: a first layer scheduler, configured to receive a request to perform a task, start or select a second layer scheduler corresponding to the task, and forward the to the second layer scheduler. The second layer scheduler is configured to decompose the task into multiple subtasks according to the logical relationship of the task when receiving the request forwarded by the first layer scheduler. On the other hand, a distributed computing task processing method is provided, the method comprising: the first layer scheduler starts or selects a second layer scheduler corresponding to the task when receiving the request for executing the task; the first layer scheduler The request is forwarded to the second layer scheduler; when receiving the request forwarded by the first layer scheduler, the second layer scheduler decomposes the task into multiple subtasks according to the logical relationship of the task.
本发明实施例采用两层调度架构, 第二层调度器对应于任务, 第一层调 度器启动或选择任务对应的第二层调度器, 从而可以适用于不同的任务, 提 高了处理效率和调度灵活性。 附图说明  The embodiment of the invention adopts a two-layer scheduling architecture. The second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that it can be applied to different tasks, improving processing efficiency and scheduling. flexibility. DRAWINGS
为了更清楚地说明本发明实施例的技术方案, 下面将对实施例或现有技 术描述中所需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图 仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造 性劳动的前提下, 还可以根据这些附图获得其他的附图。  In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings to be used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only the present invention. For some embodiments, other drawings may be obtained from those of ordinary skill in the art without departing from the drawings.
图 1是本发明实施例的任务处理***的框图。  1 is a block diagram of a task processing system in accordance with an embodiment of the present invention.
图 2是本发明一个实施例的处理架构的示意图。  2 is a schematic diagram of a processing architecture of an embodiment of the present invention.
图 3是本发明一个实施例的任务处理方法的流程图。  3 is a flow chart of a task processing method in accordance with an embodiment of the present invention.
图 4是本发明一个实施例的任务处理过程的示意流程图。 具体实施方式  4 is a schematic flow chart of a task processing procedure according to an embodiment of the present invention. detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是 全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创 造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。  The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without making creative labor are within the scope of the present invention.
图 1是本发明实施例的分布式计算任务处理***的框图。 图 1的任务处 理*** 10包括两层调度器, 即第一层调度器 11和第二层调度器 12。  1 is a block diagram of a distributed computing task processing system in accordance with an embodiment of the present invention. The task processing system 10 of Figure 1 includes a two-tier scheduler, a first level scheduler 11 and a second level scheduler 12.
第一层调度器 11接收执行任务的请求, 启动或选择该任务对应的第二 层调度器 12并向第二层调度器 12转发该请求。  The first layer scheduler 11 receives the request to perform the task, starts or selects the second layer scheduler 12 corresponding to the task, and forwards the request to the second layer scheduler 12.
例如, 在***中没有合适的第二层调度器时, 第一层调度器 11 可启动 该任务对应的第二层调度器 12。 在***中已经存在合适的第二层调度器时, 第一层调度器 11 可从这些合适的第二层调度器中选择该任务对应的第二层 调度器 12。 可选地, 所述第一层调度器还用于对所述任务进行优先级管理, 并按照 所述优先级启动或选择所述第二层调度器对所述任务进行处理。 For example, when there is no suitable second layer scheduler in the system, the first layer scheduler 11 can start the second layer scheduler 12 corresponding to the task. When a suitable second layer scheduler already exists in the system, the first layer scheduler 11 can select the second layer scheduler 12 corresponding to the task from among these suitable second layer schedulers. Optionally, the first layer scheduler is further configured to perform priority management on the task, and start or select the second layer scheduler to process the task according to the priority.
第二层调度器 12在接收到第一层调度器 11转发的请求时, 按照任务的 逻辑关系将该任务分解为多个子任务。  When receiving the request forwarded by the first layer scheduler 11, the second layer scheduler 12 decomposes the task into a plurality of subtasks according to the logical relationship of the task.
本发明实施例采用两层调度架构, 第二层调度器对应于任务, 第一层调 度器启动或选择任务对应的第二层调度器, 从而可以适用于不同的任务, 提 高了处理效率和调度灵活性。  The embodiment of the invention adopts a two-layer scheduling architecture. The second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that it can be applied to different tasks, improving processing efficiency and scheduling. flexibility.
在现有的并行处理架构中, 只有一层调度, 从而需要严格地先后按照 map, reduce两步进行任务处理, 而本发明实施例没有此限制。 本发明实施 例的第一层调度器 11 可以接受各种形式的任务, 任务的形式不必限于现有 技术中的严格的 map、 reduce两步。 第二层调度器 12对应于任务, 这样第 一层调度器 11可以将不同的任务发送给相应的第二层调度器 12进行调度处 理。 第二层调度器 12将任务分解为子任务以对任务进行处理, 例如调度各 个子任务的执行。 这样的调度具有更高的灵活性。  In the existing parallel processing architecture, there is only one layer of scheduling, so that the task processing needs to be performed in two steps in strict accordance with the map, reduce, and the embodiment of the present invention does not have this limitation. The first layer scheduler 11 of the embodiment of the present invention can accept various forms of tasks, and the form of the task is not necessarily limited to the strict two steps of map and reduce in the prior art. The second layer scheduler 12 corresponds to the task, so that the first layer scheduler 11 can send different tasks to the corresponding second layer scheduler 12 for scheduling processing. The second layer scheduler 12 decomposes the tasks into subtasks to process the tasks, such as scheduling the execution of each subtask. Such scheduling has greater flexibility.
另外, 现有技术中需要严格地先后按照 map、 reduce两步的顺序进行任 务处理。如果存在很多步骤的处理,则需要通过提交很多次任务请求来完成, 处理效率较低。 本发明实施例没有此限制。 本发明实施例对任务本身以及执 行任务或子任务的方式不作限制。 例如, 任务中所包含的子任务的数目可以 比现有技术的 map、 reduce这两种更多, 如三个或三个以上的子任务; 而且 不限于 map、 reduce子任务的形式。 另外, 子任务不必遵循严格的先后顺序, 可以并行地执行、 串行地执行、 或者部分并行部分串行地执行。 这样, 即使 是很多步骤的处理, 也只需要较少次数的任务请求, 提高了处理效率。  In addition, in the prior art, the task processing needs to be performed strictly in the order of two steps of map and reduce. If there are many steps to process, you need to complete the task request by submitting many times, and the processing efficiency is low. Embodiments of the invention do not have this limitation. The embodiment of the present invention does not limit the task itself and the manner in which the task or subtask is executed. For example, the number of subtasks included in a task can be more than the prior art map, reduce, such as three or more subtasks; and is not limited to the form of map, reduce subtasks. In addition, the subtasks do not have to follow a strict sequence, and may be executed in parallel, serially, or partially in parallel. In this way, even with many steps, only a small number of task requests are required, which improves processing efficiency.
子任务的数目与具体的任务有关, 如转码、 人脸识别业务等任务。 根据 任务的逻辑关系, 这些任务可能具有相同或不同的子任务数目。 可选地, 作 为一个实施例, 可以在任务的描述文件中携带该任务的逻辑关系。 例如, 任 务处理*** 10 (具体地, 例如第二层调度器 12 )可接收用户上传的描述文 件, 如 XML ( Extensible Markup Language, 可扩展标记语言 )格式的描述文 件, 该描述文件中携带任务的逻辑关系。  The number of subtasks is related to specific tasks, such as transcoding, face recognition services, and so on. Depending on the logical relationship of the tasks, these tasks may have the same or different number of subtasks. Optionally, as an embodiment, the logical relationship of the task may be carried in the description file of the task. For example, the task processing system 10 (specifically, for example, the second layer scheduler 12) can receive a user-uploaded description file, such as an XML (Extensible Markup Language) format description file, which carries the task. Logic.
可选地, 第二层调度器 12在接收到第一层调度器 11转发的请求时, 获 得该任务对应的 XML格式的描述文件。 根据该 XML格式的描述文件中携 带的任务的逻辑关系, 将该任务分解为多个子任务。 另外, 每个子任务还可以进一步分解为更小粒度的子任务。 也就是说, 本发明实施例的子任务可以是多层子任务,每层子任务的进一步分解的方式 均可通过描述文件中携带的逻辑关系确定。 举例来说, 子任务 1可以分解为 多个子任务 2, 子任务 2也还可以进一步分解为多个子任务 3等等。 Optionally, when receiving the request forwarded by the first layer scheduler 11, the second layer scheduler 12 obtains a description file in an XML format corresponding to the task. The task is decomposed into a plurality of subtasks according to the logical relationship of the tasks carried in the description file of the XML format. In addition, each subtask can be further broken down into sub-tasks of smaller granularity. That is, the subtask of the embodiment of the present invention may be a multi-layer subtask, and the manner of further decomposing each subtask may be determined by the logical relationship carried in the description file. For example, subtask 1 can be decomposed into multiple subtasks 2, and subtask 2 can be further decomposed into multiple subtasks 3 and so on.
可选地, 作为一个实施例, 任务的逻辑关系可指示多个子任务的执行依 赖关系。所谓执行依赖关系,是指各个子任务的执行操作之间是否相互依赖。  Optionally, as an embodiment, the logical relationship of the tasks may indicate the execution dependencies of the plurality of subtasks. The so-called execution dependency refers to whether the execution operations of each sub-task depend on each other.
举例来说, 假设子任务 2必须依赖于子任务 1的执行结果, 则子任务 2 应该在子任务 1执行之后再执行(即子任务 1和子任务 2需串行地执行)。 另一方面, 如果子任务 2不依赖于子任务 1的全部执行结果, 则子任务 1和 子任务 2可以并行执行, 也可以串行执行。  For example, assuming that subtask 2 must rely on the execution result of subtask 1, subtask 2 should be executed after subtask 1 is executed (ie, subtask 1 and subtask 2 need to be executed serially). On the other hand, if subtask 2 does not depend on the entire execution result of subtask 1, subtask 1 and subtask 2 may be executed in parallel or in series.
执行依赖关系的一个非限制性的例子可包括: 多个子任务中的两个或更 多个子任务按照串行、 或并行、 或部分并行部分串行顺序执行, 而不限于现 有技术中的 map、 reduce这两个步骤。 这样, 如果某一任务存在很多步骤的 处理, 则无需像 MapReduce 架构那样提交很多次任务请求, 本发明实施例 可能仅需提交一次或少量几次任务请求, 从而提高了任务的处理效率。  A non-limiting example of performing a dependency may include: two or more subtasks of the plurality of subtasks are executed in serial, or parallel, or partially parallel partial serial order, and are not limited to the prior art maps. , reduce these two steps. In this way, if there are many steps in a certain task, there is no need to submit a plurality of task requests as in the MapReduce architecture. The embodiment of the present invention may only need to submit one or a few times of task requests, thereby improving the processing efficiency of the task.
任务的逻辑关系可以显式地指示子任务间的执行依赖关系, 例如显式地 表示该任务是由先后串行执行的子任务 1-3构成。 或者, 任务的逻辑关系可 以隐式地指示子任务间的执行依赖关系, 例如对于某一特定任务, ***预先 知道该任务是由先后串行执行的子任务 1-3构成的。  The logical relationship of the task can explicitly indicate the execution dependencies between the subtasks, for example, explicitly indicating that the task is composed of subtasks 1-3 that are executed serially in succession. Alternatively, the logical relationship of the task may implicitly indicate an execution dependency between the subtasks. For example, for a particular task, the system knows in advance that the task is composed of subtasks 1-3 that are executed serially in succession.
可选地, 作为另一实施例, 第二层调度器 12还用于为所述多个子任务 创建相应的队列以存储所述子任务包含的任务。 当在所述队列中存储了所述 子任务包含的任务时, 第二层调度器 12还可以用于为所述子任务申请资源, 并指示所申请资源的工作单元管理器启动工作单元, 以使得所述工作单元从 所述队列获取所述子任务包含的任务以执行任务。可选地,作为另一实施例, 第二层调度器 12还可以用于指示所述工作单元将执行任务的结果放入另一 个队列中或输出所述执行任务的结果。  Optionally, as another embodiment, the second layer scheduler 12 is further configured to create a corresponding queue for the multiple subtasks to store the tasks included in the subtask. When the task included in the subtask is stored in the queue, the second layer scheduler 12 may be further configured to apply for a resource for the subtask, and indicate that the work unit manager of the requested resource starts the work unit, to The work unit is caused to acquire a task included in the subtask from the queue to perform a task. Optionally, as another embodiment, the second layer scheduler 12 is further configured to instruct the work unit to put the result of executing the task into another queue or output the result of the execution task.
进一步地, 作为另一实施例, 第二层调度器 12还可以用于获取所述队 列和所述工作单元的进度信息, 以确定所述任务的执行进度。  Further, as another embodiment, the second layer scheduler 12 is further configured to obtain progress information of the queue and the work unit to determine an execution progress of the task.
总之, 本发明实施例对任务的具体形式不作限制。 可选地, 任务的逻辑 关系设置或选择可支持用户自定义,例如通过插件机制接收用户的设置或选 择。 本发明实施例的任务处理*** 10可应用于云计算架构。 云计算提出了 一种高可靠性、 低成本、 按需使用、 弹性的商业模式。 很多***可以通过使 用云服务, 来达到高可靠性、 弹性、 低成本的目标。 In summary, the embodiment of the present invention does not limit the specific form of the task. Optionally, the logical relationship setting or selection of the task may support user customization, such as receiving a user's settings or selections through a plugin mechanism. The task processing system 10 of the embodiment of the present invention is applicable to a cloud computing architecture. Cloud computing proposes a high-reliability, low-cost, on-demand, and resilient business model. Many systems can achieve high reliability, flexibility, and low cost by using cloud services.
图 2是本发明一个实施例的处理架构的示意图。 图 2的处理架构 20是 一种云计算架构, 包括图 1的任务处理*** 10。 与图 1的不同之处在于, 图 2的任务处理*** 10可包括多个第二层调度器 12。 为了筒洁, 图 2中仅仅 描绘了两个第二层调度器 12, 但第二层调度器 12的数目不受此例子的限制 (可以更多或更少) 。 每个第二层调度器 12对应于一种任务, 以适配或支 撑不同的计算模型。 可选地, 多个第二层调度器 12也可以对应于一种任务, 以实现***调度的高并发性。 如果现有的多个第二层调度器 12 中存在对应 于任务的合适的第二层调度器 12, 则第一层调度器 11可以选择该合适的第 二层调度器 12对任务进行处理; 如果现有的多个第二层调度器 12中没有对 应于任务的合适的第二层调度器 12, 则第一层调度器 11可以启动新的合适 的第二层调度器 12对任务进行处理。  2 is a schematic diagram of a processing architecture of an embodiment of the present invention. The processing architecture 20 of FIG. 2 is a cloud computing architecture, including the task processing system 10 of FIG. The difference from FIG. 1 is that the task processing system 10 of FIG. 2 can include a plurality of second layer schedulers 12. For the sake of cleaning, only two second-tier schedulers 12 are depicted in Figure 2, but the number of second-tier schedulers 12 is not limited by this example (may be more or less). Each second level scheduler 12 corresponds to a task to adapt or support different computing models. Optionally, the plurality of second layer schedulers 12 may also correspond to a task to achieve high concurrency of system scheduling. If there is a suitable second layer scheduler 12 corresponding to the task in the existing plurality of second layer schedulers 12, the first layer scheduler 11 may select the suitable second layer scheduler 12 to process the tasks; If there is no suitable second layer scheduler 12 corresponding to the task in the existing plurality of second layer schedulers 12, the first layer scheduler 11 can start a new suitable second layer scheduler 12 to process the tasks. .
在处理架构 20中,第一层调度器 11可以是分布式的,以支持高并发性。 第一层调度器 11 可接收 Webservice (网络服务) 21 发送来的任务请求。 Webservice 21负责用户的 web (网络)请求的接收和转发, 具体实现方式可 参照现有技术, 因此不再赘述。  In processing architecture 20, first layer scheduler 11 may be distributed to support high concurrency. The first layer scheduler 11 can receive task requests sent by the Webservice 21. The web service 21 is responsible for the receiving and forwarding of the web (network) request of the user. The specific implementation manner can refer to the prior art, and therefore will not be described again.
可选地, 作为一个实施例, 当第一层调度器 11接收到多个任务请求时, 第一层调度器 11还可以对任务进行优先级管理(例如进行优先级排序 ), 并 按照优先级启动或选择第二层调度器 12对任务进行处理。 例如, 第一层调 度器 11可优先启动或选择优先级较高的任务所对应的第二层调度器 12。  Optionally, as an embodiment, when the first layer scheduler 11 receives multiple task requests, the first layer scheduler 11 may also perform priority management on the tasks (eg, prioritize), and according to the priority. The second level scheduler 12 is started or selected to process the task. For example, the first layer scheduler 11 may preferentially start or select the second layer scheduler 12 corresponding to the higher priority task.
可选地, 作为另一实施例, 第一层调度器 11可实现任务的优先级调整 等附加功能。 优先级排序或调整的方式可支持用户自定义, 例如通过插件机 制接收用户的设置。  Alternatively, as another embodiment, the first layer scheduler 11 may implement additional functions such as priority adjustment of tasks. The way of prioritization or adjustment supports user customization, such as receiving user settings through a plug-in mechanism.
第一层调度器 11在启动或选择了对应于任务的第二层调度器 12之后, 向该第二层调度器 12转发任务请求。第二层调度器 12按照任务的逻辑关系, 将任务分解为多个子任务, 并管理多个子任务的执行。 可选地, 作为一个实 施例, 可通过队列(如图 2所示的分布式队列 22 )管理子任务的执行。 分布 式队列 22中可包括多个队列, 分别存储相应的子任务包含的任务。  The first layer scheduler 11 forwards the task request to the second layer scheduler 12 after starting or selecting the second layer scheduler 12 corresponding to the task. The second layer scheduler 12 decomposes the task into multiple subtasks according to the logical relationship of the tasks, and manages the execution of the plurality of subtasks. Alternatively, as an embodiment, the execution of subtasks can be managed through a queue (distributed queue 22 as shown in Figure 2). The distributed queue 22 may include a plurality of queues, respectively storing the tasks included in the corresponding subtasks.
具体地, 第二层调度器 12可以为多个子任务创建相应的队列以存储子 任务所包含的任务。 第二层调度器 12可根据任务的逻辑关系整理队列的顺 序。 例如, 假设任务由先后串行执行的子任务 1-3 (子任务 1->子任务 2->子 任务 3 )构成, 第二层调度器 12可以建立队列 1-3 , 分别存储子任务 1-3所 包含的任务, 并且确定队列 1-3的顺序, 即按照队列 1->队列 2->队列 3的顺 序先后执行相应子任务中包含的任务。 子任务 1的任务执行结果放入队列 2 中, 子任务 2的任务执行结果放入队列 3中, 子任务 3的任务执行结果输出 至合适的位置, 例如输出至图 2所示的分布式存储设备 24或返回给用户。 Specifically, the second layer scheduler 12 can create corresponding queues for multiple subtasks to store the sub-tasks. The tasks included in the task. The second layer scheduler 12 can organize the order of the queues according to the logical relationship of the tasks. For example, suppose the task consists of subtasks 1-3 (subtask 1 -> subtask 2 -> subtask 3) that are executed serially in sequence, and the second layer scheduler 12 can establish queues 1-3, respectively storing subtasks 1 -3 contains the tasks, and determines the order of queues 1-3, that is, the tasks included in the corresponding subtasks are executed in the order of queue 1 -> queue 2 -> queue 3. The task execution result of the subtask 1 is put into the queue 2, the task execution result of the subtask 2 is put into the queue 3, and the task execution result of the subtask 3 is output to an appropriate position, for example, output to the distributed storage shown in FIG. Device 24 is either returned to the user.
可选地, 作为另一实施例, 在分布式队列 22 中存储了子任务包含的任 务时, 第二层调度器 12还可以为该子任务申请资源, 例如从资源管理器 25 申请资源。 资源管理器 25负责满足调度器 11或 12的资源申请、 释放。 资 源管理器 25的主要功能包括资源管理、 资源匹配、 资源自动伸缩。 其中资 源匹配方法可采用插件机制, 支持用户自定义。 另外, 所谓资源自动伸缩是 指用户配置集群规模在一个范围内时, 可以根据集群负载情况, 来自动扩容 集群或者减容集群。 资源管理器 25的其他实现方式可参照现有技术, 因此 不再赘述。 例如, 资源管理器 25也可以采用分布式方案, 以实现高并发性。  Optionally, as another embodiment, when the task included in the subtask is stored in the distributed queue 22, the second layer scheduler 12 may also apply for resources for the subtask, for example, requesting resources from the resource manager 25. The resource manager 25 is responsible for satisfying the resource application and release of the scheduler 11 or 12. The main functions of the resource manager 25 include resource management, resource matching, and automatic resource scaling. The resource matching method can adopt a plug-in mechanism to support user customization. In addition, the automatic resource scaling refers to automatically expanding the cluster or reducing the capacity of the cluster according to the load of the cluster when the cluster size of the cluster is within a range. Other implementations of the resource manager 25 can be referred to the prior art, and therefore will not be described again. For example, the resource manager 25 can also employ a distributed scheme to achieve high concurrency.
在为子任务建立队列并申请资源之后, 第二层调度器 12可指示所申请 资源的工作单元(worker )管理器 26启动工作单元 27, 以使得工作单元 27 从队列获取子任务包含的任务并执行该任务。 worker管理器 26负责 worker 27的创建、 删除、 监控。 在云计算架构中的每个节点(物理机或虚拟机)上 都有 worker管理器 26。 worker管理器 26的其他实现方式可参照现有技术, 因此不再赘述。  After the queue is established for the subtask and the resource is requested, the second layer scheduler 12 may indicate that the worker manager 26 of the requested resource initiates the work unit 27 to cause the work unit 27 to retrieve the tasks contained in the subtask from the queue and Perform this task. The worker manager 26 is responsible for the creation, deletion, and monitoring of the worker 27. There is a worker manager 26 on each node (physical or virtual machine) in the cloud computing architecture. Other implementations of the worker manager 26 can be referred to the prior art, and therefore will not be described again.
Worker 27负责从分布式队列 22中的相应队列获取用户的子任务所包含 的任务, 进行预处理, 之后再调用用户开发的处理程序, 待处理完成后, 可 按照第二层调度器 12所确定的队列的顺序, 将执行任务的结果放入另一个 队列中或输出执行任务的结果。 Worker 27的其他实现方式可参照现有技术, 因此不再赘述。  The worker 27 is responsible for acquiring the tasks included in the subtasks of the user from the corresponding queues in the distributed queue 22, performing preprocessing, and then calling the processing program developed by the user. After the processing is completed, the second layer scheduler 12 can determine The order of the queues, put the results of the execution tasks into another queue or output the results of the execution tasks. Other implementations of Worker 27 can be referred to the prior art, and therefore will not be described again.
此外, 第二层调度器 12还可以实现其他调度处理, 例如任务异常处理 或任务进度统计等。例如,第二层调度器 12可获取队列和工作单元(worker ) 的进度信息(如每个子任务是否完成或已完成多少, 每个队列中的子任务是 否完成或已完成多少等等), 以确定任务的执行进度。 这样, 能够实现任务 进度的实时查询。 例如, 用户可以到第二层调度器 12查询相应任务的执行 进度。 或者, 第二层调度器 12可以将任务的进度信息上报给第一层调度器 11 , 以便用户到第一层调度器 11 查询相应任务的执行进度, 方便用户的监 控。 In addition, the second layer scheduler 12 can also implement other scheduling processes, such as task exception processing or task progress statistics. For example, the second-tier scheduler 12 can obtain progress information of queues and workers (such as whether each sub-task is completed or completed, whether the sub-tasks in each queue are completed or completed, etc.) Determine the progress of the execution of the task. In this way, real-time query of task progress can be achieved. For example, the user can go to the second layer scheduler 12 to query the execution of the corresponding task. Progress. Alternatively, the second layer scheduler 12 may report the progress information of the task to the first layer scheduler 11 so that the user can query the execution schedule of the corresponding task to the first layer scheduler 11 to facilitate user monitoring.
为了筒洁,图 2中例示了三个 worker管理器 26和相应的三个 worker 27 , 但是本发明实施例不限于该具体例子, worker管理器 26和 worker 27的数目 可以更多或更少。  For the sake of cleaning, three worker managers 26 and corresponding three workers 27 are illustrated in Fig. 2, but the embodiment of the present invention is not limited to this specific example, and the number of worker managers 26 and workers 27 may be more or less.
集群管理软件 28 负责处理并行任务的集群的自动化部署与基本监控, 其实现方式可参照现有技术, 因此不再赘述。  The cluster management software 28 is responsible for the automated deployment and basic monitoring of the clusters that handle parallel tasks. The implementation manners can refer to the prior art, and therefore will not be described again.
分布式队列 22、 数据库 23 (如 nosql数据库)、 分布式存储设备 24实现 处理架构 20所需的任务存储、 数据库以及文件存储, 具体实现方式也可参 照现有技术, 因此不再赘述。 例如, 数据库 23可用于信息持久化存储, 以 满足***运行需要或实现容错功能。  The distributed queue 22, the database 23 (such as the nosql database), and the distributed storage device 24 implement the task storage, the database, and the file storage required for the processing architecture 20, and the specific implementation manners can also refer to the prior art, and therefore will not be described again. For example, database 23 can be used for information persistence storage to meet system operational needs or to implement fault tolerance.
处理架构 20的最底层支持物理机或虚拟机 29等各种异构硬件,对于用 户应用来说无需关心。 物理机或虚拟机 29的实现方式可参照现有技术, 因 此不再赘述。  The bottom layer of the processing architecture 20 supports various heterogeneous hardware such as physical machines or virtual machines 29, and there is no need for user applications. The implementation of the physical machine or virtual machine 29 can be referred to the prior art, and therefore will not be described again.
处理架构 20采用 "队列 -worker" 的计算模型, 但本发明实施例不限于 此。 处理架构 20也可以采用其他计算模型, 例如, 处理架构 20中的部分第 因此, 本发明实施例的处理架构 20采用两层调度架构, 第二层调度器 对应于任务, 第一层调度器启动或选择任务所对应的第二层调度器, 从而可 以适用于不同的任务, 提高了处理效率和调度灵活性。 并且, 通过上述 "队 列 -worker" 的计算模型, 可同时启动不同任务的多个第二层调度器, 进一步 提高并发性能。  The processing architecture 20 employs a "queue-worker" computing model, but embodiments of the invention are not limited thereto. The processing architecture 20 can also adopt other computing models, for example, part of the processing architecture 20. Therefore, the processing architecture 20 of the embodiment of the present invention adopts a two-layer scheduling architecture, the second layer scheduler corresponds to the task, and the first layer scheduler starts. Or select the second-level scheduler corresponding to the task, which can be applied to different tasks, improving processing efficiency and scheduling flexibility. Moreover, through the above-mentioned "team-worker" calculation model, multiple second-tier schedulers for different tasks can be started at the same time, further improving the concurrent performance.
另外, 本发明实施例给出高性能、 灵活的并行处理架构 20可以支持物 理机器以及目前较流行的云计算平台、 支持大规模集群、 支持用户调度策略 配置以及自定义、 支持不同的计算模型。  In addition, the embodiment of the present invention provides a high-performance, flexible parallel processing architecture 20 that can support physical machines and currently popular cloud computing platforms, support large-scale clustering, support user scheduling policy configuration, and customize and support different computing models.
图 3是本发明一个实施例的分布式计算任务处理方法的流程图。 图 3的 方法可由图 1和图 2的任务处理*** 10执行, 因此下面结合图 1和图 2来 描述图 3的方法, 并适当省略重复的描述。  3 is a flow chart of a distributed computing task processing method in accordance with an embodiment of the present invention. The method of Fig. 3 can be performed by the task processing system 10 of Figs. 1 and 2, and therefore the method of Fig. 3 will be described below in conjunction with Figs. 1 and 2, and the repeated description will be appropriately omitted.
301 , 第一层调度器 11在接收到执行任务的请求时, 启动或选择任务对 应的第二层调度器 12。 例如, 在***中没有合适的第二层调度器时, 第一层调度器 11 可启动 该任务对应的第二层调度器 12。 在***中已经存在合适的第二层调度器时, 第一层调度器 11 可从这些合适的第二层调度器中选择该任务对应的第二层 调度器 12。 301. When receiving the request for executing the task, the first layer scheduler 11 starts or selects the second layer scheduler 12 corresponding to the task. For example, when there is no suitable second layer scheduler in the system, the first layer scheduler 11 can start the second layer scheduler 12 corresponding to the task. When a suitable second layer scheduler already exists in the system, the first layer scheduler 11 can select the second layer scheduler 12 corresponding to the task from among these suitable second layer schedulers.
302, 第一层调度器 11向第二层调度器 12转发请求。  302. The first layer scheduler 11 forwards the request to the second layer scheduler 12.
303 , 第二层调度器 12在接收到第一层调度器转发的请求时, 按照任务 的逻辑关系将任务分解为多个子任务。  303. When receiving the request forwarded by the first layer scheduler, the second layer scheduler 12 decomposes the task into multiple subtasks according to the logical relationship of the task.
本发明实施例采用两层调度架构, 第二层调度器对应于任务, 第一层调 度器启动或选择任务所对应的第二层调度器, 从而可以适用于不同的任务, 提高了处理效率和调度灵活性。  The embodiment of the invention adopts a two-layer scheduling architecture, the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that the second layer scheduler can be applied to different tasks, and the processing efficiency is improved. Scheduling flexibility.
在现有的并行处理架构中, 只有一层调度, 从而需要严格地先后按照 map, reduce两步进行任务处理, 而本发明实施例没有此限制。 本发明实施 例的第一层调度器 11 可以接受各种形式的任务, 任务的形式不必限于现有 技术中的严格的 map、 reduce两步。 第二层调度器 12对应于任务, 这样第 一层调度器 11可以将不同的任务发送给相应的第二层调度器 12进行调度处 理。 第二层调度器 12将任务分解为子任务以对任务进行处理, 例如调度各 个子任务的执行。 这样的调度具有更高的灵活性。  In the existing parallel processing architecture, there is only one layer of scheduling, so that the task processing needs to be performed in two steps in strict accordance with the map, reduce, and the embodiment of the present invention does not have this limitation. The first layer scheduler 11 of the embodiment of the present invention can accept various forms of tasks, and the form of the task is not necessarily limited to the strict two steps of map and reduce in the prior art. The second layer scheduler 12 corresponds to the task, so that the first layer scheduler 11 can send different tasks to the corresponding second layer scheduler 12 for scheduling processing. The second layer scheduler 12 decomposes the tasks into subtasks to process the tasks, such as scheduling the execution of each subtask. Such scheduling has greater flexibility.
另外, 现有技术中需要严格地先后按照 map、 reduce两步的顺序进行任 务处理。如果存在很多步骤的处理,则需要通过提交很多次任务请求来完成, 处理效率较低。 本发明实施例没有此限制。 本发明实施例对任务本身以及执 行任务或子任务的方式不作限制。 例如, 任务中所包含的子任务的数目可以 比现有技术的 map、 reduce这两种更多, 如三个或三个以上的子任务; 而且 不限于 map、 reduce子任务的形式。 另外, 子任务不必遵循严格的先后顺序, 可以并行地执行、 串行地执行、 或者部分并行部分串行地执行。 这样, 即使 是很多步骤的处理, 也只需要较少次数的任务请求, 提高了处理效率。  In addition, in the prior art, the task processing needs to be performed strictly in the order of two steps of map and reduce. If there are many steps to process, you need to complete the task request by submitting many times, and the processing efficiency is low. Embodiments of the invention do not have this limitation. The embodiment of the present invention does not limit the task itself and the manner in which the task or subtask is executed. For example, the number of subtasks included in a task can be more than the prior art map, reduce, such as three or more subtasks; and is not limited to the form of map, reduce subtasks. In addition, the subtasks do not have to follow a strict sequence, and may be executed in parallel, serially, or partially in parallel. In this way, even with many steps, only a small number of task requests are required, which improves processing efficiency.
可选地, 作为一个实施例, 任务的逻辑关系可指示多个子任务的执行依 赖关系。所谓执行依赖关系,是指各个子任务的执行操作之间是否相互依赖。  Optionally, as an embodiment, the logical relationship of the tasks may indicate the execution dependencies of the plurality of subtasks. The so-called execution dependency refers to whether the execution operations of each sub-task depend on each other.
举例来说, 假设子任务 2必须依赖于子任务 1的执行结果, 则子任务 2 应该在子任务 1执行之后再执行(即子任务 1和子任务 2需串行地执行)。 另一方面, 如果子任务 2不依赖于子任务 1的全部执行结果, 则子任务 1和 子任务 2的可以并行执行, 也可以串行执行。 执行依赖关系的一个非限制性的例子可包括: 多个子任务中的两个或更 多个子任务按照串行、 或并行、 或部分并行部分串行顺序执行, 而不限于现 有技术中的 map、 reduce这两个步骤。 这样, 如果某一任务存在很多步骤的 处理, 则无需像 MapReduce 架构那样提交很多次任务请求, 本发明实施例 可能仅需提交一次或少量几次任务请求, 从而提高了任务的处理效率。 For example, assuming that subtask 2 must depend on the execution result of subtask 1, subtask 2 should be executed after subtask 1 is executed (ie, subtask 1 and subtask 2 need to be executed serially). On the other hand, if subtask 2 does not depend on the entire execution result of subtask 1, subtask 1 and subtask 2 may be executed in parallel or in series. A non-limiting example of performing a dependency may include: two or more subtasks of the plurality of subtasks are executed in serial, or parallel, or partially parallel partial serial order, and are not limited to the prior art maps. , reduce these two steps. In this way, if there are many steps of processing for a certain task, it is not necessary to submit a plurality of task requests as in the MapReduce architecture. The embodiment of the present invention may only need to submit one or a few times of task requests, thereby improving the processing efficiency of the task.
可选地, 作为另一实施例, 第二层调度器 12还可为多个子任务创建相 应的队列以存储子任务包含的任务, 根据任务的逻辑关系整理队列的顺序。  Optionally, as another embodiment, the second layer scheduler 12 may also create a corresponding queue for the plurality of subtasks to store the tasks included in the subtasks, and organize the order of the queues according to the logical relationship of the tasks.
可选地, 作为另一实施例, 第二层调度器 12还可在队列中存储了子任 务包含的任务时, 为子任务申请资源, 并指示所申请资源的工作单元 ( worker )管理器启动工作单元, 以使得工作单元从队列获取子任务包含的 任务以执行任务。  Optionally, as another embodiment, the second layer scheduler 12 may also apply for a resource for the subtask when the task included in the subtask is stored in the queue, and indicate that the worker manager of the requested resource is started. A unit of work, such that the unit of work obtains the tasks contained in the subtask from the queue to perform the task.
可选地, 作为另一实施例, 第二层调度器 12还可指示工作单元将执行 任务的结果放入另一个队列中或输出执行任务的结果。  Alternatively, as another embodiment, the second layer scheduler 12 may also instruct the work unit to put the result of executing the task into another queue or output the result of executing the task.
可选地, 作为另一实施例, 第二层调度器 12还可获取队列和工作单元 的进度信息, 以确定任务的执行进度。这样,可以实现任务进度的实时查询, 方便用户监控。  Optionally, as another embodiment, the second layer scheduler 12 may also obtain progress information of the queue and the work unit to determine the execution progress of the task. In this way, real-time query of task progress can be realized, which is convenient for user monitoring.
可选地, 作为另一实施例, 在步骤 301 中, 第一层调度器 11可对任务 进行优先级管理, 并按照优先级启动或选择第二层调度器 12对任务进行处 理。  Optionally, as another embodiment, in step 301, the first layer scheduler 11 may perform priority management on the task, and start or select the second layer scheduler 12 according to the priority to process the task.
下面, 结合具体例子, 更加详细地描述本发明的实施例。 图 4是本发明 一个实施例的任务处理过程的示意流程图。 例如, 图 4的过程可由图 2的处 理架构 20执行, 因此适当省略重复的描述。  Hereinafter, embodiments of the present invention will be described in more detail with reference to specific examples. Figure 4 is a schematic flow chart of the task processing procedure of one embodiment of the present invention. For example, the process of Fig. 4 can be performed by the processing architecture 20 of Fig. 2, and thus the repeated description is omitted as appropriate.
在图 4的例子中, H殳任务由先后串行执行的子任务 1-3 (子任务 1-> 子任务 2->子任务 3 )构成。 但是本发明实施例不限于该具体例子, 其他任 何类型的任务均可类似地应用本发明实施例的处理过程。这样的应用均落入 本发明实施例的范围内。  In the example of Figure 4, the H殳 task consists of subtasks 1-3 (subtask 1-> subtask 2-> subtask 3) that are executed serially. However, embodiments of the present invention are not limited to this specific example, and any other type of task can similarly apply the processing of the embodiment of the present invention. Such applications are all within the scope of embodiments of the invention.
401 , 网络服务(webservice )接收用户提交的执行任务的请求。 任务的 逻辑关系可以由用户定义。  401. A web service (webservice) receives a request submitted by a user to perform a task. The logical relationship of tasks can be defined by the user.
402, 网络服务将请求转发给第一层调度器。  402. The network service forwards the request to the first layer scheduler.
403 , 第一层调度器向网络服务返回提交请求成功的响应。 步骤 403是 可选的步骤。 404, 第一层调度器将接收到的任务, 按照优先级计算方法计算优先级 且进行排序, 选择出优先级高的任务。 403. The first layer scheduler returns a response to the network service that successfully submits the request. Step 403 is an optional step. 404. The first layer scheduler calculates the priority according to the priority calculation method and performs the sorting, and selects the task with a high priority.
405 , 第一层调度器按照步骤 404 中选择的任务, 启动 (如果当时*** 中没有对应于这种类型的第二层调度器)或选择 (***中有可用的对应于这 种类型的第二层调度器)合适的第二层调度器。  405. The first layer scheduler starts according to the task selected in step 404 (if there is no second layer scheduler corresponding to the type at the time) or selects (the second type corresponding to the type is available in the system) Layer scheduler) A suitable second layer scheduler.
406, 第一层调度器将任务请求转发给在步骤 405 中启动或选择的第二 层调度器。  406. The first layer scheduler forwards the task request to the second layer scheduler started or selected in step 405.
407 , 第二层调度器接收到任务请求后, 根据任务的逻辑关系对任务进 行预处理。 具体地, 作为一个非限制性的例子, 第二层调度器可将任务分解 为多个子任务(子任务 1、 子任务 2、 子任务 3 )。  407. After receiving the task request, the second layer scheduler preprocesses the task according to the logical relationship of the task. Specifically, as a non-limiting example, the second layer scheduler can decompose the task into multiple subtasks (subtask 1, subtask 2, subtask 3).
408, 第二层调度器根据 "队列 -worker" 计算模型, 为子任务 1-3创建 队列 1-3。 可选地, 第二层调度器可产生初始子任务(对应于子任务 1 ) 并 将其放入队列 1中。 此时, 第二层调度器可根据子任务 1-3之间的执行依赖 关系 "子任务 1->子任务 2->子任务 3"整理队列 1-3的执行顺序为 "队列 1-> 队列 2->队歹1 3"。  408. The second layer scheduler creates a queue 1-3 for subtasks 1-3 according to the "queue-worker" calculation model. Alternatively, the second layer scheduler may generate an initial subtask (corresponding to subtask 1) and place it in queue 1. At this time, the second layer scheduler may perform the execution dependency relationship between the subtasks 1-3 "subtask 1 -> subtask 2 -> subtask 3". The execution order of the queues 1-3 is "queue 1 -> Queue 2 -> Team 歹 1 3".
409, 第二层调度器发现队列 1中有子任务 1。 例如, 第二层调度器可以 周期性地检查队列, 以查看队列中是否有任务。 但本发明实施例对此不作限 制, 第二层调度器可以按照其他方式发现队列中的子任务。  409, the second layer scheduler finds that there are subtasks 1 in queue 1. For example, the second-tier scheduler can periodically check the queue to see if there are any tasks in the queue. However, the embodiment of the present invention does not limit this, and the second layer scheduler may discover subtasks in the queue according to other manners.
410, 第二层调度器从资源管理器为子任务 1申请资源。  410. The second layer scheduler requests resources for the subtask 1 from the resource manager.
411 , 第二层调度器指示所申请资源的工作单元 (worker ) 管理器启动 worker去处理队列 1中的子任务 1。  411. The second layer scheduler indicates that the worker of the requested resource starts the worker to process the subtask 1 in the queue 1.
412, worker管理器启动 worker, 且告知 worker将处理完子任务 1得到 的结果(对应于子任务 2 )放入队列 2中。  412, the worker manager starts the worker, and tells the worker to put the result obtained by processing the subtask 1 (corresponding to the subtask 2) into the queue 2.
413: worker启动后, 自动去队列 1获取并执行子任务 1所包含的任务, 执行完成后, 将执行结果(对应于子任务 2 )放入队列 2中。  413: After the worker starts, it automatically goes to the queue 1 to acquire and execute the tasks included in the subtask 1. After the execution is completed, the execution result (corresponding to the subtask 2) is put into the queue 2.
414, 第二层调度器发现队列 2中有子任务 2。  414, the second layer scheduler finds that there are subtasks 2 in queue 2.
415 , 第二层调度器从资源管理器为子任务 2申请资源。  415. The second layer scheduler requests resources for the subtask 2 from the resource manager.
416, 第二层调度器指示工作单元(worker ) 管理器启动 worker去处理 队列 2中的子任务 2。  416. The second layer scheduler instructs the worker manager to start the worker to process the subtask 2 in queue 2.
417 , worker管理器启动 worker, 且告知 worker将处理完子任务 2得到 的结果(对应于子任务 3 )放入队列 3中。 418: worker启动后, 自动去队列 2获取并执行子任务 2所包含的任务, 执行完成后, 将执行结果(对应于子任务 3 )放入队列 3中。 417, the worker manager starts the worker, and tells the worker to put the result obtained by processing the subtask 2 (corresponding to the subtask 3) into the queue 3. 418: After the worker starts, it automatically goes to the queue 2 to acquire and execute the tasks included in the subtask 2. After the execution is completed, the execution result (corresponding to the subtask 3) is put into the queue 3.
419, 第二层调度器发现队列 3中有子任务 3。  419, the second layer scheduler finds that there are subtasks 3 in queue 3.
420, 第二层调度器从资源管理器为子任务 3申请资源。  420. The second layer scheduler requests resources for the subtask 3 from the resource manager.
421 , 第二层调度器指示工作单元(worker ) 管理器启动 worker去处理 队列 3中的子任务 3。  421. The second layer scheduler instructs the worker manager to start the worker to process the subtask 3 in the queue 3.
422, worker管理器启动 worker, 且告知 worker将处理完子任务 3得到 的结果放入合适的位置 (如放入分布式存储设备或返回给用户)。  422, the worker manager starts the worker, and tells the worker to put the result of processing the subtask 3 into a suitable location (such as putting it into a distributed storage device or returning it to the user).
423 , worker启动后, 自动去队列 3获取并执行子任务 3所包含的任务, 执行完成后, 将执行结果放入合适的位置。  423. After the worker starts, it automatically goes to the queue 3 to acquire and execute the tasks included in the subtask 3. After the execution is completed, the execution result is put into an appropriate position.
通过上述步骤 409-423: worker自动取子任务和放子任务。这样整个任 务就可以按照任务定义的逻辑关系进行处理。 另外, 虽然图 4的实施例中将 409-423描绘为串行执行, 但本发明实施例不限于此。 在其他实施例中, 当 队列 1-3无需按顺序执行时, 步骤 409-413、 步骤 414-418、 步骤 419-423的 执行顺序有可能交换或重叠。 例如, 如果队列 2的子任务 2不依赖于队列 1 中全部子任务 1的执行结果,则在队列 1的 worker工作时,队列 2的 worker 也可以工作, 而无须队列 1的 worker执行完全部子任务 1才能启动队列 2 的 worker。  Through the above steps 409-423: The worker automatically takes the subtask and the subtask task. This allows the entire task to be processed in accordance with the logical relationship defined by the task. In addition, although 409-423 is depicted as being serially executed in the embodiment of FIG. 4, embodiments of the present invention are not limited thereto. In other embodiments, when queues 1-3 need not be performed in sequence, the order of execution of steps 409-413, steps 414-418, and steps 419-423 may be swapped or overlapped. For example, if subtask 2 of queue 2 does not depend on the execution result of all subtasks 1 in queue 1, the worker of queue 2 can also work while the worker of queue 1 is working, and the worker of queue 1 does not need to execute the complete part. Task 1 can start the worker of queue 2.
424, 第二层调度器可获取队列或 worker的进度信息, 判断整个任务执 行进度。  424. The second layer scheduler can obtain the progress information of the queue or the worker, and judge the progress of the entire task execution.
425 , 如果任务执行完成,则第二层调度器上报给第一层调度器。 第二层 调度器也可以直接供用户实时查询任务的进度。  425. If the task execution is completed, the second layer scheduler is reported to the first layer scheduler. The second layer scheduler can also directly query the progress of the task in real time.
这样, 本发明实施例采用两层调度架构, 第二层调度器对应于任务, 第 一层调度器启动或选择任务所对应的第二层调度器,从而可以适用于不同的 任务, 提高了处理效率和调度灵活性, 并且可以满足多种并行处理业务的需 求。  In this way, the embodiment of the present invention adopts a two-layer scheduling architecture, and the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, thereby being applicable to different tasks and improving processing. Efficiency and scheduling flexibility, and can meet the needs of a variety of parallel processing services.
本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的各 示例的单元及算法步骤, 能够以电子硬件、 或者计算机软件和电子硬件的结 合来实现。 这些功能究竟以硬件还是软件方式来执行, 取决于技术方案的特 定应用和设计约束条件。 专业技术人员可以对每个特定的应用来使用不同方 法来实现所描述的功能, 但是这种实现不应认为超出本发明的范围。 所属领域的技术人员可以清楚地了解到, 为描述的方便和筒洁, 上述描 述的***、 装置和单元的具体工作过程, 可以参考前述方法实施例中的对应 过程, 在此不再赘述。 Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention. A person skilled in the art can clearly understand that, for the convenience and the cleaning of the description, the specific working processes of the system, the device and the unit described above can refer to the corresponding processes in the foregoing method embodiments, and details are not described herein again.
在本申请所提供的几个实施例中, 应该理解到, 所揭露的***、 装置和 方法, 可以通过其它的方式实现。 例如, 以上所描述的装置实施例仅仅是示 意性的, 例如, 所述单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可 以有另外的划分方式, 例如多个单元或组件可以结合或者可以集成到另一个 ***, 或一些特征可以忽略, 或不执行。 另一点, 所显示或讨论的相互之间 的耦合或直接耦合或通信连接可以是通过一些接口, 装置或单元的间接耦合 或通信连接, 可以是电性, 机械或其它的形式。  In the several embodiments provided herein, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed. In addition, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作 为单元显示的部件可以是或者也可以不是物理单元, 即可以位于一个地方, 或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或 者全部单元来实现本实施例方案的目的。  The units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.
另外, 在本发明各个实施例中的各功能单元可以集成在一个处理单元 中, 也可以是各个单元单独物理存在, 也可以两个或两个以上单元集成在一 个单元中。  In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使 用时, 可以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发明 的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部 分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质 中, 包括若干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。 而前 述的存储介质包括: U盘、移动硬盘、只读存储器( ROM, Read-Only Memory )、 随机存取存储器(RAM, Random Access Memory ), 磁碟或者光盘等各种可 以存储程序代码的介质。  The functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is essential to the prior art or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护 范围应所述以权利要求的保护范围为准。  The above is only the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the claims.

Claims

权利要求 Rights request
1、 一种分布式计算任务处理***, 其特征在于, 包括: A distributed computing task processing system, comprising:
第一层调度器, 用于接收执行任务的请求, 启动或选择所述任务对应的 第二层调度器并向所述第二层调度器转发所述请求;  a first layer scheduler, configured to receive a request to perform a task, start or select a second layer scheduler corresponding to the task, and forward the request to the second layer scheduler;
第二层调度器, 用于在接收到所述第一层调度器转发的请求时, 按照所 述任务的逻辑关系将所述任务分解为多个子任务。  The second layer scheduler is configured to, according to the logical relationship of the task, decompose the task into multiple subtasks when receiving the request forwarded by the first layer scheduler.
2、 如权利要求 1所述的***, 其特征在于, 所述任务的逻辑关系指示 所述多个子任务的执行依赖关系。  2. The system of claim 1 wherein the logical relationship of the tasks indicates an execution dependency of the plurality of subtasks.
3、 如权利要求 1或 2所述的***, 其特征在于, 所述第二层调度器还 用于为所述多个子任务创建相应的队列以存储所述子任务包含的任务。  The system according to claim 1 or 2, wherein the second layer scheduler is further configured to create a corresponding queue for the plurality of subtasks to store tasks included in the subtasks.
4、 如权利要求 3所述的***, 其特征在于, 在所述队列中存储了所述 子任务包含的任务时, 所述第二层调度器还用于为所述子任务申请资源, 并 指示所申请资源的工作单元管理器启动工作单元, 以使得所述工作单元从所 述队列获取所述子任务包含的任务以执行任务。  The system according to claim 3, wherein, when the task included in the subtask is stored in the queue, the second layer scheduler is further configured to apply for a resource for the subtask, and A work unit manager indicating the requested resource initiates a work unit such that the work unit acquires tasks included in the subtask from the queue to perform a task.
5、 如权利要求 4所述的***, 其特征在于, 所述第二层调度器还用于 指示所述工作单元将执行任务的结果放入另一个队列中或输出所述执行任 务的结果。  The system according to claim 4, wherein the second layer scheduler is further configured to instruct the work unit to put the result of executing the task into another queue or output a result of the execution task.
6、 如权利要求 4或 5所述的***, 其特征在于, 所述第二层调度器还 用于获取所述队列和所述工作单元的进度信息, 以确定所述任务的执行进 度。  The system according to claim 4 or 5, wherein the second layer scheduler is further configured to acquire progress information of the queue and the work unit to determine an execution progress of the task.
7、 如权利要求 2-6任一项所述的***, 其特征在于, 所述多个子任务 的执行依赖关系包括: 所述多个子任务中的两个或更多个子任务按照串行或 者并行顺序执行。  The system according to any one of claims 2 to 6, wherein the execution dependencies of the plurality of subtasks include: two or more subtasks of the plurality of subtasks are serial or parallel Execute sequentially.
8、 如权利要求 1-7任一项所述的***, 其特征在于, 所述第一层调度 器还用于对所述任务进行优先级管理, 并按照所述优先级启动或选择所述第 二层调度器对所述任务进行处理。  The system according to any one of claims 1 to 7, wherein the first layer scheduler is further configured to perform priority management on the task, and start or select the according to the priority. The second layer scheduler processes the task.
9、 一种分布式计算任务处理方法, 其特征在于, 所述方法包括: 第一层调度器在接收到执行任务的请求时, 启动或选择所述任务对应的 第二层调度器;  A method for processing a distributed computing task, the method comprising: the first layer scheduler, when receiving a request to perform a task, starting or selecting a second layer scheduler corresponding to the task;
所述第一层调度器向所述第二层调度器转发所述请求; 所述第二层调度器在接收到所述第一层调度器转发的请求时,按照所述 任务的逻辑关系将所述任务分解为多个子任务。 The first layer scheduler forwards the request to the second layer scheduler; When receiving the request forwarded by the first layer scheduler, the second layer scheduler decomposes the task into multiple subtasks according to the logical relationship of the task.
10、 如权利要求 9所述的方法, 其特征在于, 所述任务的逻辑关系指示 所述多个子任务的执行依赖关系。  10. The method of claim 9, wherein the logical relationship of the tasks indicates an execution dependency of the plurality of subtasks.
11、 如权利要求 9或 10所述的方法, 其特征在于, 所述方法还包括: 所述第二层调度器为所述多个子任务创建相应的队列以存储所述子任务包 含的任务。  The method according to claim 9 or 10, wherein the method further comprises: the second layer scheduler creating a corresponding queue for the plurality of subtasks to store the tasks included in the subtask.
12、 如权利要求 11 所述的方法, 其特征在于, 所述方法还包括: 所述 申请资源, 并指示所申请资源的工作单元管理器启动工作单元, 以使得所述  The method according to claim 11, wherein the method further comprises: applying the resource, and indicating that the work unit manager of the applied resource starts the work unit, so that the
13、 如权利要求 12所述的方法, 其特征在于, 所述方法还包括: 所述 第二层调度器指示所述工作单元将执行任务的结果放入另一个队列中或输 出所述执行任务的结果。 13. The method according to claim 12, wherein the method further comprises: the second layer scheduler instructing the work unit to put a result of executing a task into another queue or outputting the execution task the result of.
14、 如权利要求 12或 13所述的方法, 其特征在于, 还包括: 所述第二 层调度器获取所述队列和所述工作单元的进度信息, 以确定所述任务的执行 进度。  14. The method of claim 12 or 13, further comprising: the second layer scheduler obtaining progress information of the queue and the work unit to determine an execution progress of the task.
15、 如权利要求 9-14任一项所述的方法, 其特征在于, 所述第一层调 度器还对所述任务进行优先级管理, 并按照所述优先级启动或选择所述第二 层调度器对所述任务进行处理。  The method according to any one of claims 9 to 14, wherein the first layer scheduler further performs priority management on the task, and starts or selects the second according to the priority The layer scheduler processes the task.
PCT/CN2012/070551 2012-01-18 2012-01-18 Task processing system and task processing method for distributed computation WO2013107012A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN2012800001658A CN102763086A (en) 2012-01-18 2012-01-18 Task processing system for distributed computation and task processing method for distributed computation
PCT/CN2012/070551 WO2013107012A1 (en) 2012-01-18 2012-01-18 Task processing system and task processing method for distributed computation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/070551 WO2013107012A1 (en) 2012-01-18 2012-01-18 Task processing system and task processing method for distributed computation

Publications (1)

Publication Number Publication Date
WO2013107012A1 true WO2013107012A1 (en) 2013-07-25

Family

ID=47056377

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/070551 WO2013107012A1 (en) 2012-01-18 2012-01-18 Task processing system and task processing method for distributed computation

Country Status (2)

Country Link
CN (1) CN102763086A (en)
WO (1) WO2013107012A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160299791A1 (en) * 2014-01-14 2016-10-13 Tencent Technology (Shenzhen) Company Limited Method And Apparatus For Processing Computational Task

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064736B (en) * 2012-12-06 2017-02-22 华为技术有限公司 Device and method for task processing
CN103870334B (en) * 2012-12-18 2017-05-31 ***通信集团公司 A kind of method for allocating tasks and device of extensive vulnerability scanning
US9886310B2 (en) 2014-02-10 2018-02-06 International Business Machines Corporation Dynamic resource allocation in MapReduce
CN104102949B (en) * 2014-06-27 2018-01-26 北京奇艺世纪科技有限公司 A kind of distributed work flow device and its method for handling workflow
CN104035817A (en) * 2014-07-08 2014-09-10 领佰思自动化科技(上海)有限公司 Distributed parallel computing method and system for physical implementation of large scale integrated circuit
CN104123182B (en) * 2014-07-18 2015-09-30 西安交通大学 Based on the MapReduce task of client/server across data center scheduling system and method
US10007619B2 (en) * 2015-05-29 2018-06-26 Qualcomm Incorporated Multi-threaded translation and transaction re-ordering for memory management units
CN106547523B (en) * 2015-09-17 2019-08-06 北大方正集团有限公司 Progress bar progress display methods, apparatus and system
CN105653365A (en) * 2016-02-22 2016-06-08 青岛海尔智能家电科技有限公司 Task processing method and device
CN106445681B (en) * 2016-08-31 2019-11-29 东方网力科技股份有限公司 Distributed task dispatching system and method
CN107665401B (en) * 2017-09-15 2020-06-26 平安科技(深圳)有限公司 Task allocation method, terminal and computer readable storage medium
CN107818016A (en) * 2017-11-22 2018-03-20 苏州麦迪斯顿医疗科技股份有限公司 Server application design method, request event processing method and processing device
CN110569252B (en) * 2018-05-16 2023-04-07 杭州海康威视数字技术股份有限公司 Data processing system and method
CN110597613A (en) * 2018-06-12 2019-12-20 成都鼎桥通信技术有限公司 Task processing method, device, equipment and computer readable storage medium
CN111258744A (en) * 2018-11-30 2020-06-09 中兴通讯股份有限公司 Task processing method based on heterogeneous computation and software and hardware framework system
CN109885388A (en) * 2019-01-31 2019-06-14 上海赜睿信息科技有限公司 A kind of data processing method and device suitable for heterogeneous system
CN110750371A (en) * 2019-10-17 2020-02-04 北京创鑫旅程网络技术有限公司 Flow execution method, device, equipment and storage medium
CN113448692A (en) * 2020-03-25 2021-09-28 杭州海康威视数字技术股份有限公司 Distributed graph computing method, device, equipment and storage medium
CN111506409B (en) * 2020-04-20 2023-05-23 南方电网科学研究院有限责任公司 Data processing method and system
CN111708643A (en) * 2020-06-11 2020-09-25 中国工商银行股份有限公司 Batch operation method and device for distributed streaming media platform
CN112596871A (en) * 2020-12-16 2021-04-02 中国建设银行股份有限公司 Service processing method and device
CN115268800B (en) * 2022-09-29 2022-12-20 四川汉唐云分布式存储技术有限公司 Data processing method and data storage system based on calculation route redirection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1981484A (en) * 2004-08-05 2007-06-13 思科技术公司 Hierarchal scheduler with multiple scheduling lanes
CN101621460A (en) * 2008-06-30 2010-01-06 中兴通讯股份有限公司 Packet scheduling method and device
CN102185761A (en) * 2011-04-13 2011-09-14 中国人民解放军国防科学技术大学 Two-layer dynamic scheduling method facing to ensemble prediction applications

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100440802C (en) * 2005-12-26 2008-12-03 北京航空航天大学 Service gridding system and method for processing operation
CN101169743A (en) * 2007-11-27 2008-04-30 南京大学 Method for implementing parallel power flow calculation based on multi-core computer in electric grid
CN101957780B (en) * 2010-08-17 2013-03-20 中国电子科技集团公司第二十八研究所 Resource state information-based grid task scheduling processor and grid task scheduling processing method
CN102110022B (en) * 2011-03-22 2013-04-10 上海交通大学 Sensor network embedded operation system based on priority scheduling

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1981484A (en) * 2004-08-05 2007-06-13 思科技术公司 Hierarchal scheduler with multiple scheduling lanes
CN101621460A (en) * 2008-06-30 2010-01-06 中兴通讯股份有限公司 Packet scheduling method and device
CN102185761A (en) * 2011-04-13 2011-09-14 中国人民解放军国防科学技术大学 Two-layer dynamic scheduling method facing to ensemble prediction applications

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160299791A1 (en) * 2014-01-14 2016-10-13 Tencent Technology (Shenzhen) Company Limited Method And Apparatus For Processing Computational Task
US10146588B2 (en) * 2014-01-14 2018-12-04 Tencent Technology (Shenzhen) Company Limited Method and apparatus for processing computational task having multiple subflows

Also Published As

Publication number Publication date
CN102763086A (en) 2012-10-31

Similar Documents

Publication Publication Date Title
WO2013107012A1 (en) Task processing system and task processing method for distributed computation
US11709704B2 (en) FPGA acceleration for serverless computing
Ge et al. GA-based task scheduler for the cloud computing systems
JP6373432B2 (en) Real-time optimization of computing infrastructure in virtual environment
Singh et al. Scheduling real-time security aware tasks in fog networks
US9098312B2 (en) Methods for dynamically generating an application interface for a modeled entity and devices thereof
US9262210B2 (en) Light weight workload management server integration
US20170171026A1 (en) Configuring a cloud from aggregate declarative configuration data
KR100683820B1 (en) Autonomic Service Routing Using Observed Resource Requirement for Self-Optimization
US20090241117A1 (en) Method for integrating flow orchestration and scheduling for a batch of workflows
JP2018518744A (en) Automatic scaling of resource instance groups within a compute cluster
CN108021435B (en) Cloud computing task flow scheduling method with fault tolerance capability based on deadline
CN114138486A (en) Containerized micro-service arranging method, system and medium for cloud edge heterogeneous environment
CN104506620A (en) Extensible automatic computing service platform and construction method for same
CN110661842B (en) Resource scheduling management method, electronic equipment and storage medium
JP6568238B2 (en) Hardware acceleration method and related devices
CN110569252B (en) Data processing system and method
CN104508625A (en) Abstraction models for monitoring of cloud resources
CN104503832A (en) Virtual machine scheduling system and virtual machine scheduling method with balanced equity and efficiency
CN111427675A (en) Data processing method and device and computer readable storage medium
CN112463290A (en) Method, system, apparatus and storage medium for dynamically adjusting the number of computing containers
CN107025134B (en) Database service system and method compatible with multiple databases
CN107204998B (en) Method and device for processing data
WO2022257247A1 (en) Data processing method and apparatus, and computer-readable storage medium
CN116048825A (en) Container cluster construction method and system

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201280000165.8

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12865714

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12865714

Country of ref document: EP

Kind code of ref document: A1