WO2013107012A1

WO2013107012A1 - Task processing system and task processing method for distributed computation

Info

Publication number: WO2013107012A1
Application number: PCT/CN2012/070551
Authority: WO
Inventors: 靳变变; 刘文宇; 严军
Original assignee: 华为技术有限公司
Priority date: 2012-01-18
Filing date: 2012-01-18
Publication date: 2013-07-25
Also published as: CN102763086A

Abstract

Embodiments of the present invention provide a task processing system and a task processing method for distributed computation. The system comprises: a first level scheduler, used for receiving a request of executing a task, starting or selecting a second level scheduler corresponding to the task, and forwarding the request to the second level scheduler; and the second level scheduler, used for decomposing the task into a plurality of subtasks according to a logical relationship of the task when receiving the request forwarded by the first level scheduler. The embodiments of the invention employ a two-level scheduling framework, the second level scheduler corresponding to the task and the first level scheduler starting or selecting the second level scheduler corresponding to the task, so that the task processing system and the task processing method can be used in different tasks, and the processing efficiency and scheduling flexibility are improved.

Description

分布式计算任务处理***和任务处理方法技术领域 Distributed computing task processing system and task processing method

本发明实施例涉及网络通信领域，并且更具体地，涉及分布式计算任务处理***和任务处理方法。背景技术 Embodiments of the present invention relate to the field of network communications, and more particularly, to a distributed computing task processing system and a task processing method. Background technique

目前，随着互联网的发展，对大量信息的快速处理的需求变得很迫切。因此数据的并行处理就变得很重要。分布式计算环境提供了网络环境下不同软、硬件平台资源共享和互操作的有效手段，成为并行处理的常用架构。目前业界熟知的并行处理***采用 MapReduce架构。 MapReduce是分布式计算软件构架，它可以支持大数据量的分布式处理。这个架构最初起源于函数式程式的 map (映射）和 reduce (缩减）两个函数。 map指的是对原始的文档按照自定义的映射规则进行处理，输出中间结果。 reduce按照自定义的缩减规则对中间结果进行合并。 At present, with the development of the Internet, the demand for rapid processing of a large amount of information has become urgent. Therefore, parallel processing of data becomes very important. The distributed computing environment provides an effective means for resource sharing and interoperability of different software and hardware platforms in the network environment, and becomes a common architecture for parallel processing. The parallel processing system well known in the industry currently uses the MapReduce architecture. MapReduce is a distributed computing software architecture that supports distributed processing of large amounts of data. This architecture originally originated from the function's map and reduce functions. Map refers to the processing of the original document according to the custom mapping rules, output intermediate results. Reduce combines intermediate results according to custom reduction rules.

在分布式计算环境中， MapReduce的通用架构包括调度节点和多个工作节点。调度节点负责任务调度和资源管理；负责根据用户配置，将用户提交的任务分解为 map、 reduce两种子任务，并分配 map、 reduce子任务到工作节点。工作节点负责运行 map、 reduce子任务，与调度节点保持通讯。 In a distributed computing environment, the general architecture of MapReduce includes a scheduling node and multiple working nodes. The scheduling node is responsible for task scheduling and resource management; it is responsible for decomposing the tasks submitted by the user into two subtasks, map and reduce, according to the user configuration, and assigning map and reduce subtasks to the working node. The worker node is responsible for running the map, reduce subtask, and maintaining communication with the dispatch node.

在这种并行处理架构中，由于一个调度节点负责任务以及资源管理，并且需要严格地先后按照 map、 reduce两步的顺序进行任务处理。如果存在很多步骤的处理，则需要通过提交很多次任务请求来完成，处理效率较低，调度不够灵活。发明内容 In this parallel processing architecture, since a scheduling node is responsible for tasks and resource management, it is necessary to perform task processing in strict order of two steps of map and reduce. If there are many steps to process, it needs to be completed by submitting many task requests. The processing efficiency is low and the scheduling is not flexible enough. Summary of the invention

本发明实施例提供一种任务处理***和任务处理方法，能够解决现有并行处理架构中处理效率的问题。 The embodiments of the present invention provide a task processing system and a task processing method, which can solve the problem of processing efficiency in the existing parallel processing architecture.

一方面，提供了一种分布式计算任务处理***，包括：第一层调度器，用于接收执行任务的请求，启动或选择任务对应的第二层调度器并向第二层调度器转发所述请求；第二层调度器，用于在接收到第一层调度器转发的请求时，按照任务的逻辑关系将任务分解为多个子任务。另一方面，提供了一种分布式计算任务处理方法，该方法包括：第一层调度器在接收到执行任务的请求时，启动或选择任务对应的第二层调度器；第一层调度器向第二层调度器转发该请求；第二层调度器在接收到第一层调度器转发的请求时，按照任务的逻辑关系将任务分解为多个子任务。 In one aspect, a distributed computing task processing system is provided, including: a first layer scheduler, configured to receive a request to perform a task, start or select a second layer scheduler corresponding to the task, and forward the to the second layer scheduler. The second layer scheduler is configured to decompose the task into multiple subtasks according to the logical relationship of the task when receiving the request forwarded by the first layer scheduler. On the other hand, a distributed computing task processing method is provided, the method comprising: the first layer scheduler starts or selects a second layer scheduler corresponding to the task when receiving the request for executing the task; the first layer scheduler The request is forwarded to the second layer scheduler; when receiving the request forwarded by the first layer scheduler, the second layer scheduler decomposes the task into multiple subtasks according to the logical relationship of the task.

本发明实施例采用两层调度架构，第二层调度器对应于任务，第一层调度器启动或选择任务对应的第二层调度器，从而可以适用于不同的任务，提高了处理效率和调度灵活性。附图说明 The embodiment of the invention adopts a two-layer scheduling architecture. The second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that it can be applied to different tasks, improving processing efficiency and scheduling. flexibility. DRAWINGS

为了更清楚地说明本发明实施例的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作筒单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。 In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings to be used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only the present invention. For some embodiments, other drawings may be obtained from those of ordinary skill in the art without departing from the drawings.

图 1是本发明实施例的任务处理***的框图。 1 is a block diagram of a task processing system in accordance with an embodiment of the present invention.

图 2是本发明一个实施例的处理架构的示意图。 2 is a schematic diagram of a processing architecture of an embodiment of the present invention.

图 3是本发明一个实施例的任务处理方法的流程图。 3 is a flow chart of a task processing method in accordance with an embodiment of the present invention.

图 4是本发明一个实施例的任务处理过程的示意流程图。具体实施方式 4 is a schematic flow chart of a task processing procedure according to an embodiment of the present invention. detailed description

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。 The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without making creative labor are within the scope of the present invention.

图 1是本发明实施例的分布式计算任务处理***的框图。图 1的任务处理*** 10包括两层调度器，即第一层调度器 11和第二层调度器 12。 1 is a block diagram of a distributed computing task processing system in accordance with an embodiment of the present invention. The task processing system 10 of Figure 1 includes a two-tier scheduler, a first level scheduler 11 and a second level scheduler 12.

第一层调度器 11接收执行任务的请求，启动或选择该任务对应的第二层调度器 12并向第二层调度器 12转发该请求。 The first layer scheduler 11 receives the request to perform the task, starts or selects the second layer scheduler 12 corresponding to the task, and forwards the request to the second layer scheduler 12.

例如，在***中没有合适的第二层调度器时，第一层调度器 11 可启动该任务对应的第二层调度器 12。在***中已经存在合适的第二层调度器时，第一层调度器 11 可从这些合适的第二层调度器中选择该任务对应的第二层调度器 12。可选地，所述第一层调度器还用于对所述任务进行优先级管理，并按照所述优先级启动或选择所述第二层调度器对所述任务进行处理。 For example, when there is no suitable second layer scheduler in the system, the first layer scheduler 11 can start the second layer scheduler 12 corresponding to the task. When a suitable second layer scheduler already exists in the system, the first layer scheduler 11 can select the second layer scheduler 12 corresponding to the task from among these suitable second layer schedulers. Optionally, the first layer scheduler is further configured to perform priority management on the task, and start or select the second layer scheduler to process the task according to the priority.

第二层调度器 12在接收到第一层调度器 11转发的请求时，按照任务的逻辑关系将该任务分解为多个子任务。 When receiving the request forwarded by the first layer scheduler 11, the second layer scheduler 12 decomposes the task into a plurality of subtasks according to the logical relationship of the task.

本发明实施例采用两层调度架构，第二层调度器对应于任务，第一层调度器启动或选择任务对应的第二层调度器，从而可以适用于不同的任务，提高了处理效率和调度灵活性。 The embodiment of the invention adopts a two-layer scheduling architecture. The second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that it can be applied to different tasks, improving processing efficiency and scheduling. flexibility.

在现有的并行处理架构中，只有一层调度，从而需要严格地先后按照 map, reduce两步进行任务处理，而本发明实施例没有此限制。本发明实施例的第一层调度器 11 可以接受各种形式的任务，任务的形式不必限于现有技术中的严格的 map、 reduce两步。第二层调度器 12对应于任务，这样第一层调度器 11可以将不同的任务发送给相应的第二层调度器 12进行调度处理。第二层调度器 12将任务分解为子任务以对任务进行处理，例如调度各个子任务的执行。这样的调度具有更高的灵活性。 In the existing parallel processing architecture, there is only one layer of scheduling, so that the task processing needs to be performed in two steps in strict accordance with the map, reduce, and the embodiment of the present invention does not have this limitation. The first layer scheduler 11 of the embodiment of the present invention can accept various forms of tasks, and the form of the task is not necessarily limited to the strict two steps of map and reduce in the prior art. The second layer scheduler 12 corresponds to the task, so that the first layer scheduler 11 can send different tasks to the corresponding second layer scheduler 12 for scheduling processing. The second layer scheduler 12 decomposes the tasks into subtasks to process the tasks, such as scheduling the execution of each subtask. Such scheduling has greater flexibility.

另外，现有技术中需要严格地先后按照 map、 reduce两步的顺序进行任务处理。如果存在很多步骤的处理，则需要通过提交很多次任务请求来完成，处理效率较低。本发明实施例没有此限制。本发明实施例对任务本身以及执行任务或子任务的方式不作限制。例如，任务中所包含的子任务的数目可以比现有技术的 map、 reduce这两种更多，如三个或三个以上的子任务；而且不限于 map、 reduce子任务的形式。另外，子任务不必遵循严格的先后顺序，可以并行地执行、串行地执行、或者部分并行部分串行地执行。这样，即使是很多步骤的处理，也只需要较少次数的任务请求，提高了处理效率。 In addition, in the prior art, the task processing needs to be performed strictly in the order of two steps of map and reduce. If there are many steps to process, you need to complete the task request by submitting many times, and the processing efficiency is low. Embodiments of the invention do not have this limitation. The embodiment of the present invention does not limit the task itself and the manner in which the task or subtask is executed. For example, the number of subtasks included in a task can be more than the prior art map, reduce, such as three or more subtasks; and is not limited to the form of map, reduce subtasks. In addition, the subtasks do not have to follow a strict sequence, and may be executed in parallel, serially, or partially in parallel. In this way, even with many steps, only a small number of task requests are required, which improves processing efficiency.

子任务的数目与具体的任务有关，如转码、人脸识别业务等任务。根据任务的逻辑关系，这些任务可能具有相同或不同的子任务数目。可选地，作为一个实施例，可以在任务的描述文件中携带该任务的逻辑关系。例如，任务处理*** 10 (具体地，例如第二层调度器 12 )可接收用户上传的描述文件，如 XML ( Extensible Markup Language, 可扩展标记语言 )格式的描述文件，该描述文件中携带任务的逻辑关系。 The number of subtasks is related to specific tasks, such as transcoding, face recognition services, and so on. Depending on the logical relationship of the tasks, these tasks may have the same or different number of subtasks. Optionally, as an embodiment, the logical relationship of the task may be carried in the description file of the task. For example, the task processing system 10 (specifically, for example, the second layer scheduler 12) can receive a user-uploaded description file, such as an XML (Extensible Markup Language) format description file, which carries the task. Logic.

可选地，第二层调度器 12在接收到第一层调度器 11转发的请求时，获得该任务对应的 XML格式的描述文件。根据该 XML格式的描述文件中携带的任务的逻辑关系，将该任务分解为多个子任务。另外，每个子任务还可以进一步分解为更小粒度的子任务。也就是说，本发明实施例的子任务可以是多层子任务，每层子任务的进一步分解的方式均可通过描述文件中携带的逻辑关系确定。举例来说，子任务 1可以分解为多个子任务 2, 子任务 2也还可以进一步分解为多个子任务 3等等。 Optionally, when receiving the request forwarded by the first layer scheduler 11, the second layer scheduler 12 obtains a description file in an XML format corresponding to the task. The task is decomposed into a plurality of subtasks according to the logical relationship of the tasks carried in the description file of the XML format. In addition, each subtask can be further broken down into sub-tasks of smaller granularity. That is, the subtask of the embodiment of the present invention may be a multi-layer subtask, and the manner of further decomposing each subtask may be determined by the logical relationship carried in the description file. For example, subtask 1 can be decomposed into multiple subtasks 2, and subtask 2 can be further decomposed into multiple subtasks 3 and so on.

可选地，作为一个实施例，任务的逻辑关系可指示多个子任务的执行依赖关系。所谓执行依赖关系，是指各个子任务的执行操作之间是否相互依赖。 Optionally, as an embodiment, the logical relationship of the tasks may indicate the execution dependencies of the plurality of subtasks. The so-called execution dependency refers to whether the execution operations of each sub-task depend on each other.

举例来说，假设子任务 2必须依赖于子任务 1的执行结果，则子任务 2 应该在子任务 1执行之后再执行（即子任务 1和子任务 2需串行地执行）。另一方面，如果子任务 2不依赖于子任务 1的全部执行结果，则子任务 1和子任务 2可以并行执行，也可以串行执行。 For example, assuming that subtask 2 must rely on the execution result of subtask 1, subtask 2 should be executed after subtask 1 is executed (ie, subtask 1 and subtask 2 need to be executed serially). On the other hand, if subtask 2 does not depend on the entire execution result of subtask 1, subtask 1 and subtask 2 may be executed in parallel or in series.

执行依赖关系的一个非限制性的例子可包括：多个子任务中的两个或更多个子任务按照串行、或并行、或部分并行部分串行顺序执行，而不限于现有技术中的 map、 reduce这两个步骤。这样，如果某一任务存在很多步骤的处理，则无需像 MapReduce 架构那样提交很多次任务请求，本发明实施例可能仅需提交一次或少量几次任务请求，从而提高了任务的处理效率。 A non-limiting example of performing a dependency may include: two or more subtasks of the plurality of subtasks are executed in serial, or parallel, or partially parallel partial serial order, and are not limited to the prior art maps. , reduce these two steps. In this way, if there are many steps in a certain task, there is no need to submit a plurality of task requests as in the MapReduce architecture. The embodiment of the present invention may only need to submit one or a few times of task requests, thereby improving the processing efficiency of the task.

任务的逻辑关系可以显式地指示子任务间的执行依赖关系，例如显式地表示该任务是由先后串行执行的子任务 1-3构成。或者，任务的逻辑关系可以隐式地指示子任务间的执行依赖关系，例如对于某一特定任务， ***预先知道该任务是由先后串行执行的子任务 1-3构成的。 The logical relationship of the task can explicitly indicate the execution dependencies between the subtasks, for example, explicitly indicating that the task is composed of subtasks 1-3 that are executed serially in succession. Alternatively, the logical relationship of the task may implicitly indicate an execution dependency between the subtasks. For example, for a particular task, the system knows in advance that the task is composed of subtasks 1-3 that are executed serially in succession.

可选地，作为另一实施例，第二层调度器 12还用于为所述多个子任务创建相应的队列以存储所述子任务包含的任务。当在所述队列中存储了所述子任务包含的任务时，第二层调度器 12还可以用于为所述子任务申请资源，并指示所申请资源的工作单元管理器启动工作单元，以使得所述工作单元从所述队列获取所述子任务包含的任务以执行任务。可选地，作为另一实施例，第二层调度器 12还可以用于指示所述工作单元将执行任务的结果放入另一个队列中或输出所述执行任务的结果。 Optionally, as another embodiment, the second layer scheduler 12 is further configured to create a corresponding queue for the multiple subtasks to store the tasks included in the subtask. When the task included in the subtask is stored in the queue, the second layer scheduler 12 may be further configured to apply for a resource for the subtask, and indicate that the work unit manager of the requested resource starts the work unit, to The work unit is caused to acquire a task included in the subtask from the queue to perform a task. Optionally, as another embodiment, the second layer scheduler 12 is further configured to instruct the work unit to put the result of executing the task into another queue or output the result of the execution task.

进一步地，作为另一实施例，第二层调度器 12还可以用于获取所述队列和所述工作单元的进度信息，以确定所述任务的执行进度。 Further, as another embodiment, the second layer scheduler 12 is further configured to obtain progress information of the queue and the work unit to determine an execution progress of the task.

总之，本发明实施例对任务的具体形式不作限制。可选地，任务的逻辑关系设置或选择可支持用户自定义，例如通过插件机制接收用户的设置或选择。本发明实施例的任务处理*** 10可应用于云计算架构。云计算提出了一种高可靠性、低成本、按需使用、弹性的商业模式。很多***可以通过使用云服务，来达到高可靠性、弹性、低成本的目标。 In summary, the embodiment of the present invention does not limit the specific form of the task. Optionally, the logical relationship setting or selection of the task may support user customization, such as receiving a user's settings or selections through a plugin mechanism. The task processing system 10 of the embodiment of the present invention is applicable to a cloud computing architecture. Cloud computing proposes a high-reliability, low-cost, on-demand, and resilient business model. Many systems can achieve high reliability, flexibility, and low cost by using cloud services.

图 2是本发明一个实施例的处理架构的示意图。图 2的处理架构 20是一种云计算架构，包括图 1的任务处理*** 10。与图 1的不同之处在于，图 2的任务处理*** 10可包括多个第二层调度器 12。为了筒洁，图 2中仅仅描绘了两个第二层调度器 12, 但第二层调度器 12的数目不受此例子的限制 (可以更多或更少）。每个第二层调度器 12对应于一种任务，以适配或支撑不同的计算模型。可选地，多个第二层调度器 12也可以对应于一种任务，以实现***调度的高并发性。如果现有的多个第二层调度器 12 中存在对应于任务的合适的第二层调度器 12, 则第一层调度器 11可以选择该合适的第二层调度器 12对任务进行处理；如果现有的多个第二层调度器 12中没有对应于任务的合适的第二层调度器 12, 则第一层调度器 11可以启动新的合适的第二层调度器 12对任务进行处理。 2 is a schematic diagram of a processing architecture of an embodiment of the present invention. The processing architecture 20 of FIG. 2 is a cloud computing architecture, including the task processing system 10 of FIG. The difference from FIG. 1 is that the task processing system 10 of FIG. 2 can include a plurality of second layer schedulers 12. For the sake of cleaning, only two second-tier schedulers 12 are depicted in Figure 2, but the number of second-tier schedulers 12 is not limited by this example (may be more or less). Each second level scheduler 12 corresponds to a task to adapt or support different computing models. Optionally, the plurality of second layer schedulers 12 may also correspond to a task to achieve high concurrency of system scheduling. If there is a suitable second layer scheduler 12 corresponding to the task in the existing plurality of second layer schedulers 12, the first layer scheduler 11 may select the suitable second layer scheduler 12 to process the tasks; If there is no suitable second layer scheduler 12 corresponding to the task in the existing plurality of second layer schedulers 12, the first layer scheduler 11 can start a new suitable second layer scheduler 12 to process the tasks. .

在处理架构 20中，第一层调度器 11可以是分布式的，以支持高并发性。第一层调度器 11 可接收 Webservice (网络服务） 21 发送来的任务请求。 Webservice 21负责用户的 web (网络）请求的接收和转发，具体实现方式可参照现有技术，因此不再赘述。 In processing architecture 20, first layer scheduler 11 may be distributed to support high concurrency. The first layer scheduler 11 can receive task requests sent by the Webservice 21. The web service 21 is responsible for the receiving and forwarding of the web (network) request of the user. The specific implementation manner can refer to the prior art, and therefore will not be described again.

可选地，作为一个实施例，当第一层调度器 11接收到多个任务请求时，第一层调度器 11还可以对任务进行优先级管理（例如进行优先级排序 ), 并按照优先级启动或选择第二层调度器 12对任务进行处理。例如，第一层调度器 11可优先启动或选择优先级较高的任务所对应的第二层调度器 12。 Optionally, as an embodiment, when the first layer scheduler 11 receives multiple task requests, the first layer scheduler 11 may also perform priority management on the tasks (eg, prioritize), and according to the priority. The second level scheduler 12 is started or selected to process the task. For example, the first layer scheduler 11 may preferentially start or select the second layer scheduler 12 corresponding to the higher priority task.

可选地，作为另一实施例，第一层调度器 11可实现任务的优先级调整等附加功能。优先级排序或调整的方式可支持用户自定义，例如通过插件机制接收用户的设置。 Alternatively, as another embodiment, the first layer scheduler 11 may implement additional functions such as priority adjustment of tasks. The way of prioritization or adjustment supports user customization, such as receiving user settings through a plug-in mechanism.

第一层调度器 11在启动或选择了对应于任务的第二层调度器 12之后，向该第二层调度器 12转发任务请求。第二层调度器 12按照任务的逻辑关系，将任务分解为多个子任务，并管理多个子任务的执行。可选地，作为一个实施例，可通过队列（如图 2所示的分布式队列 22 )管理子任务的执行。分布式队列 22中可包括多个队列，分别存储相应的子任务包含的任务。 The first layer scheduler 11 forwards the task request to the second layer scheduler 12 after starting or selecting the second layer scheduler 12 corresponding to the task. The second layer scheduler 12 decomposes the task into multiple subtasks according to the logical relationship of the tasks, and manages the execution of the plurality of subtasks. Alternatively, as an embodiment, the execution of subtasks can be managed through a queue (distributed queue 22 as shown in Figure 2). The distributed queue 22 may include a plurality of queues, respectively storing the tasks included in the corresponding subtasks.

具体地，第二层调度器 12可以为多个子任务创建相应的队列以存储子任务所包含的任务。第二层调度器 12可根据任务的逻辑关系整理队列的顺序。例如，假设任务由先后串行执行的子任务 1-3 (子任务 1->子任务 2->子任务 3 )构成，第二层调度器 12可以建立队列 1-3 , 分别存储子任务 1-3所包含的任务，并且确定队列 1-3的顺序，即按照队列 1->队列 2->队列 3的顺序先后执行相应子任务中包含的任务。子任务 1的任务执行结果放入队列 2 中，子任务 2的任务执行结果放入队列 3中，子任务 3的任务执行结果输出至合适的位置，例如输出至图 2所示的分布式存储设备 24或返回给用户。 Specifically, the second layer scheduler 12 can create corresponding queues for multiple subtasks to store the sub-tasks. The tasks included in the task. The second layer scheduler 12 can organize the order of the queues according to the logical relationship of the tasks. For example, suppose the task consists of subtasks 1-3 (subtask 1 -> subtask 2 -> subtask 3) that are executed serially in sequence, and the second layer scheduler 12 can establish queues 1-3, respectively storing subtasks 1 -3 contains the tasks, and determines the order of queues 1-3, that is, the tasks included in the corresponding subtasks are executed in the order of queue 1 -> queue 2 -> queue 3. The task execution result of the subtask 1 is put into the queue 2, the task execution result of the subtask 2 is put into the queue 3, and the task execution result of the subtask 3 is output to an appropriate position, for example, output to the distributed storage shown in FIG. Device 24 is either returned to the user.

可选地，作为另一实施例，在分布式队列 22 中存储了子任务包含的任务时，第二层调度器 12还可以为该子任务申请资源，例如从资源管理器 25 申请资源。资源管理器 25负责满足调度器 11或 12的资源申请、释放。资源管理器 25的主要功能包括资源管理、资源匹配、资源自动伸缩。其中资源匹配方法可采用插件机制，支持用户自定义。另外，所谓资源自动伸缩是指用户配置集群规模在一个范围内时，可以根据集群负载情况，来自动扩容集群或者减容集群。资源管理器 25的其他实现方式可参照现有技术，因此不再赘述。例如，资源管理器 25也可以采用分布式方案，以实现高并发性。 Optionally, as another embodiment, when the task included in the subtask is stored in the distributed queue 22, the second layer scheduler 12 may also apply for resources for the subtask, for example, requesting resources from the resource manager 25. The resource manager 25 is responsible for satisfying the resource application and release of the scheduler 11 or 12. The main functions of the resource manager 25 include resource management, resource matching, and automatic resource scaling. The resource matching method can adopt a plug-in mechanism to support user customization. In addition, the automatic resource scaling refers to automatically expanding the cluster or reducing the capacity of the cluster according to the load of the cluster when the cluster size of the cluster is within a range. Other implementations of the resource manager 25 can be referred to the prior art, and therefore will not be described again. For example, the resource manager 25 can also employ a distributed scheme to achieve high concurrency.

在为子任务建立队列并申请资源之后，第二层调度器 12可指示所申请资源的工作单元（worker )管理器 26启动工作单元 27, 以使得工作单元 27 从队列获取子任务包含的任务并执行该任务。 worker管理器 26负责 worker 27的创建、删除、监控。在云计算架构中的每个节点（物理机或虚拟机）上都有 worker管理器 26。 worker管理器 26的其他实现方式可参照现有技术，因此不再赘述。 After the queue is established for the subtask and the resource is requested, the second layer scheduler 12 may indicate that the worker manager 26 of the requested resource initiates the work unit 27 to cause the work unit 27 to retrieve the tasks contained in the subtask from the queue and Perform this task. The worker manager 26 is responsible for the creation, deletion, and monitoring of the worker 27. There is a worker manager 26 on each node (physical or virtual machine) in the cloud computing architecture. Other implementations of the worker manager 26 can be referred to the prior art, and therefore will not be described again.

Worker 27负责从分布式队列 22中的相应队列获取用户的子任务所包含的任务，进行预处理，之后再调用用户开发的处理程序，待处理完成后，可按照第二层调度器 12所确定的队列的顺序，将执行任务的结果放入另一个队列中或输出执行任务的结果。 Worker 27的其他实现方式可参照现有技术，因此不再赘述。 The worker 27 is responsible for acquiring the tasks included in the subtasks of the user from the corresponding queues in the distributed queue 22, performing preprocessing, and then calling the processing program developed by the user. After the processing is completed, the second layer scheduler 12 can determine The order of the queues, put the results of the execution tasks into another queue or output the results of the execution tasks. Other implementations of Worker 27 can be referred to the prior art, and therefore will not be described again.

此外，第二层调度器 12还可以实现其他调度处理，例如任务异常处理或任务进度统计等。例如，第二层调度器 12可获取队列和工作单元（worker ) 的进度信息（如每个子任务是否完成或已完成多少，每个队列中的子任务是否完成或已完成多少等等），以确定任务的执行进度。这样，能够实现任务进度的实时查询。例如，用户可以到第二层调度器 12查询相应任务的执行进度。或者，第二层调度器 12可以将任务的进度信息上报给第一层调度器 11 , 以便用户到第一层调度器 11 查询相应任务的执行进度，方便用户的监控。 In addition, the second layer scheduler 12 can also implement other scheduling processes, such as task exception processing or task progress statistics. For example, the second-tier scheduler 12 can obtain progress information of queues and workers (such as whether each sub-task is completed or completed, whether the sub-tasks in each queue are completed or completed, etc.) Determine the progress of the execution of the task. In this way, real-time query of task progress can be achieved. For example, the user can go to the second layer scheduler 12 to query the execution of the corresponding task. Progress. Alternatively, the second layer scheduler 12 may report the progress information of the task to the first layer scheduler 11 so that the user can query the execution schedule of the corresponding task to the first layer scheduler 11 to facilitate user monitoring.

为了筒洁，图 2中例示了三个 worker管理器 26和相应的三个 worker 27 , 但是本发明实施例不限于该具体例子， worker管理器 26和 worker 27的数目可以更多或更少。 For the sake of cleaning, three worker managers 26 and corresponding three workers 27 are illustrated in Fig. 2, but the embodiment of the present invention is not limited to this specific example, and the number of worker managers 26 and workers 27 may be more or less.

集群管理软件 28 负责处理并行任务的集群的自动化部署与基本监控，其实现方式可参照现有技术，因此不再赘述。 The cluster management software 28 is responsible for the automated deployment and basic monitoring of the clusters that handle parallel tasks. The implementation manners can refer to the prior art, and therefore will not be described again.

分布式队列 22、数据库 23 (如 nosql数据库）、分布式存储设备 24实现处理架构 20所需的任务存储、数据库以及文件存储，具体实现方式也可参照现有技术，因此不再赘述。例如，数据库 23可用于信息持久化存储，以满足***运行需要或实现容错功能。 The distributed queue 22, the database 23 (such as the nosql database), and the distributed storage device 24 implement the task storage, the database, and the file storage required for the processing architecture 20, and the specific implementation manners can also refer to the prior art, and therefore will not be described again. For example, database 23 can be used for information persistence storage to meet system operational needs or to implement fault tolerance.

处理架构 20的最底层支持物理机或虚拟机 29等各种异构硬件，对于用户应用来说无需关心。物理机或虚拟机 29的实现方式可参照现有技术，因此不再赘述。 The bottom layer of the processing architecture 20 supports various heterogeneous hardware such as physical machines or virtual machines 29, and there is no need for user applications. The implementation of the physical machine or virtual machine 29 can be referred to the prior art, and therefore will not be described again.

处理架构 20采用 "队列 -worker" 的计算模型，但本发明实施例不限于此。处理架构 20也可以采用其他计算模型，例如，处理架构 20中的部分第因此，本发明实施例的处理架构 20采用两层调度架构，第二层调度器对应于任务，第一层调度器启动或选择任务所对应的第二层调度器，从而可以适用于不同的任务，提高了处理效率和调度灵活性。并且，通过上述 "队列 -worker" 的计算模型，可同时启动不同任务的多个第二层调度器，进一步提高并发性能。 The processing architecture 20 employs a "queue-worker" computing model, but embodiments of the invention are not limited thereto. The processing architecture 20 can also adopt other computing models, for example, part of the processing architecture 20. Therefore, the processing architecture 20 of the embodiment of the present invention adopts a two-layer scheduling architecture, the second layer scheduler corresponds to the task, and the first layer scheduler starts. Or select the second-level scheduler corresponding to the task, which can be applied to different tasks, improving processing efficiency and scheduling flexibility. Moreover, through the above-mentioned "team-worker" calculation model, multiple second-tier schedulers for different tasks can be started at the same time, further improving the concurrent performance.

另外，本发明实施例给出高性能、灵活的并行处理架构 20可以支持物理机器以及目前较流行的云计算平台、支持大规模集群、支持用户调度策略配置以及自定义、支持不同的计算模型。 In addition, the embodiment of the present invention provides a high-performance, flexible parallel processing architecture 20 that can support physical machines and currently popular cloud computing platforms, support large-scale clustering, support user scheduling policy configuration, and customize and support different computing models.

图 3是本发明一个实施例的分布式计算任务处理方法的流程图。图 3的方法可由图 1和图 2的任务处理*** 10执行，因此下面结合图 1和图 2来描述图 3的方法，并适当省略重复的描述。 3 is a flow chart of a distributed computing task processing method in accordance with an embodiment of the present invention. The method of Fig. 3 can be performed by the task processing system 10 of Figs. 1 and 2, and therefore the method of Fig. 3 will be described below in conjunction with Figs. 1 and 2, and the repeated description will be appropriately omitted.

301 , 第一层调度器 11在接收到执行任务的请求时，启动或选择任务对应的第二层调度器 12。例如，在***中没有合适的第二层调度器时，第一层调度器 11 可启动该任务对应的第二层调度器 12。在***中已经存在合适的第二层调度器时，第一层调度器 11 可从这些合适的第二层调度器中选择该任务对应的第二层调度器 12。 301. When receiving the request for executing the task, the first layer scheduler 11 starts or selects the second layer scheduler 12 corresponding to the task. For example, when there is no suitable second layer scheduler in the system, the first layer scheduler 11 can start the second layer scheduler 12 corresponding to the task. When a suitable second layer scheduler already exists in the system, the first layer scheduler 11 can select the second layer scheduler 12 corresponding to the task from among these suitable second layer schedulers.

302, 第一层调度器 11向第二层调度器 12转发请求。 302. The first layer scheduler 11 forwards the request to the second layer scheduler 12.

303 , 第二层调度器 12在接收到第一层调度器转发的请求时，按照任务的逻辑关系将任务分解为多个子任务。 303. When receiving the request forwarded by the first layer scheduler, the second layer scheduler 12 decomposes the task into multiple subtasks according to the logical relationship of the task.

本发明实施例采用两层调度架构，第二层调度器对应于任务，第一层调度器启动或选择任务所对应的第二层调度器，从而可以适用于不同的任务，提高了处理效率和调度灵活性。 The embodiment of the invention adopts a two-layer scheduling architecture, the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, so that the second layer scheduler can be applied to different tasks, and the processing efficiency is improved. Scheduling flexibility.

举例来说，假设子任务 2必须依赖于子任务 1的执行结果，则子任务 2 应该在子任务 1执行之后再执行（即子任务 1和子任务 2需串行地执行）。另一方面，如果子任务 2不依赖于子任务 1的全部执行结果，则子任务 1和子任务 2的可以并行执行，也可以串行执行。执行依赖关系的一个非限制性的例子可包括：多个子任务中的两个或更多个子任务按照串行、或并行、或部分并行部分串行顺序执行，而不限于现有技术中的 map、 reduce这两个步骤。这样，如果某一任务存在很多步骤的处理，则无需像 MapReduce 架构那样提交很多次任务请求，本发明实施例可能仅需提交一次或少量几次任务请求，从而提高了任务的处理效率。 For example, assuming that subtask 2 must depend on the execution result of subtask 1, subtask 2 should be executed after subtask 1 is executed (ie, subtask 1 and subtask 2 need to be executed serially). On the other hand, if subtask 2 does not depend on the entire execution result of subtask 1, subtask 1 and subtask 2 may be executed in parallel or in series. A non-limiting example of performing a dependency may include: two or more subtasks of the plurality of subtasks are executed in serial, or parallel, or partially parallel partial serial order, and are not limited to the prior art maps. , reduce these two steps. In this way, if there are many steps of processing for a certain task, it is not necessary to submit a plurality of task requests as in the MapReduce architecture. The embodiment of the present invention may only need to submit one or a few times of task requests, thereby improving the processing efficiency of the task.

可选地，作为另一实施例，第二层调度器 12还可为多个子任务创建相应的队列以存储子任务包含的任务，根据任务的逻辑关系整理队列的顺序。 Optionally, as another embodiment, the second layer scheduler 12 may also create a corresponding queue for the plurality of subtasks to store the tasks included in the subtasks, and organize the order of the queues according to the logical relationship of the tasks.

可选地，作为另一实施例，第二层调度器 12还可在队列中存储了子任务包含的任务时，为子任务申请资源，并指示所申请资源的工作单元 ( worker )管理器启动工作单元，以使得工作单元从队列获取子任务包含的任务以执行任务。 Optionally, as another embodiment, the second layer scheduler 12 may also apply for a resource for the subtask when the task included in the subtask is stored in the queue, and indicate that the worker manager of the requested resource is started. A unit of work, such that the unit of work obtains the tasks contained in the subtask from the queue to perform the task.

可选地，作为另一实施例，第二层调度器 12还可指示工作单元将执行任务的结果放入另一个队列中或输出执行任务的结果。 Alternatively, as another embodiment, the second layer scheduler 12 may also instruct the work unit to put the result of executing the task into another queue or output the result of executing the task.

可选地，作为另一实施例，第二层调度器 12还可获取队列和工作单元的进度信息，以确定任务的执行进度。这样，可以实现任务进度的实时查询，方便用户监控。 Optionally, as another embodiment, the second layer scheduler 12 may also obtain progress information of the queue and the work unit to determine the execution progress of the task. In this way, real-time query of task progress can be realized, which is convenient for user monitoring.

可选地，作为另一实施例，在步骤 301 中，第一层调度器 11可对任务进行优先级管理，并按照优先级启动或选择第二层调度器 12对任务进行处理。 Optionally, as another embodiment, in step 301, the first layer scheduler 11 may perform priority management on the task, and start or select the second layer scheduler 12 according to the priority to process the task.

下面，结合具体例子，更加详细地描述本发明的实施例。图 4是本发明一个实施例的任务处理过程的示意流程图。例如，图 4的过程可由图 2的处理架构 20执行，因此适当省略重复的描述。 Hereinafter, embodiments of the present invention will be described in more detail with reference to specific examples. Figure 4 is a schematic flow chart of the task processing procedure of one embodiment of the present invention. For example, the process of Fig. 4 can be performed by the processing architecture 20 of Fig. 2, and thus the repeated description is omitted as appropriate.

在图 4的例子中， H殳任务由先后串行执行的子任务 1-3 (子任务 1-> 子任务 2->子任务 3 )构成。但是本发明实施例不限于该具体例子，其他任何类型的任务均可类似地应用本发明实施例的处理过程。这样的应用均落入本发明实施例的范围内。 In the example of Figure 4, the H殳 task consists of subtasks 1-3 (subtask 1-> subtask 2-> subtask 3) that are executed serially. However, embodiments of the present invention are not limited to this specific example, and any other type of task can similarly apply the processing of the embodiment of the present invention. Such applications are all within the scope of embodiments of the invention.

401 , 网络服务（webservice )接收用户提交的执行任务的请求。任务的逻辑关系可以由用户定义。 401. A web service (webservice) receives a request submitted by a user to perform a task. The logical relationship of tasks can be defined by the user.

402, 网络服务将请求转发给第一层调度器。 402. The network service forwards the request to the first layer scheduler.

403 , 第一层调度器向网络服务返回提交请求成功的响应。步骤 403是可选的步骤。 404, 第一层调度器将接收到的任务，按照优先级计算方法计算优先级且进行排序，选择出优先级高的任务。 403. The first layer scheduler returns a response to the network service that successfully submits the request. Step 403 is an optional step. 404. The first layer scheduler calculates the priority according to the priority calculation method and performs the sorting, and selects the task with a high priority.

405 , 第一层调度器按照步骤 404 中选择的任务，启动（如果当时*** 中没有对应于这种类型的第二层调度器）或选择 (***中有可用的对应于这种类型的第二层调度器）合适的第二层调度器。 405. The first layer scheduler starts according to the task selected in step 404 (if there is no second layer scheduler corresponding to the type at the time) or selects (the second type corresponding to the type is available in the system) Layer scheduler) A suitable second layer scheduler.

406, 第一层调度器将任务请求转发给在步骤 405 中启动或选择的第二层调度器。 406. The first layer scheduler forwards the task request to the second layer scheduler started or selected in step 405.

407 , 第二层调度器接收到任务请求后，根据任务的逻辑关系对任务进行预处理。具体地，作为一个非限制性的例子，第二层调度器可将任务分解为多个子任务（子任务 1、子任务 2、子任务 3 )。 407. After receiving the task request, the second layer scheduler preprocesses the task according to the logical relationship of the task. Specifically, as a non-limiting example, the second layer scheduler can decompose the task into multiple subtasks (subtask 1, subtask 2, subtask 3).

408, 第二层调度器根据 "队列 -worker" 计算模型，为子任务 1-3创建队列 1-3。可选地，第二层调度器可产生初始子任务（对应于子任务 1 ) 并将其放入队列 1中。此时，第二层调度器可根据子任务 1-3之间的执行依赖关系 "子任务 1->子任务 2->子任务 3"整理队列 1-3的执行顺序为 "队列 1-> 队列 2->队歹1 3"。 408. The second layer scheduler creates a queue 1-3 for subtasks 1-3 according to the "queue-worker" calculation model. Alternatively, the second layer scheduler may generate an initial subtask (corresponding to subtask 1) and place it in queue 1. At this time, the second layer scheduler may perform the execution dependency relationship between the subtasks 1-3 "subtask 1 -> subtask 2 -> subtask 3". The execution order of the queues 1-3 is "queue 1 -> Queue 2 -> Team 歹 1 3".

409, 第二层调度器发现队列 1中有子任务 1。例如，第二层调度器可以周期性地检查队列，以查看队列中是否有任务。但本发明实施例对此不作限制，第二层调度器可以按照其他方式发现队列中的子任务。 409, the second layer scheduler finds that there are subtasks 1 in queue 1. For example, the second-tier scheduler can periodically check the queue to see if there are any tasks in the queue. However, the embodiment of the present invention does not limit this, and the second layer scheduler may discover subtasks in the queue according to other manners.

410, 第二层调度器从资源管理器为子任务 1申请资源。 410. The second layer scheduler requests resources for the subtask 1 from the resource manager.

411 , 第二层调度器指示所申请资源的工作单元（worker ) 管理器启动 worker去处理队列 1中的子任务 1。 411. The second layer scheduler indicates that the worker of the requested resource starts the worker to process the subtask 1 in the queue 1.

412, worker管理器启动 worker, 且告知 worker将处理完子任务 1得到的结果（对应于子任务 2 )放入队列 2中。 412, the worker manager starts the worker, and tells the worker to put the result obtained by processing the subtask 1 (corresponding to the subtask 2) into the queue 2.

413: worker启动后，自动去队列 1获取并执行子任务 1所包含的任务，执行完成后，将执行结果（对应于子任务 2 )放入队列 2中。 413: After the worker starts, it automatically goes to the queue 1 to acquire and execute the tasks included in the subtask 1. After the execution is completed, the execution result (corresponding to the subtask 2) is put into the queue 2.

414, 第二层调度器发现队列 2中有子任务 2。 414, the second layer scheduler finds that there are subtasks 2 in queue 2.

415 , 第二层调度器从资源管理器为子任务 2申请资源。 415. The second layer scheduler requests resources for the subtask 2 from the resource manager.

416, 第二层调度器指示工作单元（worker ) 管理器启动 worker去处理队列 2中的子任务 2。 416. The second layer scheduler instructs the worker manager to start the worker to process the subtask 2 in queue 2.

417 , worker管理器启动 worker, 且告知 worker将处理完子任务 2得到的结果（对应于子任务 3 )放入队列 3中。 418: worker启动后，自动去队列 2获取并执行子任务 2所包含的任务，执行完成后，将执行结果（对应于子任务 3 )放入队列 3中。 417, the worker manager starts the worker, and tells the worker to put the result obtained by processing the subtask 2 (corresponding to the subtask 3) into the queue 3. 418: After the worker starts, it automatically goes to the queue 2 to acquire and execute the tasks included in the subtask 2. After the execution is completed, the execution result (corresponding to the subtask 3) is put into the queue 3.

419, 第二层调度器发现队列 3中有子任务 3。 419, the second layer scheduler finds that there are subtasks 3 in queue 3.

420, 第二层调度器从资源管理器为子任务 3申请资源。 420. The second layer scheduler requests resources for the subtask 3 from the resource manager.

421 , 第二层调度器指示工作单元（worker ) 管理器启动 worker去处理队列 3中的子任务 3。 421. The second layer scheduler instructs the worker manager to start the worker to process the subtask 3 in the queue 3.

422, worker管理器启动 worker, 且告知 worker将处理完子任务 3得到的结果放入合适的位置（如放入分布式存储设备或返回给用户）。 422, the worker manager starts the worker, and tells the worker to put the result of processing the subtask 3 into a suitable location (such as putting it into a distributed storage device or returning it to the user).

423 , worker启动后，自动去队列 3获取并执行子任务 3所包含的任务，执行完成后，将执行结果放入合适的位置。 423. After the worker starts, it automatically goes to the queue 3 to acquire and execute the tasks included in the subtask 3. After the execution is completed, the execution result is put into an appropriate position.

通过上述步骤 409-423: worker自动取子任务和放子任务。这样整个任务就可以按照任务定义的逻辑关系进行处理。另外，虽然图 4的实施例中将 409-423描绘为串行执行，但本发明实施例不限于此。在其他实施例中，当队列 1-3无需按顺序执行时，步骤 409-413、步骤 414-418、步骤 419-423的执行顺序有可能交换或重叠。例如，如果队列 2的子任务 2不依赖于队列 1 中全部子任务 1的执行结果，则在队列 1的 worker工作时，队列 2的 worker 也可以工作，而无须队列 1的 worker执行完全部子任务 1才能启动队列 2 的 worker。 Through the above steps 409-423: The worker automatically takes the subtask and the subtask task. This allows the entire task to be processed in accordance with the logical relationship defined by the task. In addition, although 409-423 is depicted as being serially executed in the embodiment of FIG. 4, embodiments of the present invention are not limited thereto. In other embodiments, when queues 1-3 need not be performed in sequence, the order of execution of steps 409-413, steps 414-418, and steps 419-423 may be swapped or overlapped. For example, if subtask 2 of queue 2 does not depend on the execution result of all subtasks 1 in queue 1, the worker of queue 2 can also work while the worker of queue 1 is working, and the worker of queue 1 does not need to execute the complete part. Task 1 can start the worker of queue 2.

424, 第二层调度器可获取队列或 worker的进度信息，判断整个任务执行进度。 424. The second layer scheduler can obtain the progress information of the queue or the worker, and judge the progress of the entire task execution.

425 , 如果任务执行完成,则第二层调度器上报给第一层调度器。第二层调度器也可以直接供用户实时查询任务的进度。 425. If the task execution is completed, the second layer scheduler is reported to the first layer scheduler. The second layer scheduler can also directly query the progress of the task in real time.

这样，本发明实施例采用两层调度架构，第二层调度器对应于任务，第一层调度器启动或选择任务所对应的第二层调度器，从而可以适用于不同的任务，提高了处理效率和调度灵活性，并且可以满足多种并行处理业务的需求。 In this way, the embodiment of the present invention adopts a two-layer scheduling architecture, and the second layer scheduler corresponds to the task, and the first layer scheduler starts or selects the second layer scheduler corresponding to the task, thereby being applicable to different tasks and improving processing. Efficiency and scheduling flexibility, and can meet the needs of a variety of parallel processing services.

本领域普通技术人员可以意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。所属领域的技术人员可以清楚地了解到，为描述的方便和筒洁，上述描述的***、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。 Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention. A person skilled in the art can clearly understand that, for the convenience and the cleaning of the description, the specific working processes of the system, the device and the unit described above can refer to the corresponding processes in the foregoing method embodiments, and details are not described herein again.

在本申请所提供的几个实施例中，应该理解到，所揭露的***、装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个 ***，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。 In the several embodiments provided herein, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed. In addition, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。 The units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。 In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备（可以是个人计算机，服务器，或者网络设备等）执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括： U盘、移动硬盘、只读存储器（ ROM, Read-Only Memory )、随机存取存储器（RAM, Random Access Memory ), 磁碟或者光盘等各种可以存储程序代码的介质。 The functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is essential to the prior art or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应所述以权利要求的保护范围为准。 The above is only the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the claims.

Claims

权利要求 Rights request

1、一种分布式计算任务处理***，其特征在于，包括： A distributed computing task processing system, comprising:

第一层调度器，用于接收执行任务的请求，启动或选择所述任务对应的第二层调度器并向所述第二层调度器转发所述请求； a first layer scheduler, configured to receive a request to perform a task, start or select a second layer scheduler corresponding to the task, and forward the request to the second layer scheduler;

第二层调度器，用于在接收到所述第一层调度器转发的请求时，按照所述任务的逻辑关系将所述任务分解为多个子任务。 The second layer scheduler is configured to, according to the logical relationship of the task, decompose the task into multiple subtasks when receiving the request forwarded by the first layer scheduler.

2、如权利要求 1所述的***，其特征在于，所述任务的逻辑关系指示所述多个子任务的执行依赖关系。 2. The system of claim 1 wherein the logical relationship of the tasks indicates an execution dependency of the plurality of subtasks.

3、如权利要求 1或 2所述的***，其特征在于，所述第二层调度器还用于为所述多个子任务创建相应的队列以存储所述子任务包含的任务。 The system according to claim 1 or 2, wherein the second layer scheduler is further configured to create a corresponding queue for the plurality of subtasks to store tasks included in the subtasks.

4、如权利要求 3所述的***，其特征在于，在所述队列中存储了所述子任务包含的任务时，所述第二层调度器还用于为所述子任务申请资源，并指示所申请资源的工作单元管理器启动工作单元，以使得所述工作单元从所述队列获取所述子任务包含的任务以执行任务。 The system according to claim 3, wherein, when the task included in the subtask is stored in the queue, the second layer scheduler is further configured to apply for a resource for the subtask, and A work unit manager indicating the requested resource initiates a work unit such that the work unit acquires tasks included in the subtask from the queue to perform a task.

5、如权利要求 4所述的***，其特征在于，所述第二层调度器还用于指示所述工作单元将执行任务的结果放入另一个队列中或输出所述执行任务的结果。 The system according to claim 4, wherein the second layer scheduler is further configured to instruct the work unit to put the result of executing the task into another queue or output a result of the execution task.

6、如权利要求 4或 5所述的***，其特征在于，所述第二层调度器还用于获取所述队列和所述工作单元的进度信息，以确定所述任务的执行进度。 The system according to claim 4 or 5, wherein the second layer scheduler is further configured to acquire progress information of the queue and the work unit to determine an execution progress of the task.

7、如权利要求 2-6任一项所述的***，其特征在于，所述多个子任务的执行依赖关系包括：所述多个子任务中的两个或更多个子任务按照串行或者并行顺序执行。 The system according to any one of claims 2 to 6, wherein the execution dependencies of the plurality of subtasks include: two or more subtasks of the plurality of subtasks are serial or parallel Execute sequentially.

8、如权利要求 1-7任一项所述的***，其特征在于，所述第一层调度器还用于对所述任务进行优先级管理，并按照所述优先级启动或选择所述第二层调度器对所述任务进行处理。 The system according to any one of claims 1 to 7, wherein the first layer scheduler is further configured to perform priority management on the task, and start or select the according to the priority. The second layer scheduler processes the task.

9、一种分布式计算任务处理方法，其特征在于，所述方法包括：第一层调度器在接收到执行任务的请求时，启动或选择所述任务对应的第二层调度器； A method for processing a distributed computing task, the method comprising: the first layer scheduler, when receiving a request to perform a task, starting or selecting a second layer scheduler corresponding to the task;

所述第一层调度器向所述第二层调度器转发所述请求；所述第二层调度器在接收到所述第一层调度器转发的请求时，按照所述任务的逻辑关系将所述任务分解为多个子任务。 The first layer scheduler forwards the request to the second layer scheduler; When receiving the request forwarded by the first layer scheduler, the second layer scheduler decomposes the task into multiple subtasks according to the logical relationship of the task.

10、如权利要求 9所述的方法，其特征在于，所述任务的逻辑关系指示所述多个子任务的执行依赖关系。 10. The method of claim 9, wherein the logical relationship of the tasks indicates an execution dependency of the plurality of subtasks.

11、如权利要求 9或 10所述的方法，其特征在于，所述方法还包括：所述第二层调度器为所述多个子任务创建相应的队列以存储所述子任务包含的任务。 The method according to claim 9 or 10, wherein the method further comprises: the second layer scheduler creating a corresponding queue for the plurality of subtasks to store the tasks included in the subtask.

12、如权利要求 11 所述的方法，其特征在于，所述方法还包括：所述申请资源，并指示所申请资源的工作单元管理器启动工作单元，以使得所述 The method according to claim 11, wherein the method further comprises: applying the resource, and indicating that the work unit manager of the applied resource starts the work unit, so that the

13、如权利要求 12所述的方法，其特征在于，所述方法还包括：所述第二层调度器指示所述工作单元将执行任务的结果放入另一个队列中或输出所述执行任务的结果。 13. The method according to claim 12, wherein the method further comprises: the second layer scheduler instructing the work unit to put a result of executing a task into another queue or outputting the execution task the result of.

14、如权利要求 12或 13所述的方法，其特征在于，还包括：所述第二层调度器获取所述队列和所述工作单元的进度信息，以确定所述任务的执行进度。 14. The method of claim 12 or 13, further comprising: the second layer scheduler obtaining progress information of the queue and the work unit to determine an execution progress of the task.

15、如权利要求 9-14任一项所述的方法，其特征在于，所述第一层调度器还对所述任务进行优先级管理，并按照所述优先级启动或选择所述第二层调度器对所述任务进行处理。 The method according to any one of claims 9 to 14, wherein the first layer scheduler further performs priority management on the task, and starts or selects the second according to the priority The layer scheduler processes the task.