CN109992373B - Resource scheduling method, information management method and device and task deployment system - Google Patents

Resource scheduling method, information management method and device and task deployment system Download PDF

Info

Publication number
CN109992373B
CN109992373B CN201711487682.3A CN201711487682A CN109992373B CN 109992373 B CN109992373 B CN 109992373B CN 201711487682 A CN201711487682 A CN 201711487682A CN 109992373 B CN109992373 B CN 109992373B
Authority
CN
China
Prior art keywords
data
data center
task
node
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711487682.3A
Other languages
Chinese (zh)
Other versions
CN109992373A (en
Inventor
瓦伦·萨克塞纳
纳加奈拉西姆哈·拉梅什·加拉
赵波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201711487682.3A priority Critical patent/CN109992373B/en
Publication of CN109992373A publication Critical patent/CN109992373A/en
Application granted granted Critical
Publication of CN109992373B publication Critical patent/CN109992373B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • G06F9/4856Task life-cycle, e.g. stopping, restarting, resuming execution resumption being on a different machine, e.g. task migration, virtual machine migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Abstract

The application discloses a resource scheduling method, an information management device and a task deployment system, and belongs to the technical field of big data. In the task deployment system, a computing node sends a position information request to a metadata server, wherein the position information request is used for requesting position information of data required to be processed by a task to be executed by the computing node; receiving a position information response returned by the metadata server, wherein the position information response comprises position information of data required to be processed by the task to be executed; sending a task deployment request to a resource manager node, wherein the task deployment request comprises an identifier of the task to be executed, an identifier of a data center and an identifier of a computing node; and the resource manager node determines a data center for deploying the task to be executed according to the task deployment request, so that the task deployment across the data centers is realized.

Description

Resource scheduling method, information management method and device and task deployment system
Technical Field
The present application relates to the field of computer technologies, and in particular, to a resource scheduling method, an information management apparatus, and a task deployment system.
Background
In the industry, big data (big data) has several definitions, such as those given by the research institute Gartner: big data is information assets which need a new processing mode and have stronger decision-making power, insight discovery power and flow optimization capability to adapt to mass, high growth rate and diversification. Big data can be applied to a variety of scenarios, such as massively parallel processing databases, data mining grids, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems. The data is processed through a technology related to the big data, so that the application value of the data is improved.
With the rapid development of big Data services, the size of a Data Center (DC) cluster is gradually enlarged, and the resources of the DC cluster need to be uniformly scheduled through a cluster resource scheduling system. In the related technology, each DC of a DC cluster is provided with an independent distributed resource scheduling system and an independent distributed file system, each DC realizes the storage and management of local data through the distributed file system, the scheduling and management of local resources are realized through the distributed resource scheduling system, and the DCs regularly carry out cross-DC data synchronization, so that each DC can acquire data on other DCs to execute tasks.
In the process of implementing the present application, the inventor finds that the prior art has at least the following problems: in the case that a task initiated in one DC needs data across the DC to be executed, the prior art uses a mode of synchronizing data of other DCs to the DC at regular time to execute in the DC, so that the processing delay of the task is large, and the synchronization of the data causes waste of resources.
Disclosure of Invention
In order to solve the problem that a DC in the prior art can only execute tasks by using local resources, embodiments of the present invention provide a resource scheduling method, an information management device, and a task deployment system, so that the DC can execute tasks by using resources of other DCs without synchronizing data on other DCs to the local, thereby reducing task processing delay. The technical scheme is as follows:
in a first aspect, a task deployment system is provided, where the task deployment system includes a first compute node and a first resource manager node, and the first compute node and the first resource manager node belong to a first data center;
the first computing node is configured to send a first task deployment request to a first resource manager node, where the first task deployment request includes an identifier of a task, an identifier of a data center, and an identifier of a computing node in the data center, and the identifier of the data center is used to indicate the data center in which the task is to be deployed; the first resource manager node is configured to receive a first task deployment request sent by the first computing node; and sending a second task deployment request to a resource manager node in the data center corresponding to the identifier of the data center under the condition that the identifier of the data center indicates the data center except the first data center, wherein the second task deployment request is used for deploying the task to the data center and comprises the identifier of the data center, the identifier of the task and the identifier of a computing node in the data center.
The application provides a task deployment system, which comprises a computing node and a resource manager node, wherein the computing node sends a task deployment request to the resource management node in the same data center when executing a task, the resource management request carries an identifier of the task to be executed and identifiers of a data center and the computing node where data to be processed by the task is located, the resource manager node receives the task deployment request, when the task deployment request comprises identifiers of other data centers, the task deployment request is sent to other data centers, the task is deployed by the data center where the data required to be processed by the task to be executed is located, so that when the data required by the task is not available locally, the task can be processed in different places, real-time sharing of the data in different places is realized, the data in other data centers are not required to be synchronized locally, the task processing delay is small, and the resource waste is reduced.
The first task deployment request and the second task deployment request may be the same or different. For example, the first task deployment request corresponds to only one task, and the second task deployment request corresponds to the task, and at this time, the two tasks may be the same; the first task deployment request corresponds to two tasks, and the second task deployment request corresponds to only one task, and the two tasks are different at the moment.
The number of the tasks to be executed by the first computing node may be one or more, the data to be processed by each task may include one or more files, and the one or more files may be stored in the same data center or different data centers.
In an exemplary implementation manner of the first aspect, the task deployment system further includes a metadata server; the identifier of the data center is an identifier of a second data center, the identifier of a computing node in the data center is an identifier of a second computing node, the first computing node is further configured to send a location information request to the metadata server, the location information request includes a data identifier, the location information request is used to request the metadata server for information of the data center and the computing node where data corresponding to the data identifier is located, and the data corresponding to the data identifier is data required for executing the task; the metadata server is used for receiving the position information request; sending a location information response to the first computing node according to the location information request, wherein the location information response comprises the identifier of the second data center and the identifier of the second computing node; the first computing node is configured to receive the location information response.
In the implementation mode, the task deployment system provides a location information query service by arranging the metadata servers, and the computing nodes of each data center can acquire the location information of the data required by the task by requesting the metadata servers to send requests, so that a basis is provided for the task deployment across the data centers.
Wherein the metadata server provides an interface for the data center to query the location information of the data. For example, the metadata server provides two application programming interfaces for the data center to obtain location information for the data. One interface is Map < String, FileStatus > getcrossdcfilestatus (path) -, and the interface is used for acquiring information of a data center where data is located, such as identification of the data center. Another interface is Map < String, locatedFileStatus > getCross DClocatedFileStatus (Path), and the interface is used for acquiring information of a data center where data is located and information of a computation node, such as an identifier of the data center and an identifier of the computation node.
In an exemplary implementation manner of the first aspect, the identifier of the data center is an identifier of a second data center, the identifier of a computing node in the data center is an identifier of a second computing node, and the first resource manager node is configured to receive a first resource allocation message from the second resource manager node in the second data center, where the first resource allocation message is used to indicate that the task may be deployed to the second computing node; the first computing node is configured to send a task execution request to the second computing node, where the task execution request includes an identifier of the task and a data identifier, and the data identifier is used to indicate data required for executing the task in the second computing node.
In the implementation manner, in the process of task deployment across data centers, a second data center indicates, through resource allocation information, that a first computing node of a first data center can deploy a task to a second computing node of the second data center, and the first computing node sends a task execution request to the second computing node after receiving a resource allocation message.
The resource allocation information may include, among other things, the resources allocated to the task, including, for example, the container identification.
Optionally, the first computing node is further configured to receive a task execution result returned by the second computing node.
In an exemplary implementation manner of the first aspect, the first task deployment request includes identifications of multiple data centers, where the identifications of the multiple data centers are used to indicate multiple data centers capable of deploying the task, and the first resource management node is further used to determine, from the multiple data centers, a data center to which the task is to be deployed according to a resource allocation policy.
The resource allocation policy may include a scheduling policy, where the scheduling policy is any one of the following: in the case that the task cannot be deployed on the first computing node, allowing the task to be deployed on any computing node of the data center with free resources; in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first data center that are on the same rack as the first computing node; in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first datacenter.
Optionally, the resource allocation policy may further include, when two or more data centers are determined by using the scheduling policy, selecting a data center according to attribute information of at least one of the following data centers: a priority of two or more data centers, a capacity or proportion of free resources of each of the two or more data centers, a size of a network bandwidth between a resource manager node of each of the two or more data centers and the first resource manager node.
Optionally, the task deployment system further includes a master node of the distributed file system; the metadata server is used for sending a position information query request to the main node, wherein the position information query request comprises a data identifier; the main node is used for receiving a position information query request sent by the metadata server; determining whether data corresponding to the data identification exists in a data center where the main node is located according to the position information query request; returning a location information query response to the metadata server, the location information query response including at least one of the following information: and the data center identification of the data center where the main node is located and the computing node identification of the computing node where the data corresponding to the data identification is located, or the position information query response is used for indicating that the data corresponding to the data identification does not exist in the data center where the main node is located.
In a second aspect, a resource scheduling method is provided, where the method is used for a first resource manager node of a first data center, and the first data center further includes a first computing node, and the method includes: receiving a first task deployment request sent by the first computing node, wherein the first task deployment request comprises an identifier of a task, an identifier of a data center and an identifier of a computing node in the data center, and the identifier of the data center is used for indicating the data center where the task is to be deployed; and sending a second task deployment request to a resource manager node in the data center corresponding to the identifier of the data center under the condition that the identifier of the data center indicates the data center except the first data center, wherein the second task deployment request is used for deploying the task to the data center and comprises the identifier of the data center, the identifier of the task and the identifier of a computing node in the data center.
In the implementation mode, the resource manager node receives a task deployment request sent by a computing node of the same data center, the resource management request carries an identifier of a task to be executed and identifiers of the data center where data to be processed by the task are located and the computing node, when the task deployment request comprises identifiers of other data centers, the task deployment request is sent to the other data centers, and the task is deployed by the data center where the data required to be processed by the task to be executed is located.
In an exemplary implementation manner of the second aspect, the first task deployment request includes identifications of a plurality of data centers, where the identifications of the plurality of data centers are used to indicate a plurality of data centers capable of deploying the task, and the method further includes: determining a data center from the plurality of data centers that will deploy the task according to a resource allocation policy.
In the implementation manner, the resource manager node determines the data center where the task is deployed through the resource allocation strategy, on one hand, different scheduling requirements can be met by setting different resource allocation strategies, and on the other hand, automatic scheduling of resources can be achieved through the resource allocation strategy.
The resource allocation strategy can be carried in the task deployment request, so that the resource manager node can determine the resource allocation strategy directly according to the task deployment request, and the process is simple and convenient.
In an exemplary implementation manner of the second aspect, the resource allocation policy includes a scheduling policy, and the scheduling policy is any one of the following: in the case that the task cannot be deployed on the first computing node, allowing the task to be deployed on any computing node of the data center with free resources; in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first data center that are on the same rack as the first computing node; in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first datacenter.
In the implementation mode, the scheduling strategy can be set in various modes, so that the diversified requirements of users are met; in the strategies, local resources are preferentially adopted for processing so as to ensure the task execution speed.
In an exemplary implementation manner of the second aspect, the resource allocation policy further includes, in a case where two or more data centers are determined by using the scheduling policy, selecting a data center according to attribute information of at least one of the following data centers: a priority of two or more data centers, a capacity or proportion of free resources of each of the two or more data centers, a size of a network bandwidth between a resource manager node of each of the two or more data centers and the first resource manager node.
In the implementation mode, when the scheduling policy cannot determine the specific data center, the optimal data center is selected according to the priority, the amount of idle resources, the network bandwidth and the like to provide resources, so that reasonable scheduling of the resources is ensured.
Optionally, the resource allocation policy may further define a selection range or a selection condition of the resource, according to which the resource is selected when determining the data center. For example, the resources are selected to range from 1-10 nodes of data center 1. The resource is selected by a computing node of a data center with a graphic processor, large storage or large memory, and the like.
In an exemplary implementation manner of the second aspect, the second task deployment request is a heartbeat message between the first resource manager node and the second resource manager node, where the heartbeat message includes an identification of the data center, an identification of the task, and an identification of a computing node in the data center.
In the implementation mode, the task deployment request is transmitted through the heartbeat messages among the data centers, and the implementation is simple.
In an exemplary implementation manner of the second aspect, the identification of the data center is an identification of a second data center, and the identification of a computing node in the data center is an identification of a second computing node, and the method further includes: receiving a first resource allocation message from a second resource manager node in the second data center, the first resource allocation message indicating that the task may be deployed to the second compute node; and sending a task execution request to the second computing node, wherein the task execution request comprises an identifier of the task and a data identifier, and the data identifier is used for indicating data required by the second computing node for executing the task.
In this implementation, the data center resource manager may also receive task deployment requests sent by other data centers, in addition to sending the task deployment requests to other data centers, so as to allocate resources to the computing nodes of other data centers according to the task deployment requests.
In an exemplary implementation manner of the second aspect, the method further includes: receiving a third task deployment request sent by a third resource manager node, where the third task deployment request includes an identifier of the first data center, an identifier of the first task, and an identifier of a third computing node in the first data center, and the third resource manager node belongs to the third data center; sending a second resource allocation message to the third resource manager node, the first resource allocation message indicating that the first task may be deployed to the third computing node.
In this implementation manner, the result of resource allocation is returned to the data center that sent the task deployment request through the resource allocation message, so that the data center that sent the task deployment request can execute the task by using the resource specified by the resource allocation message.
In a third aspect, a resource scheduling method is provided, where the method is used for a first computing node of a first data center, where the first data center further includes a first resource manager node, and the method includes: sending a position information request to a metadata server, wherein the position information request comprises a data identifier, the position information request is used for requesting the metadata server for information of a data center and a computing node where data corresponding to the data identifier are located, and the data corresponding to the data identifier is data required for executing the task; receiving a location information response sent by the metadata server, wherein the location information response comprises an identifier of a second data center and an identifier of a second computing node, and the second computing node belongs to the second data center; sending a first task deployment request to the first resource manager node, the first task deployment request including an identification of a task, an identification of a data center, and an identification of a compute node in the data center, the identification of the data center indicating the data center to deploy the task; receiving a first resource allocation message sent by the first resource manager node, the first resource allocation message indicating that the task may be deployed to the second computing node.
In an exemplary implementation manner of the third aspect, the method further includes: and sending a task execution request to the second computing node, wherein the task execution request comprises an identifier of the task and a data identifier, and the data identifier is used for indicating data required by the second computing node for executing the task.
In a fourth aspect, there is provided an information management method, the method comprising: the method comprises the steps that a metadata server receives a position information request sent by a first computing node, wherein the position information request comprises a data identifier, the position information request is used for requesting information of a data center and a computing node where data corresponding to the data identifier are located from the metadata server, the data corresponding to the data identifier are data required by execution of a task, and the first computing node belongs to a first data center; and sending a position information response to the first computing node according to the position information request, wherein the position information response comprises the identification of the data center where the data corresponding to the data identification is located and the identification of the computing node.
In an exemplary implementation manner of the fourth aspect, the method further includes: determining a second data center according to the data identifier, wherein the second data center is a data center where data corresponding to the data identifier is located; sending a position information query request to a master node of the second data center, wherein the position information query request comprises a data identifier, and the master node of the second data center is a master node of a distributed file system of the second data center; receiving a position information query response returned by the main node of the second data center, wherein the position information query response comprises an identifier of a computing node where data corresponding to the data identifier is located; alternatively, the method further comprises: sending a position information query request to main nodes of a plurality of data centers, wherein the position information query request comprises a data identifier, and the main node of each data center in the plurality of data centers is the main node of a distributed file system of each data center; receiving a location information query response returned by the first host node, wherein the location information query response comprises at least one of the following information: the data center and the computing node where the data corresponding to the data identifier is located, or the location information query response is used to indicate that the data corresponding to the data identifier does not exist in the data center where the first master node is located, where the first master node is a master node of a distributed file system of any one of the multiple data centers.
In an exemplary implementation manner of the fourth aspect, the method further includes: receiving a data synchronization message sent by a master node of a third data center, the data synchronization message being used for indicating first data changed by data operation in the third data center, the data operation including at least one of: creating data, deleting data and synchronizing data to other data centers, wherein a main node of the third data center is a main node of a distributed file system of the third data center; and recording the data center corresponding to the changed first data according to the data synchronization message.
In a fifth aspect, an embodiment of the present invention provides a resource scheduling apparatus, where the resource scheduling apparatus includes a unit, such as a receiving unit and a transmitting unit, configured to implement the method provided in any one of the possible implementation manners in the second aspect.
In a sixth aspect, an embodiment of the present invention provides a resource scheduling apparatus, where the resource scheduling apparatus includes a unit, such as a transmitting unit and a receiving unit, configured to implement the method provided in any one of the possible implementation manners in the third aspect.
In a seventh aspect, an embodiment of the present invention provides an information management apparatus, where the resource scheduling apparatus includes a unit, such as a receiving unit and a sending unit, configured to implement the method provided in any one of the possible implementation manners in the fourth aspect.
In an eighth aspect, an embodiment of the present invention provides a resource scheduling apparatus, where the apparatus includes: a processor, a memory, and a communication interface; the processor, the memory and the communication interface are coupled by a bus, the memory is used for storing software programs, and when the processor is used for running or executing the software programs in the memory, the method provided by any one of the possible implementation modes in the second aspect can be executed through the communication interface.
In a ninth aspect, an embodiment of the present invention provides a resource scheduling apparatus, where the apparatus includes: a processor, a memory, and a communication interface; the processor, the memory and the communication interface are coupled by a bus, the memory is used for storing software programs, and when the processor is used for running or executing the software programs in the memory, the method provided by any one of the possible implementation manners in the third aspect can be executed through the communication interface.
In a tenth aspect, an embodiment of the present invention provides an information management apparatus, where the apparatus includes: a processor, a memory, and a communication interface; the processor, the memory and the communication interface are coupled by a bus, the memory is used for storing software programs, and when the processor is used for running or executing the software programs in the memory, the method provided by any one of the possible implementation modes in the fourth aspect can be executed by the communication interface.
In an eleventh aspect, the present invention further provides a computer-readable medium for storing program code for execution by a resource scheduling apparatus, where the program code includes instructions for executing the method provided in any one of the possible implementation manners in the second aspect.
In a twelfth aspect, an embodiment of the present invention further provides a computer-readable medium for storing a program code for execution by a resource scheduling apparatus, where the program code includes instructions for executing the method provided in any one of the possible implementation manners in the third aspect.
In a thirteenth aspect, an embodiment of the present invention further provides a computer-readable medium for storing a program code for execution by an information management apparatus, where the program code includes instructions for executing the method provided in any one of the possible implementation manners in the fourth aspect.
Drawings
Fig. 1 is a schematic structural diagram of a DC cluster according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an apparatus according to an embodiment of the present invention;
3A-3C are flow charts of a method for scheduling resources according to an embodiment of the present invention;
4A-4B are flowcharts illustrating a method for obtaining location information of data by a DC according to an embodiment of the present invention;
FIG. 5 is a flow chart of another method for obtaining location information of data by a DC according to an embodiment of the present invention;
FIG. 6 is a flow chart of another method for obtaining location information of data by a DC according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a resource scheduling apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of another resource scheduling apparatus according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an information management apparatus according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a DC cluster according to an embodiment of the present invention, and referring to fig. 1, the DC cluster includes at least two DCs, the DC cluster shown in fig. 1 is described by taking 3 DCs as an example, and in other embodiments, the DC cluster may further include two DCs or more than four DCs.
As shown in fig. 1, each DC includes a Resource Manager (RM) node 40 and a compute node 20.
The computing nodes 20 are also called data nodes (datanodes) or storage nodes, the computing nodes 20 are used for storing data (files), one file may comprise a plurality of blocks, and a plurality of blocks may be stored on the same or different computing nodes 20. Meanwhile, the computing node 20 is also used for running an application program to execute a task and providing resources required for executing the task, including but not limited to resources such as a CPU, a memory, a network interface, a GPU, and the like.
Each application in the compute node 20 includes an AM, where the AM is a software module, the AM is configured to obtain resources for the application and allocate the resources to tasks executed by the application, a Job (Job) executed by the application may be split into multiple tasks, the multiple tasks may be executed by using resources provided by different compute nodes, and the compute nodes that provide resources for the multiple tasks may belong to different DCs or the same DC.
Each compute node 20 has a Resource Agent (RA) (not shown) disposed therein, and RA is a Resource and task manager in the compute node. On one hand, the RA reports resource usage and running states of each Container (Container) to the RM node 40 in the DC at regular time (for example, usage of each dimensional resource in the Container), the Container is a resource abstraction, the Container encapsulates multidimensional resources (CPU, storage, network interface, GPU, etc.) on a certain compute node, and of course, the resources may be encapsulated in other forms, such as virtual machines, etc.; on the other hand, the RA receives a request from an Application Master (AM) within the compute node 20 to control the container to start or stop running.
The RM node 40 includes an application Manager (ASM) (not shown in the drawing), a DC Scheduler (data Scheduler)41 and a Local Scheduler (Local Scheduler)42, and is mainly responsible for resource scheduling and management of the entire DC. The ASM is responsible for managing applications throughout the DC, including application submission, negotiating resources with the DC scheduler to start AM, monitoring AM running status and restarting AM if run fails, etc. Both the DC scheduler 41 and the local scheduler 42 are created by the RM node 40 using software modules. The DC Scheduler 42 includes a Cross DC Communicator (Cross DC Communicator)411 and a Cross DC Scheduler (Cross DC Scheduler) 412. The inter-DC scheduler 412 is configured to receive and process a Task deployment request (used for acquiring resources for a Task) from an AM in the compute node 20 in the DC, and determine whether to schedule the Task deployment request to a local process or distribute the Task deployment request to another DC process; the cross-DC communicator 411 is configured to transmit a heartbeat message with the DC communicators 411 of other DCs, where the heartbeat message is information periodically transmitted between the DCs, and the heartbeat message can confirm whether the DC connected to the DC is alive or not, and further synchronize resource usage messages and resource allocation messages of the DC through the heartbeat message. The resource usage message is used to indicate the usage of resources in the DC, e.g., the resource usage message may include used resources, allocated resources and available resources. The resource allocation message is used to indicate resources allocated to a task. In implementation, two RM nodes 40 may be set in each DC, and the two RM nodes 40 are in a master-slave relationship with each other.
Each DC also includes a name node (NameNode)10, also referred to as a master node, which master node 10 is used to store and manage metadata information for files stored in the respective compute nodes 20 of the DC in which it is located. In implementation, two main nodes 10 may be set in each DC, and the main nodes 10 are in a master-slave relationship with each other. The RM node 40 and the master node 10 may be implemented by the same hardware, or by two sets of hardware that are independent of each other.
In the embodiment of the present invention, the computing node 20 has a function of providing resources required for executing tasks, and simultaneously, resources in the computing node and a task manager are managed by deploying RA, at this time, the computing node 20 and the RM node 40 of each DC form a distributed scheduling system, and the distributed scheduling system may be implemented by using a Hadoop alternative Resource coordinator (YARN) architecture. The computing nodes 20 have a function of storing data, at this time, the computing node 20 of each DC and the master node 10 form a Distributed File System (DFS), and the Distributed File System may be implemented by using a Hadoop Distributed File System (HDFS) architecture.
Further, the DC cluster further includes a Metadata (Metadata Service) server 30, and the Metadata server 30 is connected to the master node 10 of each DC in the DC cluster at the same time, so that Metadata information of files in each DC can be obtained from the master node 10 of each DC, and the obtained Metadata information of files in each DC can be stored, and the Metadata information of files includes names and location information of files, etc. The master node 10 can obtain the metadata information of the files in the DC, and when the master node 10 needs to obtain the metadata information of other DCs, it can be implemented by the metadata server 30. The metadata server 30 may be a stand-alone server or may be integrated into a node of a DC in a DC cluster. In addition, since the individual DCs in a DC cluster may be distributed at different locations around the world, the metadata server 30 may also be a Global (Global) DC metadata server. In implementation, two metadata servers 30 may be set in each DC cluster, and the two metadata servers 30 are in a master-slave relationship with each other.
In the embodiment of the present invention, the metadata server 30, the master node 10, and the RM node 40 may all adopt a Quorum Journal Manager (QJM) mechanism to implement master-slave synchronization. In addition, the metadata server 30 may also employ a failover controller (ZKFC) mechanism to ensure high availability.
It should be noted that, in the present invention, the computing node is simultaneously used as a data node in the distributed file system and a computing node in the distributed scheduling system, and in other implementation manners, the data node in the distributed file system and the computing node in the distributed scheduling system may also be independently and separately set as two nodes.
In the prior art, since the distributed file systems of the DCs in the DC cluster are deployed independently, when each DC executes a task, it can only be determined whether data required for executing the task is stored locally, and the specific location of the data on other DCs is not known.
According to the method and the device, the task deployment system formed by the resource manager node, the computing node, the metadata server and the main node is used for realizing the cross-DC deployment of the tasks, so that the problems in the prior art are solved, and the detailed functions of each node and the metadata server in the task deployment system are described later.
The following describes an apparatus for implementing the embodiment of the present invention with reference to a specific hardware structure.
Fig. 2 shows a block diagram of an apparatus 140 according to an embodiment of the present invention, which may be a resource scheduling apparatus or an information management apparatus, where the resource scheduling apparatus may be the aforementioned resource manager node or the computing node, and the information management apparatus may be the aforementioned metadata server. Referring to fig. 2, the apparatus 140 may include one or more cores of a processor 31, a memory 32 including one or more computer-readable storage media, and a communication interface 33, and the processor 31 may be connected to the memory 32 and the communication interface 33 by a bus. Those skilled in the art will appreciate that the configuration shown in fig. 2 does not constitute a limitation of apparatus 140, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 31 is a control center of the device 140, connects various parts of the entire device 140 by various interfaces and lines, and performs various functions of the device 140 and processes data by running or executing software programs stored in the memory 32 and calling data stored in the memory 32, thereby performing overall monitoring of the device 140. Alternatively, the Processor 31 may include one or more Processing units, which may be a Central Processing Unit (CPU), a Network Processor (NP), or the like.
The memory 32 may be used to store various data, such as various configuration parameters, and computer instructions that may be executed by the processor 31. The memory 32 may include high speed random access memory and may also include non-volatile memory, such as at least one disk, flash memory, or other volatile solid state memory device. Accordingly, the memory 32 may also include a memory controller to provide the processor 31 access to the memory 32.
The communication interface 33 is connected with other devices in the DC cluster in a wired or wireless manner to communicate with the other devices for data transmission.
When the resource scheduling apparatus is a resource manager node, the processor 31 is configured to receive a first task deployment request sent by the first computing node through the communication interface 33, where the first task deployment request includes an identifier of a task, an identifier of a data center, and an identifier of a computing node in the data center, and the identifier of the data center is used to indicate the data center where the task is to be deployed; and in the case that the identifier of the data center indicates a data center other than the first data center, sending a second task deployment request to a resource manager node in the data center corresponding to the identifier of the data center through a communication interface 33, where the second task deployment request is used for deploying the task to the data center, and the second task deployment request includes the identifier of the data center, the identifier of the task, and an identifier of a computing node in the data center.
The method comprises the steps that a resource manager node receives a task deployment request sent by a computing node of a data center, the resource management request carries an identifier of a task to be executed and identifiers of the data center where data to be processed by the task are located and the computing node, when the task deployment request comprises identifiers of other data centers, the task deployment request is sent to other data centers, and the task is deployed by the data center where the data needed to be processed by the task to be executed are located.
When the resource scheduling device is a resource manager node, the processor 31 is further configured to perform steps performed by the resource manager node in the following resource scheduling method.
When the resource scheduling device is a computing node, the processor 31 is configured to send a location information request to the metadata server through the communication interface 33, where the location information request includes a data identifier, the location information request is used to request the metadata server for information of a data center and a computing node where data corresponding to the data identifier is located, and the data corresponding to the data identifier is data required to execute the task; receiving, through a communication interface 33, a location information response sent by the metadata server, where the location information response includes an identifier of a second data center and an identifier of a second computing node, and the second computing node belongs to the second data center; sending a first task deployment request to the first resource manager node through the communication interface 33, the first task deployment request including an identification of a task, an identification of a data center, and an identification of a compute node in the data center, the identification of the data center indicating the data center to deploy the task; receiving, via a communication interface 33, a first resource allocation message sent by the first resource manager node, the first resource allocation message indicating that the task may be deployed to the second computing node.
The method comprises the steps that a computing node sends a task deployment request to a resource manager node of the same data center, the resource management request carries an identifier of a task to be executed and identifiers of a data center where data to be processed by the task are located and the computing node, when the task deployment request comprises identifiers of other data centers, the task deployment request is sent to other data centers, and the task is deployed by the data center where the data needed to be processed by the task to be executed are located.
When the resource scheduling device is a computing node, the processor 31 is further configured to execute steps performed by the computing node in the following resource scheduling method.
When the information management apparatus is a metadata server, the processor 31 is configured to receive, through the communication interface 33, a location information request sent by a first computing node, where the location information request includes a data identifier, and the location information request is used to request, from the metadata server, information of a data center and a computing node where data corresponding to the data identifier is located, where the data corresponding to the data identifier is data required to execute the task, and the first computing node belongs to the first data center; and sending a position information response to the first computing node through a communication interface 33 according to the position information request, wherein the position information response comprises the identification of the data center where the data corresponding to the data identification is located and the identification of the computing node.
In the implementation mode, the task deployment system provides a location information query service by arranging the metadata servers, and the computing nodes of each data center can acquire the location information of the data required by the task by requesting the metadata servers to send requests, so that a basis is provided for the task deployment across the data centers.
When the resource scheduling device is a metadata server, the processor 31 is further configured to execute the steps performed by the metadata server in the resource scheduling method described later.
The processor 31 sends and receives messages or requests through the communication interface 33 means that the processor 31 sends control instructions to the communication interface 33 to enable the communication interface 33 to send and receive messages or requests.
Fig. 3A to fig. 3C are flowcharts of a resource scheduling method according to an embodiment of the present invention, where the method is implemented based on the DC cluster shown in fig. 1, where fig. 3B and fig. 3C only show an interaction process between nodes (devices) in the DC cluster, and referring to fig. 3A to fig. 3C, the method flow includes:
200: the RM node of the first DC receives an application submitted by a client and allocates resources for the application in the compute nodes of the first DC to run the application.
Step 200 is performed by an ASM in the RM node, and after receiving an Application submitted by an Application Client (Client for short), the ASM negotiates with a DC scheduler in the RM node to allocate resources to run the Application and create an AM for the Application.
201: the compute node of the first DC determines the task to be performed by the application.
Each application program comprises an AM, the AM in the application program divides the work executed by the application program into a plurality of tasks, the tasks can be parallel or serial, and the AM is a scheduling manager of the tasks which need to be executed by the application program at present. After the AM splits into a plurality of tasks, resources need to be acquired for each task to execute each task, and the specific process is described later.
The computing node in step 201 is a computing node running the application program.
202: the compute node of the first DC sends a location information request to a metadata server.
The position information request comprises a data identifier, the position information request is used for requesting the metadata server for information of a DC and a computing node where data corresponding to the data identifier are located, and the data corresponding to the data identifier is data required for executing the task.
Accordingly, the metadata server receives a location information request sent by a compute node of the first DC. As shown in fig. 3B and 3C, DC1 is the first DC.
In the embodiment of the present invention, the number of the tasks to be executed by the computing nodes of the first DC may be one or more, and the data required to be processed by each task may include one or more files, and the one or more files may be stored in the same DC or different DCs. The data identification in the location information request may be a file name, such as file x, which may be specified by the user via an interface parameter when the application is submitted in step 200.
In an embodiment of the invention, the metadata server provides an interface for the DC to query for location information of the data. For example, the metadata server provides two API interfaces for the DCs to obtain location information for the data.
One interface is Map < String, FileStatus > getCross DCFileStatus (Path path) -, the interface is used to obtain the DC information of the data, the return value type is Map < String, FileStatus >. In the interface, String represents a file name, FileStatus is an object, and includes file attributes such as the number of copies, the DC where the file is located, file creation time, and file status, MAP represents a set, and one MAP may have a plurality of files and file attributes corresponding to the files, (Path) is used to indicate whether to specify the DC, or to indicate the specified DC.
The other interface is Map < String, located filestatus > getcrossdclocated filestatus (path), the interface is used for obtaining the information of the DC where the data is located and the information of the computation node, and the return value type is Map < String, located filestatus >. In this interface, locatedFileStatus is different from FileStatus in that information of the computing node where the file is located can also be included.
In practical application, the computing node can also directly adopt the second interface to inquire the DC and the computing node where the computing node is located; the computing node may also query the DC using the first interface first, and then query the computing node where the file is located using the second interface. The second way can avoid the problem that the data flow approximately causes system breakdown due to the fact that a large number of position information requests query the information of the computing nodes of the data at the same time.
203: the metadata server determines location information of data required to be processed by a task to be executed by a compute node of the first DC according to the location information request.
The location information of the data includes information of a DC where the data is located and information of a computing node, where the information of the DC where the data is located and the information of the computing node may be an identifier of the DC where the data is located and an identifier of the computing node. For example, if the information of the DC where the data is located is DC1, and the information of the computing node is N3, it indicates that the data is located in N3 computing nodes in DC 1.
When the metadata server determines the location information of the data, there are two cases:
in the first case, the metadata server has stored therein the complete location information of the data, including information of the DC where the data is located and information of the compute node. In the second case, the metadata server does not store the location information of the data, or does not store the complete location information of the data, such as only the information of the DC where the data is located, and the information of the computing node where the data is not stored. When the information stored in the metadata server is insufficient to answer the location information request, then further location information for the data needs to be requested from the respective DCs.
In both cases, the location information of the data already stored in the metadata server can be obtained in two ways, the first way is that each DC actively reports (i.e. uploads to the metadata server), and the second way is that the metadata server requests the location information of the data from each DC, and stores the location information of the requested data in the metadata server.
The detailed process of the first mode is explained by the flow chart provided in fig. 4A and 4B, wherein fig. 4B only shows the interaction process between the devices, and referring to fig. 4A and 4B, the flow includes:
2031: the distributed file system client connects to the master node of the DC's distributed file system, creating metadata for the data.
The distributed file system client is a client accessing a distributed file system in the DC, the distributed file system client is usually a linux machine, distributed file system client software is installed on the distributed file system client, and the distributed file system client can log in a distributed file system management role through the software, so that steps 2031 and 2032 are implemented.
In the application, the distributed file system client and the application client may be two independent machines, or may be implemented by installing different client programs in one machine.
The metadata of the data in step 2031 includes location information of the data, for example, the metadata written by the distributed file system client in fig. 4B includes location information of the data: and/opt/data/x.
2032: the distributed file system clients write data, i.e., files, to the compute nodes of the DC.
2033: the computing nodes of the DCs write the data to other DCs via a data migration Tool (Replication Tool), e.g., a first DC writes the data to a second DC connected to the first DC.
The data migration tool can be started manually through a distributed file system client, and can also be called and started through a design calling interface.
It should be noted that, the above-mentioned data migration process is an optional step, and whether to perform data migration may be configured in advance.
In the embodiment, after the data is created, the data is synchronized with other DCs, so that the real-time performance of data sharing of each DC is guaranteed.
For example, the data migration tool in FIG. 4B migrates data from DC1 to DC2 at DC1 at/opt/data/x and DC2 at/opt/DC 2 data/x.
2034: the master node of the DC reports a data synchronization message for the data to the metadata server, the data synchronization message including a path before the data synchronization and a path after the data synchronization. Through the acquired data synchronization message, the metadata server obtains information of the DC where the data is located in the location information of the data.
For example, data is migrated (i.e., copied) from DC1 to DC2, with the data having a path of/opt/data/x in DC1 and a path of/opt/DC 2data/x in DC 2. The information of the DC where the file x can be stored in the metadata server includes DC1 and DC2 by the path before the data synchronization and the path after the data synchronization in the synchronization information.
The master node of the DC is provided with a PUT REST API interface, which is configured to send a data synchronization message of each metadata operation to the metadata server, where the data synchronization message is used to indicate data that changes through the data operation in the third DC, and the data synchronization message may indicate operations of creating data, deleting data, and the like, in addition to operations of synchronizing data to other DCs.
In addition, the data synchronization message generated by the synchronization data can also be uploaded to the metadata server directly through the data migration tool.
The detailed process of the second mode is described below by the flowcharts provided in fig. 4B, 5 and 6, and referring to fig. 4B, 5 and 6, the second mode includes two cases: in the first case, the metadata server knows the DC where the data is located, and the flow is shown in fig. 4B and 5:
2035: the metadata server determines information of the DC where data to be processed by the compute node of the first DC is located according to the location information request.
2036: and the metadata server sends a position information query request to the main node of the DC corresponding to the determined DC information.
In an embodiment of the present invention, the data required to be processed by the computing node of the first DC to execute the task may be one or more files, and the location information query request includes a data identifier, which may be a name of a file.
2037: and returning a position information query response to the metadata server by the DC main node corresponding to the determined DC information, wherein the position information query response comprises the information of the computing node, such as the identification of the computing node.
In the second case, the metadata server does not know the DC where the data is located, and the flow is as shown in fig. 6:
2038: the metadata server sends a location information query request to the master node of each DC.
In an embodiment of the present invention, the data required to be processed by the computing node of the first DC to execute the task may include one or more files, and the location information query request includes the data identifier.
2039: the main node of each DC returns a position information query response to the metadata server, wherein the response comprises information of the DC where the data corresponding to the data identifier is located, such as the identifier of the DC; or the response includes information of the DC where the data identifier corresponds to the data and information of the computing node, such as the identifier of the DC and the identifier of the computing node; or the acknowledgement indicates that no data corresponding to the data identity exists in the DC.
In both of the above scenarios, the metadata server may send a request to the primary node of the DC through the REST API interface.
Through the above process, the metadata server obtains the location information of the data.
In the embodiment of the present invention, the metadata server may adopt a QJM mechanism to implement information synchronization between the primary metadata server and the backup metadata server. For example, the primary metadata server records each metadata operation to an edit (edit) file of a journal node (journal node) in the primary metadata server, while the backup metadata server can read this information at any time. The primary and secondary metadata servers all implement a regular checkpoint (checkpoint) mechanism, that is, after the primary and secondary metadata servers flush records in the memory to the mirror image file in the file system, checkpoint operations are performed. The backup metadata server can regularly load the edge file into the memory, and the fast start of the active-backup switching is ensured.
204: the metadata server sends a position information response to the computing node of the first DC, wherein the position information response comprises the information of the DC and the information of the computing node, wherein the data required to be processed by the task to be executed by the computing node of the first DC is located in the position information response.
Accordingly, the compute node of the first DC receives the location information reply sent by the metadata server.
In the embodiment of the present invention, the metadata server returns a corresponding return value according to the interface type called by the first DC, where the return value may adopt a JavaScript Object Notation (JSON) format.
For example, in step 204 of fig. 3B, the location information response may carry location information of file x: DC1 ═ B1@ N1, N4, B2@. }, DC2 ═ B1@ N3, N5, B2@. }, and this position information indicates the positions of B1 blocks of file x at N1 node, N4 node, and the like of DC1, and at N3 node, N5 node, and the like of DC 2.
205: the compute nodes of the first DC send first task deployment requests to RM nodes of the first DC, the first task deployment requests including an identification of tasks to be performed by the compute nodes of the first DC, an identification of the DC, and an identification of the compute nodes.
Accordingly, the RM node of the first DC receives the first task deployment request sent by the compute nodes of the first DC.
The first task deployment request may carry an identifier of at least one task, and the identifier of each task corresponds to the identifiers of the plurality of DCs and the identifiers of the computing nodes. When the identifier of one task corresponds to the identifiers of the multiple DCs, one DC needs to be selected from the DCs corresponding to the identifiers of the multiple DCs according to the resource allocation policy to deploy the task.
Optionally, the first task deployment request may further include a resource allocation policy, and the resource allocation policy may be specified by a user through an interface parameter when the job is submitted. The resource allocation policy is used to indicate that, from the DCs corresponding to the identifier of the task to be executed, the DCs that deploy the task are determined, that is, data that needs to be processed by the task to be executed by the compute node of the first DC may be stored in a plurality of DCs at the same time, and one of the plurality of DCs is selected to deploy the task by using the resource allocation policy. The resource allocation policy may include a scheduling policy, and the scheduling policy may be any one of the following:
1) idle DC (RELAX _ DC) strategy: in the event that the task cannot be deployed on a compute node of the first DC, allowing the task to be deployed on any compute node of the DC that has free resources. The policy also means that if the resource localization of the task deployment request cannot be satisfied, the task deployment request is allowed to be scheduled to any DC having idle resources.
In this embodiment, the resource localization refers to allocating resources to the task to be executed on the computing node that sends the task deployment request. When resource localization is not satisfactory, the task may be allowed to be scheduled to a DC with any free resources, not limited to the DC where the computing node that needs to acquire the resources is located.
Optionally, a timeout time may be set in the policy, and when the timeout time is exceeded and if the resource localization of the task deployment request cannot be met, the task deployment request is allowed to be dispatched to any DC having idle resources.
2) LOCAL RACK FIRST (LOCAL _ RACK _ FIRST) strategy: in the event that the task cannot be deployed at a compute node of the first DC, allowing the task to be preferentially deployed on other compute nodes of the first DC that are on the same rack as the first compute node. The above strategy also allows tasks to be scheduled preferentially to other compute nodes on the same chassis of the same DC if the resource localization of the task deployment request cannot be met.
In an embodiment of the present invention, each DC includes a plurality of racks, each rack has a plurality of compute nodes disposed thereon, and each compute node may have a plurality of hard disks.
In this strategy, when resource localization cannot be met, the task may be allowed to be preferentially scheduled to other compute nodes on the same rack of the same DC, but when other compute nodes on the same rack of the same DC cannot meet task deployment, the task may be allowed to be scheduled to compute nodes of other racks of the same DC or other DCs.
3) Local DC (DC _ local) priority policy: in the event that the task cannot be deployed on a compute node of the first DC, allowing the task to be preferentially deployed on other compute nodes of the first DC. The above strategy allows tasks to be preferentially scheduled to other compute nodes on the same DC if the resource localization of the task deployment request cannot be met.
In this strategy, when resource localization cannot be met, the task may be allowed to be preferentially scheduled to other compute nodes of the same DC, but when other compute nodes of the same DC cannot meet task deployment, the task may be allowed to be scheduled to other DCs.
In other embodiments, the resource localization may also refer to allocating resources for the task at a computing node of the rack where the computing node that needs to acquire the resources is located, or allocating resources for the task at a computing node of the DC where the computing node that needs to acquire the resources is located.
Further, to facilitate resource localization across DCs, localization information may be specified in the first task deployment request by extending an AM protocol of HADOOP, for example, a new field in the AM protocol indicates localization information, e.g., a node indicates that only a local computing node may be selected, a rack indicates that all computing nodes of the rack may be selected, and an indicates that any computing node in the DC may be a target of selection.
Second, the AM protocol can be extended to allow the scheduling policy, e.g., DC _ local, to be specified in the first task deployment request by a string expression.
In addition to this, the resource allocation policy may also define a selection range or a selection condition of the resource according to which to choose when determining the DC. For example, the resources are selected to range from 1-10 nodes of DC 1. The selection condition of the resources is a calculation node of a DC with a GPU, a large memory or a large memory, and the like.
The resource allocation policy may further include other rules, for example, the resource allocation policy may further include selecting a DC according to attribute information of at least one of the following DCs, in case two or more DCs are determined using the scheduling policy: a priority of the two or more DCs, a capacity or proportion of free resources of each of the two or more DCs, a size of a network bandwidth between the RM node of each of the two or more DCs and the first RM node. The other rules may be carried in the first task deployment request as part of the resource allocation policy, or may be defined in a predetermined manner.
When the resource allocation policy is not included in the first task deployment request, the RM node may employ a default resource allocation policy for processing.
For example, in step 205 of fig. 3B, the first resource request may carry the following information: task 1: n3@ DC 1-P: 0, N8@ DC 2-P: 0; task 2: n1@ DC 3-P: 0, Policy ═ RELAX _ DC, the above information indicates that the first resource request is used to request resources for two tasks of task1 and task2, the location information of the data to be processed by task1 is N3 node of DC1 and N8 node of DC2, the location information of the data to be processed by task2 is N1 node of DC3, the resource allocation Policy of task2 is designated as idle DC Policy, and the priorities of DC1, DC2 and DC3 are all P: 0.
206: the RM node of the first DC determines, according to the first task deployment request, a DC for deploying the task in the first task deployment request, that is, determines a DC corresponding to the task to be executed in the first task deployment request, so as to allocate the resource to the task to be executed through the DC corresponding to the task to be executed. When it is determined that the local DC allocates resources to the task to be executed in the first task deployment request, the process of allocating resources to the task to be executed in the first task deployment request by the local DC is shown in step 207; when it is determined that resources are allocated to the task to be executed in the first task deployment request through other DCs, in a case that the identifier of the DC indicates a DC other than the first DC, the process of allocating resources to the task to be executed in the first task deployment request by other DCs refers to step 208 and subsequent steps.
Since the first task deployment request may simultaneously request allocation of resources for two or more tasks, step 207 and step 208 may be performed simultaneously.
Wherein the first task deployment request includes an identification of a plurality of DCs, the identification of the plurality of DCs being used to indicate a plurality of DCs capable of deploying the task, the method further comprising:
determining a DC from the plurality of DCs to deploy the task according to a resource allocation policy.
In the embodiment of the present invention, the RM node of the first DC determines the processing mode of the first task deployment request as follows:
and the RM node of the first DC screens out the DC meeting the requirement according to a scheduling strategy in the resource allocation strategies.
When two or more than two DCs meeting the requirements are screened out, the RM node of the first DC sorts the screened-out DCs according to the priority of the DCs, the capacity or proportion of the DC idle resources, the network bandwidth between the screened-out DCs and the first DC and the like, and selects the DC sorted at the top as the DC for processing the first task deployment request.
When the priorities, the capacities or the proportions of the idle resources, or the network bandwidths of the two or more DCs positioned at the top are the same, one of the two or more DCs is randomly selected as the DC for processing the first task deployment request, or the DC to which the first one of the two or more DCs responds is selected as the DC for processing the first task deployment request. Wherein, the first response refers to the DC corresponding to the first heartbeat message acquired in the distribution process.
Of course, in other implementations, when two or more satisfactory DCs are screened out, one DC may be randomly selected directly from the two or more satisfactory DCs as the DC for processing the first task deployment request, or a DC corresponding to a first DC of the two or more satisfactory DCs may be selected directly as the DC for processing the first task deployment request.
Finally determining a DC according to the above-mentioned manner, and when the DC is the first DC, performing step 207; when the DC is other than the first DC, step 208 is performed.
In the above process, if DC ordering is to be performed according to the size of available resources, the RM node of the first DC needs to first acquire a resource usage message of each DC, where the resource usage message is used to indicate the usage of resources in each DC, and the resource usage message is transmitted through a heartbeat message between DCs.
207: and the RM node of the first DC allocates resources for the task to be executed according to the first task deployment request.
In the embodiment of the present invention, when the first task deployment request is processed by the local computing node, the first task deployment request may be directly sent to the local scheduler for processing. The local scheduler allocates resources for the task corresponding to the first task deployment request, where the resources may be a container, and at this time, the task can only use the resources described in the container.
208: the RM node of the first DC sends a second task deployment request to the RM node of a second DC, the second task deployment request for deploying the task to the second DC, the second task deployment request comprising an identification of the second DC, an identification of the task, and an identification of a compute node in the second DC.
Accordingly, the RM node of the second DC receives the second task deployment request sent by the RM node of the first DC. The second task deployment request may be the same as or different from the first task deployment request. For example, the first task deployment request corresponds to two tasks, and the second task deployment request corresponds to only one task, or the first task deployment request carries a resource allocation policy, and the second task deployment request does not carry a resource allocation policy.
Wherein the second DC is the DC determined in step 206, the second DC may be one, two, or multiple, and when the second DC includes multiple ones, each second DC performs the method flows of steps 208 and 210. As shown in fig. 3B, 3C, both DC2 and DC3 are second DCs.
In an embodiment of the present invention, the second task deployment request is a heartbeat message between the first RM node and the second RM node, where the heartbeat message includes an identifier of the DC, an identifier of the task, and an identifier of a computing node in the DC. The RM node of the first DC may send the second task deployment request to the RM node of the second DC via a heartbeat message.
209: the RM node of the second DC schedules the resources requested by the second task deployment request.
When receiving a heartbeat message, a cross-DC scheduler in an RM node of a second DC checks whether the heartbeat message carries a task deployment request; and if the task deployment request (namely the second task deployment request) is carried, distributing the task deployment request to a local scheduler for processing. The local scheduler allocates resources for the task to be executed according to the task deployment request; and generating a resource allocation message according to the allocated resources, wherein the resource allocation message is used for indicating that the task can be deployed to the second computing node. The resource allocation message may carry a resource allocated to the task, such as a container identifier.
210: the RM node of the second DC sends a resource allocation message to the RM node of the first DC.
Accordingly, the RM node of the first DC receives the resource allocation message transmitted by the RM node of the second DC.
In the embodiment of the present invention, the resource allocation message is transmitted through a heartbeat message.
For example, the RM node of DC2 in fig. 3C sends a resource allocation message to the RM node of DC1, the resource allocation message including N8@ DC2, N8@ DC2 indicating where the resources allocated to the compute node of DC1 are located.
211: the RM node of the first DC sends resource allocation messages to the compute nodes of the first DC.
Accordingly, the compute node of the first DC receives the resource allocation message sent by the RM node of the first DC.
212: the method comprises the steps that a computing node of a first DC sends a task execution request to a computing node of a second DC, wherein the task execution request comprises identification of the task and data identification, and the data identification is used for indicating data needed by the second computing node to execute the task.
Accordingly, a computing node of a second DC receives a task execution request sent by a computing node of the first DC.
Optionally, the task execution request may further include a container identifier.
In the embodiment of the present invention, the computing node of the first DC requests the computing node where the resource is located to execute the task to be executed according to the resource allocation message, where the computing node where the resource is located is the computing node where the resource allocated to the task to be executed is located, that is, the computing node of the first DC requests the computing node of the second DC to start the allocated container to execute the task.
213: and the computing node of the second DC executes the task corresponding to the task execution request by adopting the resources distributed to the task corresponding to the task execution request, and generates a task execution result.
And the task execution result is a result obtained by the computing node executing the task.
In the process of executing the task, the RM node of the second DC sends operation state information to the RM node of the first DC, wherein the operation state information is the operation state information of a container for executing the task; the RM node of the first DC receives the operation state information transmitted by the RM node of the second DC.
214: the compute nodes of the second DC return task execution results to the compute nodes of the first DC.
Accordingly, the computing nodes of the first DC receive the task execution results sent by the computing nodes of the second DC.
An embodiment of the present invention provides a resource scheduling apparatus, which is applied to a first resource manager node of a first data center, where the first data center further includes a first computing node, and referring to fig. 7, the resource scheduling apparatus includes:
the resource scheduling means may be implemented as a dedicated hardware circuit, or a combination of hardware and software, forming all or part of a resource manager node. The resource scheduling apparatus includes: a receiving unit 301 and a transmitting unit 302. The receiving unit 301 is configured to receive a first task deployment request sent by the first computing node, where the first task deployment request includes an identifier of a task, an identifier of a data center, and an identifier of a computing node in the data center, and the identifier of the data center is used to indicate the data center in which the task is to be deployed; a sending unit 302, configured to send, to a resource manager node in a data center corresponding to an identifier of the data center, a second task deployment request when the identifier of the data center indicates a data center other than the first data center, where the second task deployment request is used to deploy the task to the data center, and the second task deployment request includes the identifier of the data center, the identifier of the task, and an identifier of a computing node in the data center.
Optionally, the first task deployment request includes identifiers of a plurality of data centers, where the identifiers of the plurality of data centers are used to indicate a plurality of data centers capable of deploying the task, and the apparatus further includes a determining unit 303 configured to determine, according to a resource allocation policy, a data center from the plurality of data centers to deploy the task.
Optionally, the resource allocation policy includes a scheduling policy, and the scheduling policy is any one of the following:
in the case that the task cannot be deployed on the first computing node, allowing the task to be deployed on any computing node of the data center with free resources;
in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first data center that are on the same rack as the first computing node;
in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first datacenter.
Optionally, the resource allocation policy further includes, when two or more data centers are determined by using the scheduling policy, selecting a data center according to attribute information of at least one of the following data centers: a priority of two or more data centers, a capacity or proportion of free resources of each of the two or more data centers, a size of a network bandwidth between a resource manager node of each of the two or more data centers and the first resource manager node.
Optionally, the second task deployment request is a heartbeat message between the first resource manager node and the second resource manager node, where the heartbeat message includes an identifier of the data center, an identifier of the task, and an identifier of a computing node in the data center.
Optionally, the identification of the data center is an identification of a second data center, the identification of a computing node in the data center is an identification of a second computing node,
the receiving unit 301 is further configured to receive a first resource allocation message from a second resource manager node in the second data center, where the first resource allocation message is used to indicate that the task may be deployed to the second computing node;
the sending unit 302 is further configured to send a task execution request to the second computing node, where the task execution request includes an identifier of the task and a data identifier, and the data identifier is used to indicate data required by the second computing node to execute the task.
Optionally, the receiving unit 301 is further configured to receive a third task deployment request sent by a third resource manager node, where the third task deployment request includes an identifier of the first data center, an identifier of the first task, and an identifier of a third computing node in the first data center, and the third resource manager node belongs to the third data center;
the sending unit 302 is further configured to send a second resource allocation message to the third resource manager node, where the first resource allocation message is used to indicate that the first task may be deployed to the third computing node.
The related details may be incorporated with the method embodiments with reference to fig. 3A-6.
It should be noted that the determining unit 303 may be implemented by a processor or a processor executing program instructions in a memory. The receiving unit 301 and the sending unit 302 may be implemented by a communication interface or the communication interface may be implemented in combination with a processor.
An embodiment of the present invention provides a resource scheduling apparatus, which is applied to a first computing node of a first data center, where the first data center further includes a first resource manager node, and referring to fig. 8, the resource scheduling apparatus includes:
the resource scheduling means may be implemented as dedicated hardware circuitry, or a combination of hardware and software, forming all or part of a compute node. The resource scheduling apparatus includes: a transmitting unit 401 and a receiving unit 402. The sending unit 401 is configured to send a location information request to a metadata server, where the location information request includes a data identifier, the location information request is used to request, from the metadata server, information of a data center and a computing node where data corresponding to the data identifier is located, and the data corresponding to the data identifier is data required for executing the task; a receiving unit 402, configured to receive a location information response sent by the metadata server, where the location information response includes an identifier of a second data center and an identifier of a second computing node, and the second computing node belongs to the second data center; the sending unit 401 is further configured to send a first task deployment request to the first resource manager node, where the first task deployment request includes an identifier of a task, an identifier of a data center, and an identifier of a computing node in the data center, and the identifier of the data center is used to indicate the data center in which the task is to be deployed; the receiving unit 402 is further configured to receive a first resource allocation message sent by the first resource manager node, where the first resource allocation message is used to indicate that the task may be deployed to the second computing node.
Optionally, the sending unit 401 is further configured to send a task execution request to the second computing node, where the task execution request includes an identifier of the task and a data identifier, and the data identifier is used to indicate data required by the second computing node to execute the task.
The related details may be incorporated with the method embodiments with reference to fig. 3A-6.
It should be noted that the receiving unit 402 and the sending unit 401 may be implemented by a communication interface or the communication interface may be implemented in combination with a processor.
An embodiment of the present invention provides a resource scheduling apparatus, where the resource scheduling apparatus may be the metadata server, and referring to fig. 9, the resource scheduling apparatus includes:
the resource scheduling means may be implemented as a dedicated hardware circuit, or a combination of hardware and software, which may be implemented as all or part of the metadata server. The resource scheduling apparatus includes: a receiving unit 501 and a transmitting unit 502. The receiving unit 501 is configured to receive a location information request sent by a first computing node, where the location information request includes a data identifier, the location information request is used to request, from the metadata server, information of a data center and a computing node where data corresponding to the data identifier is located, the data corresponding to the data identifier is data required for executing the task, and the first computing node belongs to the first data center; a sending unit 502, configured to send, according to the location information request, a location information response to the first computing node, where the location information response includes an identifier of a data center where data corresponding to the data identifier is located and an identifier of the computing node.
Optionally, the apparatus further comprises: the determining unit 503 is configured to determine a second data center according to the data identifier, where the second data center is a data center where data corresponding to the data identifier is located; the sending unit 502 is further configured to send a location information query request to a master node of the second data center, where the location information query request includes a data identifier, and the master node of the second data center is a master node of a distributed file system of the second data center; the receiving unit 501 is further configured to receive a location information query response returned by the master node of the second data center, where the location information query response includes an identifier of a computing node where data corresponding to the data identifier is located; or, the sending unit 502 is further configured to send a location information query request to a master node of a plurality of data centers, where the location information query request includes a data identifier, and the master node of each of the plurality of data centers is a master node of a distributed file system of each data center; the receiving unit 501 is further configured to receive a location information query response returned by the first host node, where the location information query response includes at least one of the following information: the data center and the computing node where the data corresponding to the data identifier is located, or the location information query response is used to indicate that the data corresponding to the data identifier does not exist in the data center where the first master node is located, where the first master node is a master node of a distributed file system of any one of the multiple data centers.
Optionally, the receiving unit 501 is further configured to receive a data synchronization message sent by a master node of a third data center, where the data synchronization message is used to indicate first data changed by a data operation in the third data center, and the data operation includes at least one of: creating data, deleting data and synchronizing data to other data centers, wherein a main node of the third data center is a main node of a distributed file system of the third data center;
the device further comprises: the processing unit 504 is configured to record, according to the data synchronization message, a data center to which the first data corresponds after being changed.
The related details may be incorporated with the method embodiments with reference to fig. 3A-6.
It should be noted that the determining unit 503 and the processing unit 504 may be implemented by a processor or a processor executing program instructions in a memory. The receiving unit 501 and the sending unit 502 may be implemented by a communication interface or a communication interface in combination with a processor.
In the above embodiments, it may be entirely or partially implemented by software, hardware, or a combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with embodiments of the invention, to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, twisted pair, optical fiber) or wirelessly (e.g., infrared, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium, or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (34)

1. A task deployment system is characterized by comprising a first computing node and a first resource manager node, wherein the first computing node and the first resource manager node belong to a first data center;
the first computing node is configured to send a first task deployment request to the first resource manager node, where the first task deployment request includes an identifier of a task, an identifier of a data center, and an identifier of a computing node in the data center, and the identifier of the data center is used to indicate the data center in which the task is to be deployed;
the first resource manager node is configured to receive a first task deployment request sent by the first computing node; sending a second task deployment request to a resource manager node in a data center corresponding to the identification of the data center other than the first data center, in the case that the identification of the data center indicates the data center other than the first data center, the second task deployment request being used for deploying the task to the data center other than the first data center, the second task deployment request including the identification of the data center other than the first data center, the identification of the task, and the identification of a computing node in the data center other than the first data center.
2. The task deployment system of claim 1, further comprising a metadata server; the identity of the data center is the identity of a second data center, the identity of a compute node in the data center is the identity of a second compute node,
the first computing node is further configured to send a location information request to the metadata server, where the location information request includes a data identifier, the location information request is used to request, from the metadata server, information of a data center and a computing node where data corresponding to the data identifier is located, and the data corresponding to the data identifier is data required for executing the task;
the metadata server is used for receiving the position information request; sending a location information response to the first computing node according to the location information request, wherein the location information response comprises the identifier of the second data center and the identifier of the second computing node;
the first computing node is configured to receive the location information response.
3. The task deployment system of claim 1, wherein the identification of the data center is an identification of a second data center, the identification of a computing node in the data center is an identification of a second computing node, the first resource manager node is configured to receive a first resource allocation message from the second resource manager node in the second data center, the first resource allocation message is configured to indicate that the task can be deployed to the second computing node;
the first computing node is configured to send a task execution request to the second computing node, where the task execution request includes an identifier of the task and a data identifier, and the data identifier is used to indicate data required for executing the task in the second computing node.
4. The task deployment system according to any one of claims 1 to 3, wherein the first task deployment request includes an identification of a plurality of data centers, the identification of the plurality of data centers being used to indicate a plurality of data centers that can deploy the task, the first resource management node being further used to determine, from the plurality of data centers, a data center that will deploy the task according to a resource allocation policy.
5. A method for resource scheduling, the method being used for a first resource manager node of a first data center, the first data center further including a first compute node, the method comprising:
receiving a first task deployment request sent by the first computing node, wherein the first task deployment request comprises an identifier of a task, an identifier of a data center and an identifier of a computing node in the data center, and the identifier of the data center is used for indicating the data center where the task is to be deployed;
sending a second task deployment request to a resource manager node in a data center corresponding to the identification of the data center other than the first data center, in the case that the identification of the data center indicates the data center other than the first data center, the second task deployment request being used for deploying the task to the data center other than the first data center, the second task deployment request including the identification of the data center other than the first data center, the identification of the task, and the identification of a computing node in the data center other than the first data center.
6. The method of claim 5, wherein the first task deployment request includes an identification of a plurality of data centers indicating a plurality of data centers that can deploy the task, the method further comprising:
determining a data center from the plurality of data centers that will deploy the task according to a resource allocation policy.
7. The method of claim 6, wherein the resource allocation policy comprises a scheduling policy, and wherein the scheduling policy is any one of:
in the case that the task cannot be deployed on the first computing node, allowing the task to be deployed on any computing node of the data center with free resources;
in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first data center that are on the same rack as the first computing node;
in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first datacenter.
8. The method of claim 7, wherein the resource allocation policy further comprises selecting a data center according to attribute information of at least one of the following data centers if two or more data centers are determined using the scheduling policy: a priority of two or more data centers, a capacity or proportion of free resources of each of the two or more data centers, a size of a network bandwidth between a resource manager node of each of the two or more data centers and the first resource manager node.
9. The method according to any of claims 5-8, wherein the second task deployment request is a heartbeat message between the first resource manager node and a second resource manager node, the heartbeat message including an identification of the data center, an identification of the task, and an identification of a computing node in the data center.
10. The method of any of claims 5-8, wherein the identification of the data center is an identification of a second data center, wherein the identification of a computing node in the data center is an identification of a second computing node, and wherein the method further comprises:
receiving a first resource allocation message from a second resource manager node in the second data center, the first resource allocation message indicating that the task may be deployed to the second compute node;
and sending a task execution request to the second computing node, wherein the task execution request comprises an identifier of the task and a data identifier, and the data identifier is used for indicating data required by the second computing node for executing the task.
11. The method according to any one of claims 5-8, further comprising:
receiving a third task deployment request sent by a third resource manager node, where the third task deployment request includes an identifier of the first data center, an identifier of the first task, and an identifier of a third computing node in the first data center, and the third resource manager node belongs to the third data center;
sending a second resource allocation message to the third resource manager node, the first resource allocation message indicating that the first task may be deployed to the third computing node.
12. A method for resource scheduling, the method being used in a first compute node of a first data center, the first data center further including a first resource manager node, the method comprising:
sending a position information request to a metadata server, wherein the position information request comprises a data identifier, the position information request is used for requesting the metadata server for information of a data center and a computing node where data corresponding to the data identifier are located, and the data corresponding to the data identifier is data required for executing a task;
receiving a location information response sent by the metadata server, wherein the location information response comprises an identifier of a second data center and an identifier of a second computing node, and the second computing node belongs to the second data center;
sending a first task deployment request to the first resource manager node, the first task deployment request including an identification of a task, an identification of a data center, and an identification of a compute node in the data center, the identification of the data center indicating the data center to deploy the task;
receiving a first resource allocation message sent by the first resource manager node, the first resource allocation message indicating that the task may be deployed to the second computing node.
13. The method of claim 12, further comprising:
and sending a task execution request to the second computing node, wherein the task execution request comprises an identifier of the task and a data identifier, and the data identifier is used for indicating data required by the second computing node for executing the task.
14. An information management method, characterized in that the method comprises:
the method comprises the steps that a metadata server receives a position information request sent by a first computing node, wherein the position information request comprises a data identifier, the position information request is used for requesting information of a data center and a computing node where data corresponding to the data identifier are located from the metadata server, the data corresponding to the data identifier are data required by task execution, and the first computing node belongs to a first data center;
and sending a position information response to the first computing node according to the position information request, wherein the position information response comprises the identification of the data center where the data corresponding to the data identification is located and the identification of the computing node.
15. The method of claim 14, further comprising:
determining a second data center according to the data identifier, wherein the second data center is a data center where data corresponding to the data identifier is located;
sending a position information query request to a master node of the second data center, wherein the position information query request comprises a data identifier, and the master node of the second data center is a master node of a distributed file system of the second data center;
receiving a position information query response returned by the main node of the second data center, wherein the position information query response comprises an identifier of a computing node where data corresponding to the data identifier is located;
alternatively, the method further comprises:
sending a position information query request to main nodes of a plurality of data centers, wherein the position information query request comprises a data identifier, and the main node of each data center in the plurality of data centers is the main node of a distributed file system of each data center;
receiving a location information query response returned by the first host node, wherein the location information query response comprises at least one of the following information: the data center and the computing node where the data corresponding to the data identifier is located, or the location information query response is used to indicate that the data corresponding to the data identifier does not exist in the data center where the first master node is located, where the first master node is a master node of a distributed file system of any one of the multiple data centers.
16. The method according to claim 14 or 15, characterized in that the method further comprises:
receiving a data synchronization message sent by a master node of a third data center, the data synchronization message being used for indicating first data changed by data operation in the third data center, the data operation including at least one of: creating data, deleting data and synchronizing data to other data centers, wherein a main node of the third data center is a main node of a distributed file system of the third data center;
and recording the data center corresponding to the changed first data according to the data synchronization message.
17. An apparatus for resource scheduling, applied to a first resource manager node of a first data center, the first data center further including a first computing node, the apparatus comprising:
a receiving unit, configured to receive a first task deployment request sent by the first computing node, where the first task deployment request includes an identifier of a task, an identifier of a data center, and an identifier of a computing node in the data center, and the identifier of the data center is used to indicate the data center in which the task is to be deployed;
a sending unit, configured to send, to a resource manager node in a data center corresponding to an identifier of a data center other than the first data center, a second task deployment request when the identifier of the data center indicates the data center other than the first data center, where the second task deployment request is used to deploy the task to the data center other than the first data center, and the second task deployment request includes the identifier of the data center other than the first data center, the identifier of the task, and an identifier of a computing node in the data center other than the first data center.
18. The apparatus according to claim 17, wherein the first task deployment request includes an identification of a plurality of data centers, the identification of the plurality of data centers indicating a plurality of data centers that can deploy the task, and the apparatus further comprises a determining unit configured to determine, from the plurality of data centers, a data center to deploy the task according to a resource allocation policy.
19. The apparatus of claim 18, wherein the resource allocation policy comprises a scheduling policy, and wherein the scheduling policy is any one of:
in the case that the task cannot be deployed on the first computing node, allowing the task to be deployed on any computing node of the data center with free resources;
in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first data center that are on the same rack as the first computing node;
in the event that the task cannot be deployed on the first computing node, allowing the task to be preferentially deployed on other computing nodes of the first datacenter.
20. The apparatus of claim 19, wherein the resource allocation policy further comprises, in the case that two or more data centers are determined by using the scheduling policy, selecting a data center according to attribute information of at least one of the following data centers: a priority of two or more data centers, a capacity or proportion of free resources of each of the two or more data centers, a size of a network bandwidth between a resource manager node of each of the two or more data centers and the first resource manager node.
21. The apparatus of any of claims 17-20, wherein the second task deployment request is a heartbeat message between the first resource manager node and a second resource manager node, the heartbeat message including an identification of the data center, an identification of the task, and an identification of a compute node in the data center.
22. The apparatus of any of claims 17-20, wherein the identification of the data center is an identification of a second data center, wherein the identification of a computing node in the data center is an identification of a second computing node,
the receiving unit is further configured to receive a first resource allocation message from a second resource manager node in the second data center, where the first resource allocation message is used to indicate that the task may be deployed to the second computing node;
the sending unit is further configured to send a task execution request to the second computing node, where the task execution request includes an identifier of the task and a data identifier, and the data identifier is used to indicate data required for executing the task in the second computing node.
23. The apparatus according to any of claims 17-20, wherein the receiving unit is further configured to receive a third task deployment request sent by a third resource manager node, where the third task deployment request includes an identifier of the first data center, an identifier of the first task, and an identifier of a third computing node in the first data center, and the third resource manager node belongs to a third data center;
the sending unit is further configured to send a second resource allocation message to the third resource manager node, where the first resource allocation message is used to indicate that the first task may be deployed to the third computing node.
24. An apparatus for scheduling resources, applied to a first computing node of a first data center, the first data center further including a first resource manager node therein, the apparatus comprising:
the system comprises a sending unit, a receiving unit and a processing unit, wherein the sending unit is used for sending a position information request to a metadata server, the position information request comprises a data identifier, the position information request is used for requesting the metadata server for information of a data center and a computing node where data corresponding to the data identifier are located, and the data corresponding to the data identifier are data required by task execution;
a receiving unit, configured to receive a location information response sent by the metadata server, where the location information response includes an identifier of a second data center and an identifier of a second computing node, and the second computing node belongs to the second data center;
the sending unit is further configured to send a first task deployment request to the first resource manager node, where the first task deployment request includes an identifier of a task, an identifier of a data center, and an identifier of a computing node in the data center, and the identifier of the data center is used to indicate the data center in which the task is to be deployed;
the receiving unit is further configured to receive a first resource allocation message sent by the first resource manager node, where the first resource allocation message is used to indicate that the task may be deployed to the second computing node.
25. The apparatus of claim 24, wherein the sending unit is further configured to send a task execution request to the second computing node, and wherein the task execution request includes an identifier of the task and a data identifier, and wherein the data identifier is used to indicate data required by the second computing node to execute the task.
26. An information management apparatus applied to a metadata server, the apparatus comprising:
a receiving unit, configured to receive a location information request sent by a first computing node, where the location information request includes a data identifier, and the location information request is used to request, from the metadata server, information of a data center and a computing node where data corresponding to the data identifier is located, where the data corresponding to the data identifier is data required for executing a task, and the first computing node belongs to a first data center;
and the sending unit is used for sending a position information response to the first computing node according to the position information request, wherein the position information response comprises the identification of the data center where the data corresponding to the data identification is located and the identification of the computing node.
27. The apparatus of claim 26, further comprising: the determining unit is used for determining a second data center according to the data identifier, wherein the second data center is a data center where data corresponding to the data identifier is located;
the sending unit is further configured to send a location information query request to the master node of the second data center, where the location information query request includes a data identifier, and the master node of the second data center is a master node of a distributed file system of the second data center;
the receiving unit is further configured to receive a location information query response returned by the master node of the second data center, where the location information query response includes an identifier of a computing node where data corresponding to the data identifier is located;
or the sending unit is further configured to send a location information query request to the master nodes of the multiple data centers, where the location information query request includes a data identifier, and the master node of each of the multiple data centers is the master node of the distributed file system of each data center;
the receiving unit is further configured to receive a location information query response returned by the first host node, where the location information query response includes at least one of the following information: the data center and the computing node where the data corresponding to the data identifier is located, or the location information query response is used to indicate that the data corresponding to the data identifier does not exist in the data center where the first master node is located, where the first master node is a master node of a distributed file system of any one of the multiple data centers.
28. The apparatus according to claim 26 or 27, wherein the receiving unit is further configured to receive a data synchronization message sent by a master node of a third data center, where the data synchronization message is used to indicate first data changed by a data operation in the third data center, and the data operation includes at least one of: creating data, deleting data and synchronizing data to other data centers, wherein a main node of the third data center is a main node of a distributed file system of the third data center;
the device further comprises: and the processing unit is used for recording the data center corresponding to the changed first data according to the data synchronization message.
29. An apparatus for scheduling resources, the apparatus comprising: a processor, a memory, and a communication interface; the processor, the memory for storing a software program, and the communication interface are coupled by a bus, wherein the method according to any of claims 5-11 can be performed by the communication interface when the processor is configured to run or execute the software program in the memory.
30. An apparatus for scheduling resources, the apparatus comprising: a processor, a memory, and a communication interface; the processor, the memory for storing software programs, and the communication interface are coupled by a bus, wherein the method according to claim 12 or 13 can be performed by the communication interface when the processor is configured to run or execute the software programs in the memory.
31. An information management apparatus, the apparatus comprising: a processor, a memory, and a communication interface; the processor, the memory for storing a software program, and the communication interface are coupled by a bus, wherein the method according to any of claims 14-16 can be performed by the communication interface when the processor is configured to run or execute the software program in the memory.
32. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of any of claims 5-11.
33. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of claim 12 or 13.
34. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of any one of claims 14-16.
CN201711487682.3A 2017-12-29 2017-12-29 Resource scheduling method, information management method and device and task deployment system Active CN109992373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711487682.3A CN109992373B (en) 2017-12-29 2017-12-29 Resource scheduling method, information management method and device and task deployment system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711487682.3A CN109992373B (en) 2017-12-29 2017-12-29 Resource scheduling method, information management method and device and task deployment system

Publications (2)

Publication Number Publication Date
CN109992373A CN109992373A (en) 2019-07-09
CN109992373B true CN109992373B (en) 2021-04-09

Family

ID=67111407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711487682.3A Active CN109992373B (en) 2017-12-29 2017-12-29 Resource scheduling method, information management method and device and task deployment system

Country Status (1)

Country Link
CN (1) CN109992373B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340613B (en) * 2020-02-26 2023-10-03 中国邮政储蓄银行股份有限公司 Job processing method, job processing system and storage medium
CN112085378B (en) * 2020-09-04 2023-02-03 中国平安财产保险股份有限公司 Resource allocation method, device, computer equipment and storage medium
CN112383878B (en) * 2020-09-27 2021-07-30 中国信息通信研究院 Collaborative computing method and electronic device
CN112130983A (en) * 2020-10-27 2020-12-25 上海商汤临港智能科技有限公司 Task processing method, device, equipment, system and storage medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101167054A (en) * 2005-05-27 2008-04-23 国际商业机器公司 Methods and apparatus for selective workload off-loading across multiple data centers
CN101997929A (en) * 2010-11-29 2011-03-30 北京卓微天成科技咨询有限公司 Data access method, device and system for cloud storage
CN102426542A (en) * 2011-10-28 2012-04-25 中国科学院计算技术研究所 Resource management system for data center and operation calling method
CN102445978A (en) * 2010-10-12 2012-05-09 深圳市金蝶中间件有限公司 Method and device for managing data center
CN102567851A (en) * 2011-12-29 2012-07-11 武汉理工大学 Safely-sensed scientific workflow data layout method under cloud computing environment
CN102739785A (en) * 2012-06-20 2012-10-17 东南大学 Method for scheduling cloud computing tasks based on network bandwidth estimation
CN103530182A (en) * 2013-10-22 2014-01-22 海南大学 Working scheduling method and device
CN104104655A (en) * 2013-04-07 2014-10-15 华为技术有限公司 Resource release method, device and system
CN104683161A (en) * 2015-03-18 2015-06-03 杭州华三通信技术有限公司 Network management method and device based on SaaS (software as a service)
CN106201698A (en) * 2016-07-15 2016-12-07 北京金山安全软件有限公司 Method and device for managing application program and electronic equipment
CN106648464A (en) * 2016-12-22 2017-05-10 柏域信息科技(上海)有限公司 Multi-node mixed block cache data read-writing method and system based on cloud storage
CN106921977A (en) * 2015-12-26 2017-07-04 华为技术有限公司 A kind of service quality planing method, apparatus and system based on Business Stream
CN107291746A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and apparatus for storing and reading data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2008355092A1 (en) * 2008-04-21 2009-10-29 Adaptive Computing Enterprises, Inc. System and method for managing energy consumption in a compute environment

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101167054A (en) * 2005-05-27 2008-04-23 国际商业机器公司 Methods and apparatus for selective workload off-loading across multiple data centers
CN102445978A (en) * 2010-10-12 2012-05-09 深圳市金蝶中间件有限公司 Method and device for managing data center
CN101997929A (en) * 2010-11-29 2011-03-30 北京卓微天成科技咨询有限公司 Data access method, device and system for cloud storage
CN102426542A (en) * 2011-10-28 2012-04-25 中国科学院计算技术研究所 Resource management system for data center and operation calling method
CN102567851A (en) * 2011-12-29 2012-07-11 武汉理工大学 Safely-sensed scientific workflow data layout method under cloud computing environment
CN102739785A (en) * 2012-06-20 2012-10-17 东南大学 Method for scheduling cloud computing tasks based on network bandwidth estimation
CN104104655A (en) * 2013-04-07 2014-10-15 华为技术有限公司 Resource release method, device and system
CN103530182A (en) * 2013-10-22 2014-01-22 海南大学 Working scheduling method and device
CN104683161A (en) * 2015-03-18 2015-06-03 杭州华三通信技术有限公司 Network management method and device based on SaaS (software as a service)
CN106921977A (en) * 2015-12-26 2017-07-04 华为技术有限公司 A kind of service quality planing method, apparatus and system based on Business Stream
CN107291746A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and apparatus for storing and reading data
CN106201698A (en) * 2016-07-15 2016-12-07 北京金山安全软件有限公司 Method and device for managing application program and electronic equipment
CN106648464A (en) * 2016-12-22 2017-05-10 柏域信息科技(上海)有限公司 Multi-node mixed block cache data read-writing method and system based on cloud storage

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MapReduce集群环境下的数据放置策略;荀亚玲 等;《软件学报》;20150202;第26卷(第8期);全文 *

Also Published As

Publication number Publication date
CN109992373A (en) 2019-07-09

Similar Documents

Publication Publication Date Title
US11226847B2 (en) Implementing an application manifest in a node-specific manner using an intent-based orchestrator
CN107066319B (en) Multi-dimensional scheduling system for heterogeneous resources
US9971823B2 (en) Dynamic replica failure detection and healing
US9999030B2 (en) Resource provisioning method
CN109992373B (en) Resource scheduling method, information management method and device and task deployment system
US9053167B1 (en) Storage device selection for database partition replicas
JP6190389B2 (en) Method and system for performing computations in a distributed computing environment
JP6185486B2 (en) A method for performing load balancing in a distributed computing environment
US8645745B2 (en) Distributed job scheduling in a multi-nodal environment
WO2020001320A1 (en) Resource allocation method, device, and apparatus
US10177994B2 (en) Fault tolerant federation of computing clusters
US20160275123A1 (en) Pipeline execution of multiple map-reduce jobs
CN110941481A (en) Resource scheduling method, device and system
WO2012068867A1 (en) Virtual machine management system and using method thereof
WO2015176636A1 (en) Distributed database service management system
CN103797462A (en) Method, system, and device for creating virtual machine
US9092272B2 (en) Preparing parallel tasks to use a synchronization register
US11467874B2 (en) System and method for resource management
CN113382077B (en) Micro-service scheduling method, micro-service scheduling device, computer equipment and storage medium
CN113839814B (en) Decentralized Kubernetes cluster federal implementation method and system
CN111343219B (en) Computing service cloud platform
KR20190028210A (en) Cloud service method and system for deployment of artificial intelligence application using container
US10725819B2 (en) System and method for scheduling and allocating data storage
CN116881012A (en) Container application vertical capacity expansion method, device, equipment and readable storage medium
CN116954816A (en) Container cluster control method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant