CN110166507B - Multi-resource scheduling method and device - Google Patents

Multi-resource scheduling method and device Download PDF

Info

Publication number
CN110166507B
CN110166507B CN201810145714.XA CN201810145714A CN110166507B CN 110166507 B CN110166507 B CN 110166507B CN 201810145714 A CN201810145714 A CN 201810145714A CN 110166507 B CN110166507 B CN 110166507B
Authority
CN
China
Prior art keywords
task
resources
resource
port number
gpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810145714.XA
Other languages
Chinese (zh)
Other versions
CN110166507A (en
Inventor
鲁楠
王永亮
王科
何云龙
张志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201810145714.XA priority Critical patent/CN110166507B/en
Publication of CN110166507A publication Critical patent/CN110166507A/en
Application granted granted Critical
Publication of CN110166507B publication Critical patent/CN110166507B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1012Server selection for load balancing based on compliance of requirements or conditions with available server resources
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention discloses a multi-resource scheduling method and device, and relates to the technical field of computers. Wherein the method comprises the following steps: receiving a task request: the task request includes: network application parameters, resource application parameters; allocating resources and operating servers for the tasks according to the network application parameters, the resource application parameters and the current cluster available resource monitoring information; when the container on the operation server is started in a host network mode, setting the network of the container according to the network application parameters, and setting the resources allocated for the tasks as available resources of the container. Through the steps, resources can be dynamically allocated for the task, the scheduling of multiple network cards is supported, and the free switching among different network cards is realized.

Description

Multi-resource scheduling method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for scheduling multiple resources.
Background
With the development of artificial intelligence, deep learning tools such as TensorFlow, caffe are increasingly used. The service performance of the deep learning tools is limited by various resources such as network, memory, CPU, GPU and the like.
The existing machine learning platform mainly depends on yarn (a Hadoop resource manager) or kubernetes (a container cluster management system) to perform resource scheduling, and mainly supports scheduling of CPU and memory resources. In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:
first, the scheduling of multiple network cards is not supported. Most existing machine learning platforms are based on a network. When multiple network cards exist, the existing scheduling method cannot well perform network adaptation and free switching.
Second, the scheduling of GPU resources is not supported. Or even if the scheduling of GPU resources is supported, the mixed scheduling effect of the GPU and the CPU is poor, so that the utilization ratio of the GPU and the CPU resources is difficult to achieve maximization.
Disclosure of Invention
In view of the above, the present invention provides a multi-resource scheduling method and apparatus, which can dynamically allocate resources for tasks, support scheduling of multiple network cards, and realize free switching between different network cards. Furthermore, the invention supports the dispatching of GPU resources and improves the resource utilization rate of the mixed dispatching of GPU and CPU resources.
To achieve the above object, according to one aspect of the present invention, there is provided a multi-resource scheduling method.
The multi-resource scheduling method of the invention comprises the following steps: receiving a task request: the task request includes: network application parameters, resource application parameters; allocating resources and operating servers for the tasks according to the network application parameters, the resource application parameters and the current cluster available resource monitoring information; when the container on the operation server is started in a host network mode, setting the network of the container according to the network application parameters, and setting the resources allocated for the tasks as available resources of the container.
Optionally, the task request further includes: a task type; the method further comprises the steps of: and in the case that the task type belongs to a port dependent type, assigning a port number to the task.
Optionally, the step of allocating a port number for the task includes: inquiring the distributed storage system according to the initial port number; if the inquiry result is that the initial port number is not in lease, the initial port number is allocated to the task; if the inquiry result is that the initial port number is in lease, updating the initial port number, and inquiring the distributed storage system according to the updated port number; and if the query result is that the updated port number is not in the lease, distributing the updated port number to the task.
Optionally, the resource application parameter includes an applied resource type; the resource types of the application include at least one of: GPU, CPU, memory.
Optionally, the step of allocating resources for the task and running the server according to the task request and the current cluster available resource monitoring information includes: allocating resources for the task according to a maximum and minimum fairness principle; and then, determining an operation server of the task according to the network application parameters and the resources allocated for the task.
Optionally, the step of determining the running server of the task according to the network application parameter and the resources allocated for the task includes: screening a first candidate server set from the cluster according to the network application parameters; if the resources allocated for the task comprise GPU resources, a GPU server set is screened out from the first candidate server set according to the GPU resources allocated for the task; and then, selecting the running server of the task from the GPU server set according to other resources allocated for the task.
Optionally, the step of determining the running server of the task according to the network application parameter and the resources allocated for the task further includes: if the resources allocated for the task comprise CPU resources and do not comprise GPU resources, a CPU server set is screened out from the first candidate server set according to the CPU resources allocated for the task; then, screening out an operation server of the task from the CPU servers according to other resources allocated for the task; and if the screened CPU server set is empty, screening a GPU server set from the first candidate server set according to CPU resources allocated for the task, and then screening the running server of the task from the GPU servers according to other resources allocated for the task.
To achieve the above object, according to another aspect of the present invention, there is provided a multi-resource scheduling apparatus.
The multi-resource scheduling device of the present invention comprises: the receiving module is used for receiving the task request: the task request includes: network application parameters, resource application parameters; the allocation module is used for allocating resources and operating servers for the tasks according to the network application parameters, the resource application parameters and the current cluster available resource monitoring information; and the setting module is used for setting the network of the container according to the network application parameters when the container on the operation server is started in a host network mode, and setting the resources allocated for the tasks as the available resources of the container.
Optionally, the task request further includes: a task type; the allocation module is further configured to allocate a port number to the task if the task type belongs to a port dependency type.
Optionally, the allocating module allocates a port number for the task includes: the distribution module queries the distributed storage system according to the initial port number; if the inquiry result is that the initial port number is not in lease, the allocation module allocates the initial port number to the task; if the inquiry result is that the initial port number is in lease, the allocation module updates the initial port number and inquires the distributed storage system according to the updated port number; and if the query result is that the updated port number is not in the lease, the allocation module allocates the updated port number to the task.
Optionally, the resource application parameter includes an applied resource type; the resource types of the application include at least one of: GPU, CPU, memory.
Optionally, the allocating module allocates resources for the task and runs the server according to the task request and the current cluster available resource monitoring information, and includes: the allocation module allocates resources for the task according to the maximum and minimum fairness principle; and then, the allocation module determines an operation server of the task according to the network application parameters and the resources allocated for the task.
Optionally, the determining, by the allocation module, the running server of the task according to the network application parameter and the resources allocated for the task includes: the distribution module screens a first candidate server set from the cluster according to the network application parameters; if the resources allocated for the task comprise GPU resources, the allocation module screens out a GPU server set from the first candidate server set according to the GPU resources allocated for the task; and then, the allocation module screens out the running server of the task from the GPU server set according to other resources allocated for the task.
Optionally, the allocation module determines the running server of the task according to the network application parameter and the resources allocated for the task further includes: if the resources allocated for the task comprise CPU resources and do not comprise GPU resources, the allocation module screens a CPU server set from the first candidate server set according to the CPU resources allocated for the task; then, the allocation module screens out an operation server of the task from the CPU servers according to other resources allocated to the task; and if the screened CPU server set is empty, the distribution module screens the GPU server set from the first candidate server set according to CPU resources distributed for the task, and then, the distribution module screens the running server of the task from the GPU servers according to other resources distributed for the task.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
The electronic device of the present invention includes: one or more processors; and a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the multi-resource scheduling method of the present invention.
To achieve the above object, according to still another aspect of the present invention, a computer-readable medium is provided.
The computer readable medium of the present invention has stored thereon a computer program which when executed by a processor implements the multi-resource scheduling method of the present invention.
One embodiment of the above invention has the following advantages or benefits: in the embodiment of the invention, resources are allocated for the task and a server is operated according to the received task request and the current cluster available resource monitoring information; when the container on the operation server is started in the host network mode, the network of the container is set according to the network application parameters in the task request, and the resources allocated for the task are set as available resources of the container, so that the scheduling of multiple network cards can be supported, and the free switching among different network cards is realized.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main flow of a multi-resource scheduling method according to one embodiment of the present invention;
FIG. 2 is a schematic diagram of the main flow of a multi-resource scheduling method according to another embodiment of the present invention;
FIG. 3 is a schematic diagram of the main modules of a multi-resource scheduling apparatus according to an embodiment of the present invention;
FIG. 4 is a network architecture diagram of a container according to an embodiment of the invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 6 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It is noted that embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Before describing embodiments of the present invention in detail, some technical terms related to the embodiments of the present invention will be described first.
Dock: an open-source application container engine allows developers to package their applications and rely on packages into a portable container and then release them to any popular Linux machine, which can also implement virtualization. The containers are completely using a sandbox mechanism without any interface to each other.
Host network mode: is a network mode of the Docker container. In this network mode, the Docker container and host share a network namespace (Network Namespace). The Docker container does not virtualize its own network card and IP, but uses the host's network card and IP.
TensorFlow: is a second generation artificial intelligence learning system developed by ***, and the naming of the second generation artificial intelligence learning system is derived from the operation principle of the second generation artificial intelligence learning system. Tensor means an N-dimensional array, flow means computation based on a data Flow graph, and TensorFlow is a computation process in which tensors Flow from one end of an image to the other end. TensorFlow is a system that transmits complex data structures into an artificial intelligence neural network for analysis and processing.
Caffe: is a deep learning framework for computing CNN correlation algorithms.
Etcd: is a distributed, consistent KV (key pair) storage system for shared configuration and service discovery.
Fig. 1 is a schematic diagram of a main flow of a multi-resource scheduling method according to an embodiment of the present invention. As shown in fig. 1, the multi-resource scheduling method according to the embodiment of the present invention includes:
step S101, receiving a task request: the task request includes: network application parameters, resource application parameters.
The task request may be a resource request of various tasks, such as a resource request of a machine learning task. Wherein the resource application parameters comprise the type of the applied resource; the resource types of the application include at least one of: GPU, CPU, memory. For example, a certain task request includes the following information "net=ib, cpu core=2, cpu core=1", which indicates: the network of the task application is IB (InfiniBand) network, the task application is two CPU cores, and the task application is one GPU core.
Step S102, distributing resources and operating servers for the tasks according to the network application parameters, the resource application parameters and the current cluster available resource monitoring information.
For example, the resources allocated for the task are "two CPU cores, two GPU cores", and the running server is the server numbered "worker-0".
Step S103, when the container on the operation server is started in a host (host) network mode, setting the network of the container according to the network application parameters, and setting the resources allocated for the task as the available resources of the container.
For example, when the network application parameter is "net=ib", the IB network card on the host is used as the network resource of the container, and the IP (such as 192.18.177.12) of the host under the IB network card is used as the IP of the container. When the network application parameter is "net=eth", the ethernet card on the host is used as the network resource of the container, and the IP (such as 192.17.155.13) of the host under the ethernet card is used as the IP of the container.
In the embodiment of the invention, resources can be dynamically allocated for the task through the steps, and the scheduling of multiple network cards is supported, so that the free switching among different network cards is realized. Furthermore, the task can be operated under the Ethernet or the IB network.
Fig. 2 is a schematic diagram of a main flow of a multi-resource scheduling method according to another embodiment of the present invention. As shown in fig. 2, the multi-resource scheduling method according to the embodiment of the present invention includes:
Step S201, receiving a task request; the task request includes: task type, network application parameters, resource application parameters.
Wherein the task types can be classified into a port-dependent type and a non-port-dependent type. For example, when the value of the parameter of the task type is "true", the port dependency is indicated; when the value of the parameter of the task type is "false", the non-port dependent type is indicated. In specific implementation, task types of TensorFlow can be divided into port dependency types, and task types of Caffe can be divided into non-port dependency types.
Wherein the resource application parameters comprise the type of the applied resource; the resource types of the application include at least one of: GPU, CPU, memory. For example, a certain task request includes the following information "type=true, net=ib, cpu core=2, cpu core=1", which indicates: the task type is port dependent, the network of task application is IB (InfiniBand) network, the task application is two CPU cores, and the task application is one GPU core.
Step S202, judging whether the task type is a port dependent type according to the task request. In the case that the task type is the port dependent type, step S203 is executed; in the case where the task type is the port-independent type, step S204 is performed.
Step S203, a port number is allocated for the task.
In an alternative embodiment, step S203 may include: steps a to d.
And a step a, inquiring the distributed storage system according to the initial port number.
Wherein, the distributed storage system may be Etcd. In the step, if the query result is that the initial port number is not in lease, executing the step b; and if the inquiry result is that the initial port number is in lease, executing the step c.
And b, assigning the initial port number to the task.
And c, updating the port number, and inquiring the distributed storage system according to the updated port number.
In the step, if the query result is that the updated port number is not in lease, executing the step d; if the query result is that the updated port number is in the lease, executing the step c again until the port number which is not in the lease is found. When the port number is in the lease, the port number is occupied by other tasks, and the lock cannot be released; when the port number is not in lease, it indicates that the port number is not occupied by other tasks, i.e., the port number is an idle port number. By introducing a lease mechanism to manage port number resources, the implementation can be realized: in the task operation process, TTL Time (Time To Live, i.e. the operation Time of the task) is automatically updated, so that the occupation period of the port resource is ensured To be consistent with the task operation period; after the task operation is finished, the TTL time is finished, and the port resources are released.
The updating of the port number may be performed by adding a fixed stride value to the port number at a time, for example. For example, if the initial port number is 6060 and the fixed step value is 1, the port number obtained by the first update is 6061. Querying the distributed storage system according to the port number '6061', and if the port number '6061' is not in lease, executing step d, namely, allocating the port number '6061' to the task; if port number "6061" is in lease, step c is performed, i.e., the port number is updated to "6062", and the distributed storage system is queried according to port number "6062".
And d, distributing the updated port number to the task.
In the embodiment of the present invention, the port number resource can be dynamically allocated to the task through step S203, so as to ensure the exclusivity of the task on the port number resource, and the life cycle of the port number resource is synchronous with the life cycle of the task.
And step S204, allocating resources for the task according to the resource application parameters and the current cluster available resource monitoring information.
For example, after obtaining the resource application parameter and the current available resource monitoring information of the cluster, resources may be allocated to the task according to a preset allocation principle. For example, the preset allocation rule may be a maximum-minimum fairness rule. The maximum and minimum fairness principle means that under a multi-resource environment, dominant resources of a task should be allocated preferentially, and then non-dominant resources are allocated.
When the resource application parameters in the task request comprise the GPU, the GPU can be directly used as a dominant resource; when the resource application parameters in the task request do not include the GPU but include the CPU and the memory, the dominant resource may be selected according to the ratio of the applied resource amount to the available resource amount of the cluster, and then the resource allocation may be performed.
For example, the current cluster available resources include: 9 CPU cores and 18GB of memory, and the request of the task A comprises the following steps: 1 CPU core, 4GB of memory, task B request includes: 3 CPU cores and 4GB of memory, then in the request of task A: the CPU duty cycle is 11% and the memory duty cycle is 22%, in the request of task B: the CPU accounts for 33% and the memory accounts for 22%. Furthermore, for task a's request, the dominant resource is memory; for task B's request, the dominant resource is the CPU. Furthermore, for the request of the task A, memory resources are preferentially allocated, and CPU resources are allocated; for the request of task B, CPU resource is allocated preferentially, and memory resource is allocated.
Step S205, determining the running server of the task according to the network application parameters and all the resources allocated for the task.
In an alternative embodiment, step S205 includes:
And step A, screening a first candidate server set from the cluster according to the network application parameters.
For example, assuming that a certain cluster includes two servers, one configured with only an ethernet card and the other configured with an ethernet card and an IB card, step a may include: if the network application parameter is 'net=ib', screening out servers configured with two network cards, namely constructing a first candidate server set based on the servers configured with the two network cards; if the network application parameter is "net=eth", a first candidate server set is constructed based on all servers in the cluster.
B, if the resources allocated for the task comprise GPU resources, a GPU server set is screened out from the first candidate server set according to the GPU resources allocated for the task; and then, selecting the running server of the task from the GPU server set according to other resources allocated for the task. Generally, a GPU and a CPU are configured on a GPU server, and a CPU is configured on a CPU server, but no GPU.
Further, step S205 may further include:
step C, if the resources allocated for the task comprise CPU resources and do not comprise GPU resources, a CPU server set is screened out from the first candidate server set according to the CPU resources allocated for the task; then, screening out an operation server of the task from the CPU servers according to other resources allocated for the task; and if the screened CPU server set is empty, screening a GPU server set from the first candidate server set according to CPU resources allocated for the task, and then screening the running server of the task from the GPU servers according to other resources allocated for the task.
In the embodiment of the invention, when the allocated resources comprise GPU resources, the server which can meet the requirements of the GPU resources can be preferentially searched through the step B; the problem that the task applying the CPU resource runs on the GPU machine can be avoided as much as possible, the load of the GPU machine is increased to a certain extent, the probability that the task applying the CPU resource but not applying the GPU resource runs on the CPU machine is improved as much as possible, and the resource utilization rate of the mixed scheduling of the GPU and the CPU resource is improved.
Step S206, when the container on the operation server is started in the host network mode, setting the network of the container according to the network application parameters, and setting the resources allocated for the task as the available resources of the container.
In the embodiment of the invention, resources can be dynamically allocated for the task and the scheduling of multiple network cards can be supported through the steps. Specifically, the embodiment of the invention can realize the free switching among different network cards, so that the task can run under the Ethernet or the IB network. In addition, the embodiment of the invention can dynamically allocate the port number resource for the task, ensure the monopolization of the task to the port number resource and synchronize the life cycle of the port number resource with the life cycle of the task. In addition, the embodiment of the invention can support the dispatching of GPU resources and improve the resource utilization rate of the mixed dispatching of GPU and CPU resources.
Fig. 3 is a schematic diagram of main modules of a multi-resource scheduling apparatus according to an embodiment of the present invention. As shown in fig. 3, the multi-resource scheduling apparatus 300 according to the embodiment of the present invention includes: a receiving module 301, an allocating module 302, and a setting module 303.
The receiving module 301 is configured to receive a task request.
The task request may be a resource request of various tasks, such as a resource request of a machine learning task. The task request may include: network application parameters, resource application parameters. The resource application parameters comprise the type of the applied resource; the resource types of the application include at least one of: GPU, CPU, memory.
For example, a certain task request includes the following information "net=ib, cpu core=2, cpu core=1", which indicates: the network of task application is IB (InfiniBand) network, the applied resource types are CPU and GPU, the applied quantity of CPU is two CPU cores, and the applied quantity of GPU is one GPU core.
And the allocation module 302 is configured to allocate resources and operate a server for the task according to the network application parameter, the resource application parameter and the current cluster available resource monitoring information.
The running server is provided with resources allocated for the task and is used for running the task. For example, the resources allocated for the task are "two CPU cores, two GPU cores", and the running server is the server numbered "worker-0".
In an alternative embodiment, the allocating module 302 allocates resources to the task and runs the server according to the task request and the current available resources of the cluster, which specifically includes:
1) The allocation module 302 allocates resources for the task according to the resource application parameters and the current cluster available resource monitoring information.
For example, after obtaining the resource application parameter and the current available resources monitoring information of the cluster, the allocation module 302 may allocate resources for the task according to a preset allocation rule. For example, the preset allocation rule may be a maximum-minimum fairness rule. The maximum and minimum fairness principle means that under a multi-resource environment, dominant resources of a task should be allocated preferentially, and then non-dominant resources are allocated.
2) The allocation module 302 determines an operation server of the task according to the network application parameters and all the resources allocated to the task, and specifically includes:
21 The allocation module 302 screens the first set of candidate servers from the cluster according to the network application parameters.
For example, assuming that a certain cluster includes two servers, one server is configured with only an ethernet card, and the other server is configured with an ethernet card and an IB network card, if the network application parameter is "net=ib", the allocation module 302 may screen out the servers configured with the two network cards, and construct a first candidate server set based on the servers configured with the two network cards; if the network application parameter is "net=eth," the assignment module 302 may construct the first candidate set of servers based on all servers in the cluster.
22 If the resources allocated for the task include GPU resources, the allocation module 302 screens out a GPU server set from the first candidate server set according to the GPU resources allocated for the task; the allocation module 302 then screens out the running servers for the task from the set of GPU servers based on other resources allocated for the task.
23 If the resources allocated for the task include CPU resources and do not include GPU resources, the allocation module 302 screens the set of CPU servers from the first set of candidate servers according to the CPU resources allocated for the task; then, the allocation module 302 screens out the running server of the task from the CPU servers according to other resources allocated to the task; if the screened CPU server set is empty, the allocation module 302 screens the GPU server set from the first candidate server set according to the CPU resources allocated for the task, and then the allocation module 302 screens the running server of the task from the GPU servers according to other resources allocated for the task.
Further, the task request may further include: task type. The allocation module 302 is further configured to allocate a port number to the task if the task type belongs to a port dependency type.
In an alternative embodiment, the assigning module 302 assigns a port number to the task includes: the distribution module 302 queries the distributed storage system according to the initial port number; if the query result is that the initial port number is not in lease, the allocation module 302 allocates the initial port number to the task; if the inquiry result is that the initial port number is in lease, the allocation module 302 updates the initial port number and inquires the distributed storage system according to the updated port number; if the query result is that the updated port number is not in the lease, the allocation module 302 allocates the updated port number to the task. Wherein, the distributed storage system package may be Etcd.
A setting module 303, configured to set, when starting the container on the running server in a host (host) network mode, a network of the container according to the network application parameter, and set a resource allocated for the task as an available resource of the container.
For example, when the network application parameter is "net=ib", the IB network card on the host is used as the network resource of the container, and the IP (such as 192.18.177.12) of the host under the IB network card is used as the IP of the container. When the network application parameter is "net=eth", the ethernet card on the host is used as the network resource of the container, and the IP (such as 192.17.155.13) of the host under the ethernet card is used as the IP of the container.
The device provided by the embodiment of the invention can dynamically allocate resources for the task and support the scheduling of the multi-network card. Specifically, the device of the embodiment of the invention can realize the free switching among different network cards, so that the task can run under the Ethernet or the IB network. In addition, the device of the embodiment of the invention can dynamically allocate the port number resource for the task, ensure the monopolization of the task to the port number resource and synchronize the life cycle of the port number resource with the life cycle of the task. In addition, the device of the embodiment of the invention can support the dispatching of GPU resources and improve the resource utilization rate of the mixed dispatching of the GPU and the CPU.
Fig. 4 is a network architecture diagram of a container according to an embodiment of the present invention. As shown in fig. 4, the nodes are servers in the cluster, and the containers are the smallest operating units in the cluster. Typically, the container will be virtualized by a Docker. Therefore, the network card corresponding to the container is a network card Docker0 of Docker. In the prior art, in implementing Docker and container communications on different servers, third party bridging techniques (e.g., calico and flannel) virtual routing are mostly used. However, since the third party bridging technology relies on the ethernet card of the local server, it is difficult to realize free switching of the network card. In the embodiment of the invention, the network mode of the Docker is configured as the host (host) network mode, and the network resources of the container are dynamically designated according to the network application parameters when the container is started, so that the free switching of the multiple network cards is realized.
Fig. 5 illustrates an exemplary system architecture 500 to which the multi-resource scheduling method or multi-resource scheduling apparatus of embodiments of the present invention may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 is used as a medium to provide communication links between the terminal devices 501, 502, 503 and the server 505. The network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 505 via the network 504 using the terminal devices 501, 502, 503 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc., may be installed on the terminal devices 501, 502, 503.
The terminal devices 501, 502, 503 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 505 may be a server providing various services, such as a schedule management server providing support for task requests submitted by users with the terminal devices 501, 502, 503. The scheduling management server may analyze and process the received data such as the task request, and feed back the processing result (for example, the resources and the running node allocated to the task) to the terminal device.
It should be noted that, the multi-resource scheduling method provided in the embodiment of the present invention is generally executed by the server 505, and accordingly, the multi-resource scheduling device is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 6 shows a schematic diagram of a computer system 600 suitable for use in implementing an electronic device of an embodiment of the invention. The computer system shown in fig. 6 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 601.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a receiving module, an allocation module, and a setting module. The names of these modules do not in any way constitute a limitation of the module itself, for example, the receiving module may also be described as "module receiving a task request".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer-readable medium carries one or more programs which, when executed by one of the devices, cause the device to perform the following: receiving a task request: the task request includes: network application parameters, resource application parameters; allocating resources and operating servers for the tasks according to the network application parameters, the resource application parameters and the current cluster available resource monitoring information; when the container on the operation server is started in a host network mode, setting the network of the container according to the network application parameters, and setting the resources allocated for the tasks as available resources of the container.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (12)

1. A multi-resource scheduling method, the method comprising:
receiving a task request: the task request includes: network application parameters, resource application parameters and task types;
allocating resources and operating servers for the tasks according to the network application parameters, the resource application parameters and the current cluster available resource monitoring information;
when a container on the operation server is started in a host network mode, setting a network of the container according to the network application parameters, and setting resources allocated for the tasks as available resources of the container;
assigning a port number to the task if the task type belongs to a port dependent type; the method specifically comprises the following steps: inquiring the distributed storage system according to the initial port number; if the inquiry result is that the initial port number is not in lease, the initial port number is allocated to the task; if the inquiry result is that the initial port number is in lease, updating the initial port number, and inquiring the distributed storage system according to the updated port number; and if the query result is that the updated port number is not in the lease, distributing the updated port number to the task.
2. The method of claim 1, wherein the resource application parameters include an applied resource type; the resource types of the application include at least one of: GPU, CPU, memory.
3. The method of claim 2, wherein the step of allocating resources for a task and running a server based on the task request and current cluster available resource monitoring information comprises:
allocating resources for the task according to a maximum and minimum fairness principle; and then, determining an operation server of the task according to the network application parameters and the resources allocated for the task.
4. A method according to claim 3, wherein the step of determining the running server of a task based on the network application parameters and the resources allocated for the task comprises:
screening a first candidate server set from the cluster according to the network application parameters; if the resources allocated for the task comprise GPU resources, a GPU server set is screened out from the first candidate server set according to the GPU resources allocated for the task; and then, selecting the running server of the task from the GPU server set according to other resources allocated for the task.
5. The method of claim 4, wherein the step of determining the running server for the task based on the network application parameters and the resources allocated for the task further comprises:
if the resources allocated for the task comprise CPU resources and do not comprise GPU resources, a CPU server set is screened out from the first candidate server set according to the CPU resources allocated for the task; then, screening out an operation server of the task from the CPU servers according to other resources allocated for the task; and if the screened CPU server set is empty, screening a GPU server set from the first candidate server set according to CPU resources allocated for the task, and then screening the running server of the task from the GPU servers according to other resources allocated for the task.
6. A multi-resource scheduling apparatus, comprising:
the receiving module is used for receiving the task request: the task request includes: network application parameters, resource application parameters and task types;
the allocation module is used for allocating resources and operating servers for the tasks according to the network application parameters, the resource application parameters and the current cluster available resource monitoring information; the method is also used for allocating a port number for the task under the condition that the task type belongs to a port dependent type; the method specifically comprises the following steps: inquiring the distributed storage system according to the initial port number; if the inquiry result is that the initial port number is not in lease, the allocation module allocates the initial port number to the task; if the inquiry result is that the initial port number is in lease, the allocation module updates the initial port number and inquires the distributed storage system according to the updated port number; if the query result is that the updated port number is not in the lease, the allocation module allocates the updated port number to the task;
And the setting module is used for setting the network of the container according to the network application parameters when the container on the operation server is started in a host network mode, and setting the resources allocated for the tasks as the available resources of the container.
7. The apparatus of claim 6, wherein the resource application parameter comprises an applied resource type; the resource types of the application include at least one of: GPU, CPU, memory.
8. The apparatus of claim 7, wherein the means for allocating resources and running servers for tasks based on the task request and current cluster available resource monitoring information comprises:
the allocation module allocates resources for the task according to the maximum and minimum fairness principle; and then, the allocation module determines an operation server of the task according to the network application parameters and the resources allocated for the task.
9. The apparatus of claim 8, wherein the allocation module determining the running server of the task based on the network application parameters and the resources allocated for the task comprises:
the distribution module screens a first candidate server set from the cluster according to the network application parameters; if the resources allocated for the task comprise GPU resources, the allocation module screens out a GPU server set from the first candidate server set according to the GPU resources allocated for the task; and then, the allocation module screens out the running server of the task from the GPU server set according to other resources allocated for the task.
10. The apparatus of claim 9, wherein the allocation module determining the running server of the task based on the network application parameters and the resources allocated for the task further comprises:
if the resources allocated for the task comprise CPU resources and do not comprise GPU resources, the allocation module screens a CPU server set from the first candidate server set according to the CPU resources allocated for the task; then, the allocation module screens out an operation server of the task from the CPU servers according to other resources allocated to the task; and if the screened CPU server set is empty, the distribution module screens the GPU server set from the first candidate server set according to CPU resources distributed for the task, and then, the distribution module screens the running server of the task from the GPU servers according to other resources distributed for the task.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1 to 5.
12. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1 to 5.
CN201810145714.XA 2018-02-12 2018-02-12 Multi-resource scheduling method and device Active CN110166507B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810145714.XA CN110166507B (en) 2018-02-12 2018-02-12 Multi-resource scheduling method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810145714.XA CN110166507B (en) 2018-02-12 2018-02-12 Multi-resource scheduling method and device

Publications (2)

Publication Number Publication Date
CN110166507A CN110166507A (en) 2019-08-23
CN110166507B true CN110166507B (en) 2023-06-23

Family

ID=67635200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810145714.XA Active CN110166507B (en) 2018-02-12 2018-02-12 Multi-resource scheduling method and device

Country Status (1)

Country Link
CN (1) CN110166507B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110554905A (en) * 2019-08-28 2019-12-10 北京奇艺世纪科技有限公司 Starting method and device of container
CN110704177B (en) * 2019-09-04 2022-06-10 金蝶软件(中国)有限公司 Computing task processing method and device, computer equipment and storage medium
CN110995780A (en) * 2019-10-30 2020-04-10 北京文渊佳科技有限公司 API calling method and device, storage medium and electronic equipment
CN111597036A (en) * 2020-04-15 2020-08-28 中国人民财产保险股份有限公司 Server resource configuration method and device
CN111813541B (en) * 2020-06-12 2024-04-09 北京火山引擎科技有限公司 Task scheduling method, device, medium and equipment
CN112860440A (en) * 2021-03-12 2021-05-28 云知声智能科技股份有限公司 Method and device for allocating cluster computing resources, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970822A (en) * 2017-02-20 2017-07-21 阿里巴巴集团控股有限公司 A kind of container creation method and device
CN107135257A (en) * 2017-04-28 2017-09-05 东方网力科技股份有限公司 Task is distributed in a kind of node cluster method, node and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201403375A (en) * 2012-04-20 2014-01-16 歐樂岡科技公司 Secure zone for secure purchases

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970822A (en) * 2017-02-20 2017-07-21 阿里巴巴集团控股有限公司 A kind of container creation method and device
CN107135257A (en) * 2017-04-28 2017-09-05 东方网力科技股份有限公司 Task is distributed in a kind of node cluster method, node and system

Also Published As

Publication number Publication date
CN110166507A (en) 2019-08-23

Similar Documents

Publication Publication Date Title
CN110166507B (en) Multi-resource scheduling method and device
Alsaffar et al. An architecture of IoT service delegation and resource allocation based on collaboration between fog and cloud computing
CN108737270B (en) Resource management method and device for server cluster
CN108182111B (en) Task scheduling system, method and device
US10394477B2 (en) Method and system for memory allocation in a disaggregated memory architecture
US10310908B2 (en) Dynamic usage balance of central processing units and accelerators
US8825863B2 (en) Virtual machine placement within a server farm
CN113243005A (en) Performance-based hardware emulation in on-demand network code execution systems
CN109729106B (en) Method, system and computer program product for processing computing tasks
CN109117252B (en) Method and system for task processing based on container and container cluster management system
WO2021227999A1 (en) Cloud computing service system and method
US10728169B1 (en) Instance upgrade migration
CN112749002A (en) Method and device for dynamically managing cluster resources
US11144359B1 (en) Managing sandbox reuse in an on-demand code execution system
Davoli et al. Forch: An orchestrator for fog computing service deployment
CN112424749A (en) On-demand code execution with limited memory footprint
Wu et al. Abp scheduler: Speeding up service spread in docker swarm
CN107045452B (en) Virtual machine scheduling method and device
WO2022148376A1 (en) Edge time sharing across clusters via dynamic task migration
Hung et al. A new approach for task scheduling optimization in mobile cloud computing
CN112860421A (en) Method, apparatus and computer program product for job processing
Pawar et al. A review on virtual machine scheduling in cloud computing
US11948010B2 (en) Tag-driven scheduling of computing resources for function execution
CN114237902A (en) Service deployment method and device, electronic equipment and computer readable medium
US11263130B2 (en) Data processing for allocating memory to application containers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant