CN114706690B - Method and system for sharing GPU (graphics processing Unit) by Kubernetes container - Google Patents

Method and system for sharing GPU (graphics processing Unit) by Kubernetes container Download PDF

Info

Publication number
CN114706690B
CN114706690B CN202210627516.3A CN202210627516A CN114706690B CN 114706690 B CN114706690 B CN 114706690B CN 202210627516 A CN202210627516 A CN 202210627516A CN 114706690 B CN114706690 B CN 114706690B
Authority
CN
China
Prior art keywords
gpu
pod
shared
kubernets
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210627516.3A
Other languages
Chinese (zh)
Other versions
CN114706690A (en
Inventor
薛少宁
林巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Technology Co Ltd
Original Assignee
Inspur Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Communication Technology Co Ltd filed Critical Inspur Communication Technology Co Ltd
Priority to CN202210627516.3A priority Critical patent/CN114706690B/en
Publication of CN114706690A publication Critical patent/CN114706690A/en
Application granted granted Critical
Publication of CN114706690B publication Critical patent/CN114706690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a method and a system for sharing a GPU (graphics processing Unit) by Kubernets containers, belonging to the technical field of computers and comprising the following steps: mounting a GPU (graphics processing Unit) on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU expansion scheduler; deploying a GPU sharing equipment plug-in component for each operation node, and acquiring the operation nodes with the shared GPU requirement; and constructing a pod based on the operating nodes with the shared GPU requirements, and issuing GPU shared resources in the pod. According to the invention, GPU extended scheduling parameters are added on the basis of an original scheduling system in Kubernets to form a GPU extended scheduler, and the GPU shared device plug-in component realizes scheduling and binding of the memory level when pod uses GPU devices on the Kubernets, so that the problem that the Kubernets can not share GPUs across pods is solved.

Description

Method and system for sharing GPU (graphics processing Unit) by Kubernetes container
Technical Field
The invention relates to the technical field of computers, in particular to a method and a system for sharing a GPU by Kubernetes containers.
Background
As is well known, kubernets, abbreviated as K8s, is an open source for managing containerized applications on multiple hosts in a cloud platform, and aims to make deploying containerized applications simple and efficient and to provide mechanisms for application deployment, planning, updating, and maintenance.
In kubernets, a pod is the smallest unit of k8s, containers are contained in a pod, there is a pause container and several service containers in a pod, and a container is a single container. However, the GPUs cannot be shared among the PODs, the user can only use the physical GPU device on the node, and the GPU cannot be split and shared for use, and only a certain POD occupies alone, which is not friendly to the user who wants to use the sharing function of the GPU to improve the utilization rate of the GPU in the cluster, and especially in the scenarios of model development and model inference, serious resource waste is caused.
For the above limitation that the GPU cannot be shared between the pods in kubernets, a corresponding solution needs to be proposed.
Disclosure of Invention
The invention provides a method and a system for sharing a GPU by Kubernets containers, which are used for solving the defect that the Kubernets cannot share the GPU in a cross-pod manner in the prior art.
In a first aspect, the present invention provides a kubernets container GPU sharing method, including:
determining operation nodes in a container arrangement engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node;
deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter;
deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources;
and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.
According to the Kubernets container GPU sharing method provided by the invention, the method that the operation nodes in the Kubernets container arranging engine are determined, the GPU is mounted on the operation nodes, and the GPU reporting resource of each operation node is obtained comprises the following steps:
deploying a Kubernets cluster, and acquiring the operation node supporting the GPU when the container operation is determined in the Kubernets cluster;
and acquiring the GPU number and the GPU memory of the operation nodes by using the graph viewing command, and determining that the operation nodes mount GPU equipment.
According to the Kubernets container GPU sharing method provided by the invention, a GPU extension scheduler is deployed in the Kubernets container, the GPU extension scheduler comprises GPU sharing strategy configuration parameters, and the method comprises the following steps:
adding GPU memory parameters and GPU equipment parameters on an original scheduler in the Kubernets to form the GPU extended scheduler;
and controlling a global scheduler in the Kubernetes to acquire a single GPU memory on each running node, and controlling the GPU extended scheduler to input a GPU allocation result into a controller field Annotation Pod indication.
According to the Kubernetes container GPU sharing method provided by the invention, the method comprises the following steps of deploying a GPU sharing equipment plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources, wherein the method comprises the following steps:
deploying the GPU sharing equipment plug-in component for each running node based on a daemon controller DaemonSet;
and determining to add a GPU shared node label to the operation node with the shared GPU requirement, and obtaining the GPU shared resource.
According to the Kubernetes container GPU sharing method provided by the invention, the method comprises the following steps of deploying a GPU sharing equipment plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and after determining GPU shared resources, further comprising:
reporting the GPU number and the GPU memory of the operating nodes to a controller Kubelet;
and reporting the GPU number and the GPU memory of the operating nodes to a Kubernets API Server by the Kubelet.
According to the Kubernets container GPU sharing method provided by the invention, the step of calling the Kubernets, constructing a controller pod based on the operation nodes with the shared GPU requirement, and issuing GPU shared resource information in the pod comprises the following steps:
calling a Kubernetes cluster to construct a pod, and determining to add GPU memory parameters and GPU equipment parameters in the GPU extended scheduler to a resource fields of the pod in pod deployment script annotations;
binding the run node and pod based on the GPU extension scheduler;
and controlling the Kubelet to receive the associated events of the operating node and the pod after the binding, creating a pod entity on the operating node by the Kubelet, and issuing the GPU shared resource information in the pod entity.
According to the Kubernets container GPU sharing method provided by the invention, the binding of the operation node and the pod based on the GPU extension scheduler comprises the following steps:
acquiring a GPU corresponding to the operating node based on a resource scheduling policy bipack rule, respectively storing equipment address information of the GPU in annotation of the pod, storing a GPU memory corresponding to the pod in a GPU memory application time annotation of the pod, and calling a Kubernets API Server;
and binding the operating node and the pod by using the Kubernets API Server.
In a second aspect, the present invention further provides a kubernets container sharing GPU system, including:
the acquisition module is used for determining operation nodes in Kubernets, mounting a GPU on the operation nodes and acquiring GPU reporting resources of each operation node;
a deployment module, configured to deploy a GPU extended scheduler in the kubernets, where the GPU extended scheduler includes a GPU sharing policy configuration parameter;
the determining module is used for deploying the GPU sharing equipment plug-in components for each operating node, acquiring the operating nodes with the shared GPU requirement and determining GPU shared resources;
and the calling module is used for calling the Kubernets, constructing a pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.
In a third aspect, the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement any of the kubernets container-shared GPU methods described above.
In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the kubernets container-shared GPU method as described in any of the above.
According to the Kubernetes container GPU sharing method and system, GPU expansion scheduling parameters are added on the basis of an original scheduling system of Kubernetes to form a GPU expansion scheduler, and a GPU sharing device plug-in component is used for achieving scheduling and binding of video memory levels when a pod uses GPU devices on the Kubernetes, so that the problem that the Kubernetes cannot share GPUs across pods is solved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a Kubernets container GPU sharing method provided by the invention;
FIG. 2 is a schematic structural diagram of a Kubernets container-shared GPU system provided by the present invention;
fig. 3 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Aiming at the problem that a GPU cannot be reasonably shared and allocated on a kubernets platform in the existing Kubernets, particularly under the model development and model reasoning scenes of machine learning, the invention provides a Kubernets container GPU sharing method, fig. 1 is a flow schematic diagram of the Kubernets container GPU sharing method provided by the invention, and as shown in fig. 1, the method comprises the following steps:
step 100: determining operation nodes in Kubernetes, mounting a GPU on the operation nodes, and acquiring GPU reporting resources of each operation node;
step 200: deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter;
step 300: deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources;
step 400: and calling the Kubernets, constructing a pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.
The core idea of the invention is based on Kubernets technology, and the Kubernets platform supports the function of sharing and distributing GPU among PODs by utilizing Kubernets expansion resources, a scheduling program expander and an equipment plug-in mechanism.
Specifically, in a kubernets cluster, resource scheduling of GPU memory and GPU number is increased on the basis of an original scheduling system, GPU equipment on a unified scheduling management node is achieved through a Kubernets extended scheduling technology and an increased GPU extended Scheduler GPU Share Scheduler Extender used for GPU equipment, a GPU sharing equipment plug-in component GPU Share Device plug-in component based on a Kubernets Device mechanism is increased, and scheduling and binding of a memory level when a container POD on the Kubernets uses GPU equipment are achieved.
Deploying a Kubernetes cluster, and configuring the container operation in the Kubernetes cluster to support the operation of a GPU; successfully mounting GPU equipment; deploying a GPU extended Scheduler GPU Share Scheduler; modifying a Kubernetes default scheduler, and adding an extension parameter to the default scheduler in a mode of adding a policy configuration file parameter in a scheduler parameter; deploying a GPU Share Device plug component for each node in a DaemonSet mode, and configuring the operation authority of the component; adding a gpusheare node tag for a node needing to use the shared GPU; deploying the instance of the kubecect, expanding kubecect-instance-gpushare, and adding an execution right; calling a Kubernets cluster to create a pod, and adding GPU fields of instur, com/GPU-mem and instur, com/GPU-count which are defined by us into resources fields of the pod specified in a pod deployment script annotation; and after the Pod is successfully created, entering the Pod, and running a related command to check the information of the GPU.
Particularly, the method comprises the steps of GPU resource discovery and reporting, GPU resource management and scheduling, GPU resource quantity splitting and GPU resource video memory allocation.
According to the invention, GPU extended scheduling parameters are added on the basis of an original scheduling system in Kubernets to form a GPU extended scheduler, and the GPU shared device plug-in component realizes scheduling and binding of the memory level when pod uses GPU devices on the Kubernets, so that the problem that the Kubernets can not share GPUs across pods is solved.
Based on the above embodiment, the determining the operation nodes in the container orchestration engine kubernets, mounting the GPU on the operation nodes, and acquiring the GPU reporting resource of each operation node includes:
deploying a Kubernetes cluster, and acquiring the operation node supporting the GPU when the container is determined to operate in the Kubernetes cluster;
and acquiring the GPU number and the GPU memory of the operation nodes by using the graph viewing command, and determining that the operation nodes mount GPU equipment.
Specifically, a kubernets cluster is deployed first, and the container runtime is configured in the kubernets cluster to support a runtime of a GPU, for example, NVIDIA-docker2 supporting an NVIDIA graphics card.
And then checking whether the GPU equipment of the Kubernetes node is mounted successfully or not, for example, the NVIDIA display card can check through a graphic checking command NVIDIA-smi.
According to the invention, through the initial Kubernets cluster setting, the current state of GPU resources is obtained for the existing GPU equipment mounted on the existing nodes, and the subsequent reallocation calculation is convenient.
Based on any one of the above embodiments, the deploying of the GPU extended scheduler in the kubernets includes a GPU sharing policy configuration parameter, including:
adding GPU memory parameters and GPU equipment parameters on an original scheduler in the Kubernetes to form the GPU extended scheduler;
and controlling a global scheduler in the Kubernetes to acquire a single GPU memory on each running node, and controlling the GPU extended scheduler to input a GPU allocation result into a controller field Annotation Pod indication.
Specifically, by modifying the kubernets default scheduler, the extension parameter is added to the default scheduler by adding the policy profile parameter to the scheduler parameter. Two new extended resources are added to the Scheduler in Kubernetes: the first is 'GPU-mem', corresponding to GPU video memory; the second is GPU-count, which corresponds to the number of GPU devices.
Here, the invention uses the Extender mechanism of the Kubernetes Scheduler, and adds a new extended Scheduler GPU Share Scheduler Extender. The function of the method is to be responsible for judging whether a single GPU device on a node can provide enough GPU Memory or not when a global scheduler of kubernets filters and binds, and recording a GPU distribution result to a Pod Spec indication for subsequent filtering when POD binds.
Note that, similarly to Label, the Annotation in Kubernetes also defines the form of a key/value key value pair. Except that a Label has strict naming conventions, it defines the Metadata (Metadata) of the Kubernets object, and is used for Label Selector. The indication is additional information arbitrarily defined by the user to facilitate the search by external tools. Many times, the kubernets module itself will tag some special information of the resource object through Annotation. Generally, information recorded with Annotation includes: build information, release information, Docker mirror information, and the like, such as a timestamp, a release id number, a PR number, a mirror Hash value, a Docker Registry address, and the like; address information of resource libraries such as a log library, a monitoring library, an analysis library and the like; program debugging tool information such as tool name, version number, etc.; team contact information such as phone number, leader name, web site, etc.
According to the invention, the default scheduler of Kubernets is modified, GPU sharing parameters are added, and a GPU extended scheduler is formed, so that the method can be specially used for resource scheduling of the GPU, and the native structure of the Kubernets does not need to be changed.
Based on any of the above embodiments, the deploying a GPU sharing device plug-in component for each running node, obtaining the running node with the shared GPU requirement, and determining the GPU shared resource includes:
deploying the GPU sharing equipment plug-in component for each running node based on a daemon controller DaemonSet;
and determining to add a GPU shared node label to the operation node with the shared GPU requirement, and obtaining the GPU shared resource.
The deploying of the GPU sharing equipment plug-in components for each operation node, acquiring the operation nodes with the shared GPU requirement, and after determining GPU shared resources, the method further comprises the following steps:
reporting the GPU number and the GPU memory of the operating nodes to a controller Kubelet;
and reporting the GPU number and the GPU memory of the operating nodes to a Kubernets API Server by the Kubelet.
Specifically, the GPU Share Device plug component is deployed for each node in a DaemonSet mode, the operation authority of the component is configured at the same time, each node can be guaranteed to report and issue resources, and then a gpushare node tag is added to the node needing to use the shared GPU.
Here, DaemonSet only manages Pod objects, and through the nodesaffinity scheduler and the tolerization scheduler, it is ensured that only one Pod cluster on each Node dynamically joins a new Node, a Pod in DaemonSet is also added to the newly joined Node, and deleting one DaemonSet also deletes all the created pods in a cascade manner.
Further, deploying the instance of the kubecect to expand the kubecect-instance-gpushare, adding an execution right, and viewing the shared GPU data of the node through a kubecect instance gpushare command. Here, a Device plug mechanism of kubernets is utilized, a GPU Share Device plug is added, and the main function is to inquire resource information such as the number of GPU equipment on a node and a GPU memory; reporting the information to a Kubelet through ListAndWatch (); kubelet reports these back to kubernets API Server.
It should be further noted that kubernets' command line tool (CLI) is a management tool essential for kubernets users and administrators, and kubecectl provides a large number of subcommands, which facilitates management of various functions in a kubernets cluster.
Kubelet is a proxy component on kubernets working nodes, running on each node. Kubelet is the primary service on the worker node, periodically receiving new or modified Pod specifications from the kube-apiserver component and ensuring that the Pod and its container run under the desired specifications. Meanwhile, the component serves as a monitoring component of the working node and reports the running condition of the host to the kube-apiserver.
The kubernets API Server provides the HTTP Rest interfaces such as add, delete, check and watch of various resource objects (pod, RC, Service and the like) of kubernets, and is a data bus and a data center of the whole system. The function of kubernets API Server: a REST API interface for cluster management (including authentication authorization, data verification and cluster state change) is provided; a hub for providing data interaction and communication between other modules (other modules inquire or modify data through an API Server, and only the API Server directly operates the etcd); is an entry for resource quota control; has a complete cluster security mechanism.
The invention deploys the GPU Share Device plug component for each node in a DaemonSet mode, and ensures that each node can report and issue resources.
Based on any of the above embodiments, the invoking the kubernets, constructing a controller pod based on the operating node with the shared GPU requirement, and issuing GPU shared resource information in the pod includes:
calling a Kubernetes cluster to construct a pod, and determining that GPU memory parameters and GPU equipment parameters in the GPU extended scheduler are added to a resource resources field of the pod in a pod deployment script annotation;
binding the run node and pod based on the GPU extension scheduler;
and controlling the Kubelet to receive the associated events of the operating nodes and the pod after binding, creating a pod entity on the operating nodes by the Kubelet, and issuing the GPU shared resource information in the pod entity.
Wherein said binding the run node and pod based on the GPU extension scheduler comprises:
acquiring a GPU corresponding to the operating node based on a resource scheduling policy bind rule, respectively storing equipment address information of the GPU in an annotation of a pod, storing a GPU memory corresponding to the pod in a GPU memory application time annotation of the pod, and calling a Kubernets API Server;
and binding the operating node and the pod by utilizing the Kubernets API Server.
Specifically, a Kubernets cluster is called to create a pod, and the resources field of the pod is specified in the pod deployment script annotation, and GPU fields of instur.com/GPU-mem and instur.com/GPU-count defined in the invention are added. After the pod is successfully created, the pod is entered, and a related command is run to check the information of the GPU, for example, a video card of NVIDIA can check the related information through the NVIDIA-smi command.
It is understood that at the time of pod creation, the filter of the GPU Share Scheduler Extender is invoked in http after the Kubernets Scheduler completes all default filters. This is because the default scheduler calculates extended resources, and can only determine whether there are free resources in the total amount of resources that meet the requirement. Therefore, whether a single device meets the requirements or not is specifically judged, whether available resources exist in the single device or not needs to be checked through the GPU Share Scheduler Extender, and therefore more accurate scheduling is achieved.
When Kubernetes' Scheduler finds a node that meets the requirements, it will delegate GPU Share Scheduler Extender to bind the node and the Pod. Is performed by two steps:
firstly, the GPU Share Scheduler Extender finds GPU equipment in the node according to the bipack rule, records the ID of the GPU equipment and stores the GPU _ ID in the indices of the position. Meanwhile, the GPU memory of the POD application is saved in the GPU _ MEM _ estimate annotation of the GPU _ MEM _ POD of the POD. If no GPU is found during binding, no binding is performed at this time. The default scheduler will reschedule after an expiration timeout.
And secondly, binding the pod and the node by using a Kubernetes API.
When the event that the Pod and the node are bound is received by the Kubelet, the Kubelet can create a real Pod entity on the node, in the process, the GPU Share Device plug acquires all the Pending, GPU Share and other unmarked Pod information in the node from the Kubernets API Server according to the time stamp, the GPU _ MEM _ ASSIGNED of the annotation of the Pod with the GPU memory request amount is marked as true, and the GPU information in the Pod annotation is converted into an environment variable to be returned to the Kubelet for really creating the Pod.
It should be noted that the present invention is not limited to implement shared allocation of GPU resources in the virtual machine platform deployment container, and also includes implementing shared allocation of GPU resources in the physical machine platform deployment container.
The kubernets container-sharing GPU system provided by the present invention is described below, and the kubernets container-sharing GPU system described below and the kubernets container-sharing GPU method described above may be referred to correspondingly.
Fig. 2 is a schematic structural diagram of a kubernets container-shared GPU system provided in the present invention, as shown in fig. 2, including: an obtaining module 21, a deployment module 22, a determining module 23, and a calling module 24, wherein:
the obtaining module 21 is configured to determine an operating node in kubernets, mount a GPU on the operating node, and obtain a GPU reporting resource of each operating node; the deployment module 22 is configured to deploy a GPU extended scheduler in the kubernets, where the GPU extended scheduler includes a GPU sharing policy configuration parameter; the determining module 23 is configured to deploy a GPU sharing device plug-in component for each running node, obtain the running node with the shared GPU requirement, and determine a GPU shared resource; the calling module 24 is configured to call the kubernets, construct a pod based on the running node with the shared GPU requirement, and issue the GPU shared resource in the pod.
According to the invention, GPU extension scheduling parameters are added on the basis of an original scheduling system in Kubernets to form a GPU extension scheduler, and the GPU sharing equipment plug-in component realizes the scheduling and binding of the memory level when the pod uses GPU equipment on the Kubernets, so that the problem that the Kubernets can not share the GPU across pods is solved.
Fig. 3 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 3: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. Processor 310 may call logic instructions in memory 330 to perform a kubernets container shared GPU method comprising: determining operation nodes in a container arrangement engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter; deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources; and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.
In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the kubernets container-shared GPU method provided by the above methods, the method comprising: determining operation nodes in a container arranging engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter; deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources; and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the kubernets container-shared GPU method provided by the above methods, the method comprising: determining operation nodes in a container arrangement engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter; deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources; and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A Kubernetes container GPU sharing method is characterized by comprising the following steps:
determining operation nodes in a container arrangement engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node;
deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter;
deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources;
calling the Kubernetes, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod;
the deploying of the GPU extension scheduler in the Kubernets includes that the GPU extension scheduler includes GPU sharing strategy configuration parameters, and the method includes:
adding GPU memory parameters and GPU equipment parameters on an original scheduler in the Kubernetes to form the GPU extended scheduler;
and controlling a global scheduler in the Kubernetes to acquire a single GPU memory on each running node, and controlling the GPU extended scheduler to input a GPU allocation result into a controller field Annotation Pod indication.
2. The kubernets container-sharing GPU method according to claim 1, wherein the determining of the operation node in the container orchestration engine kubernets, mounting a GPU on the operation node, and acquiring a GPU reporting resource of each operation node includes:
deploying a Kubernetes cluster, and acquiring the operation node supporting the GPU when the container is determined to operate in the Kubernetes cluster;
and acquiring the GPU number and the GPU memory of the operation nodes by using the graph viewing command, and determining that the operation nodes mount GPU equipment.
3. The Kubernetes container-based GPU sharing method of claim 1, wherein the deploying a GPU sharing device plugin component for each running node, obtaining the running node with shared GPU requirements, and determining GPU shared resources comprises:
deploying the GPU sharing equipment plug-in components for each running node based on a daemon process controller DaemonSet;
and determining to add a GPU shared node label to the operation node with the shared GPU requirement, and obtaining the GPU shared resource.
4. The Kubernetes container-shared GPU method of claim 3, wherein the deploying a GPU sharing device plug-in component for each running node, obtaining the running node with shared GPU requirements, and after determining GPU shared resources, further comprises:
reporting the GPU number and the GPU memory of the operating nodes to a controller Kubelet;
and reporting the GPU number and the GPU memory of the operating nodes to a Kubernets API Server by the Kubelet.
5. The kubernets container GPU sharing method of claim 1, wherein the invoking the kubernets, constructing a controller pod based on the running node with the shared GPU requirement, and issuing GPU shared resource information in the pod comprises:
calling a Kubernetes cluster to construct a pod, and determining to add GPU memory parameters and GPU equipment parameters in the GPU extended scheduler to a resource fields of the pod in pod deployment script annotations;
binding the run node and pod based on the GPU extension scheduler;
and controlling the Kubelet to receive the associated events of the operating nodes and the pod after binding, creating a pod entity on the operating nodes by the Kubelet, and issuing the GPU shared resource information in the pod entity.
6. The Kubernetes container-shared GPU method of claim 5, wherein binding the running node and pod based on the GPU extension scheduler comprises:
acquiring a GPU corresponding to the operating node based on a resource scheduling policy bipack rule, respectively storing equipment address information of the GPU in annotation of the pod, storing a GPU memory corresponding to the pod in a GPU memory application time annotation of the pod, and calling a Kubernets API Server;
and binding the operating node and the pod by using the Kubernets API Server.
7. A kubernets container shared GPU system, comprising:
the acquisition module is used for determining operation nodes in Kubernets, mounting a GPU on the operation nodes and acquiring GPU reporting resources of each operation node;
a deployment module, configured to deploy a GPU extended scheduler in the kubernets, where the GPU extended scheduler includes a GPU sharing policy configuration parameter;
the determining module is used for deploying the GPU sharing equipment plug-in components for each operating node, acquiring the operating nodes with the shared GPU requirement and determining GPU shared resources;
the calling module is used for calling the Kubernets, constructing a pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod;
the deployment module is specifically configured to:
adding GPU memory parameters and GPU equipment parameters on an original scheduler in the Kubernetes to form the GPU extended scheduler;
and controlling a global scheduler in the Kubernetes to acquire a single GPU memory on each running node, and controlling the GPU extended scheduler to input a GPU allocation result into a controller field Annotation Pod indication.
8. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements the kubernets container-shared GPU method of any of claims 1 to 6.
9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the kubernets container-shared GPU method of any of claims 1 to 6.
CN202210627516.3A 2022-06-06 2022-06-06 Method and system for sharing GPU (graphics processing Unit) by Kubernetes container Active CN114706690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210627516.3A CN114706690B (en) 2022-06-06 2022-06-06 Method and system for sharing GPU (graphics processing Unit) by Kubernetes container

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210627516.3A CN114706690B (en) 2022-06-06 2022-06-06 Method and system for sharing GPU (graphics processing Unit) by Kubernetes container

Publications (2)

Publication Number Publication Date
CN114706690A CN114706690A (en) 2022-07-05
CN114706690B true CN114706690B (en) 2022-09-16

Family

ID=82177814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210627516.3A Active CN114706690B (en) 2022-06-06 2022-06-06 Method and system for sharing GPU (graphics processing Unit) by Kubernetes container

Country Status (1)

Country Link
CN (1) CN114706690B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089009A (en) * 2023-02-01 2023-05-09 华院计算技术(上海)股份有限公司 GPU resource management method, system, equipment and storage medium
CN116258622A (en) * 2023-02-16 2023-06-13 青软创新科技集团股份有限公司 GPU distribution method and device based on container, electronic equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111506404A (en) * 2020-04-07 2020-08-07 上海德拓信息技术股份有限公司 Kubernetes-based shared GPU (graphics processing Unit) scheduling method
CN111913794A (en) * 2020-08-04 2020-11-10 北京百度网讯科技有限公司 Method and device for sharing GPU, electronic equipment and readable storage medium
CN112231049A (en) * 2020-09-28 2021-01-15 苏州浪潮智能科技有限公司 Computing equipment sharing method, device, equipment and storage medium based on kubernets
CN112559164A (en) * 2019-09-25 2021-03-26 中兴通讯股份有限公司 Resource sharing method and device
WO2021203805A1 (en) * 2020-04-08 2021-10-14 苏州浪潮智能科技有限公司 Gpu-shared dispatching and single-machine multi-card methods, systems and devices

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11693698B2 (en) * 2018-11-23 2023-07-04 Netapp, Inc. System and method for infrastructure scaling
JP7385156B2 (en) * 2020-04-16 2023-11-22 日本電信電話株式会社 Scheduling method, scheduler, GPU cluster system and program
CN112463375A (en) * 2020-11-26 2021-03-09 广州橙行智动汽车科技有限公司 Data processing method and device
CN113204428B (en) * 2021-05-28 2023-01-20 北京市商汤科技开发有限公司 Resource scheduling method, device, electronic equipment and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559164A (en) * 2019-09-25 2021-03-26 中兴通讯股份有限公司 Resource sharing method and device
CN111506404A (en) * 2020-04-07 2020-08-07 上海德拓信息技术股份有限公司 Kubernetes-based shared GPU (graphics processing Unit) scheduling method
WO2021203805A1 (en) * 2020-04-08 2021-10-14 苏州浪潮智能科技有限公司 Gpu-shared dispatching and single-machine multi-card methods, systems and devices
CN111913794A (en) * 2020-08-04 2020-11-10 北京百度网讯科技有限公司 Method and device for sharing GPU, electronic equipment and readable storage medium
EP3876100A2 (en) * 2020-08-04 2021-09-08 Beijing Baidu Netcom Science And Technology Co. Ltd. Method and apparatus for sharing gpu, electronic device and readable storage medium
CN112231049A (en) * 2020-09-28 2021-01-15 苏州浪潮智能科技有限公司 Computing equipment sharing method, device, equipment and storage medium based on kubernets
WO2022062650A1 (en) * 2020-09-28 2022-03-31 苏州浪潮智能科技有限公司 Computing device sharing method and apparatus based on kubernetes, and device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
数据中心上异构资源的细粒度分配算法研究;汤小春等;《西北工业大学学报》;20200615(第03期);全文 *

Also Published As

Publication number Publication date
CN114706690A (en) 2022-07-05

Similar Documents

Publication Publication Date Title
WO2020253347A1 (en) Container cluster management method, device and system
CN114706690B (en) Method and system for sharing GPU (graphics processing Unit) by Kubernetes container
US10620927B2 (en) Method, arrangement, computer program product and data processing program for deploying a software service
US10594800B2 (en) Platform runtime abstraction
CN113296792B (en) Storage method, device, equipment, storage medium and system
EP4033349A1 (en) Method and apparatus for generating mirror image file, and computer-readable storage medium
CN111475227A (en) Business plug-in loading implementation method and device and terminal equipment
CN113094028A (en) Windows desktop program development framework, method and related components
CN110532058B (en) Management method, device and equipment of container cluster service and readable storage medium
Dragoicea et al. Integrating HLA and service-oriented architecture in a simulation framework
CN114006815A (en) Automatic deployment method and device for cloud platform nodes, nodes and storage medium
CN115037757B (en) Multi-cluster service management system
CN113010385B (en) Task state updating method, device, equipment and medium
CN115237455A (en) Application management method and related equipment
CN109257256A (en) Apparatus monitoring method, device, computer equipment and storage medium
CN115485677A (en) Secure data replication in a distributed data storage environment
CN113485830A (en) Micro-service automatic capacity expansion method for power grid monitoring system
CN113741912A (en) Model management system, method, device and equipment
CN112422308A (en) Method and device for realizing operation and maintenance monitoring
CN117251297B (en) Equipment distribution method, electronic equipment and storage medium
CN112100283B (en) Linux platform based time-sharing multiplexing method for android virtual machine
Chandra Effective memory utilization using custom scheduler in kubernetes
CN113050979B (en) Installation configuration method and device for installing operating system, and installation method and device
CN110321335B (en) Modeling data downloading method and device, electronic equipment and computer storage medium
CN113703798A (en) Distributed service updating method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230613

Address after: 250100 building S02, Inspur Science Park, No. 1036, Inspur Road, high tech Zone, Jinan, Shandong

Patentee after: Inspur Software Technology Co.,Ltd.

Address before: 266107 No. 2, Xiangtan Road, Danshan Industrial Park, Chengyang District, Qingdao, Shandong

Patentee before: Inspur Communication Technology Co.,Ltd.