CN114706690B

CN114706690B - Method and system for sharing GPU (graphics processing Unit) by Kubernetes container

Info

Publication number: CN114706690B
Application number: CN202210627516.3A
Authority: CN
Inventors: 薛少宁; 林巍
Original assignee: Inspur Communication Technology Co Ltd
Current assignee: Inspur Software Technology Co Ltd
Priority date: 2022-06-06
Filing date: 2022-06-06
Publication date: 2022-09-16
Anticipated expiration: 2042-06-06
Also published as: CN114706690A

Abstract

The invention provides a method and a system for sharing a GPU (graphics processing Unit) by Kubernets containers, belonging to the technical field of computers and comprising the following steps: mounting a GPU (graphics processing Unit) on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU expansion scheduler; deploying a GPU sharing equipment plug-in component for each operation node, and acquiring the operation nodes with the shared GPU requirement; and constructing a pod based on the operating nodes with the shared GPU requirements, and issuing GPU shared resources in the pod. According to the invention, GPU extended scheduling parameters are added on the basis of an original scheduling system in Kubernets to form a GPU extended scheduler, and the GPU shared device plug-in component realizes scheduling and binding of the memory level when pod uses GPU devices on the Kubernets, so that the problem that the Kubernets can not share GPUs across pods is solved.

Description

Method and system for sharing GPU (graphics processing Unit) by Kubernetes container

Technical Field

The invention relates to the technical field of computers, in particular to a method and a system for sharing a GPU by Kubernetes containers.

Background

As is well known, kubernets, abbreviated as K8s, is an open source for managing containerized applications on multiple hosts in a cloud platform, and aims to make deploying containerized applications simple and efficient and to provide mechanisms for application deployment, planning, updating, and maintenance.

In kubernets, a pod is the smallest unit of k8s, containers are contained in a pod, there is a pause container and several service containers in a pod, and a container is a single container. However, the GPUs cannot be shared among the PODs, the user can only use the physical GPU device on the node, and the GPU cannot be split and shared for use, and only a certain POD occupies alone, which is not friendly to the user who wants to use the sharing function of the GPU to improve the utilization rate of the GPU in the cluster, and especially in the scenarios of model development and model inference, serious resource waste is caused.

For the above limitation that the GPU cannot be shared between the pods in kubernets, a corresponding solution needs to be proposed.

Disclosure of Invention

The invention provides a method and a system for sharing a GPU by Kubernets containers, which are used for solving the defect that the Kubernets cannot share the GPU in a cross-pod manner in the prior art.

In a first aspect, the present invention provides a kubernets container GPU sharing method, including:

determining operation nodes in a container arrangement engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node;

deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter;

deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources;

and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.

According to the Kubernets container GPU sharing method provided by the invention, the method that the operation nodes in the Kubernets container arranging engine are determined, the GPU is mounted on the operation nodes, and the GPU reporting resource of each operation node is obtained comprises the following steps:

deploying a Kubernets cluster, and acquiring the operation node supporting the GPU when the container operation is determined in the Kubernets cluster;

and acquiring the GPU number and the GPU memory of the operation nodes by using the graph viewing command, and determining that the operation nodes mount GPU equipment.

According to the Kubernets container GPU sharing method provided by the invention, a GPU extension scheduler is deployed in the Kubernets container, the GPU extension scheduler comprises GPU sharing strategy configuration parameters, and the method comprises the following steps:

adding GPU memory parameters and GPU equipment parameters on an original scheduler in the Kubernets to form the GPU extended scheduler;

and controlling a global scheduler in the Kubernetes to acquire a single GPU memory on each running node, and controlling the GPU extended scheduler to input a GPU allocation result into a controller field Annotation Pod indication.

According to the Kubernetes container GPU sharing method provided by the invention, the method comprises the following steps of deploying a GPU sharing equipment plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources, wherein the method comprises the following steps:

deploying the GPU sharing equipment plug-in component for each running node based on a daemon controller DaemonSet;

and determining to add a GPU shared node label to the operation node with the shared GPU requirement, and obtaining the GPU shared resource.

According to the Kubernetes container GPU sharing method provided by the invention, the method comprises the following steps of deploying a GPU sharing equipment plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and after determining GPU shared resources, further comprising:

reporting the GPU number and the GPU memory of the operating nodes to a controller Kubelet;

and reporting the GPU number and the GPU memory of the operating nodes to a Kubernets API Server by the Kubelet.

According to the Kubernets container GPU sharing method provided by the invention, the step of calling the Kubernets, constructing a controller pod based on the operation nodes with the shared GPU requirement, and issuing GPU shared resource information in the pod comprises the following steps:

calling a Kubernetes cluster to construct a pod, and determining to add GPU memory parameters and GPU equipment parameters in the GPU extended scheduler to a resource fields of the pod in pod deployment script annotations;

binding the run node and pod based on the GPU extension scheduler;

and controlling the Kubelet to receive the associated events of the operating node and the pod after the binding, creating a pod entity on the operating node by the Kubelet, and issuing the GPU shared resource information in the pod entity.

According to the Kubernets container GPU sharing method provided by the invention, the binding of the operation node and the pod based on the GPU extension scheduler comprises the following steps:

acquiring a GPU corresponding to the operating node based on a resource scheduling policy bipack rule, respectively storing equipment address information of the GPU in annotation of the pod, storing a GPU memory corresponding to the pod in a GPU memory application time annotation of the pod, and calling a Kubernets API Server;

and binding the operating node and the pod by using the Kubernets API Server.

In a second aspect, the present invention further provides a kubernets container sharing GPU system, including:

the acquisition module is used for determining operation nodes in Kubernets, mounting a GPU on the operation nodes and acquiring GPU reporting resources of each operation node;

a deployment module, configured to deploy a GPU extended scheduler in the kubernets, where the GPU extended scheduler includes a GPU sharing policy configuration parameter;

the determining module is used for deploying the GPU sharing equipment plug-in components for each operating node, acquiring the operating nodes with the shared GPU requirement and determining GPU shared resources;

and the calling module is used for calling the Kubernets, constructing a pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.

In a third aspect, the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement any of the kubernets container-shared GPU methods described above.

In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the kubernets container-shared GPU method as described in any of the above.

According to the Kubernetes container GPU sharing method and system, GPU expansion scheduling parameters are added on the basis of an original scheduling system of Kubernetes to form a GPU expansion scheduler, and a GPU sharing device plug-in component is used for achieving scheduling and binding of video memory levels when a pod uses GPU devices on the Kubernetes, so that the problem that the Kubernetes cannot share GPUs across pods is solved.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a Kubernets container GPU sharing method provided by the invention;

FIG. 2 is a schematic structural diagram of a Kubernets container-shared GPU system provided by the present invention;

fig. 3 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Aiming at the problem that a GPU cannot be reasonably shared and allocated on a kubernets platform in the existing Kubernets, particularly under the model development and model reasoning scenes of machine learning, the invention provides a Kubernets container GPU sharing method, fig. 1 is a flow schematic diagram of the Kubernets container GPU sharing method provided by the invention, and as shown in fig. 1, the method comprises the following steps:

step 100: determining operation nodes in Kubernetes, mounting a GPU on the operation nodes, and acquiring GPU reporting resources of each operation node;

step 200: deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter;

step 300: deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources;

step 400: and calling the Kubernets, constructing a pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.

The core idea of the invention is based on Kubernets technology, and the Kubernets platform supports the function of sharing and distributing GPU among PODs by utilizing Kubernets expansion resources, a scheduling program expander and an equipment plug-in mechanism.

Specifically, in a kubernets cluster, resource scheduling of GPU memory and GPU number is increased on the basis of an original scheduling system, GPU equipment on a unified scheduling management node is achieved through a Kubernets extended scheduling technology and an increased GPU extended Scheduler GPU Share Scheduler Extender used for GPU equipment, a GPU sharing equipment plug-in component GPU Share Device plug-in component based on a Kubernets Device mechanism is increased, and scheduling and binding of a memory level when a container POD on the Kubernets uses GPU equipment are achieved.

Deploying a Kubernetes cluster, and configuring the container operation in the Kubernetes cluster to support the operation of a GPU; successfully mounting GPU equipment; deploying a GPU extended Scheduler GPU Share Scheduler; modifying a Kubernetes default scheduler, and adding an extension parameter to the default scheduler in a mode of adding a policy configuration file parameter in a scheduler parameter; deploying a GPU Share Device plug component for each node in a DaemonSet mode, and configuring the operation authority of the component; adding a gpusheare node tag for a node needing to use the shared GPU; deploying the instance of the kubecect, expanding kubecect-instance-gpushare, and adding an execution right; calling a Kubernets cluster to create a pod, and adding GPU fields of instur, com/GPU-mem and instur, com/GPU-count which are defined by us into resources fields of the pod specified in a pod deployment script annotation; and after the Pod is successfully created, entering the Pod, and running a related command to check the information of the GPU.

Particularly, the method comprises the steps of GPU resource discovery and reporting, GPU resource management and scheduling, GPU resource quantity splitting and GPU resource video memory allocation.

According to the invention, GPU extended scheduling parameters are added on the basis of an original scheduling system in Kubernets to form a GPU extended scheduler, and the GPU shared device plug-in component realizes scheduling and binding of the memory level when pod uses GPU devices on the Kubernets, so that the problem that the Kubernets can not share GPUs across pods is solved.

Based on the above embodiment, the determining the operation nodes in the container orchestration engine kubernets, mounting the GPU on the operation nodes, and acquiring the GPU reporting resource of each operation node includes:

deploying a Kubernetes cluster, and acquiring the operation node supporting the GPU when the container is determined to operate in the Kubernetes cluster;

Specifically, a kubernets cluster is deployed first, and the container runtime is configured in the kubernets cluster to support a runtime of a GPU, for example, NVIDIA-docker2 supporting an NVIDIA graphics card.

And then checking whether the GPU equipment of the Kubernetes node is mounted successfully or not, for example, the NVIDIA display card can check through a graphic checking command NVIDIA-smi.

According to the invention, through the initial Kubernets cluster setting, the current state of GPU resources is obtained for the existing GPU equipment mounted on the existing nodes, and the subsequent reallocation calculation is convenient.

Based on any one of the above embodiments, the deploying of the GPU extended scheduler in the kubernets includes a GPU sharing policy configuration parameter, including:

adding GPU memory parameters and GPU equipment parameters on an original scheduler in the Kubernetes to form the GPU extended scheduler;

Specifically, by modifying the kubernets default scheduler, the extension parameter is added to the default scheduler by adding the policy profile parameter to the scheduler parameter. Two new extended resources are added to the Scheduler in Kubernetes: the first is 'GPU-mem', corresponding to GPU video memory; the second is GPU-count, which corresponds to the number of GPU devices.

Here, the invention uses the Extender mechanism of the Kubernetes Scheduler, and adds a new extended Scheduler GPU Share Scheduler Extender. The function of the method is to be responsible for judging whether a single GPU device on a node can provide enough GPU Memory or not when a global scheduler of kubernets filters and binds, and recording a GPU distribution result to a Pod Spec indication for subsequent filtering when POD binds.

Note that, similarly to Label, the Annotation in Kubernetes also defines the form of a key/value key value pair. Except that a Label has strict naming conventions, it defines the Metadata (Metadata) of the Kubernets object, and is used for Label Selector. The indication is additional information arbitrarily defined by the user to facilitate the search by external tools. Many times, the kubernets module itself will tag some special information of the resource object through Annotation. Generally, information recorded with Annotation includes: build information, release information, Docker mirror information, and the like, such as a timestamp, a release id number, a PR number, a mirror Hash value, a Docker Registry address, and the like; address information of resource libraries such as a log library, a monitoring library, an analysis library and the like; program debugging tool information such as tool name, version number, etc.; team contact information such as phone number, leader name, web site, etc.

According to the invention, the default scheduler of Kubernets is modified, GPU sharing parameters are added, and a GPU extended scheduler is formed, so that the method can be specially used for resource scheduling of the GPU, and the native structure of the Kubernets does not need to be changed.

Based on any of the above embodiments, the deploying a GPU sharing device plug-in component for each running node, obtaining the running node with the shared GPU requirement, and determining the GPU shared resource includes:

The deploying of the GPU sharing equipment plug-in components for each operation node, acquiring the operation nodes with the shared GPU requirement, and after determining GPU shared resources, the method further comprises the following steps:

Specifically, the GPU Share Device plug component is deployed for each node in a DaemonSet mode, the operation authority of the component is configured at the same time, each node can be guaranteed to report and issue resources, and then a gpushare node tag is added to the node needing to use the shared GPU.

Here, DaemonSet only manages Pod objects, and through the nodesaffinity scheduler and the tolerization scheduler, it is ensured that only one Pod cluster on each Node dynamically joins a new Node, a Pod in DaemonSet is also added to the newly joined Node, and deleting one DaemonSet also deletes all the created pods in a cascade manner.

Further, deploying the instance of the kubecect to expand the kubecect-instance-gpushare, adding an execution right, and viewing the shared GPU data of the node through a kubecect instance gpushare command. Here, a Device plug mechanism of kubernets is utilized, a GPU Share Device plug is added, and the main function is to inquire resource information such as the number of GPU equipment on a node and a GPU memory; reporting the information to a Kubelet through ListAndWatch (); kubelet reports these back to kubernets API Server.

It should be further noted that kubernets' command line tool (CLI) is a management tool essential for kubernets users and administrators, and kubecectl provides a large number of subcommands, which facilitates management of various functions in a kubernets cluster.

Kubelet is a proxy component on kubernets working nodes, running on each node. Kubelet is the primary service on the worker node, periodically receiving new or modified Pod specifications from the kube-apiserver component and ensuring that the Pod and its container run under the desired specifications. Meanwhile, the component serves as a monitoring component of the working node and reports the running condition of the host to the kube-apiserver.

The kubernets API Server provides the HTTP Rest interfaces such as add, delete, check and watch of various resource objects (pod, RC, Service and the like) of kubernets, and is a data bus and a data center of the whole system. The function of kubernets API Server: a REST API interface for cluster management (including authentication authorization, data verification and cluster state change) is provided; a hub for providing data interaction and communication between other modules (other modules inquire or modify data through an API Server, and only the API Server directly operates the etcd); is an entry for resource quota control; has a complete cluster security mechanism.

The invention deploys the GPU Share Device plug component for each node in a DaemonSet mode, and ensures that each node can report and issue resources.

Based on any of the above embodiments, the invoking the kubernets, constructing a controller pod based on the operating node with the shared GPU requirement, and issuing GPU shared resource information in the pod includes:

calling a Kubernetes cluster to construct a pod, and determining that GPU memory parameters and GPU equipment parameters in the GPU extended scheduler are added to a resource resources field of the pod in a pod deployment script annotation;

binding the run node and pod based on the GPU extension scheduler;

and controlling the Kubelet to receive the associated events of the operating nodes and the pod after binding, creating a pod entity on the operating nodes by the Kubelet, and issuing the GPU shared resource information in the pod entity.

Wherein said binding the run node and pod based on the GPU extension scheduler comprises:

acquiring a GPU corresponding to the operating node based on a resource scheduling policy bind rule, respectively storing equipment address information of the GPU in an annotation of a pod, storing a GPU memory corresponding to the pod in a GPU memory application time annotation of the pod, and calling a Kubernets API Server;

and binding the operating node and the pod by utilizing the Kubernets API Server.

Specifically, a Kubernets cluster is called to create a pod, and the resources field of the pod is specified in the pod deployment script annotation, and GPU fields of instur.com/GPU-mem and instur.com/GPU-count defined in the invention are added. After the pod is successfully created, the pod is entered, and a related command is run to check the information of the GPU, for example, a video card of NVIDIA can check the related information through the NVIDIA-smi command.

It is understood that at the time of pod creation, the filter of the GPU Share Scheduler Extender is invoked in http after the Kubernets Scheduler completes all default filters. This is because the default scheduler calculates extended resources, and can only determine whether there are free resources in the total amount of resources that meet the requirement. Therefore, whether a single device meets the requirements or not is specifically judged, whether available resources exist in the single device or not needs to be checked through the GPU Share Scheduler Extender, and therefore more accurate scheduling is achieved.

When Kubernetes' Scheduler finds a node that meets the requirements, it will delegate GPU Share Scheduler Extender to bind the node and the Pod. Is performed by two steps:

firstly, the GPU Share Scheduler Extender finds GPU equipment in the node according to the bipack rule, records the ID of the GPU equipment and stores the GPU _ ID in the indices of the position. Meanwhile, the GPU memory of the POD application is saved in the GPU _ MEM _ estimate annotation of the GPU _ MEM _ POD of the POD. If no GPU is found during binding, no binding is performed at this time. The default scheduler will reschedule after an expiration timeout.

And secondly, binding the pod and the node by using a Kubernetes API.

When the event that the Pod and the node are bound is received by the Kubelet, the Kubelet can create a real Pod entity on the node, in the process, the GPU Share Device plug acquires all the Pending, GPU Share and other unmarked Pod information in the node from the Kubernets API Server according to the time stamp, the GPU _ MEM _ ASSIGNED of the annotation of the Pod with the GPU memory request amount is marked as true, and the GPU information in the Pod annotation is converted into an environment variable to be returned to the Kubelet for really creating the Pod.

It should be noted that the present invention is not limited to implement shared allocation of GPU resources in the virtual machine platform deployment container, and also includes implementing shared allocation of GPU resources in the physical machine platform deployment container.

The kubernets container-sharing GPU system provided by the present invention is described below, and the kubernets container-sharing GPU system described below and the kubernets container-sharing GPU method described above may be referred to correspondingly.

Fig. 2 is a schematic structural diagram of a kubernets container-shared GPU system provided in the present invention, as shown in fig. 2, including: an obtaining module 21, a deployment module 22, a determining module 23, and a calling module 24, wherein:

the obtaining module 21 is configured to determine an operating node in kubernets, mount a GPU on the operating node, and obtain a GPU reporting resource of each operating node; the deployment module 22 is configured to deploy a GPU extended scheduler in the kubernets, where the GPU extended scheduler includes a GPU sharing policy configuration parameter; the determining module 23 is configured to deploy a GPU sharing device plug-in component for each running node, obtain the running node with the shared GPU requirement, and determine a GPU shared resource; the calling module 24 is configured to call the kubernets, construct a pod based on the running node with the shared GPU requirement, and issue the GPU shared resource in the pod.

According to the invention, GPU extension scheduling parameters are added on the basis of an original scheduling system in Kubernets to form a GPU extension scheduler, and the GPU sharing equipment plug-in component realizes the scheduling and binding of the memory level when the pod uses GPU equipment on the Kubernets, so that the problem that the Kubernets can not share the GPU across pods is solved.

Fig. 3 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 3: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. Processor 310 may call logic instructions in memory 330 to perform a kubernets container shared GPU method comprising: determining operation nodes in a container arrangement engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter; deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources; and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.

In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the kubernets container-shared GPU method provided by the above methods, the method comprising: determining operation nodes in a container arranging engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter; deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources; and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the kubernets container-shared GPU method provided by the above methods, the method comprising: determining operation nodes in a container arrangement engine Kubernets, mounting a graphic processor GPU on the operation nodes, and acquiring GPU reporting resources of each operation node; deploying a GPU extension scheduler in the Kubernets, wherein the GPU extension scheduler comprises a GPU sharing strategy configuration parameter; deploying a GPU shared device plug-in component for each operation node, acquiring the operation nodes with shared GPU requirements, and determining GPU shared resources; and calling the Kubernets, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A Kubernetes container GPU sharing method is characterized by comprising the following steps:

calling the Kubernetes, constructing a container group pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod;

the deploying of the GPU extension scheduler in the Kubernets includes that the GPU extension scheduler includes GPU sharing strategy configuration parameters, and the method includes:

2. The kubernets container-sharing GPU method according to claim 1, wherein the determining of the operation node in the container orchestration engine kubernets, mounting a GPU on the operation node, and acquiring a GPU reporting resource of each operation node includes:

3. The Kubernetes container-based GPU sharing method of claim 1, wherein the deploying a GPU sharing device plugin component for each running node, obtaining the running node with shared GPU requirements, and determining GPU shared resources comprises:

deploying the GPU sharing equipment plug-in components for each running node based on a daemon process controller DaemonSet;

4. The Kubernetes container-shared GPU method of claim 3, wherein the deploying a GPU sharing device plug-in component for each running node, obtaining the running node with shared GPU requirements, and after determining GPU shared resources, further comprises:

5. The kubernets container GPU sharing method of claim 1, wherein the invoking the kubernets, constructing a controller pod based on the running node with the shared GPU requirement, and issuing GPU shared resource information in the pod comprises:

binding the run node and pod based on the GPU extension scheduler;

6. The Kubernetes container-shared GPU method of claim 5, wherein binding the running node and pod based on the GPU extension scheduler comprises:

and binding the operating node and the pod by using the Kubernets API Server.

7. A kubernets container shared GPU system, comprising:

the calling module is used for calling the Kubernets, constructing a pod based on the operation node with the shared GPU requirement, and issuing the GPU shared resource in the pod;

the deployment module is specifically configured to:

8. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements the kubernets container-shared GPU method of any of claims 1 to 6.

9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the kubernets container-shared GPU method of any of claims 1 to 6.