CN112162864A

CN112162864A - Cloud resource allocation method and device and storage medium

Info

Publication number: CN112162864A
Application number: CN202011158622.9A
Authority: CN
Inventors: 兰天
Original assignee: New H3C Big Data Technologies Co Ltd
Current assignee: New H3C Big Data Technologies Co Ltd
Priority date: 2020-10-26
Filing date: 2020-10-26
Publication date: 2021-01-01
Anticipated expiration: 2040-10-26
Also published as: CN112162864B

Abstract

The disclosure provides a cloud resource allocation method, a cloud resource allocation device and a storage medium, which are used for improving the utilization rate of cloud resources. According to the method, the GPU acceleration strategy and the relevant threshold value are set, and the cloud operating system is triggered to add GPU resources to the cloud host when the utilization rate of the CPU resources of the cloud host is too high, so that the processing performance of the overall computing resources of the cloud host is improved. According to the cloud resource scheduling method and device, the GPU resources and the CPU resources are reasonably and effectively scheduled and distributed, the elastic expansion performance of the cloud resources of the cloud operating system is improved, the physical resources can be saved under the condition that the user requirements are met, and the utilization rate of the cloud resources is improved.

Description

Cloud resource allocation method and device and storage medium

Technical Field

The present disclosure relates to the field of cloud computing technologies, and in particular, to a cloud resource allocation method, an apparatus, and a storage medium.

Background

The elastic expansion function of the cloud resources is to automatically adjust computing resources according to the services and strategies of users, meet different requirements of users under the condition that the service volume fluctuates continuously, currently, each cloud manufacturer supports the function of elastic expansion based on physical resources such as a CPU (central processing unit), a memory and the like, the CPU and the GPU are used as current mainstream processors, the physical resources are relatively expensive, and if the two resources cannot be reasonably and effectively integrated and distributed, the waste of the cloud resources and the reduction of user experience can be caused.

Disclosure of Invention

In view of this, the present disclosure provides a cloud resource allocation method, an apparatus and a storage medium, which are used to improve the utilization rate of cloud resources.

Based on an embodiment of the present disclosure, the present disclosure provides a cloud resource allocation method, including:

monitoring the CPU resource utilization rate of a first cloud host distributed for a user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;

and allocating GPU resources for the first cloud host according to a first elastic scaling strategy.

Further, the method further comprises: judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold value, wherein the second threshold value is larger than the first threshold value, and triggering and executing a second elastic expansion strategy for expanding the cloud host when the CPU resource utilization rate of the first cloud host exceeds the second threshold value; and according to a second elastic expansion strategy, expanding a new cloud host for the user.

Further, the method further comprises: and if the GPU resource pool is found to have no GPU resource available when the CPU resource utilization rate of the first cloud host exceeds a first threshold value and the execution of a first elastic scaling strategy for allocating GPU resources is triggered, directly triggering and executing a second elastic scaling strategy for expanding the cloud host.

Further, the method further comprises: and after the GPU resources are distributed to the first cloud host, injecting a computing unified device architecture component package into the first cloud host.

Further, the method further comprises: and when the CPU resource utilization rate of the first cloud host is judged to be lower than a first threshold value and the difference value between the CPU resource utilization rate of the first cloud host and the first threshold value is larger than a preset amplitude, triggering and executing an elastic expansion strategy for recovering GPU resources.

Based on another aspect of the present disclosure, the present disclosure further provides a cloud resource allocation apparatus, including:

the monitoring module is used for monitoring the CPU resource utilization rate of the first cloud host distributed for the user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;

and the elastic module is used for allocating GPU resources to the first cloud host according to the first elastic expansion strategy.

Further, the monitoring module is further configured to determine whether the CPU resource usage rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and when it is determined that the CPU resource usage rate of the first cloud host exceeds the second threshold, trigger execution of a second elastic scaling policy for expanding the cloud host;

and the elastic module is also used for expanding a new cloud host for the user according to a second elastic expansion strategy.

Further, the monitoring module is further configured to, when it is determined that the CPU resource utilization of the first cloud host exceeds the first threshold and the execution of the first elastic scaling policy for allocating GPU resources is triggered, directly trigger the execution of the second elastic scaling policy for expanding the cloud host if it is found that no GPU resource is available in the GPU resource pool.

Further, the monitoring module is further configured to trigger execution of an elastic scaling policy for recovering GPU resources when it is determined that the CPU resource utilization of the first cloud host is lower than a first threshold and a difference value from the first threshold is greater than a preset amplitude.

According to the method, the GPU acceleration strategy and the relevant threshold value are set, and the cloud operating system is triggered to add GPU resources to the cloud host when the utilization rate of the CPU resources of the cloud host is too high, so that the processing performance of the overall computing resources of the cloud host is improved. According to the cloud resource scheduling method and device, the GPU resources and the CPU resources are reasonably and effectively scheduled and distributed, the elastic expansion performance of the cloud resources of the cloud operating system is improved, the physical resources can be saved under the condition that the user requirements are met, and the cloud resource scheduling capability and the utilization rate are improved.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present disclosure or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present disclosure, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present disclosure.

Fig. 1 is a flowchart illustrating steps of a cloud resource allocation method according to an embodiment of the present disclosure;

fig. 2 is a schematic process diagram of a cloud resource allocation method according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a cloud resource allocation apparatus according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a cloud resource allocation device according to an embodiment of the present disclosure.

Detailed Description

The terminology used in the embodiments of the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present disclosure. As used in the embodiments of the present disclosure, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The term "and/or" if used in this disclosure is intended to encompass any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information in the embodiments of the present disclosure, such information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of embodiments of the present disclosure. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".

Fig. 1 is a flowchart of steps of a cloud resource allocation method according to an embodiment of the present disclosure, where the method is applied to a cloud operating system (a system running on a cloud management platform becomes a cloud operating system) in a cloud computing scene, and the method triggers the cloud operating system to add GPU resources to a cloud host to improve the processing performance of the overall computing resources of the cloud host when the CPU resource utilization of the cloud host is too high by setting a threshold value for GPU acceleration and a GPU acceleration policy. According to the cloud resource scheduling method and device, the GPU resources and the CPU resources are properly scheduled and distributed, the elastic expansion performance of the cloud resources of the cloud operating system is improved, the physical resources can be saved under the condition that the user requirements are met, and the cloud resource scheduling capability is improved. The method comprises the following steps:

step 101, monitoring the CPU resource utilization rate of a first cloud host distributed for a user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;

before this step is executed, a first threshold needs to be preset for the first cloud host, and a flexible scaling strategy for allocating GPU resources needs to be newly established for the user through the flexible scaling management platform (which is a sub-platform of the cloud management platform). The first threshold is used to trigger execution of a first elastic scaling policy that allocates GPU resources for the cloud host.

The cloud operating system can monitor the CPU resource utilization rate of the cloud host in real time, when the CPU resource utilization rate exceeds a first threshold value, it is indicated that the computing resources of the first cloud host are in shortage at the moment, in order to guarantee normal operation of a service, execution of a first elastic expansion strategy needs to be triggered, therefore, GPU resources are added to the first cloud host, the processing performance of the cloud host of a user is improved through good parallel computing capacity of the GPU, and the service requirement of the user can be met on the premise that the cloud host examples are not increased.

And 102, distributing GPU resources for the first cloud host according to the first elastic expansion strategy.

In an embodiment of the present disclosure, the method further includes: and when the CPU resource utilization rate of the first cloud host is judged to be lower than a first threshold value and the difference value between the CPU resource utilization rate of the first cloud host and the first threshold value is larger than a preset amplitude, triggering and executing an elastic expansion strategy for recovering GPU resources.

How to allocate the GPU resources to the cloud host may be set on the corresponding elastic scaling management platform, for example, the amount of GPU resources may be increased or recovered according to a certain proportion of existing CPU resources of the cloud host, which is not specifically limited in this disclosure.

In an embodiment of the present disclosure, after GPU resources are allocated to the cloud host, if the CPU resource utilization of the cloud host continues to increase beyond a preset threshold, the method further includes a step of expanding a new cloud host for the user, so as to meet a business requirement of the user. Based on this, the method further comprises:

step 103, judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold, wherein the second threshold is larger than a first threshold, for example, the first threshold is 80%, and the second threshold is 90%, and when the CPU resource utilization rate of the first cloud host is judged to exceed the second threshold, triggering and executing a second elastic expansion strategy for expanding the cloud host;

and 104, expanding a new cloud host for the user according to a second elastic expansion strategy.

By expanding a new cloud host for a user, a part of services on the first cloud host can be transferred to the newly expanded cloud host or the newly added services can be guided to the new cloud host, so that the resource pressure of the first cloud host is reduced, the service quality of the services of the first cloud host is guaranteed, and the condition that the service response is slow or even interrupted due to shortage of cloud resources is avoided as much as possible.

In order to clearly disclose the technical content and the technical effects of the technical solutions of the present disclosure, the following description is made with reference to the accompanying drawings. Fig. 2 is a schematic process diagram of a cloud resource allocation method according to an embodiment of the present disclosure. In general, the process of elastic scaling of the cloud operating system is as follows: when cloud host 1 reaches a CPU utilization threshold (e.g., 90%), the elastic scaling management platform automatically expands one or more cloud host instances, such as "cloud host 2" and "cloud host 3".

In the implementation scheme of the embodiment of the present disclosure shown in fig. 2, in combination with GPU acceleration to improve the performance of the cloud host, an option of "GPU acceleration" is added to the elastic scaling association platform, and a policy 1 is set.

Step 1, when the CPU utilization rate of the cloud host reaches a first threshold value, for example 80% (the threshold value is set by a user, and the purpose is to trigger GPU acceleration in advance before the cloud host is expanded), the execution of a GPU acceleration strategy 1 is triggered.

And 2, automatically allocating GPU resources for the cloud host 1 from the GPU resource pool according to the resource load condition of the cloud host 1 (on the premise of ensuring that the GPU resources are abundant).

After adding GPU resources to the cloud host 1, a Compute Unified Device Architecture (CUDA) component package is injected into the cloud host 1 through the cloud-init, the CUDA is a general parallel computing Architecture released by a certain graphics card manufacturer, and enables the GPU to solve complex computing problems.

And 3, after GPU resources are added, if the service processing capacity of the cloud host 1 is still continuously increased, the CPU utilization rate of the cloud host reaches a second threshold value, for example 90%, at the moment, the execution of an expansion strategy of the cloud host is triggered, namely one or more cloud hosts are expanded for a user, and the expansion operation allocates CPU resources for the newly expanded cloud host from a CPU resource pool.

In some cases, the following may also exist: in other words, when it is determined that the CPU resource utilization of the first cloud host exceeds the first threshold, after the first elastic scaling policy for allocating GPU resources is triggered to be executed, and when it is found that no GPU resource is available in the GPU resource pool, in this case, the second elastic scaling policy for expanding the cloud host may be directly triggered to be executed, so as to expand a new cloud host for the user.

It should be recognized that embodiments of the present disclosure can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The method may be implemented in a computer program using standard programming techniques, including a non-transitory computer readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.

Further, operations of processes described by the present disclosure may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described in this disclosure (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) executing collectively on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.

Further, the methods provided by the present disclosure may be implemented in any type of computing platform operatively connected to suitable, including but not limited to personal computers, minicomputers, mainframe computers, workstations, networked or distributed computing environments, separate or integrated computer platforms, or in communication with charged particle tools or other imaging devices, and the like. Aspects of the disclosure may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described in this disclosure includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The disclosure also includes the computer itself when programmed according to the methods and techniques described in this disclosure.

Fig. 3 is a schematic structural diagram of a cloud resource allocation apparatus according to an embodiment of the present disclosure, where each functional module in the apparatus may be implemented in a form of a software module, a hardware unit, or a combination of software and hardware. The functions of each module of the device have corresponding relation with each step in the method provided by the implementation of the disclosure. The apparatus 300 comprises: a monitoring module 310 and a spring module 320.

The monitoring module 310 is configured to monitor a CPU resource utilization rate of a first cloud host allocated to a user, and trigger execution of a first elastic scaling policy for allocating GPU resources when it is determined that the CPU resource utilization rate of the first cloud host exceeds a first threshold.

The elastic module 320 is configured to allocate GPU resources to the first cloud host according to a first elastic scaling policy.

In an embodiment of the present disclosure, the monitoring module 310 may further be configured to determine whether the CPU resource usage rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and when it is determined that the CPU resource usage rate of the first cloud host exceeds the second threshold, trigger execution of a second elastic scaling policy for expanding the cloud host. The elastic module 320 is further configured to expand a new cloud host for the user according to a second elastic scaling policy.

In an embodiment of the present disclosure, the monitoring module 310 may be further configured to, when determining that the CPU resource utilization of the first cloud host exceeds the first threshold and triggering execution of the first elastic scaling policy for allocating GPU resources, find that no GPU resource is available in the GPU resource pool, directly trigger execution of the second elastic scaling policy for expanding the cloud host.

In an embodiment of the present disclosure, the monitoring module 310 may be further configured to trigger execution of an elastic scaling policy for recycling GPU resources when it is determined that the CPU resource usage rate of the first cloud host is lower than a first threshold and a difference value from the first threshold is greater than a preset magnitude.

Fig. 4 is a schematic structural diagram of a cloud resource allocation device according to an embodiment of the present disclosure, where the device 400 includes: a processor 410 such as a Central Processing Unit (CPU), an internal bus 420, a network interface 440, and a computer-readable storage medium 430. Wherein the processor 410 and the computer-readable storage medium 430 can communicate with each other through an internal bus 420. The computer readable storage medium 430 may store therein a computer program provided by the present disclosure for implementing the cloud resource allocation method, and when the computer program is executed by the processor 410, the computer program can implement the functions of the steps of the method provided by the present disclosure.

The machine-readable storage medium may include Random Access Memory (RAM) and may also include Non-Volatile Memory (NVM), such as at least one disk Memory. Additionally, the machine-readable storage medium may be at least one memory device located remotely from the aforementioned processor. The Processor may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), etc.; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

The equipment provided by the embodiment of the disclosure and the method provided by the embodiment of the disclosure have the same technical concept and the same beneficial effects as the method adopted, operated or realized by the equipment.

The above description is only an example of the present disclosure and is not intended to limit the present disclosure. Various modifications and variations of this disclosure will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A cloud resource allocation method, the method comprising:

2. The method of claim 1, further comprising:

judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold value, wherein the second threshold value is larger than the first threshold value, and triggering and executing a second elastic expansion strategy for expanding the cloud host when the CPU resource utilization rate of the first cloud host exceeds the second threshold value;

and according to a second elastic expansion strategy, expanding a new cloud host for the user.

3. The method of claim 1, further comprising:

and if the GPU resource pool is found to have no GPU resource available when the CPU resource utilization rate of the first cloud host exceeds a first threshold value and the execution of a first elastic scaling strategy for allocating GPU resources is triggered, directly triggering and executing a second elastic scaling strategy for expanding the cloud host.

4. The method of claim 1, further comprising:

and after the GPU resources are distributed to the first cloud host, injecting a computing unified device architecture component package into the first cloud host.

5. The method of claim 1, further comprising:

and when the CPU resource utilization rate of the first cloud host is judged to be lower than a first threshold value and the difference value between the CPU resource utilization rate of the first cloud host and the first threshold value is larger than a preset amplitude, triggering and executing an elastic expansion strategy for recovering GPU resources.

6. An apparatus for cloud resource allocation, the apparatus comprising:

7. The apparatus of claim 6,

the monitoring module is further configured to determine whether the CPU resource utilization rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and when it is determined that the CPU resource utilization rate of the first cloud host exceeds the second threshold, trigger execution of a second elastic scaling policy for expanding the cloud host;

8. The apparatus of claim 6,

the monitoring module is further configured to, when it is determined that the CPU resource utilization of the first cloud host exceeds the first threshold and triggers execution of the first elastic scaling policy for allocating GPU resources, directly trigger execution of the second elastic scaling policy for expanding the cloud host if it is found that no GPU resource is available in the GPU resource pool.

9. The apparatus of claim 6,

the monitoring module is further configured to trigger execution of an elastic expansion strategy for recovering the GPU resources when it is determined that the CPU resource utilization rate of the first cloud host is lower than a first threshold and a difference value with the first threshold is greater than a preset amplitude.

10. A storage medium on which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out the functions of the method steps of any one of the claims 1 to 5.