CN112162864A - Cloud resource allocation method and device and storage medium - Google Patents

Cloud resource allocation method and device and storage medium Download PDF

Info

Publication number
CN112162864A
CN112162864A CN202011158622.9A CN202011158622A CN112162864A CN 112162864 A CN112162864 A CN 112162864A CN 202011158622 A CN202011158622 A CN 202011158622A CN 112162864 A CN112162864 A CN 112162864A
Authority
CN
China
Prior art keywords
cloud host
cloud
utilization rate
resources
resource utilization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011158622.9A
Other languages
Chinese (zh)
Other versions
CN112162864B (en
Inventor
兰天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN202011158622.9A priority Critical patent/CN112162864B/en
Publication of CN112162864A publication Critical patent/CN112162864A/en
Application granted granted Critical
Publication of CN112162864B publication Critical patent/CN112162864B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The disclosure provides a cloud resource allocation method, a cloud resource allocation device and a storage medium, which are used for improving the utilization rate of cloud resources. According to the method, the GPU acceleration strategy and the relevant threshold value are set, and the cloud operating system is triggered to add GPU resources to the cloud host when the utilization rate of the CPU resources of the cloud host is too high, so that the processing performance of the overall computing resources of the cloud host is improved. According to the cloud resource scheduling method and device, the GPU resources and the CPU resources are reasonably and effectively scheduled and distributed, the elastic expansion performance of the cloud resources of the cloud operating system is improved, the physical resources can be saved under the condition that the user requirements are met, and the utilization rate of the cloud resources is improved.

Description

Cloud resource allocation method and device and storage medium
Technical Field
The present disclosure relates to the field of cloud computing technologies, and in particular, to a cloud resource allocation method, an apparatus, and a storage medium.
Background
The elastic expansion function of the cloud resources is to automatically adjust computing resources according to the services and strategies of users, meet different requirements of users under the condition that the service volume fluctuates continuously, currently, each cloud manufacturer supports the function of elastic expansion based on physical resources such as a CPU (central processing unit), a memory and the like, the CPU and the GPU are used as current mainstream processors, the physical resources are relatively expensive, and if the two resources cannot be reasonably and effectively integrated and distributed, the waste of the cloud resources and the reduction of user experience can be caused.
Disclosure of Invention
In view of this, the present disclosure provides a cloud resource allocation method, an apparatus and a storage medium, which are used to improve the utilization rate of cloud resources.
Based on an embodiment of the present disclosure, the present disclosure provides a cloud resource allocation method, including:
monitoring the CPU resource utilization rate of a first cloud host distributed for a user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;
and allocating GPU resources for the first cloud host according to a first elastic scaling strategy.
Further, the method further comprises: judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold value, wherein the second threshold value is larger than the first threshold value, and triggering and executing a second elastic expansion strategy for expanding the cloud host when the CPU resource utilization rate of the first cloud host exceeds the second threshold value; and according to a second elastic expansion strategy, expanding a new cloud host for the user.
Further, the method further comprises: and if the GPU resource pool is found to have no GPU resource available when the CPU resource utilization rate of the first cloud host exceeds a first threshold value and the execution of a first elastic scaling strategy for allocating GPU resources is triggered, directly triggering and executing a second elastic scaling strategy for expanding the cloud host.
Further, the method further comprises: and after the GPU resources are distributed to the first cloud host, injecting a computing unified device architecture component package into the first cloud host.
Further, the method further comprises: and when the CPU resource utilization rate of the first cloud host is judged to be lower than a first threshold value and the difference value between the CPU resource utilization rate of the first cloud host and the first threshold value is larger than a preset amplitude, triggering and executing an elastic expansion strategy for recovering GPU resources.
Based on another aspect of the present disclosure, the present disclosure further provides a cloud resource allocation apparatus, including:
the monitoring module is used for monitoring the CPU resource utilization rate of the first cloud host distributed for the user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;
and the elastic module is used for allocating GPU resources to the first cloud host according to the first elastic expansion strategy.
Further, the monitoring module is further configured to determine whether the CPU resource usage rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and when it is determined that the CPU resource usage rate of the first cloud host exceeds the second threshold, trigger execution of a second elastic scaling policy for expanding the cloud host;
and the elastic module is also used for expanding a new cloud host for the user according to a second elastic expansion strategy.
Further, the monitoring module is further configured to, when it is determined that the CPU resource utilization of the first cloud host exceeds the first threshold and the execution of the first elastic scaling policy for allocating GPU resources is triggered, directly trigger the execution of the second elastic scaling policy for expanding the cloud host if it is found that no GPU resource is available in the GPU resource pool.
Further, the monitoring module is further configured to trigger execution of an elastic scaling policy for recovering GPU resources when it is determined that the CPU resource utilization of the first cloud host is lower than a first threshold and a difference value from the first threshold is greater than a preset amplitude.
According to the method, the GPU acceleration strategy and the relevant threshold value are set, and the cloud operating system is triggered to add GPU resources to the cloud host when the utilization rate of the CPU resources of the cloud host is too high, so that the processing performance of the overall computing resources of the cloud host is improved. According to the cloud resource scheduling method and device, the GPU resources and the CPU resources are reasonably and effectively scheduled and distributed, the elastic expansion performance of the cloud resources of the cloud operating system is improved, the physical resources can be saved under the condition that the user requirements are met, and the cloud resource scheduling capability and the utilization rate are improved.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present disclosure or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present disclosure, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present disclosure.
Fig. 1 is a flowchart illustrating steps of a cloud resource allocation method according to an embodiment of the present disclosure;
fig. 2 is a schematic process diagram of a cloud resource allocation method according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a cloud resource allocation apparatus according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a cloud resource allocation device according to an embodiment of the present disclosure.
Detailed Description
The terminology used in the embodiments of the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present disclosure. As used in the embodiments of the present disclosure, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The term "and/or" if used in this disclosure is intended to encompass any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information in the embodiments of the present disclosure, such information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of embodiments of the present disclosure. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".
Fig. 1 is a flowchart of steps of a cloud resource allocation method according to an embodiment of the present disclosure, where the method is applied to a cloud operating system (a system running on a cloud management platform becomes a cloud operating system) in a cloud computing scene, and the method triggers the cloud operating system to add GPU resources to a cloud host to improve the processing performance of the overall computing resources of the cloud host when the CPU resource utilization of the cloud host is too high by setting a threshold value for GPU acceleration and a GPU acceleration policy. According to the cloud resource scheduling method and device, the GPU resources and the CPU resources are properly scheduled and distributed, the elastic expansion performance of the cloud resources of the cloud operating system is improved, the physical resources can be saved under the condition that the user requirements are met, and the cloud resource scheduling capability is improved. The method comprises the following steps:
step 101, monitoring the CPU resource utilization rate of a first cloud host distributed for a user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;
before this step is executed, a first threshold needs to be preset for the first cloud host, and a flexible scaling strategy for allocating GPU resources needs to be newly established for the user through the flexible scaling management platform (which is a sub-platform of the cloud management platform). The first threshold is used to trigger execution of a first elastic scaling policy that allocates GPU resources for the cloud host.
The cloud operating system can monitor the CPU resource utilization rate of the cloud host in real time, when the CPU resource utilization rate exceeds a first threshold value, it is indicated that the computing resources of the first cloud host are in shortage at the moment, in order to guarantee normal operation of a service, execution of a first elastic expansion strategy needs to be triggered, therefore, GPU resources are added to the first cloud host, the processing performance of the cloud host of a user is improved through good parallel computing capacity of the GPU, and the service requirement of the user can be met on the premise that the cloud host examples are not increased.
And 102, distributing GPU resources for the first cloud host according to the first elastic expansion strategy.
In an embodiment of the present disclosure, the method further includes: and when the CPU resource utilization rate of the first cloud host is judged to be lower than a first threshold value and the difference value between the CPU resource utilization rate of the first cloud host and the first threshold value is larger than a preset amplitude, triggering and executing an elastic expansion strategy for recovering GPU resources.
How to allocate the GPU resources to the cloud host may be set on the corresponding elastic scaling management platform, for example, the amount of GPU resources may be increased or recovered according to a certain proportion of existing CPU resources of the cloud host, which is not specifically limited in this disclosure.
In an embodiment of the present disclosure, after GPU resources are allocated to the cloud host, if the CPU resource utilization of the cloud host continues to increase beyond a preset threshold, the method further includes a step of expanding a new cloud host for the user, so as to meet a business requirement of the user. Based on this, the method further comprises:
step 103, judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold, wherein the second threshold is larger than a first threshold, for example, the first threshold is 80%, and the second threshold is 90%, and when the CPU resource utilization rate of the first cloud host is judged to exceed the second threshold, triggering and executing a second elastic expansion strategy for expanding the cloud host;
and 104, expanding a new cloud host for the user according to a second elastic expansion strategy.
By expanding a new cloud host for a user, a part of services on the first cloud host can be transferred to the newly expanded cloud host or the newly added services can be guided to the new cloud host, so that the resource pressure of the first cloud host is reduced, the service quality of the services of the first cloud host is guaranteed, and the condition that the service response is slow or even interrupted due to shortage of cloud resources is avoided as much as possible.
In order to clearly disclose the technical content and the technical effects of the technical solutions of the present disclosure, the following description is made with reference to the accompanying drawings. Fig. 2 is a schematic process diagram of a cloud resource allocation method according to an embodiment of the present disclosure. In general, the process of elastic scaling of the cloud operating system is as follows: when cloud host 1 reaches a CPU utilization threshold (e.g., 90%), the elastic scaling management platform automatically expands one or more cloud host instances, such as "cloud host 2" and "cloud host 3".
In the implementation scheme of the embodiment of the present disclosure shown in fig. 2, in combination with GPU acceleration to improve the performance of the cloud host, an option of "GPU acceleration" is added to the elastic scaling association platform, and a policy 1 is set.
Step 1, when the CPU utilization rate of the cloud host reaches a first threshold value, for example 80% (the threshold value is set by a user, and the purpose is to trigger GPU acceleration in advance before the cloud host is expanded), the execution of a GPU acceleration strategy 1 is triggered.
And 2, automatically allocating GPU resources for the cloud host 1 from the GPU resource pool according to the resource load condition of the cloud host 1 (on the premise of ensuring that the GPU resources are abundant).
After adding GPU resources to the cloud host 1, a Compute Unified Device Architecture (CUDA) component package is injected into the cloud host 1 through the cloud-init, the CUDA is a general parallel computing Architecture released by a certain graphics card manufacturer, and enables the GPU to solve complex computing problems.
And 3, after GPU resources are added, if the service processing capacity of the cloud host 1 is still continuously increased, the CPU utilization rate of the cloud host reaches a second threshold value, for example 90%, at the moment, the execution of an expansion strategy of the cloud host is triggered, namely one or more cloud hosts are expanded for a user, and the expansion operation allocates CPU resources for the newly expanded cloud host from a CPU resource pool.
In some cases, the following may also exist: in other words, when it is determined that the CPU resource utilization of the first cloud host exceeds the first threshold, after the first elastic scaling policy for allocating GPU resources is triggered to be executed, and when it is found that no GPU resource is available in the GPU resource pool, in this case, the second elastic scaling policy for expanding the cloud host may be directly triggered to be executed, so as to expand a new cloud host for the user.
It should be recognized that embodiments of the present disclosure can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The method may be implemented in a computer program using standard programming techniques, including a non-transitory computer readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, operations of processes described by the present disclosure may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described in this disclosure (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) executing collectively on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the methods provided by the present disclosure may be implemented in any type of computing platform operatively connected to suitable, including but not limited to personal computers, minicomputers, mainframe computers, workstations, networked or distributed computing environments, separate or integrated computer platforms, or in communication with charged particle tools or other imaging devices, and the like. Aspects of the disclosure may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described in this disclosure includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The disclosure also includes the computer itself when programmed according to the methods and techniques described in this disclosure.
Fig. 3 is a schematic structural diagram of a cloud resource allocation apparatus according to an embodiment of the present disclosure, where each functional module in the apparatus may be implemented in a form of a software module, a hardware unit, or a combination of software and hardware. The functions of each module of the device have corresponding relation with each step in the method provided by the implementation of the disclosure. The apparatus 300 comprises: a monitoring module 310 and a spring module 320.
The monitoring module 310 is configured to monitor a CPU resource utilization rate of a first cloud host allocated to a user, and trigger execution of a first elastic scaling policy for allocating GPU resources when it is determined that the CPU resource utilization rate of the first cloud host exceeds a first threshold.
The elastic module 320 is configured to allocate GPU resources to the first cloud host according to a first elastic scaling policy.
In an embodiment of the present disclosure, the monitoring module 310 may further be configured to determine whether the CPU resource usage rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and when it is determined that the CPU resource usage rate of the first cloud host exceeds the second threshold, trigger execution of a second elastic scaling policy for expanding the cloud host. The elastic module 320 is further configured to expand a new cloud host for the user according to a second elastic scaling policy.
In an embodiment of the present disclosure, the monitoring module 310 may be further configured to, when determining that the CPU resource utilization of the first cloud host exceeds the first threshold and triggering execution of the first elastic scaling policy for allocating GPU resources, find that no GPU resource is available in the GPU resource pool, directly trigger execution of the second elastic scaling policy for expanding the cloud host.
In an embodiment of the present disclosure, the monitoring module 310 may be further configured to trigger execution of an elastic scaling policy for recycling GPU resources when it is determined that the CPU resource usage rate of the first cloud host is lower than a first threshold and a difference value from the first threshold is greater than a preset magnitude.
Fig. 4 is a schematic structural diagram of a cloud resource allocation device according to an embodiment of the present disclosure, where the device 400 includes: a processor 410 such as a Central Processing Unit (CPU), an internal bus 420, a network interface 440, and a computer-readable storage medium 430. Wherein the processor 410 and the computer-readable storage medium 430 can communicate with each other through an internal bus 420. The computer readable storage medium 430 may store therein a computer program provided by the present disclosure for implementing the cloud resource allocation method, and when the computer program is executed by the processor 410, the computer program can implement the functions of the steps of the method provided by the present disclosure.
The machine-readable storage medium may include Random Access Memory (RAM) and may also include Non-Volatile Memory (NVM), such as at least one disk Memory. Additionally, the machine-readable storage medium may be at least one memory device located remotely from the aforementioned processor. The Processor may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), etc.; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
The equipment provided by the embodiment of the disclosure and the method provided by the embodiment of the disclosure have the same technical concept and the same beneficial effects as the method adopted, operated or realized by the equipment.
The above description is only an example of the present disclosure and is not intended to limit the present disclosure. Various modifications and variations of this disclosure will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims (10)

1. A cloud resource allocation method, the method comprising:
monitoring the CPU resource utilization rate of a first cloud host distributed for a user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;
and allocating GPU resources for the first cloud host according to a first elastic scaling strategy.
2. The method of claim 1, further comprising:
judging whether the CPU resource utilization rate of the first cloud host exceeds a second threshold value, wherein the second threshold value is larger than the first threshold value, and triggering and executing a second elastic expansion strategy for expanding the cloud host when the CPU resource utilization rate of the first cloud host exceeds the second threshold value;
and according to a second elastic expansion strategy, expanding a new cloud host for the user.
3. The method of claim 1, further comprising:
and if the GPU resource pool is found to have no GPU resource available when the CPU resource utilization rate of the first cloud host exceeds a first threshold value and the execution of a first elastic scaling strategy for allocating GPU resources is triggered, directly triggering and executing a second elastic scaling strategy for expanding the cloud host.
4. The method of claim 1, further comprising:
and after the GPU resources are distributed to the first cloud host, injecting a computing unified device architecture component package into the first cloud host.
5. The method of claim 1, further comprising:
and when the CPU resource utilization rate of the first cloud host is judged to be lower than a first threshold value and the difference value between the CPU resource utilization rate of the first cloud host and the first threshold value is larger than a preset amplitude, triggering and executing an elastic expansion strategy for recovering GPU resources.
6. An apparatus for cloud resource allocation, the apparatus comprising:
the monitoring module is used for monitoring the CPU resource utilization rate of the first cloud host distributed for the user, and triggering and executing a first elastic expansion strategy for distributing GPU resources when the CPU resource utilization rate of the first cloud host is judged to exceed a first threshold value;
and the elastic module is used for allocating GPU resources to the first cloud host according to the first elastic expansion strategy.
7. The apparatus of claim 6,
the monitoring module is further configured to determine whether the CPU resource utilization rate of the first cloud host exceeds a second threshold, where the second threshold is greater than the first threshold, and when it is determined that the CPU resource utilization rate of the first cloud host exceeds the second threshold, trigger execution of a second elastic scaling policy for expanding the cloud host;
and the elastic module is also used for expanding a new cloud host for the user according to a second elastic expansion strategy.
8. The apparatus of claim 6,
the monitoring module is further configured to, when it is determined that the CPU resource utilization of the first cloud host exceeds the first threshold and triggers execution of the first elastic scaling policy for allocating GPU resources, directly trigger execution of the second elastic scaling policy for expanding the cloud host if it is found that no GPU resource is available in the GPU resource pool.
9. The apparatus of claim 6,
the monitoring module is further configured to trigger execution of an elastic expansion strategy for recovering the GPU resources when it is determined that the CPU resource utilization rate of the first cloud host is lower than a first threshold and a difference value with the first threshold is greater than a preset amplitude.
10. A storage medium on which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out the functions of the method steps of any one of the claims 1 to 5.
CN202011158622.9A 2020-10-26 2020-10-26 Cloud resource allocation method, device and storage medium Active CN112162864B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011158622.9A CN112162864B (en) 2020-10-26 2020-10-26 Cloud resource allocation method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011158622.9A CN112162864B (en) 2020-10-26 2020-10-26 Cloud resource allocation method, device and storage medium

Publications (2)

Publication Number Publication Date
CN112162864A true CN112162864A (en) 2021-01-01
CN112162864B CN112162864B (en) 2023-06-09

Family

ID=73864662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011158622.9A Active CN112162864B (en) 2020-10-26 2020-10-26 Cloud resource allocation method, device and storage medium

Country Status (1)

Country Link
CN (1) CN112162864B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114138449A (en) * 2021-12-14 2022-03-04 河南省儿童医院郑州儿童医院 Rehabilitation training system based on virtual reality

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778080A (en) * 2014-01-14 2015-07-15 中兴通讯股份有限公司 Job scheduling processing method and device based on coprocessor
CN104954478A (en) * 2015-06-23 2015-09-30 普元信息技术股份有限公司 System and method for realizing automatic longitudinal scaling of server in cloud computing platform
US20160055612A1 (en) * 2014-08-25 2016-02-25 Intel Corporation Adaptive scheduling for task assignment among heterogeneous processor cores
CN106254459A (en) * 2016-05-13 2016-12-21 江苏云途腾科技有限责任公司 A kind of resource elasticity allocation strategy for cloud platform user and device
CN107688495A (en) * 2017-06-22 2018-02-13 平安科技(深圳)有限公司 The method and apparatus of dispatch processor
CN107743611A (en) * 2015-04-29 2018-02-27 微软技术许可有限责任公司 The optimum allocation of dynamic cloud computing platform resource
CN110308990A (en) * 2012-08-23 2019-10-08 亚马逊技术有限公司 For extending the computer implemented method and computing system of computing resource
CN111158852A (en) * 2019-12-14 2020-05-15 苏州浪潮智能科技有限公司 Training resource dynamic allocation method, system, terminal and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110308990A (en) * 2012-08-23 2019-10-08 亚马逊技术有限公司 For extending the computer implemented method and computing system of computing resource
CN104778080A (en) * 2014-01-14 2015-07-15 中兴通讯股份有限公司 Job scheduling processing method and device based on coprocessor
US20160055612A1 (en) * 2014-08-25 2016-02-25 Intel Corporation Adaptive scheduling for task assignment among heterogeneous processor cores
CN107743611A (en) * 2015-04-29 2018-02-27 微软技术许可有限责任公司 The optimum allocation of dynamic cloud computing platform resource
CN104954478A (en) * 2015-06-23 2015-09-30 普元信息技术股份有限公司 System and method for realizing automatic longitudinal scaling of server in cloud computing platform
CN106254459A (en) * 2016-05-13 2016-12-21 江苏云途腾科技有限责任公司 A kind of resource elasticity allocation strategy for cloud platform user and device
CN107688495A (en) * 2017-06-22 2018-02-13 平安科技(深圳)有限公司 The method and apparatus of dispatch processor
CN111158852A (en) * 2019-12-14 2020-05-15 苏州浪潮智能科技有限公司 Training resource dynamic allocation method, system, terminal and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
尤永康: "《私有云架构设计与实践》", 31 December 2019, 上海交通大学出版社 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114138449A (en) * 2021-12-14 2022-03-04 河南省儿童医院郑州儿童医院 Rehabilitation training system based on virtual reality

Also Published As

Publication number Publication date
CN112162864B (en) 2023-06-09

Similar Documents

Publication Publication Date Title
CN108881495B (en) Resource allocation method, device, computer equipment and storage medium
CN110532197B (en) Memory recovery method and device, electronic equipment and storage medium
US20170017511A1 (en) Method for memory management in virtual machines, and corresponding system and computer program product
US20070169125A1 (en) Task scheduling policy for limited memory systems
US9547510B2 (en) Tracking guest memory characteristics for memory scheduling
CN108205474B (en) Memory management method, terminal device, computer apparatus, and readable storage medium
CN113467933B (en) Distributed file system thread pool optimization method, system, terminal and storage medium
CN112650575B (en) Resource scheduling method, device and cloud service system
CN112486642B (en) Resource scheduling method, device, electronic equipment and computer readable storage medium
CN112667380A (en) Multiprocessor task scheduling method, device and storage medium
CN110471769B (en) Resource management method and device for virtual machine
CN112162864B (en) Cloud resource allocation method, device and storage medium
CN111679914B (en) Memory management method, system, computer equipment and storage medium
CN112463361A (en) Method and equipment for distributing elastic resources of distributed computation
CN108897603B (en) Memory resource management method and device
CN107391254B (en) Intelligent terminal, resource allocation method thereof and computer-readable storage medium
CN115587049A (en) Memory recovery method and device, electronic equipment and storage medium
CN111090627B (en) Log storage method and device based on pooling, computer equipment and storage medium
CN112363828B (en) Memory fragment management method and device, vehicle-mounted system and vehicle
CN114157717A (en) Micro-service dynamic current limiting system and method
CN108804225B (en) Virtual machine load regulation and control method and device
KR20210157246A (en) Method and Device for managing resource dynamically in a embedded system
CN111352710A (en) Process management method and device, computing equipment and storage medium
CN111752851B (en) Memory recycling method and device
CN112732449B (en) Video memory resource allocation method, device and equipment based on GPU virtualization technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant