CN115729671A

CN115729671A - Resource scheduling method and related device

Info

Publication number: CN115729671A
Application number: CN202211433837.6A
Authority: CN
Inventors: 周星; 高伟; 周明伟
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2023-03-03

Abstract

The embodiment of the application discloses a resource scheduling method and a related device, wherein the method comprises the following steps: acquiring a first resource utilization rate of each node; the node is used for providing resources to the container group according to the scheduling request of the container group deployed in the node; screening each node according to a first utilization rate threshold value to obtain at least one target node; wherein the first resource utilization of the target node is less than or equal to a first utilization threshold; for each target node, applying a scoring mechanism matched with the priority of the target container group, and scoring the target node according to the second resource utilization rate and the resource allocation rate of the target node; and realizing resource scheduling according to the priority of the target container group and the scores of all target nodes corresponding to the priority. The load balance among the nodes is realized, and the operation stability of the high-priority task is improved.

Description

Resource scheduling method and related device

Technical Field

The present application relates to the field of resource processing technologies, and in particular, to a resource scheduling method and a related apparatus.

Background

The container cloud is a cloud service product realized by deploying container services on cluster servers through a Docker technology, and can be regarded as a lightweight linux cloud server. At present, many container cloud platforms provide application operation platforms through technologies such as Docker and Kubernets, and the application environment resources are rapidly deployed, elastically stretched and dynamically adjusted. Kubernets is a management system capable of orchestrating and scheduling containers for automatic deployment, expansion, and management of applications. The Kubernets comprise a plurality of nodes, the nodes are minimum computing hardware units in the Kubernets, and each node can provide resources such as computing resources and memory resources required by running of an application program for the application program.

The running of each application program can be a task, and when the tasks under different scenes are realized through the cluster server, cluster scheduling needs to be carried out to schedule corresponding resources to execute the tasks, so that efficient resource scheduling is a problem to be considered in the field of cluster scheduling. The resource scheduling system may allocate an appropriate physical node based on the resource request of the task to allocate a corresponding resource to perform the task.

In the related art, the first way is static scheduling, and the requested resource is compared with the node allocable resource to determine whether the node has enough resources to accommodate the container group of the deployment task, which may result in low utilization of cluster resources. The second way is dynamic scheduling, and one case of dynamic scheduling is to consider only the historical resource utilization rate, which results in poor stability of service operation in the resource scheduling process. In another case of dynamic scheduling, unused resources that have been allocated are reclaimed for rescheduling, but the algorithm for calculating the recoverable resources is complex, and the recoverable resources are difficult to calculate for a long period of time when there is a jitter in the use of the resources or a tidal phenomenon on the order of days.

Disclosure of Invention

The embodiment of the application provides a resource scheduling method and a related device, which are used for improving the utilization rate of cluster resources and improving the stability of task operation.

In a first aspect, an embodiment of the present application provides a resource scheduling method, including:

acquiring a first resource utilization rate of each node; the node is used for providing resources to the container group according to the scheduling request of the container group deployed in the node;

for each target node, applying a scoring mechanism matched with the priority of the target container group, and scoring the target node according to the second resource utilization rate and the resource allocation rate of the target node; the target node is obtained by screening each node according to a first utilization rate threshold, and the first resource utilization rate of the target node is smaller than or equal to the first utilization rate threshold;

and realizing resource scheduling according to the priority of the target container group and the scores of all target nodes corresponding to the priority.

In a second aspect, an embodiment of the present application provides a resource scheduling apparatus, including:

the acquisition module is used for acquiring the first resource utilization rate of each node; the node is used for providing resources to the container group according to the scheduling request of the container group deployed in the node;

the scoring module is used for applying a scoring mechanism matched with the priority of the target container group aiming at each target node and scoring the target node according to the second resource utilization rate and the resource distribution rate of the target node; the target node is obtained by screening each node according to a first utilization rate threshold, and the first resource utilization rate of the target node is smaller than or equal to the first utilization rate threshold;

and the scheduling module is used for realizing resource scheduling according to the priority of the target container group and the scores of all target nodes corresponding to the priority.

In a third aspect, an embodiment of the present application provides a resource scheduling system, where the resource scheduling system includes a scheduler and at least one component, and the scheduler is configured to process data monitored by the at least one component to implement any of the steps of the method described above.

In a fourth aspect, an embodiment of the present application provides a resource scheduling apparatus, which includes a memory, a processor, a computer program stored in the memory and executable on the processor, and the resource scheduling system of the third aspect, the resource scheduling system being disposed on the processor, wherein the processor implements the steps of any one of the methods when executing the computer program.

In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, on which computer program instructions are stored, which when executed by a processor implement the steps of any one of the above methods.

In a sixth aspect, an embodiment of the present application provides a computer program product, which includes a computer program, and when being executed by a processor, the computer program implements the steps of any one of the methods described above.

The embodiment of the application has the following beneficial effects:

firstly, the nodes are screened, the nodes with high load are filtered, and each target node with the first resource utilization rate smaller than the first utilization rate threshold is obtained. And performing multi-priority division on the tasks, namely performing multi-priority division on the corresponding target container groups, and comprehensively considering the second resource utilization rate and the resource allocation rate of the target nodes to score each target node. And then realizing resource scheduling according to the priority of the target container group and the fraction of each target node corresponding to the priority. And (3) performing balanced load dynamic scheduling on tasks with different priorities, and realizing load balance among nodes as much as possible while improving the utilization rate of cluster resources. The running stability of the high-priority task is greatly improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a schematic diagram of a framework for resource scheduling according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a resource scheduling method according to an embodiment of the present application;

fig. 3 is a schematic diagram of a resource before scheduling according to an embodiment of the present application;

fig. 4 is a schematic diagram of a resource scheduling process according to an embodiment of the present application;

fig. 5 is a schematic diagram illustrating a node resource reselling according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a resource scheduling apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a resource scheduling apparatus according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

For convenience of understanding, terms referred to in the embodiments of the present application are explained below:

(1) Presentation state transfer (REST), which is a type of architectural style. Wherein, the presentation layer refers to a form for specifically presenting the resources.

(2) Application Programming Interface (API), a set of rules for defining the interconnection and communication of applications or devices, is a mechanism that enables one application or service to access resources in another application or service.

Any number of elements in the drawings are by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.

The container cloud is a cloud service product realized by deploying container services on cluster servers through a Docker technology, and can be regarded as a lightweight linux cloud server. The container cloud is a cloud service product realized by deploying container services on cluster servers through a Docker technology, and can be regarded as a lightweight linux cloud server. The configuration and the network are selected according to needs, the system can be used as a virtual host, a cloud server and a cluster server, and is suitable for building stations, running application programs, configuring load balance and building service clusters. The customer need not handle the maintenance work of complicated bottom server, only need install the mirror image that provides, can operate the container, and only need pay for the container resource that uses. Therefore, the use of the container cloud can save capital and operating costs to the greatest extent while meeting functional requirements.

At present, many container cloud platforms provide application operation platforms through technologies such as Docker and Kubernets, so that operation and maintenance automation is achieved, application is deployed rapidly, application environment resources are adjusted elastically and flexibly and dynamically, and research and development operation efficiency is improved. Kubernets is a management system capable of orchestrating and scheduling containers for automatic deployment, expansion, and management of applications. The Kubernetes comprises a plurality of nodes, the nodes are minimum computing hardware units in the Kubernetes, and each node can provide computing resources, memory resources and other resources required by running of an application program for the application program. In kubernets, one or more containers are packaged in a container group (pod) and run on a node. Kubernetes provides a built-in load balancer and a scalable resource automatic scheduling capability, and a scheduler is responsible for collecting and analyzing resources occupied by nodes and performs resource scheduling according to resources requested by pod based on an analysis result, so that a newly-built container group is allocated to an available node for deployment. The resource package of the pod request includes two configurations, namely a request and a limit, wherein the limit refers to the specification and the memory specification of a Central Processing Unit (CPU) which can be used by a single pod at most, the request refers to the lowest resource which can be allocated to the single pod by a system, the request can influence scheduling, a scheduling component can use allocable resources in each node and the allocated pod request for calculation during scheduling, and when the allocated request is equal to or close to the allocable resources, the pod is not scheduled to come in again. Resources may include CPU resources, memory resources, and other resources, among others.

It can be seen that kubernets uses static scheduling, which refers to scheduling according to the resource requested by the container, that is, comparing the resource requested by a pod with the resource allocable by a node to determine whether the node has enough resources to accommodate the pod. The static scheduling has the greatest advantages of simple and efficient scheduling and convenient cluster resource management, but in an actual environment, a task has certain subjectivity and blindness when selecting a container specification, and in order to ensure the stability of the task, the task can apply for resources far larger than the actual resource usage amount of the task, so that the resource utilization rate of a task container is very low, and the condition of low cluster resource utilization rate is easily caused if the task occupies a large proportion.

In order to solve the problem of low actual utilization rate of cluster resources, the Kubernetes community mainly adopts the modes of pod allocation resource compression, autoScale elastic expansion capability and the like. When a pod creation request exists, the container cloud platform automatically modifies a request value of the request resource of the pod according to the compression ratio. By the method, the request of the resource of the whole cluster is compressed, and after the request is compressed, the cluster can create more pod, so that the actual utilization rate of the cluster resource is improved. However, the value of the resource request of the compressed pod request cannot be compressed during service operation only when the pod is created or rebuilt, for example, when the service is deployed or upgraded, and the actual load change rule of each pod is different, so how to determine the compression ratio is also a difficult problem. The automatic scale elastic expansion mainly adopts horizontal expansion (HPA) and vertical expansion (VPA). The HPA dynamically increases or decreases the number of service pods according to the actual resource usage monitoring data of the service, so that the actual resource usage rate of each pod approaches to the value of the pod request resource request. And the VPA also dynamically adjusts the values of the request resource request and the limit in the pod according to the actual resource usage monitoring data of the pod, so that the actual resource usage rate of each pod approaches to the value of the request resource request of the pod. However, the number of the pod is reduced when the task of the HPA is idle, and when the load of the task fluctuates, a new pod needs to be created and started to expand the capacity, which is time-consuming and cannot be tolerated by the task. When the VPA modifies the pod resource, the pod is rebuilt, causing the application task to be interrupted.

In summary, the Kubernetes community scheme has a certain problem in solving load optimization of cluster resources, and a dynamic scheduling scheme needs to be implemented according to a specific task scene, so that the actual utilization rate of the cluster resources is improved.

One of the dynamic scheduling schemes in the related art is to oversale node resources according to the historical resource utilization rate of the node, but tidal phenomena may occur in pod resource usage, and when a peak value occurs in the pod resource usage rate on the node after the pod completes dynamic scheduling according to the historical resource utilization rate of the node, insufficient resources of other pods may occur, and stability of service operation is affected. The other is that the allocated and unused resources in the various types of resources of the node component need to be recovered and reused for scheduling resources. However, the algorithm for calculating the recyclable resource is complex, and when the resource is used in a jittering or day-level tide phenomenon, the recyclable resource is difficult to calculate for a long time period. In addition, recoverable resources and schedulable resources are distinguished in scheduling, services with different quality grades use different resource scheduling, so that when the scheduler performs scheduling algorithm operation, two sets of resource views exist, and the scheduling algorithm is relatively complex.

Therefore, aiming at a scene with a low cluster resource utilization rate, the embodiment of the application provides a dynamic scheduling method for realizing simple complex node perception. And according to the historical load information of the nodes, performing dynamic scheduling on the pod, increasing the number of pod deployments in the nodes and improving the utilization rate of cluster resources. In addition, the priority of the tasks is defined, and the resources of the pod corresponding to the high-priority tasks are ensured, so that the running stability of the high-priority tasks is ensured.

After introducing the design concept of the embodiment of the present application, some simple descriptions are provided below for application scenarios to which the technical solution of the embodiment of the present application can be applied, and it should be noted that the application scenarios described below are only used for describing the embodiment of the present application and are not limited. In specific implementation, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.

To further explain the technical solutions provided by the embodiments of the present application, the following detailed description is made with reference to the accompanying drawings and the specific embodiments. Although the embodiments of the present application provide method steps as shown in the following embodiments or figures, more or fewer steps may be included in the method based on conventional or non-inventive efforts. In steps where no necessary causal relationship exists logically, the order of execution of the steps is not limited to that provided by the embodiments of the present application.

Referring to fig. 1, a framework for resource scheduling provided in an embodiment of the present application is described below.

The Kube-apiserver is a Kubernets native core component, provides REST API (API according with REST architectural style design principle) of the Kubernets, and is responsible for storage operation of various kinds of information of the Kubernets.

Kubelet is a Kubernetes native node agent component and is responsible for maintaining the total amount of node resources and allocable amount, creating containers and the like.

The Node exporter is a Node monitoring and collecting component and collects the actual use information of the Node resources in real time.

Prometheus is an open source monitoring component and is responsible for storing acquired information and performing statistical aggregation on monitoring data according to query conditions.

The Node-actor is a self-research component and is responsible for pulling monitoring data in Prometous, regularly synchronizing real load information (such as CPU (Central processing Unit) use condition and memory use condition) of a Node into an annotation field of the Node, and dynamically calculating the Node resource over-sale coefficient according to the allocated resource amount and the actual use rate of the Node.

The Scheduler is a Kubernets native Scheduler and schedules resources according to the Request resources of the pod Request and the allocable resources of the nodes.

The Extern-scheduler is a self-developed extended scheduler, and regularly acquires the node load information in the node information announcing field by monitoring the kube-apiserver, and schedules according to the real load of the node.

When the pod resource usage on a node is jittered or resource tides occur, the pod resources are caused to compete with each other, and some pods are further caused to be expelled. In order to solve the problem, the service quality assurance of different task levels is realized by defining different priorities for the tasks. The priority is used to describe the priority order that tasks satisfy the pod request resources and the pod eviction order when node resources are tight. Priority is achieved by PriorityClass resource definition to kubernets, including three levels, prod (high), mid (medium) and Low. The Prod priority task has the highest priority resource guarantee, and when a node is evicted, such pod is evicted finally. Generally used for online high-priority tasks, which do not allow interrupts and resource delay responses. And ensuring the lowest priority resource of the Low priority task, wherein when the resource is in shortage, the request resource cannot be ensured, and when the node is evicted, the pod is evicted firstly. Typically used by delay insensitive tasks such as offline tasks, which run for short interruptions or pauses.

With reference to fig. 2 and fig. 1, the following describes a technical solution provided by an embodiment of the present application. The embodiment of the application provides a resource scheduling method, which is applied to a container cloud platform and at least comprises the following steps:

s201, obtaining a first resource utilization rate of each node.

The node is used for providing resources to the container group according to the scheduling request of the container group deployed at the node.

S202, aiming at each target node, a scoring mechanism matched with the priority of the target container group is applied, and the target node is scored according to the second resource utilization rate and the resource distribution rate of the target node.

The target node is obtained by screening each node according to the first utilization rate threshold, and the first resource utilization rate of the target node is smaller than or equal to the first utilization rate threshold.

And S203, realizing resource scheduling according to the priority of the target container group and the scores of all target nodes corresponding to the priority.

In the above embodiment, the nodes are first screened, and the nodes with high load are filtered, so as to obtain each target node with the first resource utilization rate smaller than the first utilization rate threshold. And performing multi-priority division on the tasks, namely performing multi-priority division on the corresponding target container groups, and comprehensively considering the second resource utilization rate and the resource allocation rate of the target nodes to score each target node. And then, the resource scheduling is realized according to the priority of the target container group and the scores of all target nodes corresponding to the priority. And (3) performing balanced load dynamic scheduling on tasks with different priorities, and realizing load balance among nodes as much as possible while improving the utilization rate of cluster resources. The running stability of the high-priority task is greatly improved.

Referring to S101, a container group is already deployed on a node, and a task of the container group uses resources in the execution process, where the resources generally include CPU resources and memory resources. The first resource utilization rate comprises a CPU utilization rate and a memory utilization rate which are determined by a container group deployed by the node in the resource using process. In an actual application process, the resources may also include other resources, and the CPU resources and the memory resources are only for illustration and are not limited specifically.

Illustratively, the external-scheduler extension dynamic scheduler acquires Node load information in a Node information annotation field by monitoring changes of Node information, and performs balanced scheduling on tasks with different priorities according to the Node load information.

Taking a node as an example, a process of obtaining a first resource utilization rate of the node is described:

the Node-annotor mainly completes synchronization of real load information of the Node, and respectively calculates, by monitoring data of the monitoring component Prometheus, an average utilization rate of a CPU (for example, 0.33142) in the past 5 minutes, an average utilization rate of a CPU (for example, 0.33495) in the past 1 hour, a maximum utilization rate of a CPU (for example, 0.33295) in the past 24 hours, an average utilization rate of a memory (for example, 0.3401) in the past 5 minutes, an average utilization rate of a memory (for example, 0.3461) in the past 1 hour, and a maximum utilization rate of a memory (for example, 0.3525) in the past 24 hours. The node annotator component updates these data into the announcement field in the node.

In addition, the weight of each utilization rate is configured in advance, for example, the weight of the average utilization rate of the 5-minute CPU is 0.2, the weight of the average utilization rate of the 1-hour CPU is 0.3, and the weight of the maximum utilization rate of the 24-hour CPU is 0.5. The weight of the average memory utilization for 5 minutes is 0.2, the weight of the average memory utilization for 1 hour is 0.3, and the weight of the maximum memory utilization for 24 hours is 0.5.

And weighting and summing the average utilization rate and the weight to obtain the CPU utilization rate of 0.33142 +0.2 +0.33495 +0.3 +0.33295 + 0.5=0.333244 and the memory utilization rate of 0.3401 +0.2 +0.3461 +0.3 +0.3525 + 0.5=0.3481.

The CPU utilization rate and the memory utilization rate of each node are obtained by applying the method. The first utilization threshold includes a first CPU utilization threshold and a first memory utilization threshold, and the first CPU utilization threshold may be used to screen each node, and the first memory utilization threshold may be used to screen each node. Specifically, the nodes with the CPU utilization rate greater than the first CPU utilization rate threshold value in each node are deleted, and the nodes with the memory utilization rate greater than the first memory utilization rate threshold value in each node are deleted. With the design, the CPU utilization rate of the target node obtained after screening is smaller than or equal to the first CPU utilization rate threshold, and the memory utilization rate of the target node obtained after screening is smaller than or equal to the first memory utilization rate threshold.

Referring to S202, for each target node, a scoring mechanism matching the priority of the target container group is applied, and the target node is scored according to the second resource utilization rate and the resource allocated rate of the target node.

The target container group is a container group to be dispatched to the node, and the priority of the target container group is also the priority of the task executed by the target container group. Different tasks have different priorities and are divided into a high-priority container group, a medium-priority container group and a low-priority container group according to a preset priority determination rule.

Due to the fact that the scheduling principles of the target container groups with different priorities are different, the matching scoring mechanisms are different. If the priority of the target container group is high, the scoring mechanism matched with the priority of the target container group is a first scoring mechanism; and if the priority of the target container group is medium or low, the scoring mechanism of the priority matching of the target container group is a second scoring mechanism.

In the first scoring mechanism, the second resource utilization rate is inversely related to the score, that is, the lower the second resource utilization rate is, the higher the score is; the resource allocation rate is inversely related to the score, i.e. the lower the resource allocation rate, the higher the score. In the second scoring mechanism, the second resource utilization rate is negatively correlated with the score, that is, the lower the second resource utilization rate is, the higher the score is; the allocated rate of resources is positively correlated with the score, i.e. the lower the allocated rate of resources, the lower the score. The difference between the two scoring mechanisms is that a high priority target container group can be scheduled to a node with a low resource allocation rate, but in order to reserve resources for a high priority target container, medium and low priorities are scheduled to a node with a high resource allocation rate.

Due to different scoring mechanisms, the same target node has two scores, namely a first target score and a second target score. The first target score is the score of the target node under the condition that the target container group is high in priority, and the second target score is the score of the target node under the condition that the target container group is low in priority.

In the case where the scoring mechanism is determined, a scoring process of a target node is described by taking a target node as an example.

The first target score determination process:

(1) And applying a first scoring mechanism to score the target node according to the second resource utilization rate of the target node to obtain a first score.

Different weights can be set for the CPU utilization rate and the memory utilization rate respectively, and the second resource utilization rate is obtained through weighting. And the higher the second resource utilization rate is, the higher the score of the target node is, and in the step, the first score of the target node is obtained according to the rule.

(2) And scoring the target node according to the resource allocation rate to obtain a second score.

And in the step, a second score of the target node is obtained according to the rule, wherein the lower the resource allocation rate is, the higher the score of the target node is.

(3) And weighting the first score and the second score to obtain a first target score of the target node under the condition that the priority of the target container group is high.

And weighting the first score and the second score according to the preset weights of the first score and the second score to obtain a first target score of the target node under the condition that the priority of the target container group is high.

The second target score determination process:

(1) And applying a second scoring mechanism to score the target node according to the second resource utilization rate of the target node to obtain a third score.

And the higher the second resource utilization rate is, the higher the score of the target node is, and in the step, the third score of the target node is obtained according to the rule.

(2) And scoring the target node according to the resource allocation rate to obtain a fourth score.

And in the step, the fourth score of the target node is obtained according to the rule.

(3) And weighting the third score and the fourth score to obtain a second target score of the target node under the condition that the priority of the target container group is high.

And weighting the third score and the fourth score according to the preset weights of the third score and the fourth score to obtain a second target score of the target node under the condition that the priority of the target container group is high.

Referring to S203, the resource scheduling process will be described separately according to the different priorities of the target container groups.

In the first case, the priority of the target container group is high. The score of the target node applied in this case is the first target score.

A. And under the condition that the second resource utilization rate of the N target nodes is less than or equal to the second utilization rate threshold value and the resource allocation rate is less than or equal to the allocation rate threshold value, scheduling the target container group to a target node with the highest first target score in the N target nodes.

Wherein the lower the second resource utilization rate, the lower the resource allocation rate, indicating that the target container group is more suitable for scheduling to the target node. Thus, if the second resource utilization of N target nodes of all target nodes is less than or equal to the second utilization threshold and the resource allocated rate is less than or equal to the allocation rate threshold. With such a design, the target container group may be scheduled to the target node with the highest first target score among the N target nodes.

B. And under the condition that the resource allocation rate of each target node is greater than the allocation rate threshold, scheduling the target container group to the target node with the lowest second resource utilization rate in each target node.

If the resource allocation rate of each target node is greater than the allocation rate threshold, it indicates that the resource allocation rate of each target node is very high, and then the scheduling may not be performed according to the first target score. The target node with the lowest second resource utilization rate can be selected from all the target nodes, and the target container group is dispatched to the target node.

C. Under the condition that the second resource utilization rate of each target node is greater than the second utilization rate threshold value, deleting the container group with low priority in the first reference target node, and scheduling the target container group to the first reference target node; and the first reference target node is the target node with the highest second target score in the target nodes of the container groups with low priorities.

If the second resource utilization rate of each target node is greater than the second utilization rate threshold, the second resource utilization rate of each target node is high. And finding each target node of the container group with low priority in each target node, wherein the target node with the highest second target score is the first reference target node. And deleting the container group with low priority in the first reference target node, and scheduling the target container group to the first reference target node. In this example, resources are set aside for high priority task scheduling, and resources of a high priority container group may be guaranteed with priority.

In the second case, the priority of the target container group is medium or low. The score of the target node applied in this case is the second target score.

A. Under the condition that the second resource utilization rate of the M target nodes is smaller than or equal to a second utilization rate threshold value and the resource allocation rate is smaller than or equal to an allocation rate threshold value, scheduling the target container group to a target node with the highest second target score in the M target nodes; wherein M is an integer greater than or equal to 1 and less than or equal to the total number of target nodes.

For this example, refer to step a in the first case, which is not described herein.

B. Under the condition that the second resource utilization rate of each target node is greater than the second utilization rate threshold, deleting the low-priority container group in the second reference target node aiming at the medium-priority target container group, and scheduling the medium-priority target container group to the second reference target node; and controlling the low-priority target container group to wait for scheduling, wherein the second reference target node is the target node with the highest second target score in the target nodes of the low-priority container group.

For this example, refer to step B in the first case, which is not described herein. In addition, the target container group with low priority has no resource allocable resource and waits all the time.

In a specific example, referring to fig. 3, a schematic diagram before resource scheduling is shown, and fig. 4 shows a schematic diagram of a resource scheduling process. After the multi-priority load balancing scheduling is completed, when resource use jitters or resource tides occur along with the operation of the pod on the node, the node load may become high, and resource preemption is performed before the pod, so that the stable operation of the pod is influenced. The method and the device have the advantages that through the priority eviction strategy, the pod is migrated to the node with low load, and the load of the node is balanced.

The Descheduler monitors Kube-apiserver to obtain load information and pod information of the Node, when the load of the Node is in a high water level, an eviction strategy is carried out, a low-priority task pod is evicted preferentially, a medium-priority task pod is formed, and the high-priority task pod is evicted finally. The pod after being evicted is dispatched to a low-water level node through a dynamic scheduler, so that the high load of the node is reduced, and meanwhile, due to the eviction priority order, the high-priority task pod cannot be evicted as much as possible, so that the stability and the resource use of the high-priority task are ensured.

In the process of resource scheduling, the size relationship between the second resource utilization rate and the second utilization rate threshold value and the size relationship between the resource allocation rate and the allocation rate threshold value are considered at the same time. Next, a calculation process of the resource allocation rate will be described by taking one target node as an example.

And determining the actual total resource amount of the target node based on the resource over-sale coefficient of the target node, and determining the resource allocation rate according to the actual total resource amount and the allocated resource.

Wherein, the resource types are different, and the resource over-sale coefficients are also different. In the embodiment of the application, the resource over-selling coefficient comprises a CPU resource over-selling coefficient and a memory resource over-selling coefficient, and the actual total resource comprises an actual CPU resource total amount and an actual memory resource total amount.

For any node, in a scenario where the real usage of the resource of the container group is far lower than that of the requested resource, a common problem is faced, that is, the real load of the node is relatively low, but the allocable resource of the node is already occupied by the requested resource of the container group, so that a new container group cannot be scheduled to the node. And the resources which are allocated and not used at the node can be provided to a new container group scheduling as allocable resources by carrying out over-selling on the resources of the node.

And after the Node-annotor dynamically calculates the over-selling coefficient of the Node resource according to the real historical load information of the Node, the Kubelet carries out over-selling on the Node resource according to the coefficient. A schematic diagram of node resource over-selling is shown in fig. 5.

By modifying the Kubelet logic, when the Kubelet reports and updates the distributable resources of the nodes in the Node status periodically, the actual resource total amount and the distributable resource amount in the Node status are modified according to the Node resource over-selling coefficient of the Node information announcement field. The actual total resource calculation is defined as follows:

Rr＝F*T

wherein r represents a resource type, i.e., CPU or memory, F represents a corresponding resource over-selling coefficient, and T represents an initial resource total amount.

Illustratively, the actual total amount of CPU resources is determined based on the CPU resource over-sell factor and the initial total amount of CPU resources of the target node. For example, the CPU resource over-selling coefficient is 1.5, the initial total CPU resource amount of the target node is 100 cores, and the actual total CPU resource amount is 100 cores × 1.5=150 cores.

Illustratively, the actual total amount of the memory resources is determined according to the over-selling coefficient of the memory resources and the initial total amount of the memory resources of the target node. For example, if the over-selling coefficient of the over-selling resources of the memory is 1.3, the total amount of the initial memory resources of the target node is 1G, and the total amount of the actual memory resources is 1.3G.

And the calculation of the amount of allocable resources of a node is defined as follows:

Ur＝F*T-A

wherein r represents a resource type, i.e., CPU or memory, F represents a corresponding resource over-selling coefficient, T represents an initial resource total amount, and a represents an allocated resource amount.

As above, referring to FIG. 5, a schematic diagram of node resource oversell is shown. The total amount of the node resources is virtually enlarged through the node resource over-sale coefficient, the allocable amount of the node resources is enlarged in a phase-changing manner, the effect is equivalent to adding the resources which are distributed but not used in the node into the allocable resources and reusing the resources, and therefore the real utilization rate of the node resources is improved.

Determining the allocated rate of the first resource according to the actual total amount of the CPU resource and the allocated amount of the CPU resource; determining the allocated rate of the second resource according to the actual total amount of the memory resources and the allocated amount of the memory resources; and weighting the first resource allocation rate and the second resource allocation rate to obtain the resource allocation rate.

For example, if the actual total amount of CPU resources is 150 cores and the allocated amount of CPU resources is 30 cores, the first resource allocation rate is 0.2, the actual total amount of memory resources is 1.3G, and the allocated amount of memory resources is 0.3G, and the second resource allocation rate is 0.23. The weight of the first resource allocated rate is, for example, 0.4, and the weight of the second resource allocated rate is, for example, 0.6, so that the weighted resource allocated rate is 0.2+ 0.4+0.23 + 0.6=0.186.

In the process, the resource over-sale coefficient is determined by the following method:

aiming at the first resource, determining the allocation rate of the first resource according to the allocated first resource amount of the current target node and the initial first resource total amount; determining a resource over-selling coefficient of the first resource according to the allocation rate of the first resource, the resource utilization rate of the first resource and the weight corresponding to the resource utilization rate; the first resource is a CPU resource or a memory resource.

For example, taking the CPU resource as an example, the amount of the allocated CPU resource is 50 cores, the total amount of the initial CPU resource is 100 cores, and the allocation rate of the CPU resource is 0.5. And determining the resource over-selling coefficient of the CPU resource according to the allocation rate of the CPU resource of 0.5, the resource utilization rate of the CPU resource and the weight corresponding to the resource utilization rate (see the embodiment).

Continuing with the CPU resource example, the resource over-sell coefficient can be determined by the following formula:

wherein, F is a resource over-selling coefficient, and a is the current resource allocation rate;

for the resource utilization rate, use _ i is the resource utilization rate of the ith preset time period, weight _ i is the weight of the resource utilization rate of the ith preset time period, and h is the number of the preset time periods.

For example, a resource utilization rate of 0.33142 for 0.5, i =1 (5 minutes), and the corresponding weight is 0.2; resource utilization of i =2 (1 hour) is 0.33495, with a corresponding weight of 0.3; the resource utilization of i =3 (24 hours) is 0.3525, with a corresponding weight of 0.5. In this example, F =1+ (0.4-0.33142 0.2+0.33495 0.3+0.33295 + 0.5) = 0.8=1.13.

In the embodiment, on the basis of dynamically selling the node resources according to the real historical load information, the tasks are further subjected to multi-priority division, and the tasks with different priorities are subjected to balanced load dynamic scheduling through the extended scheduler according to the node load condition, so that the cluster resource utilization rate is improved, and the load balance among the nodes is realized as much as possible. Meanwhile, due to the priority eviction strategy, the pod of the high-priority task is not evicted as much as possible, and therefore the running stability of the high-priority task is greatly improved.

As shown in fig. 6, based on the same inventive concept as the resource scheduling method, the embodiment of the present application further provides a resource scheduling apparatus, where the method is applied to a container cloud platform, and the apparatus includes an obtaining module 61, a scoring module 62, and a scheduling module 63.

An obtaining module 61, configured to obtain a first resource utilization rate of each node; the node is used for providing resources for the container group according to the scheduling request of the container group deployed at the node;

a scoring module 62, configured to apply, for each target node, a scoring mechanism matched with the priority of the target container group, and score the target node according to the second resource utilization rate and the resource allocation rate of the target node; the target node is obtained by screening each node according to a first utilization rate threshold, and the first resource utilization rate of the target node is smaller than or equal to the first utilization rate threshold;

and the scheduling module 63 is configured to implement resource scheduling according to the priority of the target container group and the score of each target node corresponding to the priority.

In one possible design, the first resource utilization rate is a CPU utilization rate and a memory utilization rate determined in a resource use process according to a container group in which a node has been deployed; the first utilization threshold includes a first CPU utilization threshold and a first memory utilization threshold.

In one possible design, the system further includes a screening module configured to:

deleting nodes of which the CPU utilization rate is greater than a first CPU utilization rate threshold value in each node;

and deleting the nodes of which the memory utilization rate is greater than the first memory utilization rate threshold value in each node.

In one possible design, if the priority of the target container group is high, the scoring mechanism of the priority matching of the target container group is the first scoring mechanism; if the priority of the target container group is medium or low, the scoring mechanism matched with the priority of the target container group is a second scoring mechanism;

in the first scoring mechanism, the utilization rate of the second resource is negatively related to scoring, and the allocated rate of the resource is negatively related to scoring; in a second scoring mechanism, the utilization rate of a second resource is negatively correlated with the score, and the allocated rate of the resource is positively correlated with the score; the second resource utilization rate is a utilization rate obtained by weighting the CPU utilization rate and the memory utilization rate.

In one possible design, scoring module 62 is specifically configured to:

grading the target node according to the second resource utilization rate of the target node by applying a first grading mechanism to obtain a first score; grading the target node according to the resource allocation rate to obtain a second score; weighting the first score and the second score to obtain a first target score of the target node under the condition that the priority of the target container group is high;

applying a second scoring mechanism, and scoring the target node according to the second resource utilization rate of the target node to obtain a third score; scoring the target node according to the resource allocation rate to obtain a fourth score; and weighting the third score and the fourth score to obtain a second target score of the target node under the condition that the priority of the target container group is medium or low.

In one possible design, the scheduling module 63 is specifically configured to:

under the condition that the second resource utilization rate of the N target nodes is smaller than or equal to a second utilization rate threshold value and the resource allocation rate is smaller than or equal to an allocation rate threshold value, scheduling the target container group to a target node with the highest first target score in the N target nodes; wherein N is an integer greater than or equal to 1 and less than or equal to the total number of the target nodes;

under the condition that the resource allocation rate of each target node is greater than the allocation rate threshold, scheduling the target container group to a target node with the lowest second resource utilization rate in each target node;

under the condition that the second resource utilization rate of each target node is greater than the second utilization rate threshold value, deleting the container group with low priority in the first reference target node, and scheduling the target container group to the first reference target node; the first reference target node is a target node with the highest first target score in the target nodes of the container groups with low priorities.

In one possible design, the scheduling module 63 is specifically configured to:

under the condition that the second resource utilization rate of the M target nodes is smaller than or equal to a second utilization rate threshold value and the resource allocation rate is smaller than or equal to an allocation rate threshold value, scheduling the target container group to a target node with the highest second target score in the M target nodes; wherein M is an integer greater than or equal to 1 and less than or equal to the total number of the target nodes;

under the condition that the second resource utilization rate of each target node is greater than the second utilization rate threshold value, deleting the container group with low priority in the second reference target node, and scheduling the target container group to the second reference target node; and the second reference target node is the target node with the highest second target score in the target nodes of the container groups with low priorities.

In one possible design, the method further includes determining the allocated rate of resources by:

for each target node, determining the actual total amount of resources of the target node based on the resource over-sale coefficient of the target node;

and determining the allocated rate of the resources according to the actual total amount of the resources and the allocated resources.

In one possible design, the resource over-selling coefficient comprises a CPU resource over-selling coefficient and a memory resource over-selling coefficient, and the actual resource total amount comprises an actual CPU resource total amount and an actual memory resource total amount; the determining module is specifically configured to:

determining the actual total amount of the CPU resource according to the CPU resource over-selling coefficient and the initial total amount of the CPU resource of the target node; and determining the actual total amount of the memory resources according to the memory resource over-selling coefficient and the initial total amount of the memory resources of the target node.

In one possible design, the determining module is specifically configured to:

determining the allocated rate of the first resource according to the actual total amount of the CPU resource and the allocated amount of the CPU resource; determining the allocated rate of the second resource according to the actual total amount of the memory resources and the allocated amount of the memory resources;

and weighting the first resource allocation rate and the second resource allocation rate to obtain the resource allocation rate.

In one possible design, the determining module is specifically configured to:

aiming at the first resource, determining the allocation rate of the first resource according to the allocated first resource amount of the current target node and the initial first resource total amount;

determining a resource over-selling coefficient of the first resource according to the allocation rate of the first resource, the resource utilization rate of the first resource and the weight corresponding to the resource utilization rate;

the first resource is a CPU resource or a memory resource.

In one possible design, the determining module is specifically configured to determine the resource reselling factor by the following formula:

The resource processing device and the resource processing method provided by the embodiment of the application adopt the same inventive concept, can obtain the same beneficial effects, and are not described again.

Based on the same inventive concept as the resource processing method, the embodiment of the application also provides a resource scheduling system. The resource scheduling system is deployed on the container cloud in the form of software, and comprises a scheduler and at least one component, wherein the scheduler is used for processing data monitored by the at least one component to realize the steps of the resource scheduling method of the embodiment. The functions of the actual scheduler and components can be seen in fig. 1 and are not described in detail here.

Based on the same inventive concept as the resource processing method, the embodiment of the present application further provides a resource scheduling device, which may be a cloud server, for example, a container cloud. As shown in fig. 7, the resource scheduling apparatus may include a processor 701 and a memory 702, and the resource scheduling system may be disposed on the processor 701.

The Processor 701 may be a general-purpose Processor, such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component, and may implement or execute the methods, steps, and logic blocks disclosed in the embodiments of the present Application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in a processor.

Memory 702, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charged Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 702 in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; the computer storage media may be any available media or data storage device that can be accessed by a computer, including but not limited to: a mobile storage device, a Random Access Memory (RAM), a magnetic Memory (e.g., a flexible disk, a hard disk, a magnetic tape, a magneto-optical disk (MO), etc.), an optical Memory (e.g., a CD, a DVD, a BD, an HVD, etc.), and a semiconductor Memory (e.g., a ROM, an EPROM, an EEPROM, a nonvolatile Memory (NAND FLASH), a Solid State Disk (SSD)) and various media that can store program codes.

Alternatively, the integrated unit described above may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present application or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods of the embodiments of the present application. And the aforementioned storage medium includes: various media that can store program codes include a removable Memory device, a Random Access Memory (RAM), a magnetic Memory (e.g., a flexible disk, a hard disk, a magnetic tape, a magneto-optical disk (MO), etc.), an optical Memory (e.g., a CD, a DVD, a BD, an HVD, etc.), and a semiconductor Memory (e.g., a ROM, an EPROM, an EEPROM, a nonvolatile Memory (NAND FLASH), a Solid State Disk (SSD)).

The above embodiments are only used to describe the technical solutions of the present application in detail, but the above embodiments are only used to help understanding the method of the embodiments of the present application, and should not be construed as limiting the embodiments of the present application. Modifications and substitutions that may be readily apparent to those skilled in the art are intended to be included within the scope of the embodiments of the present application.

Claims

1. A method for scheduling resources, comprising:

acquiring a first resource utilization rate of each node; wherein the node is configured to provide resources to the container group according to a scheduling request of the container group deployed at the node;

for each target node, applying a scoring mechanism matched with the priority of a target container group, and scoring the target node according to the second resource utilization rate and the resource allocation rate of the target node; the target node is obtained by screening each node according to a first utilization rate threshold, and the first resource utilization rate of the target node is smaller than or equal to the first utilization rate threshold;

and realizing resource scheduling according to the priority of the target container group and the scores of all the target nodes corresponding to the priority.

2. The resource scheduling method according to claim 1, wherein the first resource utilization rate is a CPU utilization rate and a memory utilization rate determined in a resource use process according to a container group that has been deployed by the node; the first utilization threshold includes a first CPU utilization threshold and a first memory utilization threshold.

3. The method according to claim 2, wherein the screening the nodes according to the first utilization threshold comprises:

deleting nodes of which the CPU utilization rate is greater than the first CPU utilization rate threshold value in each node;

4. The method according to claim 1, wherein if the priority of the target container group is high, the scoring mechanism for the priority matching of the target container group is the first scoring mechanism; if the priority of the target container group is medium or low, the scoring mechanism matched with the priority of the target container group is a second scoring mechanism;

in the first scoring mechanism, the utilization rate of the second resource is negatively correlated with the score, and the allocated rate of the resource is negatively correlated with the score; in the second scoring mechanism, the utilization rate of a second resource is negatively correlated with the score, and the allocated rate of the resource is positively correlated with the score; and the second resource utilization rate is weighted by the CPU utilization rate and the memory utilization rate.

5. The method according to claim 4, wherein applying a scoring mechanism matching with the priority of the target container group to score the target node according to the second resource utilization rate and the resource allocation rate of the target node comprises:

applying the first scoring mechanism to score the target node according to the second resource utilization rate of the target node to obtain a first score; scoring the target node according to the resource allocation rate to obtain a second score; weighting the first score and the second score to obtain a first target score of the target node under the condition that the priority of the target container group is high;

applying the second scoring mechanism to score the target node according to the second resource utilization rate of the target node to obtain a third score; scoring the target node according to the resource allocation rate to obtain a fourth score; and weighting the third score and the fourth score to obtain a second target score of the target node under the condition that the priority of the target container group is medium or low.

6. The method according to claim 5, wherein if the priority of the target container group is high, implementing resource scheduling according to the priority of the target container group and the score of each target node corresponding to the priority comprises:

under the condition that second resource utilization rates of N target nodes are smaller than or equal to a second utilization rate threshold value and a resource allocation rate is smaller than or equal to an allocation rate threshold value, scheduling the target container group to a target node with the highest first target score in the N target nodes; wherein N is an integer greater than or equal to 1 and less than or equal to the total number of the target nodes;

7. The method according to claim 5, wherein if the priority of the target container group is medium or low, the implementing resource scheduling according to the priority of the target container group and the score of each target node comprises:

under the condition that the second resource utilization rate of M target nodes is smaller than or equal to a second utilization rate threshold value and the resource allocation rate is smaller than or equal to an allocation rate threshold value, scheduling the target container group to a target node with the highest second target score in the M target nodes; wherein M is an integer greater than or equal to 1 and less than or equal to the total number of the target nodes;

under the condition that the second resource utilization rate of each target node is greater than a second utilization rate threshold, deleting a low-priority container group in a second reference target node aiming at the target container group with the medium priority, and scheduling the target container group with the medium priority to the second reference target node; and controlling the target container group with low priority to wait for scheduling, wherein the second reference target node is a target node with the highest second target score in the target nodes of each container group with low priority.

8. The method according to any of claims 1 to 7, wherein the resource allocated rate is determined by:

for each target node, determining the actual total amount of resources of the target node based on the resource over-selling coefficient of the target node;

9. The method according to claim 8, wherein the resource reselling coefficients comprise CPU resource reselling coefficients and memory resource reselling coefficients, and the actual total amount of resources comprises actual total amount of CPU resources and actual total amount of memory resources; the determining the actual total amount of the resources of the target node based on the resource over-selling coefficient of the target node comprises:

determining the actual total amount of the CPU resources according to the CPU resource over-selling coefficient and the initial total amount of the CPU resources of the target node; and determining the actual total amount of the memory resources according to the memory resource over-selling coefficient and the initial total amount of the memory resources of the target node.

10. The method according to claim 9, wherein the determining the allocated rate of the resources according to the actual total amount of the resources and the allocated resources comprises:

determining a first resource allocation rate according to the actual CPU resource total amount and the allocated CPU resource amount; determining the allocated rate of a second resource according to the actual total amount of the memory resources and the allocated amount of the memory resources;

11. A resource scheduling apparatus, comprising:

the acquisition module is used for acquiring the first resource utilization rate of each node; wherein the node is configured to provide resources to the group of containers according to a scheduling request of the group of containers deployed at the node;

the scoring module is used for applying a scoring mechanism matched with the priority of the target container group aiming at each target node and scoring the target node according to the second resource utilization rate and the resource allocation rate of the target node; the target node is obtained by screening each node according to a first utilization rate threshold, and the first resource utilization rate of the target node is smaller than or equal to the first utilization rate threshold;

and the scheduling module is used for realizing resource scheduling according to the priority of the target container group and the scores of all the target nodes corresponding to the priority.

12. A resource scheduling system comprising a scheduler and at least one component, the scheduler being configured to process data monitored by the at least one component to implement the steps of the method of any one of claims 1 to 10.