CN112799817A - Micro-service resource scheduling system and method - Google Patents

Micro-service resource scheduling system and method Download PDF

Info

Publication number
CN112799817A
CN112799817A CN202110143249.8A CN202110143249A CN112799817A CN 112799817 A CN112799817 A CN 112799817A CN 202110143249 A CN202110143249 A CN 202110143249A CN 112799817 A CN112799817 A CN 112799817A
Authority
CN
China
Prior art keywords
micro
service
resource
parameters
resources
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110143249.8A
Other languages
Chinese (zh)
Inventor
刘磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN202110143249.8A priority Critical patent/CN112799817A/en
Publication of CN112799817A publication Critical patent/CN112799817A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the invention provides a micro-service resource scheduling system and a method thereof, wherein the system comprises: the control logic is used for allocating idle resources for the newly-added microservices to be deployed so as to perform trial run on the newly-added microservices; the parameter acquisition module is used for acquiring operation state parameters during the trial operation of the newly added micro-service and preprocessing a plurality of parameters in the operation state parameters according to a preset rule to obtain first state input data during the trial operation of the newly added micro-service; the invention predicts the resource cliff and the optimal resource allocation area of the newly-increased micro service through the pre-trained first neural network model, thereby providing more accurate reference for the resource allocation of the newly-increased micro service through the optimal resource allocation area.

Description

Micro-service resource scheduling system and method
Technical Field
The present invention relates to the field of server resource scheduling, in particular to the technical field of microservice-oriented resource scheduling driven by quality of service (QoS), and more particularly to a microservice resource scheduling system and method.
Background
As cloud computing enters a new era, cloud services are turning from a single design to a large number of loosely coupled microservices, collectively serving end users. Micro-services have been growing rapidly since 2018, and are now dominating in modern cloud environments. Most cloud providers, including alibas, amazon, Facebook (Facebook), and ***, deploy microservices to improve the scalability, functionality, and reliability of their cloud systems.
To improve cost efficiency, multiple microservices are typically run on one server. Therefore, runtime resource scheduling becomes the key to guarantee quality of service (QoS) for microservices. However, as the number of server resources such as kernel, cache, bandwidth, etc. increases and the micro-services become more and more diversified, the scheduling exploration space is rapidly expanded, and the conventional scheduler cannot meet the rapid change of the service requirement. In addition, although modern servers have more kernel and memory resources than ever before, these resources are not fully utilized in current cloud environments. For example, ***'s data center has a CPU utilization of about 45% to 53% and a memory utilization of between 25% to 77% in 25 days, whereas a cluster of alisbaca has a CPU utilization of 18% to 40% and a memory utilization of 42% to 60% in 12 hours, which indicates that a large amount of resources are not fully exploited and utilized.
Some previous studies designed clustering methods that distribute the last level cache or Memory Bandwidth (Memory Bandwidth) among single-threaded applications. However, they are not suitable for microservices which typically have many concurrent threads and strict quality of service (QoS) constraints. Furthermore, these studies rely on performance models, which can introduce high scheduling overhead at runtime and porting difficulties, and designing an accurate model is a challenging task. Currently, the most advanced research uses either a heuristic algorithm, which increases or decreases one resource each time and observes performance changes, and the representative work is part proposed by Shuang Chen, which divides resources such as cache, main memory, I/O, network, disk bandwidth, etc.; or in a relatively simple manner using a learning-based algorithm, such as bayesian optimization, where the representative work is CLITE proposed by Tirthak Patel, the runtime parameters, QoS of the microservice are input into a bayesian optimizer to predict the scheduling policy. These existing methods cannot handle the various needs of the user in a timely manner on modern devices with more and more parallel computing units and complex memory hierarchies. Taking the example of scheduling 5 micro-servers, the average scheduling overhead of part and tile is above 20 seconds, under certain QoS constraints, which is too high for some specific tasks sensitive to delay and may cause some loss for the corresponding users. For example, for financial related microservices, too high a delay may cause significant financial loss. Therefore, the current resource scheduling technology in micro service is not ideal enough, and there is a need for improvement of the prior art.
Disclosure of Invention
Therefore, the present invention is directed to overcome the above-mentioned drawbacks of the prior art and to provide a micro-service resource scheduling system and method.
The purpose of the invention is realized by the following technical scheme:
according to a first aspect of the present invention, there is provided a micro service resource scheduling system, comprising: the control logic is used for allocating idle resources for the newly-added microservices to be deployed so as to perform trial run on the newly-added microservices; the parameter acquisition module is used for acquiring operation state parameters during the trial operation of the newly added micro-service and preprocessing a plurality of parameters in the operation state parameters according to a preset rule to obtain first state input data during the trial operation of the newly added micro-service; the pre-trained first neural network model is used for processing first state input data of the newly-added micro service in trial operation to predict a resource cliff and an optimal resource allocation area of the newly-added micro service, wherein the first neural network model is obtained by training by using operating state parameters of one or more micro services under different resource allocation schemes and the resource cliff and the optimal resource allocation area of the corresponding micro service as sample data.
In some embodiments of the invention, the first neural network model is pre-trained in the following manner: acquiring running state parameters of one or more micro services under different resource allocation schemes, and preprocessing a plurality of parameters according to a preset rule to obtain a plurality of pieces of first state input data for training; acquiring a resource cliff and an optimal resource allocation area of the corresponding micro service marked for each piece of first state input data for training as training labels; and performing multi-round training on the first neural network model by using the plurality of pieces of first state input data for training and the corresponding training labels until convergence, so as to obtain the pre-trained first neural network model.
In some embodiments of the present invention, the plurality of parameters for preprocessing the first state input data for the commissioning of the newly added micro service or each of the plurality of first state input data for preprocessing the training comprises a system state parameter and a first set of micro service state parameters.
In some embodiments of the invention, the system status parameter is a status parameter of a currently scheduled server system, including: one or more of a CPU utilization parameter, a CPU core frequency parameter, an instruction number per clock cycle, a cache miss parameter, a memory bandwidth parameter, and a memory utilization parameter.
In some embodiments of the present invention, the first set of micro-service state parameters are state parameters of the newly added micro-service itself at the trial run time, including: virtual memory occupation parameters, physical memory occupation parameters, the number of distributed CPU cores, the number of distributed LLC lines and the size parameters of the currently occupied LLC.
In some embodiments of the invention, the resource cliffs of the newly added service include a CPU core cliff and an LLC cliff.
In some embodiments of the present invention, the optimal resource allocation region includes CPU core resource allocation under a CPU core priority condition, LLC allocation under a CPU core priority condition, CPU core resource allocation under a cache priority condition, and LLC allocation under a cache priority condition.
In some embodiments of the present invention, the parameter obtaining module is further configured to obtain an operation state parameter during the commissioning and preprocess a plurality of parameters thereof according to a predetermined rule to obtain second state input data during the commissioning, the micro-service resource scheduling system further includes a pre-trained second neural network model for, in response to invocation of the control logic, processing the second state input data at the run-time when the current system resources cannot meet the resource requirements of the newly added microservice to predict a resource deprivation scheme for one or more neighboring microservices located on the same server that meet their own quality of service requirements, the second neural network model is obtained by training one or more micro-services under different resource allocation schemes by taking the running state parameters and corresponding resources which can be deprived of the micro-services under the condition that the micro-services meet the self service quality requirements as training samples; the control logic is further configured to deprive and allocate resources of the one or more neighboring microservices to the newly added microservices when the allocated resources of the newly added microservices and the deprivable resources of the one or more neighboring microservices determined according to the resource deprivation scheme can meet the resource requirements of the newly added microservices.
In some embodiments of the invention, the second neural network model is pre-trained in the following manner: acquiring running state parameters of one or more micro services under different resource allocation schemes, and preprocessing a plurality of parameters according to a preset rule to obtain a plurality of pieces of second state input data for training; acquiring a resource deprivation scheme of each corresponding micro service marked by each piece of second state input data for training under the condition that the corresponding micro service meets the self service quality requirement as a training label; and performing multi-round training on a second neural network model by using the plurality of pieces of second state input data for training and the corresponding training labels until convergence, so as to obtain the pre-trained second neural network model.
In some embodiments of the present invention, the plurality of parameters for obtaining the second state input data via preprocessing or each of the plurality of pieces of second state input data for training via preprocessing comprises a system state parameter and a second set of microservice state parameters.
In some embodiments of the invention, the second set of microservice state parameters includes: virtual memory occupation parameters, physical memory occupation parameters, the number of distributed CPU cores, the number of distributed LLC paths and the size parameter of the currently occupied LLC, the expected QoS reduction percentage, the number of CPU cores occupied by the neighbor micro-services located in the same server, the number of LLC paths occupied by the neighbor micro-services located in the same server, and the memory bandwidth occupied by the neighbor micro-services located in the same server.
In some embodiments of the invention, the resource deprivation scheme comprises: the target number of CPU cores, the target number of LLC paths, the target number of CPU cores under the CPU core priority condition, the target number of LLC paths under the CPU core priority condition, the target number of CPU cores under the cache priority condition and the target number of LLC paths under the cache priority condition.
In some embodiments of the present invention, the parameter obtaining module is further configured to obtain an operation state parameter during the trial operation period, and preprocess a plurality of parameters in the operation state parameter according to a predetermined rule to obtain second shadow state input data during the trial operation; the micro-service resource scheduling system further comprises a pre-trained second shadow neural network model, wherein the pre-trained second shadow neural network model is used for responding to the calling of the control logic, processing second shadow state input data to predict the QoS reduction percentage of a specific neighbor micro-service and a target micro-service after the resource sharing by a plurality of resource sharing schemes when any target micro-service needs to obtain more resources and the resource demand of the target micro-service cannot be met by deprived resources of idle resources and one or more neighbor micro-services, and the second shadow neural network model is obtained by training by taking the running state parameters of one or more micro-services under the plurality of resource sharing schemes and the corresponding Qos reduction percentage as training samples; the control logic is further configured to select an acceptable resource sharing scheme from the plurality of resource sharing schemes in sequence with the preference condition of less QoS degradation percentage related to the micro-service, and configure the shared resources for the specific neighbor micro-service and the target micro-service according to the acceptable resource sharing scheme.
In some embodiments of the invention, the second shadow neural network model is pre-trained in the following manner: acquiring running state parameters of one or more micro services under multiple resource sharing schemes, and preprocessing the parameters according to a preset rule to obtain a plurality of pieces of second shadow state input data for training; acquiring the corresponding QoS reduction percentage of each second shadow state input data mark for training as a training label; and performing multi-round training on a second shadow neural network model by using the plurality of pieces of second state input data for training and corresponding training labels until convergence, so as to obtain the pre-trained second neural network model.
In some embodiments of the invention, the plurality of parameters for preprocessing to obtain the second shadow state input data or each of the plurality of pieces of second shadow state input data for preprocessing to obtain the training comprise system state parameters and a second set of micro-service shadow state parameters.
In some embodiments of the invention, the second set of microservice shadow state parameters includes: virtual memory occupation parameters, physical memory occupation parameters, the number of allocated CPU cores, the number of allocated LLC lines and the size parameter of the currently occupied LLC, the number of shared CPU cores, the number of shared LLC lines, the number of CPU cores occupied by the neighbor micro-service located in the same server, the number of LLC lines occupied by the neighbor micro-service located in the same server, and the memory bandwidth occupied by the neighbor micro-service located in the same server.
In some embodiments of the present invention, the micro-service resource scheduling system further includes a third neural network model trained in advance, wherein the control logic is further configured to invoke the parameter obtaining module when the control logic detects that a QoS violation occurs for a specific micro-service; the parameter obtaining module is further used for responding to the calling of the control logic to obtain the running state parameters of the specific micro service when the QoS violation occurs, and preprocessing a plurality of parameters in the running state parameters according to a preset rule to obtain third state data when the QoS violation occurs; the pre-trained third neural network model is used for processing the third state data when the QoS violation occurs so as to predict resources needing to be added to the specific micro service; and the control logic is further configured to add the resource to the specific micro service if the currently available resource satisfies the resource that needs to be added to the specific micro service.
In some embodiments of the present invention, the control logic is further configured to, when it is detected that a specific micro service has an over-allocation of resources, invoke the parameter obtaining module, where the parameter obtaining module is further configured to, in response to the invocation of the control logic, obtain an operating state parameter of the specific micro service when the over-allocation of resources is detected, and preprocess a plurality of parameters thereof according to a predetermined rule to obtain third state data when the over-allocation of resources is detected, where the pre-trained third neural network model is further configured to process the third state data when the over-allocation of resources is detected to predict excess resources of the specific micro service; the control logic is further configured to adjust the resource configuration of the particular microservice based on the predicted excess resources for the particular microservice to reclaim the excess resources from the particular microservice.
In some embodiments of the present invention, the plurality of parameters for obtaining the third state data through preprocessing include a system state parameter and a third set of micro service state parameters, where the third set of micro service state parameters includes a number of allocated CPU cores, a number of allocated LLC ways, a currently occupied LLC size parameter, and a micro service response delay.
In some embodiments of the present invention, the third neural network model may adopt a reinforcement learning model, and the third neural network model is trained in advance according to the following manner: acquiring a pre-acquired training set, wherein each sample in the pre-acquired training set comprises initial third state data, an action, next third state data and a reward value; and performing multiple rounds of training on the third neural network model by using the pre-acquired training set until convergence to obtain the pre-trained third neural network model, wherein the initial third state data is an operation state parameter when Qos violation occurs or resource overage occurs to the micro-service, the action is a resource addition action or a resource recovery action, the next third state data is obtained by applying the acquired operation state parameter after the resource addition action or the resource recovery action based on the initial third state data, and the reward value is a grade of the action.
According to a second aspect of the present invention, there is provided a method for scheduling micro service resources, implemented by a micro service resource scheduling system according to the first aspect, including: allocating idle resources for newly-added micro services to be deployed so as to perform trial run on the newly-added micro services; acquiring running state parameters during the trial running period, and preprocessing a plurality of parameters according to a preset rule to obtain first state input data during the trial running of the newly-added micro service; and processing first state input data of the newly added micro service during trial operation through a pre-trained first neural network model to predict a resource cliff and an optimal resource allocation area of the newly added micro service.
In some embodiments of the present invention, the method for scheduling micro service resources further includes: when the current system resources cannot meet the resource requirements of the newly-added micro services, processing second state input data in the trial run of the newly-added micro services through a pre-trained second neural network model to predict a resource deprivation scheme of one or more neighbor micro services on the same server under the condition of meeting the self service quality requirements; when the resources which are allocated to the newly-increased micro service in the commissioning process and the deprivable resources of one or more neighbor micro services determined according to the resource deprivation scheme can meet the resource requirement of the newly-increased micro service, the resources of the one or more neighbor micro services are deprived and allocated to the newly-increased micro service through the control logic.
In some embodiments of the present invention, the method for scheduling micro service resources further includes: when any target micro service needs to acquire more resources and the deprived resources of the idle resources and one or more neighbor micro services cannot meet the resource requirement of the target micro service, second shadow state input data are processed through a second shadow neural network model trained in advance to predict the QoS reduction percentage of a specific neighbor micro service and the target micro service after the resources are shared by a plurality of resource sharing schemes, an acceptable resource sharing scheme is selected by a control logic from the plurality of resource sharing schemes in turn by taking fewer micro services and fewer QoS reduction percentages as preferential selection conditions, and shared resources are configured for the specific neighbor micro service and the target micro service according to the acceptable resource sharing scheme.
In some embodiments of the present invention, the method for scheduling micro service resources further includes: when the control logic detects that the QoS violation occurs to a specific micro-service, processing third state data when the QoS violation occurs through a pre-trained third neural network model to predict resources needing to be added to the specific micro-service; and in the case that the current available resource meets the resource needing to be added to the specific micro service, adding the resource to the specific micro service through the control logic.
In some embodiments of the present invention, the method for scheduling micro service resources further includes: when the control logic detects that the specific micro service has the condition of excessive resource allocation, processing third state data when the excessive resource allocation is detected by a pre-trained third neural network model to predict the excessive resource of the specific micro service; the resource configuration of the particular microservice is adjusted by control logic based on the predicted excess resources of the particular microservice to reclaim excess resources from the particular microservice.
According to a third aspect of the present invention, there is provided an electronic apparatus comprising: one or more processors; and a memory, wherein the memory is to store one or more executable instructions; the one or more processors are configured to implement the steps of the method of the second aspect via execution of the one or more executable instructions.
Compared with the prior art, the invention has the advantages that:
the method comprises the steps that a first neural network model is trained in advance to learn the relation between a resource cliff of the micro service and an optimal resource allocation area of the micro service related to the QoS requirement of the micro service, wherein the running state parameters (such as instruction number (IPC) of each clock cycle, Cache Miss (Cache Miss), memory occupation and the like) are learned by the first neural network model, so that when the new micro service needs to be deployed, the resource cliff and the optimal resource allocation area of the new micro service are predicted through the first neural network model trained in advance, and therefore more accurate reference is provided for resource allocation of the new micro service through the optimal resource allocation area; the scheduling speed of the resource scheduling is 3.2-5.5 times that of the prior art, so that the resource scheduling is realized more quickly, and the user experience is improved; compared to the latest studies, the present invention supports higher loadings (up to 60%); according to the method, the resource cliff of the micro service is prevented from being dropped by the micro service in the subsequent resource allocation process according to the predicted resource cliff of the micro service, so that the QoS fluctuation of the micro service is better avoided, and the user experience is improved;
the invention can further acquire the needed resources for the newly added micro-service from the neighbor micro-service with the allocated resources through the second neural network model, so that the invention supports higher load, considers the QoS reduction percentage, can better avoid the QoS fluctuation possibly caused by depriving the resources to the neighbor micro-service, and improves the user experience;
the invention can efficiently process the complex resource sharing condition through the second shadow neural network model without generating a large amount of scheduling overhead;
the invention can correct the condition of too little or over-distribution of resources through the third neural network model, better utilize the resources of the server and ensure the service quality.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings, in which:
FIG. 1 is a block diagram of a micro-server resource scheduling system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an operating architecture of a micro service resource scheduling system according to an embodiment of the present invention;
fig. 3 is a schematic workflow diagram of a micro service resource scheduling system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail by embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As mentioned in the background section, existing research relies on performance models, which can introduce high scheduling overhead at runtime, and migration difficulties, designing an accurate model is also a challenging task. In addition, the prior art cannot detect the resource cliff, so that the service quality fluctuation caused by the resource cliff cannot be avoided in the resource allocation process, and the prior art is difficult to accurately and quickly find the optimal resource allocation area. In order to solve the above-mentioned technical defects in the prior art, and to avoid the problem of the resource cliff and quickly respond to the change of the current operating state parameters, the present invention predicts the resource cliff and the optimal resource allocation area through the pre-trained first neural network model, so that the micro-service resource scheduling system of the present invention can automatically detect and avoid the resource cliff and allocate the required resources meeting the quality of service requirements (QoS requirements) for the micro-service according to the optimal resource allocation area.
Before describing embodiments of the present invention in detail, some of the terms used therein will be explained as follows:
the micro service is that the traditional single application is split into services, and the services are decomposed into mutually independent micro services from the horizontal direction or the longitudinal direction according to business and function requirements. Each micro-service can independently run a plurality of instances, and the micro-services are independent from each other logically as much as possible.
The LLC number of ways refers to the number of ways in the Last Level Cache (Last Level Cache).
The number of CPU cores refers to the number of cores (cores) of a Central Processing Unit (Central Processing Unit).
Referring to fig. 1, according to an embodiment of the present invention, the present invention provides a micro service resource scheduling system, including: the system comprises control logic, a parameter acquisition module and a pre-trained first neural network model. Preferably, the micro service resource scheduling system further comprises at least one of a pre-trained second neural network model, a pre-trained second shadow neural network model, and a pre-trained third neural network model.
According to one embodiment of the invention, the control logic is connected to the parameter acquisition module, the pre-trained first neural network model, the pre-trained second shadow neural network model, and the pre-trained third neural network model. The control logic is used for allocating idle resources for the newly-added microservices to be deployed so as to perform trial run on the newly-added microservices; and calling a pre-trained first neural network model, a pre-trained second shadow neural network model and a pre-trained third neural network model to complete corresponding prediction tasks, and executing resource scheduling according to the result of the prediction tasks to meet the resource requirements of corresponding microservices. The current Server operating system mainly comprises Windows Server, Netware, Unix and Linux. The present invention may be deployed on these server operating systems for resource scheduling for one or more microservices that need to be deployed or that are already deployed on them. Taking Linux as an example, the kernel space of the current Linux operating system lacks support for a machine learning library, so that the control logic can be set in the user space of the operating system and implemented based on Python and C languages. For allocating LLC resources, the control logic may employ Intel Cache Allocation Technology (CAT) to control the Cache Allocation and support dynamic adjustment. For allocating CPU core resources, the control logic may use the taskset technique of Linux to allocate a particular kernel for the microservice.
According to an embodiment of the present invention, the parameter obtaining module is configured to obtain an operation status parameter when each micro service operates and/or preprocess the operation status parameter according to a predetermined rule. The parameter acquisition module can capture the micro-service online Performance parameters through a PQoS tool and a Performance Monitoring Unit (PMU). The parameter acquisition module is used for acquiring running state parameters during the trial running period and preprocessing a plurality of parameters according to a preset rule to obtain first state input data during the trial running of the newly-added micro-service.
According to one embodiment of the invention, a first neural network model is trained in advance for processing first state input data at commissioning of the new incremental service to predict resource cliffs and optimal resource allocation zones for the new incremental service. The first neural network model may be obtained by training using operating state parameters of one or more micro services under different resource allocation schemes, and a resource cliff and an optimal resource allocation region of the corresponding micro service as sample data. According to one embodiment of the invention, the first neural network model is pre-trained in the following way: obtaining one or more microservices at different resource allocationsOperating state parameters under the scheme and preprocessing a plurality of parameters in the operating state parameters according to a preset rule to obtain a plurality of pieces of first state input data for training; acquiring a resource cliff and an optimal resource allocation area of the corresponding micro service marked for each piece of first state input data for training as training labels; and performing multi-round training on the first neural network model by using the plurality of pieces of first state input data for training and the corresponding training labels until convergence, so as to obtain the pre-trained first neural network model. Preferably, the first neural network model may employ a multi-layer perceptron, preferably a three-layer multi-layer perceptron. Each layer of the multi-layer perceptron is a set of non-linear functions. The present invention trains a first neural network model using an offline database composed of operating state parameters that collect one or more microservices when operating under different resource allocation scenarios until the first neural network model converges. During training, the value of the hyperparameter can be adjusted through a Bayesian optimization algorithm. Setting the exploration range of the learning rate to be 0.0001-0.1, and taking the cross validation loss as an objective function of the Bayesian optimization method. And (4) through repeated exploration by a Bayesian optimization method, selecting a learning rate which can minimize cross validation loss as a final result. The loss function of the first neural network model is set to MSE. Suppose the prediction result is s ═ s1,s2,…,snThe true value result corresponding to the label is y ═ y }1,y2,…,yn}. The loss function is:
Figure BDA0002928825000000101
stdenotes the t-th prediction, ytAnd (4) representing a true value result corresponding to the t-th label, namely, subtracting the predicted result from the true value result corresponding to the label, and calculating the square sum.
According to one embodiment of the invention, the one or more micro-services are, for example, one or more of Img-dnn, Masstree, Memcached, MongoDB, Moses, Nginx, Specjbb, Sphinx, xaian, Login, Advertising. The inventor conducts experiments and training data based on one or more micro services, but it should be understood that as the technology advances or different users need, the users can actually train the model with the pre-collected operation state parameters corresponding to any type of micro service according to the idea of the present invention, and therefore, the present invention does not limit the type of micro service. The inventor finds through experiments that a Resource Cliff phenomenon (RCliff phenomenon for short) exists in a Resource scheduling space, which not only affects exploration efficiency and makes it difficult to converge to optimal scheduling, but also causes serious QoS fluctuation. The RCliff phenomenon means that under a certain resource allocation condition, even if only one core or one last-level cache resource is deprived, the performance of an application program is degraded in a cliff mode. In this case, reducing only a small amount of resources results in a significant (large) QoS degradation. For example, for microservers, with 6 CPU core resources and 10 last level buffers already allocated, only 1 last level buffer is deprived, and the response delay also increases dramatically from 34 milliseconds to 4644 milliseconds. The resource Cliff comprises a Cache Cliff (Cache Cliff) and a Core Cliff (Core Cliff). Wherein, the root cause of the cache cliff is locality; the root cause of the core cliff is queuing theory, and when the request arrival rate exceeds the available core, the service requests are piled up, and the delay is increased sharply. The resource cliff can be marked according to the influence of the resource allocation condition of the micro service on the response delay obtained by the response delay experiment of the micro service. A resource cliff is the minimum resource requirement for the microservice under current operating conditions, below which a significant decrease in QoS will result. Due to different characteristics of different microservices, indexes of QoS great reduction cannot be clearly unified, and a user needs to self-define and set according to actual conditions. Generally, decreasing a small amount of resources results in a large increase in response delay, which is considered as a large decrease in QoS. In practical cases, the optimal resource allocation region for the micro service is an optimal resource allocation scheme satisfying the QoS requirement according to the response delay marks measured under different resource allocation schemes. According to an embodiment of the present invention, the plurality of parameters for obtaining the first state input data for the commissioning of the newly added micro service through preprocessing or each of the plurality of pieces of first state input data for training through preprocessing comprises a system state parameter and a first set of micro service state parameters. Preferably, in the present invention, the system status parameters mentioned here and elsewhere refer to the status parameters of the currently scheduled server system, including: one or more of a CPU utilization parameter, a CPU core frequency parameter, an instruction number per clock cycle, a cache miss parameter, a memory bandwidth parameter, and a memory utilization parameter. Preferably, the first set of micro-service state parameters are state parameters of the newly added micro-service itself during commissioning, and include: virtual memory occupation parameters, physical memory occupation parameters, the number of distributed CPU cores, the number of distributed LLC lines and the size parameters of the currently occupied LLC. The resource cliffs of the newly-added micro-service comprise a CPU core cliff and an LLC cliff. And the control logic avoids the Qos fluctuation of the micro-service caused by trapping or dropping a resource cliff due to the adjustment when adjusting the resources of the micro-service. Trapping a resource cliff means that the amount of a certain resource allocated to the micro-service is exactly equal to the amount of the resource defined in the resource cliff. For example, the number of CPU cores or LLC ways is exactly equal to the number of resources defined in the resource cliff. Dropping a resource cliff means that the amount of a certain resource allocated to the micro-service is less than the amount of resources defined in the resource cliff. The optimal resource allocation area comprises CPU core resource allocation under the CPU core priority condition, LLC allocation under the CPU core priority condition, CPU core resource allocation under the cache priority condition and LLC allocation under the cache priority condition. According to an embodiment of the present invention, the cache miss parameter may adopt a cache miss rate or a size of the cache miss as a parameter. The CPU core cliff refers to the minimum CPU core resource allocated to the newly added microservice. An LLC cliff refers to the minimum LLC resources allocated to the newly added microservice. Values below the CPU core cliff or LLC cliff will cause a cliff-like decrease in the performance of the newly added service. The optimal resource allocation region includes two resource allocation schemes, namely a first resource allocation scheme (CPU core resource allocation under CPU core priority condition, LLC allocation under CPU core priority condition) under CPU core priority condition and a second resource allocation scheme (CPU core resource allocation under cache priority condition, LLC allocation under cache priority condition) under cache priority condition. The control logic selects the final resource allocation scheme according to the available resource conditions of the current CPU core and the LLC. For example, when the current CPU core resources and the LLC resources are both sufficient, the control logic randomly selects one resource allocation scheme from the optimal resource allocation region, as the control logic adopts the first resource allocation scheme when the current CPU core resources are more sufficient relative to the LLC resources, otherwise adopts the second resource allocation scheme. It should be noted that the resource allocation of the present invention is mainly described with the number of CPU cores and LLC lines as the resource allocation targets, but this does not mean that the present invention is only applicable to these two resources as the allocation targets. For example, based on the idea of the present invention, the CPU core and the memory bandwidth, the LLC number of ways, and the memory bandwidth may also be used as allocation targets, which is not limited in any way by the present invention. According to one embodiment of the present invention, there are several references to pre-treatment in the present invention, which are collectively described herein for simplicity. The preprocessing is a process of organizing and normalizing the plurality of corresponding parameters in a specific order. The normalization process may be a maximum-minimum normalization process. Preferably, the control logic is further configured to save a resource cliff for each micro-service and to avoid resource adjustment during resource adjustment for the micro-service that would cause the resource of the micro-service to approach or fall below the resource cliff. Preferably, the control logic is further configured to receive a configuration of resource allocation parameters of a user for a certain micro service, and allow the resource of the micro service to be equal to or close to the predicted value in the resource cliff during the resource adjustment for the micro service. Therefore, the user can configure some resource allocation parameters corresponding to the micro-services which are positioned on the resource cliff and can meet the QoS requirements according to the needs, and the resources of the server are utilized more fully. The technical scheme of the embodiment can at least realize the following beneficial technical effects: the method comprises the steps that a first neural network model is trained in advance to learn the relation between a resource cliff of the micro service and an optimal resource allocation area of the micro service related to the QoS requirement of the micro service, wherein the running state parameters (such as instruction number (IPC) of each clock cycle, Cache Miss (Cache Miss), memory occupation and the like) are learned by the first neural network model, so that when the new micro service needs to be deployed, the resource cliff and the optimal resource allocation area of the new micro service are predicted through the first neural network model trained in advance, and therefore more accurate reference is provided for resource allocation of the new micro service through the optimal resource allocation area; the scheduling speed of the resource scheduling is 3.2-5.5 times that of the prior art, so that the resource scheduling is realized more quickly, and the user experience is improved; compared to the latest studies, the present invention supports higher loadings (up to 60%); the invention avoids the micro service from dropping the resource cliff in the subsequent resource allocation process according to the predicted resource cliff of the micro service, better avoids the QoS fluctuation of the micro service and improves the user experience.
To further illustrate the form of the inputs and outputs of the first neural network model, according to an example of the present invention, it is shown visually here by a sample organized in a particular order but not yet normalized. For example, the input is x ═ x (a CPU utilization parameter, a CPU core frequency parameter, an instruction count per clock cycle, a cache miss parameter, a memory bandwidth parameter, a memory utilization, a virtual memory footprint parameter, a physical memory footprint parameter, an allocated CPU core count, an allocated LLC ways count, and a currently occupied LLC size parameter). It should be appreciated that the order in which these parameters are organized may be defined by the user as desired prior to training the first neural network model. For example, it can also be defined as: x ═ is (a CPU utilization parameter, a CPU core frequency parameter, an instruction number per clock cycle, a cache miss parameter, a currently occupied LLC size parameter, a memory bandwidth parameter, a memory utilization, a virtual memory occupancy parameter, a physical memory occupancy parameter, an allocated CPU core number, an allocated LLC lane number); for example, assuming that x is equal to (152.9,1853.6,0.55,10179,4550,11711.9,1.1,1602012,1363149,3,2), the operation state parameter obtained by the current trial operation of the new microservice is: the CPU utilization rate parameter is 152 and 9 percent, the CPU core frequency parameter is 1853.6MHz, the instruction number in each clock cycle is 0.55, the cache miss parameter is 10179KB, the currently occupied LLC size parameter is 4550KB, the memory bandwidth parameter is 11711.9MB/s, the memory utilization rate is 1.1 percent, the virtual memory occupation parameter is 1602012KB, the physical memory occupation parameter is 1363149KB, the number of distributed CPU cores is 3, and the number of distributed LLC paths is 2. Correspondingly, the schematic output of the first neural network model is y ═ y (CPU core cliff, LLC cliff, CPU core resource allocation under CPU core priority condition, LLC allocation under CPU core priority condition, CPU core resource allocation under cache priority condition, LLC allocation under cache priority condition). For example, assume that some y is (2,4.5,5,6.75,4,9), which represents a CPU core cliff: 2, LLC cliff: 4.5MB, CPU core resource allocation under CPU core priority condition: 5, LLC allocation under the priority condition of the CPU core: 6.75MB, CPU core resource allocation under cache priority condition: 4, LLC allocation under cache priority condition: 9 MB. The value of y may not be normalized so that the desired prediction result is obtained directly at the time of prediction. In experiments, the present invention uses hundreds of millions of data samples covering various resource allocation samples of commonly used microservices as training data.
According to an embodiment of the present invention, the parameter obtaining module is further configured to obtain an operation state parameter during the trial operation and preprocess a plurality of parameters thereof according to a predetermined rule to obtain a second state input data during the trial operation, and the micro-service resource scheduling system further includes a pre-trained second neural network model, which is configured to, in response to the invocation of the control logic, process the second state input data during the trial operation when the current system resource cannot meet the resource requirement of the new micro-service, so as to predict a resource deprivation scheme of one or more neighbor micro-services located on the same server when the resource requirement of the new micro-service is met; the control logic is further configured to deprive and allocate resources of the one or more neighboring microservices to the newly added microservices when the allocated resources of the newly added microservices and the deprivable resources of the one or more neighboring microservices determined according to the resource deprivation scheme can meet the resource requirements of the newly added microservices.
According to an embodiment of the present invention, the second neural network model may be obtained by training, as a training sample, the operating state parameters of one or more micro services under different resource allocation schemes and the deprivable resources of the corresponding micro services meeting their own quality of service requirements. Pre-training a second neural network model in the following manner: acquiring running state parameters of one or more micro-services under different resource allocation schemes, and preprocessing the parameters according to a predetermined rule to obtain a plurality of second state input numbers for trainingAccordingly; acquiring a resource deprivation scheme of each corresponding micro service marked by each piece of second state input data for training under the condition that the corresponding micro service meets the self service quality requirement as a training label; and performing multi-round training on a second neural network model by using the plurality of pieces of second state input data for training and the corresponding training labels until convergence, so as to obtain the pre-trained second neural network model. Preferably, the second neural network model may employ a multi-layered perceptron, preferably a three-layered multi-layered perceptron. Each layer of the multi-layer perceptron is a set of non-linear functions. The present invention trains a second neural network model using an offline database composed of operating state parameters that collect one or more microservices when operating under different resource allocation schemes until the second neural network model converges. During training, the value of the hyperparameter can be adjusted through a Bayesian optimization algorithm. Setting the exploration range of the learning rate to be 0.0001-0.1, and taking the cross validation loss as an objective function of the Bayesian optimization method. And (4) through repeated exploration by a Bayesian optimization method, selecting a learning rate which can minimize cross validation loss as a final result. The loss function of the second neural network model may be set to a modified MSE loss function. Suppose the prediction result is s ═ s1,s2,…,snThe true value result corresponding to the label is y ═ y }1,y2,…,yn}. The loss function is:
Figure BDA0002928825000000141
wherein s istDenotes the t-th prediction, ytWhich represents the true result for the t-th label, c is a small constant close to 0. When no feasible allocation can be found, the tag will be marked as 0. The purpose of modifying the MSE loss function is to avoid the label yiThe counter-propagation is performed at 0.
According to an embodiment of the present invention, the plurality of parameters for obtaining the second state input data through preprocessing or each of the plurality of pieces of second state input data for training through preprocessing comprise a system state parameter and a second set of micro service state parameters.
According to an embodiment of the invention, the second set of microservice state parameters includes: virtual memory occupation parameters, physical memory occupation parameters, the number of distributed CPU cores, the number of distributed LLC paths and the size parameter of the currently occupied LLC, the expected QoS reduction percentage, the number of CPU cores occupied by the neighbor micro-services located in the same server, the number of LLC paths occupied by the neighbor micro-services located in the same server, and the memory bandwidth occupied by the neighbor micro-services located in the same server.
According to one embodiment of the invention, a resource deprivation scheme comprises: the target number of CPU cores, the target number of LLC paths, the target number of CPU cores under the CPU core priority condition, the target number of LLC paths under the CPU core priority condition, the target number of CPU cores under the cache priority condition and the target number of LLC paths under the cache priority condition. In this case, the resource deprivation scheme directly gives the remaining resource after depriving the resource. The target CPU core number and the target LLC path number correspond to a sub scheme for depriving the CPU core and the LLC in a default state; the target CPU core number under the CPU core priority condition and the target LLC path number under the CPU core priority condition correspond to the sub-scheme which tends to deprive more CPU cores; the target number of CPU cores under cache-first conditions and the target number of LLC ways under cache-first conditions are sub-schemes that tend to deprive more of the number of LLC ways. Because there are one or more resource deprivation schemes, the control logic may select a desired resource deprivation scheme from among the one or more resource deprivation schemes according to user-defined rules. For example, the control logic selects a sub-scheme from the one or more resource deprivation schemes that is least deprived of resources and can make up for the required resources of the newly added microservice.
The technical scheme of the embodiment can at least realize the following beneficial technical effects: the invention can further acquire the needed resources for the newly added micro-service from the neighbor micro-service with the allocated resources through the second neural network model, so that the invention supports higher load, considers the QoS reduction percentage, can better avoid the QoS fluctuation possibly caused by depriving the resources to the neighbor micro-service, and improves the user experience.
To further illustrate the form of the inputs and outputs of the second neural network model, according to an example of the present invention, it is shown here visually by a sample organized in a particular order but not normalized yet. For example, the input is x ═ x (a CPU utilization parameter, a CPU core frequency parameter, an instruction count per clock cycle, a cache miss parameter, a currently occupied LLC size parameter, a memory bandwidth parameter, a memory utilization, a virtual memory occupancy parameter, a physical memory occupancy parameter, a number of allocated CPU cores, a number of allocated LLC lines, an expected QoS degradation percentage, a number of CPU cores occupied by neighbor microservices located on the same server, a number of LLC lines occupied by neighbor microservices located on the same server, a memory bandwidth occupied by neighbor microservices located on the same server). Let a certain x be (246.6,2540.5,0.38,11859,6912,23475.5,0.03,13421772.8,1258291.2,10,3,0.2,18,10,11286.2). The meaning represented by the first 11 parameters in x can be referred to the first neural network model, and the rest elements from the 12 th parameter to the back represent: percentage expected QoS degradation: 20%, the number of CPU cores occupied by neighbor microservices located in the same server: 18, number of LLC lanes occupied by neighbor microservices located in the same server: 10, memory bandwidth occupied by neighbor microservices located in the same server: 11286.2 MB/s. Correspondingly, the schematic output of the second neural network model is y ═ y (the target CPU core number, the target LLC way number, the target CPU core number under the CPU core priority condition, the target LLC way number under the CPU core priority condition, the target CPU core number under the cache priority condition, and the target LLC way number under the cache priority condition). For example, assume that a certain y is (8,2,7,2,0,0), which indicates the target CPU core number: 8, target LLC number: 2, target CPU core number under CPU core priority condition: 7, target LLC path number under the CPU core priority condition: 2, the number of target CPU cores under the cache priority condition: 0, target LLC way number under cache priority condition: 0. since the target number of CPU cores or the target number of LLC ways under the cache-first condition does not search for a suitable resource-deprivation policy, 0 is set accordingly.
According to an alternative embodiment of the invention, the resource deprivation scheme comprises: the number of deprived CPU cores, the number of deprived LLC paths, the number of deprived CPU cores under CPU core priority conditions, the number of deprived LLC paths under CPU core priority conditions, the number of deprived CPU cores under cache priority conditions, and the number of deprived LLC paths under cache priority conditions. The number of deprived CPU cores and the number of deprived LLC paths correspond to the sub-scheme of depriving the CPU cores and the LLC in the default state; the number of deprived CPU cores under the CPU core priority condition and the number of deprived LLC circuits under the CPU core priority condition correspond to the sub-scheme which tends to deprive more CPU cores; the number of deprived CPU cores under cache-first conditions and the number of deprived LLC ways under cache-first conditions are sub-schemes that tend to deprive more LLC ways. The resource deprivation scheme of the previous embodiment is to directly predict the resources remaining after deprivation, and this embodiment is to predict how much resources are to be deprived. The two schemes can both realize a prediction resource deprivation scheme, and a user can select a specific implementation mode according to needs and correspondingly configure training data. It should be appreciated that, if the former approach, the deprivable resource is equal to the allocated resource minus the predicted target resource in the resource fluctuation scheme; in this embodiment, the value in the resource deprivation scheme is the predicted deprivable resource.
According to an embodiment of the present invention, the parameter obtaining module is further configured to obtain an operation state parameter during the trial operation and preprocess a plurality of parameters thereof according to a predetermined rule to obtain a second shadow state input data during the trial operation, the micro service resource scheduling system further includes a second shadow neural network model trained in advance, in response to a call of the control logic, when any target micro service needs to obtain more resources and a resource requirement of the target micro service cannot be met by a deprivable resource of an idle resource and one or more neighbor micro services, the second shadow state input data is processed to predict a QoS degradation percentage after a specific neighbor micro service and the target micro service share the resources in multiple resource sharing schemes, the control logic is further configured to select an acceptable resource sharing scheme with a priority that less micro services are involved and the QoS degradation percentage is less from the multiple resource sharing schemes, and configuring shared resources for the specific neighbor microservice and the target microservice according to an acceptable resource sharing scheme. The second shadow neural network model may be obtained by training, using the operating state parameters of one or more micro services under multiple resource sharing schemes and the corresponding Qos degradation percentages thereof as training samples. Preferably, the second shadow neural network model is pre-trained in the following way: acquiring running state parameters of one or more micro services under multiple resource sharing schemes, and preprocessing the parameters according to a preset rule to obtain a plurality of pieces of second shadow state input data for training; acquiring the corresponding QoS reduction percentage of each second shadow state input data mark for training as a training label; and performing multi-round training on a second shadow neural network model by using the plurality of pieces of second state input data for training and corresponding training labels until convergence, so as to obtain the pre-trained second neural network model. The hyper-parameter and loss function setting of the second shadow neural network may be performed with reference to the first neural network model, which is not described herein again. Preferably, the plurality of parameters for obtaining the second shadow state input data through preprocessing or each piece of second shadow state input data in the plurality of pieces of second shadow state input data for training through preprocessing include system state parameters and a second group of micro-service shadow state parameters. Preferably, the second set of microservice shadow state parameters includes: virtual memory occupation parameters, physical memory occupation parameters, the number of allocated CPU cores, the number of allocated LLC lines and the size parameter of the currently occupied LLC, the number of shared CPU cores, the number of shared LLC lines, the number of CPU cores occupied by the neighbor micro-service located in the same server, the number of LLC lines occupied by the neighbor micro-service located in the same server, and the memory bandwidth occupied by the neighbor micro-service located in the same server. If the available resources of the same server are too small and the upper layer scheduler still wants to increase the load on this server (deploy the new micro-service) and it is difficult to meet the resource requirements of the new micro-service by depriving the resources of the neighbor micro-service (e.g., all the neighbor micro-services deployed on the same server are close to their resource cliff), the present invention will enable resource sharing. The control logic is to implement resource sharing based on cooperative operation of the first neural network model and the second shadow neural network model. The control logic estimates the quantity of resources needed by the newly added microservice besides the currently allocated resources based on the optimal resource allocation region predicted by the first neural network model; then, the control logic calls a second shadow neural network model to predict the QoS reduction percentage of one or more neighbor micro-services which are subjected to resource sharing, preferentially selects a resource sharing scheme according to the QoS reduction percentage of the one or more neighbor micro-services, and performs resource allocation according to the resource sharing scheme. In practice, to minimize adverse effects, the sharing constraint parameter may be configured to allow sharing of its resources among up to three microservices. The technical scheme of the embodiment can at least realize the following beneficial technical effects: the invention can efficiently process the complex resource sharing condition through the second shadow neural network model without generating a large amount of scheduling overhead.
To further illustrate the form of the inputs and outputs of the second shadow neural network model, according to an example of the present invention, it is shown visually here by a sample organized in a particular order but not yet normalized. For example, the input is x ═ x (a CPU utilization parameter, a CPU core frequency parameter, an instruction count per clock cycle, a cache miss parameter, a currently occupied LLC size parameter, a memory bandwidth parameter, a memory utilization, a virtual memory occupancy parameter, a physical memory occupancy parameter, an allocated CPU core count, an allocated LLC way count, a shared CPU core count, a shared LLC way count, a CPU core count occupied by neighbor microservices located on the same server, an LLC way count occupied by neighbor microservices located on the same server, and a memory bandwidth occupied by neighbor microservices located on the same server). Let a certain x be (320,2643.1,0.76,104018,6912,16391.5,2.099,3809748.0,2726297.6,10,8,9,7,12,8,29913.2). The meaning represented by the first 11 parameters and the last 3 parameters in x can refer to a second neural network model, and the 12 th to 13 th parameters respectively represent: number of shared CPU cores: 9, number of shared LLC lanes: 7. correspondingly, the schematic output of the second shadow neural network model is y ═ QoS degradation percentage. For example, assume that some y is (0.128), indicating an expected QoS degradation percentage of 12.8%.
According to an embodiment of the present invention, the micro service resource scheduling system further includes a third neural network model trained in advance, wherein the control logic is further configured to invoke the parameter obtaining module when the control logic detects that a QoS violation occurs to a specific micro service; the parameter obtaining module is further used for responding to the calling of the control logic to obtain the running state parameters of the specific micro service when the QoS violation occurs, and preprocessing a plurality of parameters in the running state parameters according to a preset rule to obtain third state data when the QoS violation occurs; the pre-trained third neural network model is used for processing the third state data when the QoS violation occurs so as to predict resources needing to be added to the specific micro service; and the control logic is further configured to add the resource to the specific micro service if the currently available resource satisfies the resource that needs to be added to the specific micro service. Preferably, the control logic is further configured to, when it is detected that a specific micro service has a condition of resource over-allocation, invoke the parameter obtaining module, where the parameter obtaining module is further configured to, in response to the invocation of the control logic, obtain an operating state parameter of the specific micro service when the resource over-allocation is detected, and preprocess a plurality of parameters thereof according to a predetermined rule to obtain third state data when the resource over-allocation is detected, where the pre-trained third neural network model is further configured to process the third state data when the resource over-allocation is detected to predict an excess resource of the specific micro service; the control logic is further configured to adjust the resource configuration of the particular microservice based on the predicted excess resources for the particular microservice to reclaim the excess resources from the particular microservice. Preferably, the plurality of parameters for obtaining the third state data through preprocessing include system state parameters and a third group of micro service state parameters, where the third group of micro service state parameters include the number of allocated CPU cores, the number of allocated LLC lines, a current occupied LLC size parameter, and micro service response delay. According to an embodiment of the present invention, the third neural network model may adopt a reinforcement learning model, and the third neural network model is trained in advance in the following manner: acquiring a pre-acquired training set, wherein each sample in the pre-acquired training set comprises initial third state data, an action, next third state data and a reward value; and performing multi-round training on the third neural network model by using the pre-collected training set until convergence to obtain the pre-trained third neural network model. The initial third state data is an operation state parameter when Qos violation occurs or resource overage occurs on the micro service, the action is a resource adding action or a resource recycling action, the next third state data is obtained by applying the acquired operation state parameter after the resource adding action or the resource recycling action based on the initial third state data, and the reward value is a grade of the action. The loss function of the third neural network model may be set to be an MSE loss function. In the invention, all the neural network models can adopt an error threshold value method to judge convergence; and presetting a smaller error threshold, and judging that the model is converged when the model error is smaller than the threshold. In addition, the maximum iteration step number can be set, and the training is stopped when the model training exceeds 100000 times. In operation, the control logic may be configured to monitor the QoS status of each microservice every second. If QoS violation occurs in a micro service, calling a third neural network model to predict how many resources need to be allocated to realize an ideal QoS requirement, and adding resources for the micro service; if the reserved resource space of a certain micro service is found to be too large (namely, resources are wasted), a third neural network model is called to predict how much excess resources exist, and then resource recovery is carried out. The reinforcement learning model is an online learning model and is used for dynamically processing the error prediction, the environmental change and the unpredictable situation of the first neural network model. And after the initial prediction result is given by the first neural network model, the resources of each micro service are dynamically adjusted by the third neural network model. The key component of the reinforcement learning model is a reinforcement deep neural network which is composed of two neural networks with the same structure, namely a strategy network and a target network. The policy network or the target network may employ three layers of multi-layer perceptrons. The reinforcement learning model performs off-line learning and on-line learning by collecting historical records and runtime information, and the action of the reinforcement learning model is based on the output of the first neural network model or the second neural network model and starts from the predicted optimal resource allocation region, so that the time overhead of exploring and scheduling space is greatly saved. And the reinforcement learning model is used for carrying out dynamic resource adjustment on the micro-service aiming at the error prediction, the environmental change and the unpredictable situation which occur in the first neural network model and the second neural network model. Offline learning is the process of training off-line data collected in earlier stages. On-line learning is a process of putting historical data collected in a prediction process into an experience pool for real-time training. In operation, the present invention monitors the QoS state of each microservice every second, if QoS violation is detected, then invokes the third neural network model to allocate more resources to achieve the ideal QoS requirement; and if the micro service reserved resources are too large, calling a third neural network model to recover redundant resources. The micro-service resource scheduling system can set a QoS violation delay threshold of a micro-service to judge whether a QoS violation occurs or not; for example, if the QoS violation delay threshold of a certain micro-service is set to 50ms, then the QoS violation is considered to occur if the response delay of the micro-service is higher than 50 ms. A delay ratio threshold value can be set in the micro-service resource scheduling system to judge whether resource over-allocation occurs, and if the response delay ratio is lower than the delay ratio threshold value, the resource over-allocation is regarded as; the response delay ratio is defined as the ratio of the current response delay to the QoS violation delay threshold; assuming that the setting is 80%, when the response delay ratio is lower than the value, it is considered that the resource over-allocation exists in the micro service. It should be understood that there are other ways to determine whether there is a QoS violation or resource over-allocation in the art, which are merely exemplary, and the present invention is not limited in this respect. The technical scheme of the embodiment can at least realize the following beneficial technical effects: the invention can correct the condition of too little or over-distribution of resources through the third neural network model, better utilize the resources of the server and ensure the service quality. The invention can efficiently correct the error prediction, environment change and unpredictable situations in the first neural network model or the second neural network model based on the third neural network model, in practice, the third neural network can obtain ideal results only through small calibrations, and the performance is superior to that of a heuristic method.
Preferably, the micro-service resource scheduling system is configured to: and presetting an action amplitude range for the third neural network model. For example, setting the action amplitude range to [ -3,3], means that the amplitude of the adjustment of the resource by each adjustment action is limited to the range of [ -3,3], and taking the adjustment of the number of CPU cores or LLC lines as an example, means that the number of CPU cores or LLC lines is decreased or increased by 3 at most for each action. Thus, there are 49 action strategies 7x 7. In actual operation, it can be predefined, each Action strategy is represented by 0-48, each value represents an Action Function (Action _ Function), where { < m, n > | m belongs to [ -3,3], n belongs to a value of [ -3,3] }, m is the variation of the CPU core, and n is the variation of the LLC number of ways. For example, 0 represents < -3, -3>, that is, the allocation of 3 CPU cores and 3 LLC ways is reduced, 1 represents < -3, -2>, that is, the allocation of 3 CPU cores and 2 LLC ways is reduced, … …, 48 represents <3,3> that is, 3 CPU cores and 3 LLC ways are increased, the corresponding relationship is not unique, and the corresponding relationship can be set in advance according to needs when in use, which is not limited by the present invention. The technical scheme of the embodiment can at least realize the following beneficial technical effects: the present invention has the ability to handle the following: the environment or user requirements change, so that the first neural network model or the second neural network model is inaccurate in prediction, wrong prediction is performed, and the resources of some micro services are excessive or insufficient, so that resources needing to be recycled or added are predicted through the third neural network model, dynamic adjustment of the resources of the micro services is realized, and appropriate resources are better allocated to each micro service; different from the prior art that the resources of the micro-service are continuously adjusted to try the optimal allocation scheme of the micro-service, the method and the device can reallocate the resources when the QoS violation or the resource over-allocation is detected, can avoid unnecessary scheduling actions and reduce the QoS fluctuation of the micro-service.
According to one embodiment of the invention, referring to FIG. 2, the following terms are replaced with the following abbreviations or labels for simplicity: OSML: a control logic; model A: a first neural network model; model B: a second neural network model; model B': a second shadow neural network model; model C: a third neural network model; OAA: an optimal resource allocation region; RCliff: resource cliffs. And the OSML samples the running state parameters on line and calls the model A, the model B' or the model C to complete the resource scheduling work of the hardware resources of the server. The model C comprises a strategy network and a target network, and can be trained on line through an experience pool. The machine learning model of the present invention (model a, model B', model C may be based on TensorFlow (version 1.13.0-rc0) and may run on a CPU or GPU.
According to an embodiment of the present invention, referring to fig. 3, an exemplary workflow of the system of the present invention can be illustrated using the following 4 algorithms:
for deploying newly added services:
k1: the control logic judges whether a new micro service arrives, if so, the control logic calls the algorithm 1, and if not, the step is repeated;
algorithm 1:
a1: model a predicts OAA, step a 2:
a2: judging whether the available resources meet the allocation requirements, if so, turning to the step A3, and if not, turning to the step A4;
a3: allocating resources to the microservice according to the OAA;
a4: calling a model B to predict the minimum resource stripped by the micro service under the allowed QoS reduction, and turning to the step A5;
a5: judging whether the available resources and the deprivable resources meet the allocation requirements, if so, turning to the step A3, otherwise, turning to the step A6;
a6: if the microservice must be run, then algorithm 4 is invoked;
for QoS violation detection:
k2: the control logic judges whether the micro service QoS violation is detected, if so, the algorithm 2 is called, and if not, the step is repeated;
algorithm 2 (for micro-service resource allocation up to OAA):
c1: predicting a resource addition action by the model C;
c2: judging whether the available resources meet the resource addition action, if so, turning to the step C3, and if not, calling an algorithm 4 to try resource sharing;
c3: adding resources to the micro-service with the QoS violation;
for detecting resource over-allocation:
k3: the control logic judges whether the micro service resource over-distribution is detected, if so, the algorithm 3 is called, and if not, the step is repeated;
algorithm 3 (for reclaiming excess resources):
c4: predicting the resource recovery action by the model C, and turning to the step C5;
c5: judging whether the QoS is still satisfied after the surplus resources are recovered, if so, turning to the step C6, otherwise, turning to the step C7;
c6: executing a resource recovery action;
c7: and canceling the resource recovery action.
These 4 algorithms are further illustrated below by way of a schematic algorithm example:
Figure BDA0002928825000000231
algorithm 1: using machine learning for resource allocation, selecting only one policy in OAA
Line 1: mapping the upcoming microservices to idle resources, and capturing runtime parameters within 2 seconds (default) of the upcoming microservices;
line 2: these parameters are used as input for model a;
line 3: and (3) outputting a model A: (1) OAA meeting a target quality of service; (2) RCliff under the current environment;
line 4: if the free resources satisfy OAA, then:
line 5: allocating resources using a resource allocation scheme in the OAA;
line 6: ending the current selection statement;
line 7: if the free resources are insufficient:
line 8: calculating the difference between the free resource and its OAA, i.e. < + cores, + LLC way > (the number of resources still in short supply in order to satisfy QoS);
line 9: calculating the difference value between the idle resource and RCliff, namely < + cores ', + LLC way' > (needs to be used carefully to avoid entering RCliff);
line 10: for each running microserver, performing:
line 11: if the program can accept a certain Qos degradation:
lines 12, 13: inferring the minimum resource scenario (B-Points) that the program can be stripped at the allowed Qos degradation using model B;
line 14: ending the current selection statement;
line 15: ending the current loop statement;
line 16: recording the B-Point corresponding to the QoS reduction of each micro service;
line 17: finding the solution most suitable for meeting OAA/RCliff according to B-Points, and only involving 3 micro services at most;
line 18: if the solution can meet the requirements of OAA or RCliff:
line 19: adjusting resource allocation according to the OAA;
line 20: otherwise:
line 21: if the resources are not shared, the micro service program cannot be run on the server;
line 22: ending the current selection statement;
line 23: the model B calling is finished;
line 24: and reporting the result to a superior scheduler, and calling an algorithm 4 for sharing if necessary.
Algorithm 1 line 17 is to allocate resources for upcoming microservices according to their B-Points, an exemplary workflow is as follows:
sorting all the micro-services from big to small according to the sum of the number of resources (CPU core number and LLC path number) available in the B-Points;
for the number of cores and the number of LLC ways required by the upcoming microservice, the following are respectively set in two cycles: < + cores, + LLC way > or < + cores ', + LLC way' >, performing:
for the first three microservices, when the number of cores or LLC ways required for the upcoming microservice is greater than 0, perform:
selecting one of all schemes in the micro-service B-Points to ensure that the sum of the required core number and the LLC path number is minimum after resources are allocated to the upcoming micro-service;
ending the current loop statement (selecting allocation scheme from the first three micro-services);
if the upcoming microservice resource requirement has been met, then:
if the success is successful, returning to the resource scheduling scheme;
ending the current selection statement;
ending the current loop statement (the number of resources needed is set according to OAA and RCliff, respectively);
and returning the scheduling failure.
Figure BDA0002928825000000251
Algorithm 2, handling resource starvation
Line 1: for each micro-service of the allocated resources, performing:
line 2: if its QoS is not met (higher latency), then:
line 3: forwarding the current operating state parameters to the model C;
line 4: the model C outputs a resource additional Action through an Action _ Fun function;
line 5: returning the output of model C (< cores +, LLC Ways + >) to the central controller of OSML;
line 6: if the current idle resource satisfies < cores +, LLC Wats + >:
line 7: OSML distributes resources and jumps to the 2 nd line;
line 8: otherwise:
line 9: calling algorithm 4 to share resources with other programs;
line 10: ending the current selection statement (allocating resources or invoking algorithm 4);
line 11: end the current selection statement (add resources to the program that has a QoS violation);
line 12: the current loop statement is ended.
Figure BDA0002928825000000261
Algorithm 3: handling situations of over-allocation of resources
Line 1: for each micro-service program of the allocated resources, executing:
lines 2, 3: if the resource allocation is excessive:
line 4: forwarding the current operating state parameters to the model C;
line 5: the model C outputs a resource recovery action;
line 6: returning the output of model C (< cores-, LLC Ways- >) to the central controller of OSML;
line 7: OSML recycling resources;
line 8: if the QoS violation occurs after the redundant resources of the program are recycled, then:
line 9: OSML cancels the resource recovery action;
line 10: end the current selection statement (handling statement that a QoS violation occurred);
line 11: ending the current selection statement (resource reclamation operation);
line 12: the current loop statement is ended.
Figure BDA0002928825000000271
Algorithm 4, handling resource sharing between microservices
Line 1: noting that OSML attempts to allocate resources across RCiffs.
Line 2: the model A deduces the amount of resources needed by the new micro service program besides the currently allocated resources, and the new micro service program shares the resources with the neighbor micro service program;
line 3: for each potential sharable resource neighbor microserver, performing:
line 4: executing a resource sharing strategy on the neighbor micro service program;
line 5: predicting the Qos reduction generated by the neighbor micro-service program after sharing resources by using the model B';
line 6: ending the current loop statement;
line 7: if OSML can accept the Qos degradation of the neighbor microserver, then:
line 8: OSML carries out resource sharing operation;
line 9: otherwise:
line 10: the OSML transfers the micro service program to another node;
line 11: the current selection statement is ended.
In algorithm 4, + core, + LLC way, refers to how much more resources are needed for this microservice to meet its QoS requirement. For example, if the OAA predicted to receive micro services is 10 cores and 10 ways, but the system has only 8 cores and 8 ways of spare resources, then + core + LLCway is 2 and 2, respectively. For example, for other running microservices to be deprived of resources, nine deprivation conditions [ (0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2) ] are analyzed, where < u, v > here the number of cores and way in the deprivation policy) cause them a percentage of the QoS degradation that is large and, if acceptable, how much of the resources available for deprivation can be provided. The resource sharing scheme involving fewer micro-services is selected based on the number of micro-services involved, and if the number of micro-services involved in the plurality of schemes is the same, the resource sharing scheme in which the QoS drops less is selected based on the QoS drop.
According to an aspect of the present invention, there is also provided a method for scheduling micro service resources based on the foregoing micro service resource scheduling system, including: allocating idle resources for newly-added micro services to be deployed so as to perform trial run on the newly-added micro services; acquiring running state parameters during the trial running period, and preprocessing a plurality of parameters according to a preset rule to obtain first state input data during the trial running of the newly-added micro service; and processing first state input data of the newly added micro service during trial operation through a pre-trained first neural network model to predict a resource cliff and an optimal resource allocation area of the newly added micro service. According to an embodiment of the present invention, the method for scheduling microservice resources further includes: when the current system resources cannot meet the resource requirements of the newly added micro services, processing second state input data in trial operation through a pre-trained second neural network model to predict deprivable resources of one or more neighbor micro services on the same server under the condition of meeting the self service quality requirements; when the allocated resources of the newly added microservice and the deprivable resources of one or more neighbor microservices can meet the resource requirement of the newly added microservices, the resources of the one or more neighbor microservices are deprived and allocated to the newly added microservices through the control logic. According to an embodiment of the present invention, the method for scheduling microservice resources further includes: when any target micro service needs to acquire more resources and the deprived resources of the idle resources and one or more neighbor micro services cannot meet the resource requirement of the target micro service, second shadow state input data are processed through a second shadow neural network model trained in advance to predict the QoS reduction percentage of a specific neighbor micro service and the target micro service after the resources are shared by a plurality of resource sharing schemes, an acceptable resource sharing scheme is selected by a control logic from the plurality of resource sharing schemes in turn by taking fewer micro services and fewer QoS reduction percentages as preferential selection conditions, and shared resources are configured for the specific neighbor micro service and the target micro service according to the acceptable resource sharing scheme. According to an embodiment of the present invention, the method for scheduling micro service resources further includes: when the control logic detects that the QoS violation occurs to a specific micro-service, processing third state data when the QoS violation occurs through a pre-trained third neural network model to predict resources needing to be added to the specific micro-service; and in the case that the current available resource meets the resource needing to be added to the specific micro service, adding the resource to the specific micro service through the control logic. According to an embodiment of the present invention, the method for scheduling microservice resources further includes: when the control logic detects that the specific micro service has the condition of excessive resource allocation, processing third state data when the excessive resource allocation is detected by a pre-trained third neural network model to predict the excessive resource of the specific micro service; the resource configuration of the particular microservice is adjusted by control logic based on the predicted excess resources of the particular microservice to reclaim excess resources from the particular microservice. Some technical details of the method for scheduling microservice resources have been described in the foregoing microservice resource scheduling system, and reference may be made to the foregoing embodiments, which are not described herein again.
Generally, the invention adopts the technology of cooperation of a plurality of machine learning models (a first neural network model, a second shadow neural network model and a third neural network model), avoids resource cliffs and quickly provides an optimal solution for resource allocation and adjustment through the first neural network model, the second neural network model and the second shadow neural network model. The present invention also utilizes the DQN model (third neural network model) to dynamically manage the allocation of resources and perform reallocation (dynamic adjustment) in the event of QoS violations and resource over-allocation (resource over-allocation). The present invention avoids the QoS degradation often caused by resource cliff problems in previous schedulers and avoids unnecessary scheduling actions. Furthermore, since each model is lightweight and its functionality is well defined, it is easy to locate problems and debug them when they occur.
It should be noted that, although the steps are described in a specific order, the steps are not necessarily performed in the specific order, and in fact, some of the steps may be performed concurrently or even in a changed order as long as the required functions are achieved.
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may include, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (27)

1. A micro-service resource scheduling system, comprising:
the control logic is used for allocating idle resources for the newly-added microservices to be deployed so as to perform trial run on the newly-added microservices;
the parameter acquisition module is used for acquiring operation state parameters during the trial operation of the newly added micro-service and preprocessing a plurality of parameters in the operation state parameters according to a preset rule to obtain first state input data during the trial operation of the newly added micro-service;
the pre-trained first neural network model is used for processing first state input data of the newly-added micro service in trial operation to predict a resource cliff and an optimal resource allocation area of the newly-added micro service, wherein the first neural network model is obtained by training by using operating state parameters of one or more micro services under different resource allocation schemes and the resource cliff and the optimal resource allocation area of the corresponding micro service as sample data.
2. The micro-service resource scheduling system of claim 1, wherein the first neural network model is pre-trained in the following manner:
acquiring running state parameters of one or more micro services under different resource allocation schemes, and preprocessing a plurality of parameters according to a preset rule to obtain a plurality of pieces of first state input data for training;
acquiring a resource cliff and an optimal resource allocation area of the corresponding micro service marked for each piece of first state input data for training as training labels;
and performing multi-round training on the first neural network model by using the plurality of pieces of first state input data for training and the corresponding training labels until convergence, so as to obtain the pre-trained first neural network model.
3. The micro-service resource scheduling system of claim 2, wherein the plurality of parameters for preprocessing the first state input data for commissioning a new micro-service or each of the plurality of first state input data for preprocessing the training comprises a system state parameter and a first set of micro-service state parameters.
4. The micro-service resource scheduling system of claim 3, wherein the system state parameter is a state parameter of a currently scheduled server system, comprising: one or more of a CPU utilization parameter, a CPU core frequency parameter, an instruction number per clock cycle, a cache miss parameter, a memory bandwidth parameter, and a memory utilization parameter.
5. The micro-service resource scheduling system of claim 3, wherein the first set of micro-service state parameters are state parameters of the newly added micro-service itself at commissioning, comprising: virtual memory occupation parameters, physical memory occupation parameters, the number of distributed CPU cores, the number of distributed LLC lines and the size parameters of the currently occupied LLC.
6. The micro-service resource scheduling system of claim 1, wherein the resource cliffs for the newly added micro-service comprise CPU core cliffs and LLC cliffs.
7. The micro-service resource scheduling system of claim 1, wherein the optimal resource allocation region comprises CPU core resource allocation under CPU core priority condition, LLC resource allocation under CPU core priority condition, CPU core resource allocation under cache priority condition, LLC allocation under cache priority condition.
8. The micro-service resource scheduling system of any one of claims 1 to 7, wherein the parameter obtaining module is further configured to obtain the operating state parameters during the commissioning and preprocess a plurality of parameters thereof according to a predetermined rule to obtain the second state input data during the commissioning,
the micro-service resource scheduling system also comprises a pre-trained second neural network model, which is used for responding to the calling of a control logic and processing second state input data in trial operation when the current system resource cannot meet the resource requirement of a newly-added micro-service so as to predict a resource deprivation scheme of one or more neighbor micro-services positioned on the same server under the condition of meeting the self service quality requirement, wherein the second neural network model is obtained by taking the running state parameters of one or more micro-services under different resource allocation schemes and the deprivable resources of the corresponding micro-services under the condition of meeting the self service quality requirement as training samples for training;
the control logic is further configured to deprive and allocate resources of the one or more neighboring microservices to the newly added microservices when the allocated resources of the newly added microservices and the deprivable resources of the one or more neighboring microservices determined according to the resource deprivation scheme can meet the resource requirements of the newly added microservices.
9. The micro-service resource scheduling system of claim 8, wherein the second neural network model is pre-trained in the following manner:
acquiring running state parameters of one or more micro services under different resource allocation schemes, and preprocessing a plurality of parameters according to a preset rule to obtain a plurality of pieces of second state input data for training;
acquiring a resource deprivation scheme of each corresponding micro service marked by each piece of second state input data for training under the condition that the corresponding micro service meets the self service quality requirement as a training label;
and performing multi-round training on a second neural network model by using the plurality of pieces of second state input data for training and the corresponding training labels until convergence, so as to obtain the pre-trained second neural network model.
10. The micro-service resource scheduling system of claim 9, wherein the plurality of parameters for preprocessing the second state input data or each of the plurality of second state input data for preprocessing training comprises a system state parameter and a second set of micro-service state parameters.
11. The micro-service resource scheduling system of claim 10, wherein the second set of micro-service state parameters comprises: virtual memory occupation parameters, physical memory occupation parameters, the number of distributed CPU cores, the number of distributed LLC paths and the size parameter of the currently occupied LLC, the expected QoS reduction percentage, the number of CPU cores occupied by the neighbor micro-services located in the same server, the number of LLC paths occupied by the neighbor micro-services located in the same server, and the memory bandwidth occupied by the neighbor micro-services located in the same server.
12. The micro-service resource scheduling system of claim 8, wherein the resource deprivation scheme comprises: the target number of CPU cores, the target number of LLC paths, the target number of CPU cores under the CPU core priority condition, the target number of LLC paths under the CPU core priority condition, the target number of CPU cores under the cache priority condition and the target number of LLC paths under the cache priority condition.
13. The micro-service resource scheduling system of claim 8, wherein the parameter obtaining module is further configured to obtain the running state parameters during the trial running, and preprocess a plurality of parameters therein according to a predetermined rule to obtain second shadow state input data during the trial running;
the micro-service resource scheduling system further comprises a pre-trained second shadow neural network model, wherein the pre-trained second shadow neural network model is used for responding to the calling of the control logic, processing second shadow state input data to predict the QoS reduction percentage of a specific neighbor micro-service and a target micro-service after the resource sharing by a plurality of resource sharing schemes when any target micro-service needs to obtain more resources and the resource demand of the target micro-service cannot be met by deprived resources of idle resources and one or more neighbor micro-services, and the second shadow neural network model is obtained by training by taking the running state parameters of one or more micro-services under the plurality of resource sharing schemes and the corresponding Qos reduction percentage as training samples;
the control logic is further configured to select an acceptable resource sharing scheme from the plurality of resource sharing schemes in sequence with the preference condition of less QoS degradation percentage related to the micro-service, and configure the shared resources for the specific neighbor micro-service and the target micro-service according to the acceptable resource sharing scheme.
14. The micro-service resource scheduling system of claim 13, wherein the second shadow neural network model is pre-trained in the following manner:
acquiring running state parameters of one or more micro services under multiple resource sharing schemes, and preprocessing the parameters according to a preset rule to obtain a plurality of pieces of second shadow state input data for training;
acquiring the corresponding QoS reduction percentage of each second shadow state input data mark for training as a training label;
and performing multi-round training on a second shadow neural network model by using the plurality of pieces of second state input data for training and corresponding training labels until convergence, so as to obtain the pre-trained second neural network model.
15. The micro-service resource scheduling system of claim 14, wherein the plurality of parameters for preprocessing to obtain the second shadow state input data or each of the plurality of second shadow state input data for preprocessing to obtain training comprises system state parameters and a second set of micro-service shadow state parameters.
16. The micro-service resource scheduling system of claim 15, wherein the second set of micro-service shadow state parameters comprises: virtual memory occupation parameters, physical memory occupation parameters, the number of allocated CPU cores, the number of allocated LLC lines and the size parameter of the currently occupied LLC, the number of shared CPU cores, the number of shared LLC lines, the number of CPU cores occupied by the neighbor micro-service located in the same server, the number of LLC lines occupied by the neighbor micro-service located in the same server, and the memory bandwidth occupied by the neighbor micro-service located in the same server.
17. The micro service resource scheduling system according to any one of claims 1 to 7, further comprising a third neural network model trained in advance, wherein,
the control logic is also used for calling the parameter acquisition module when the control logic detects that the QoS violation occurs in the specific micro service;
the parameter obtaining module is further used for responding to the calling of the control logic to obtain the running state parameters of the specific micro service when the QoS violation occurs, and preprocessing a plurality of parameters in the running state parameters according to a preset rule to obtain third state data when the QoS violation occurs;
the pre-trained third neural network model is used for processing the third state data when the QoS violation occurs so as to predict resources needing to be added to the specific micro service; and the number of the first and second electrodes,
the control logic is further configured to add the resource to the particular micro service if the current available resource satisfies the resource that needs to be added to the particular micro service.
18. The micro-service resource scheduling system of claim 17, wherein the control logic is further configured to invoke the parameter obtaining module upon detecting that there is an over-allocation of resources for a particular micro-service,
the parameter obtaining module is further used for responding to the calling of the control logic to obtain the running state parameters of the specific micro service when the resource over-distribution is detected, and preprocessing a plurality of parameters in the running state parameters according to a preset rule to obtain third state data when the resource over-distribution is detected,
the pre-trained third neural network model is further used for processing third state data when the resource overactivity is distributed so as to predict the excess resource of the specific micro service;
the control logic is further configured to adjust the resource configuration of the particular microservice based on the predicted excess resources for the particular microservice to reclaim the excess resources from the particular microservice.
19. The micro-service resource scheduling system of claim 18, wherein the plurality of parameters for obtaining the third status data after preprocessing comprises system status parameters and a third set of micro-service status parameters, wherein the third set of micro-service status parameters comprises a number of allocated CPU cores, a number of allocated LLC ways, a currently occupied LLC size parameter, and micro-service response delay.
20. The micro-service resource scheduling system according to claim 18 or 19, wherein the third neural network model may be pre-trained using a reinforcement learning model in the following manner:
acquiring a pre-acquired training set, wherein each sample in the pre-acquired training set comprises initial third state data, an action, next third state data and a reward value; and performing multiple rounds of training on the third neural network model by using the pre-acquired training set until convergence to obtain the pre-trained third neural network model, wherein the initial third state data is an operation state parameter when Qos violation occurs or resource overage occurs to the micro-service, the action is a resource addition action or a resource recovery action, the next third state data is obtained by applying the acquired operation state parameter after the resource addition action or the resource recovery action based on the initial third state data, and the reward value is a grade of the action.
21. A micro service resource scheduling method implemented based on the micro service resource scheduling system of any one of claims 1 to 20, comprising:
allocating idle resources for newly-added micro services to be deployed so as to perform trial run on the newly-added micro services;
acquiring running state parameters during the trial running period, and preprocessing a plurality of parameters according to a preset rule to obtain first state input data during the trial running of the newly-added micro service;
and processing first state input data of the newly added micro service during trial operation through a pre-trained first neural network model to predict a resource cliff and an optimal resource allocation area of the newly added micro service.
22. The method of claim 21, further comprising:
when the current system resources cannot meet the resource requirements of the newly-added micro services, processing second state input data in the trial run of the newly-added micro services through a pre-trained second neural network model to predict a resource deprivation scheme of one or more neighbor micro services on the same server under the condition of meeting the self service quality requirements;
when the resources which are allocated to the newly-increased micro service in the commissioning process and the deprivable resources of one or more neighbor micro services determined according to the resource deprivation scheme can meet the resource requirement of the newly-increased micro service, the resources of the one or more neighbor micro services are deprived and allocated to the newly-increased micro service through the control logic.
23. The method of claim 22, further comprising:
when any target micro service needs to acquire more resources and the deprived resources of the idle resources and one or more neighbor micro services cannot meet the resource requirement of the target micro service, second shadow state input data is processed through a second shadow neural network model trained in advance to predict the QoS reduction percentage of the specific neighbor micro service and the target micro service after the resources are shared by a plurality of resource sharing schemes,
selecting an acceptable resource sharing scheme by the control logic according to the priority conditions of less micro-services and less QoS reduction percentage, and configuring shared resources for a specific neighbor micro-service and a target micro-service according to the acceptable resource sharing scheme.
24. The method of claim 23, further comprising:
when the control logic detects that the QoS violation occurs to a specific micro-service, processing third state data when the QoS violation occurs through a pre-trained third neural network model to predict resources needing to be added to the specific micro-service;
and in the case that the current available resource meets the resource needing to be added to the specific micro service, adding the resource to the specific micro service through the control logic.
25. The method of claim 24, further comprising:
when the control logic detects that the specific micro service has the condition of excessive resource allocation, processing third state data when the excessive resource allocation is detected by a pre-trained third neural network model to predict the excessive resource of the specific micro service;
the resource configuration of the particular microservice is adjusted by control logic based on the predicted excess resources of the particular microservice to reclaim excess resources from the particular microservice.
26. A computer-readable storage medium, having embodied thereon a computer program, the computer program being executable by a processor to perform the steps of the method of any one of claims 21 to 25.
27. An electronic device, comprising:
one or more processors; and
a memory, wherein the memory is to store one or more executable instructions;
the one or more processors are configured to implement the steps of the method of any one of claims 21 to 25 via execution of the one or more executable instructions.
CN202110143249.8A 2021-02-02 2021-02-02 Micro-service resource scheduling system and method Pending CN112799817A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110143249.8A CN112799817A (en) 2021-02-02 2021-02-02 Micro-service resource scheduling system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110143249.8A CN112799817A (en) 2021-02-02 2021-02-02 Micro-service resource scheduling system and method

Publications (1)

Publication Number Publication Date
CN112799817A true CN112799817A (en) 2021-05-14

Family

ID=75813676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110143249.8A Pending CN112799817A (en) 2021-02-02 2021-02-02 Micro-service resource scheduling system and method

Country Status (1)

Country Link
CN (1) CN112799817A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254213A (en) * 2021-06-08 2021-08-13 苏州浪潮智能科技有限公司 Service computing resource allocation method, system and device
CN113296951A (en) * 2021-05-31 2021-08-24 阿里巴巴新加坡控股有限公司 Resource allocation scheme determination method and equipment
CN114205419A (en) * 2021-12-14 2022-03-18 上海交通大学 Data center request scheduling system and method oriented to micro-service multi-dimensional disturbance characteristics
CN114615338A (en) * 2022-04-11 2022-06-10 河海大学 Micro-service deployment method and device based on layer sharing in edge environment
CN115037749A (en) * 2022-06-08 2022-09-09 山东省计算中心(国家超级计算济南中心) Performance-aware intelligent multi-resource cooperative scheduling method and system for large-scale micro-service
CN116820784A (en) * 2023-08-30 2023-09-29 杭州谐云科技有限公司 GPU real-time scheduling method and system for reasoning task QoS
CN117692503A (en) * 2024-02-04 2024-03-12 国网湖北省电力有限公司信息通信公司 Combined optimization method and system for dynamic microservice graph deployment and probability request routing
CN117692503B (en) * 2024-02-04 2024-04-26 国网湖北省电力有限公司信息通信公司 Combined optimization method and system for dynamic microservice graph deployment and probability request routing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595301A (en) * 2018-03-26 2018-09-28 中国科学院计算技术研究所 A kind of server energy consumption prediction technique and system based on machine learning
CN108845960A (en) * 2013-10-23 2018-11-20 华为技术有限公司 A kind of memory resource optimization method and device
CN109165729A (en) * 2018-08-22 2019-01-08 中科物栖(北京)科技有限责任公司 The dispatching method and system of neural network
CN111444009A (en) * 2019-11-15 2020-07-24 北京邮电大学 Resource allocation method and device based on deep reinforcement learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108845960A (en) * 2013-10-23 2018-11-20 华为技术有限公司 A kind of memory resource optimization method and device
CN108595301A (en) * 2018-03-26 2018-09-28 中国科学院计算技术研究所 A kind of server energy consumption prediction technique and system based on machine learning
CN109165729A (en) * 2018-08-22 2019-01-08 中科物栖(北京)科技有限责任公司 The dispatching method and system of neural network
CN111444009A (en) * 2019-11-15 2020-07-24 北京邮电大学 Resource allocation method and device based on deep reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LEI LIU: "QoS-Aware Resource Scheduling for Microservices: A Multi-Model Collaborative Learning-based Approach", SEARCHGATE, pages 4 - 6 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113296951A (en) * 2021-05-31 2021-08-24 阿里巴巴新加坡控股有限公司 Resource allocation scheme determination method and equipment
WO2022257301A1 (en) * 2021-06-08 2022-12-15 苏州浪潮智能科技有限公司 Method, system and apparatus for configuring computing resources of service
CN113254213B (en) * 2021-06-08 2021-10-15 苏州浪潮智能科技有限公司 Service computing resource allocation method, system and device
CN113254213A (en) * 2021-06-08 2021-08-13 苏州浪潮智能科技有限公司 Service computing resource allocation method, system and device
CN114205419A (en) * 2021-12-14 2022-03-18 上海交通大学 Data center request scheduling system and method oriented to micro-service multi-dimensional disturbance characteristics
CN114615338A (en) * 2022-04-11 2022-06-10 河海大学 Micro-service deployment method and device based on layer sharing in edge environment
CN114615338B (en) * 2022-04-11 2023-07-18 河海大学 Micro-service deployment method and device based on layer sharing in edge environment
CN115037749A (en) * 2022-06-08 2022-09-09 山东省计算中心(国家超级计算济南中心) Performance-aware intelligent multi-resource cooperative scheduling method and system for large-scale micro-service
CN115037749B (en) * 2022-06-08 2023-07-28 山东省计算中心(国家超级计算济南中心) Large-scale micro-service intelligent multi-resource collaborative scheduling method and system
CN116820784A (en) * 2023-08-30 2023-09-29 杭州谐云科技有限公司 GPU real-time scheduling method and system for reasoning task QoS
CN116820784B (en) * 2023-08-30 2023-11-07 杭州谐云科技有限公司 GPU real-time scheduling method and system for reasoning task QoS
CN117692503A (en) * 2024-02-04 2024-03-12 国网湖北省电力有限公司信息通信公司 Combined optimization method and system for dynamic microservice graph deployment and probability request routing
CN117692503B (en) * 2024-02-04 2024-04-26 国网湖北省电力有限公司信息通信公司 Combined optimization method and system for dynamic microservice graph deployment and probability request routing

Similar Documents

Publication Publication Date Title
CN112799817A (en) Micro-service resource scheduling system and method
US11301307B2 (en) Predictive analysis for migration schedulers
CN109324875B (en) Data center server power consumption management and optimization method based on reinforcement learning
CN101414269B (en) Priority based throttling for power/performance quality of service
CN110231976B (en) Load prediction-based edge computing platform container deployment method and system
CN113806018B (en) Kubernetes cluster resource mixed scheduling method based on neural network and distributed cache
CN108965014A (en) The service chaining backup method and system of QoS perception
CN104679594B (en) A kind of middleware distributed computing method
CN110262897B (en) Hadoop calculation task initial allocation method based on load prediction
CN109783225B (en) Tenant priority management method and system of multi-tenant big data platform
CN107864211B (en) Cluster resource dispatching method and system
CN113867959A (en) Training task resource scheduling method, device, equipment and medium
CN111381928B (en) Virtual machine migration method, cloud computing management platform and storage medium
TW202133055A (en) Method for establishing system resource prediction and resource management model through multi-layer correlations
CN110677499A (en) Cloud resource management application system
CA3189144A1 (en) Power aware scheduling
CN115543577B (en) Covariate-based Kubernetes resource scheduling optimization method, storage medium and device
CN115934344A (en) Heterogeneous distributed reinforcement learning calculation method, system and storage medium
CN111767145A (en) Container scheduling system, method, device and equipment
CN111309483A (en) Management method, device, equipment and storage medium of server cluster
CN112398917A (en) Real-time task scheduling method and device for multi-station fusion architecture
CN109408230B (en) Docker container deployment method and system based on energy consumption optimization
CN114978913B (en) Cross-domain deployment method and system for service function chains based on cut chains
CN114296872A (en) Scheduling method and device for container cluster management system
CN116450328A (en) Memory allocation method, memory allocation device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination