CN115499511A - Micro-service active scaling method based on space-time diagram neural network load prediction - Google Patents
Micro-service active scaling method based on space-time diagram neural network load prediction Download PDFInfo
- Publication number
- CN115499511A CN115499511A CN202211442766.6A CN202211442766A CN115499511A CN 115499511 A CN115499511 A CN 115499511A CN 202211442766 A CN202211442766 A CN 202211442766A CN 115499511 A CN115499511 A CN 115499511A
- Authority
- CN
- China
- Prior art keywords
- micro
- service
- gat
- network
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the field of cloud computing, and discloses a micro-service active scaling method based on space-time diagram neural network load prediction, which introduces a space-time diagram neural network to predict the working load, and better embodies the spatial relation between different micro-services in a micro-service scene, so that more accurate prediction can be made; based on accurate prediction of workload, micro-services can be better balanced between the computational resources occupied by the micro-services and the quality of service provided through micro-service scaling decisions.
Description
Technical Field
The invention relates to the field of cloud computing, in particular to a microservice active scaling method based on space-time diagram neural network load prediction.
Background
With the rapid development of network services, the services provided by network application service providers are more and more complex, the functions are more and more, and simultaneously, services are rapidly expanded and iterated. Under this trend, microservices architecture arose. In the micro-service architecture, the whole network application program is divided into a plurality of micro-services which are independent from each other, and only other micro-services are called through network requests to acquire required information. Compared with the traditional network application, the micro-service architecture realizes application modularization and has higher expandability, fault tolerance and maintainability. In a cloud data center, in order to provide better service quality for micro services, more computing resources should be allocated to the micro services, and the better the micro services are, but allocating too many computing resources may result in too low resource utilization, thereby generating resource waste. Therefore, the data center must provide an efficient elastic scaling scheme for the microservice so as to meet the service quality requirements of the microservice and improve the utilization rate of computing resources as much as possible to reduce the operation cost of the microservice. Therefore, dynamic resource scheduling of micro services is of great interest to academia and industry, and automatic elastic scaling of micro services is a specific implementation manner of the micro services. The current elastic expansion schemes are mainly divided into two categories: one is a threshold-based reactive algorithm and the other is a prediction-based proactive algorithm. Reactive algorithms such as SmartVM can only react when workload changes have occurred, and therefore have hysteresis, are prone to jitter when workload changes are rapid, and stretch frequently causing unnecessary overhead. Active algorithms such as HANSEL and the like rely more on the accuracy of workload prediction. The existing prediction algorithm is mainly based on a regression theory or a traditional neural network, can only predict according to the historical time sequence of the micro-service workload, and cannot reflect the spatial relation among the micro-services. Therefore, a model which can simultaneously embody the time relation and the space relation of the workload of the micro-service is needed to predict, and then the elastic expansion and contraction of the micro-service are carried out based on the workload prediction.
Disclosure of Invention
In order to solve the technical problems, the invention provides a micro-service active scaling method based on space-time diagram neural network load prediction, which improves the utilization rate of computing resources to reduce the operation cost of a cloud computing center while ensuring the service quality of micro-services as much as possible by scaling and scheduling the computing resources occupied by the micro-services.
In order to solve the technical problem, the invention adopts the following technical scheme:
a micro-service active scaling method based on space-time diagram neural network load prediction comprises the following steps:
step one, modeling a micro service architecture:
the whole micro-service architecture comprises N micro-services and a set of micro-servicesIth microserviceIs represented by(ii) a Wherein the content of the first and second substances,representing microservicesThe work load of (a) is,representing microservicesThe computing resources of (a) are set up,representing microservicesThe quality of service of (c); fixed calling relation exists among micro services, and set of calling relation exists,(ii) a Wherein the relationship is invokedRepresenting microservicesTo micro serviceCalling relationship of, micro-serviceWhen the workload of (2) changes, the relationship is calledWill micro-serveThe workload of (2) changes;
wherein, the ith micro-serviceAttributes may also be expressed asI.e. increase by oneThe attributes of the data are then compared to the attributes,representing microservicesThe identification of (a);
step two, predicting the working load of the micro-service:
constructing and training a time-space diagram neural network consisting of a GAT network and a GRU network, and recording as a GAT-GRU network;
in a GAT-GRU network, the input comprises input dataAnd set of calling relationshipsWhereinRepresenting the length of the time series of the input data,representing the number of microservices;representing the characteristic number of the micro-service workload, wherein the characteristics of the micro-service workload comprise the CPU occupancy rate and the memory occupancy rate of the micro-service; firstly processing input data by a GAT layer GAT-1, then inputting the hidden state output by the GAT-1 into a GRU layer, then processing the hidden state output by the GRU layer by another GAT layer GAT-2 as the input of the GRU layer of the next time sequence, finally merging the hidden state output by each time sequence GRU, processing the merged state by a prediction layer, and finally outputtingRequired prediction data(ii) a WhereinRepresenting the temporal length of the predicted data;
step three, scaling decision of micro-service level:
adopting a DDPG model to decide whether each micro-service is scaled based on the prediction of the micro-service workload;
the environment state of the DDPG model comprises the resource occupation condition and the service quality condition of each micro service obtained from the prediction data; the resource occupation condition comprises the CPU occupancy rate, the memory occupancy rate and the number of the work copies of the microservice; the quality of service condition comprises an average request response time of the microservice;
the action set of the DDPG model comprises the capacity reduction, maintenance or capacity expansion of each micro-service; when the action value is larger than 1, carrying out capacity expansion, and taking the number of the working copies as the action value and rounding down; when the action value is smaller than-1, carrying out capacity reduction, wherein the number of the working copies is the action value and rounded up; maintaining the number of microservice working copies unchanged when the action value is between-1 and 1;
the reward of the DDPG model is the reciprocal of the weighted average of the average occupancy rate of the CPU of each micro service, the average occupancy rate of the memory and the normalized request response time.
Further, the prediction layer is formed by connecting several fully-connected layers in series.
Compared with the prior art, the invention has the beneficial technical effects that:
due to the fact that the space-time diagram neural network is introduced to predict the working load, the space relation among different micro services in a micro service scene is better reflected, and therefore more accurate prediction can be made. Based on accurate prediction of workload, the computational resources occupied by the micro-services and the quality of service provided can be better balanced through micro-service scaling decisions. And because the invention is based on the active expansion of prediction, can respond to the change of the working load in advance, prevent the system from difficult to respond in time and cause the service quality collapse or resource waste when the request quantity appears and changes greatly.
Drawings
FIG. 1 is a block diagram of a GAT-GRU network for microservice workload prediction in accordance with the present invention;
fig. 2 is an internal work flow diagram of a GRU network;
FIG. 3 is a flowchart illustrating an embodiment of micro-server resource scheduling according to the present invention.
Detailed Description
A preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.
The implementation of the invention is based on the combination of a space-time diagram neural network and a DDPG (Deep Deterministic Policy gradient) model.
The space-time diagram neural network is composed of a GRU network (Gated recurring Unit) network and a GAT network (Graph Attention Networks).
The conversion form of the GRU network is shown as follows:
whereinThe representative matrix is multiplied element by element,、andrespectively representing a reset gate, an update gate and a cell gate in the GRU network;、andthe parameters of the reset gate, the update gate and the cell gate respectively,、andthe offsets corresponding to the reset gate, the update gate and the cell gate respectively can be continuously learned in the training process;is a hyperbolic tangent function, andandare respectively GRU networkInput and output of time of day.
The internal workflow of the GRU network is shown in fig. 2.
The transformation of the GAT network is shown as follows:
whereinRepresenting the input data of the GAT network,is a node in a GAT networkFeature vectors, nodes ofNode, nodeIs a nodeOf the node(s) of (a) is,is a nodeIs determined by the feature vector of (a),is a nodeIs determined by the feature vector of (a), 、 、 has a length of;Represents the output data of the GAT network,the feature vector output after graph attention aggregation is carried out on each node in the GAT network, and the length is;Which is a function that is non-linear in expression,in order to activate the function(s),is a weight matrix of shape,Representing nodesIs determined by the node of the neighbor node set,the symbols represent vector concatenations, andis a length ofThe weight vector of (a) is calculated,is composed ofThe transposing of (1). Finally obtained from formula (6)I.e. the attention coefficient, which represents the nodeBy its neighbor nodeThe extent of the effect. After a multi-head attention mechanism is adopted, input data are processed by using a plurality of same GAT networks at the same time, and then the average value or the concatenation of the outputs is taken when the input data are output.
When the GAT network is applied to the micro-service workload prediction in the present invention, the nodes of the GAT network are the micro-services in the present invention.
The above two networks are used to predict the workload of the microservice.
The DDPG model is a reinforced learning algorithm and is used for carrying out scaling decision of micro-services.
The DDPG model comprises four networks, respectivelyA network,A network,Network anda network. WhereinThe network is used to convert the input environmental state into an action value,the network is used for pairing in corresponding environment statesThe action value provided by the network is scored, andandnetworks for preventing respectivelyNetwork andthe network fluctuates too much during a training session. The main workflow of the DDPG model is as follows:
(2) InitializationParameters of a networkAndparameters of a networkLet the parametersAndare the same value, parameterAnd withThe values of (A) and (B) are the same;
(3) Initializing a memory cache;
(4) For each round:
(5) Initializing a random variable with a mean value of 0 normal distribution;
(16) Ending a time step, if the state is not the final state or the time does not exceed the range, returning to the step (7), and executing the next time step;
(17) And (4) ending one round, returning to the step (4), and entering the next round.
In a scenario where a microservice is deployed in a cloud computing center, computing resources need to be provided to many microservices at the same time. Generally, for microservices, the more computing resources are obtained, the stronger the capacity of providing services is, and the more the self-service quality can be guaranteed. However, the cloud computing center cannot allocate computing resources to the micro-service without limit in order to improve the service quality of the micro-service, which may lead to an unlimited increase in the operation cost of the cloud computing center. Therefore, when the workload of the micro service is large, more computing resources are needed to be allocated to the micro service to ensure the service quality of the micro service, and when the micro service is relatively idle, certain computing resources should be recycled to prevent the waste of the computing resources caused by the low resource utilization rate of the micro service.
Therefore, an active resource scheduling method is needed, which predicts the workload of the microservice in the future by monitoring the workload data of the microservice in real time, and then determines whether to scale the microservice according to the predicted workload. The specific method is described as follows:
(1) Modeling a micro-service architecture. The whole micro service architecture is arranged to containA micro-service, useRepresents a collection of microservices, then. For the firstPersonal microserviceBy usingIndicating its attributes. Wherein, the first and the second end of the pipe are connected with each other,representing microservicesThe identity of (2);representing microservicesThe workload of (2);representing microservicesThe computing resources of (1);representing microservicesThe quality of service of. This is achievedBesides, a fixed calling relationship also exists between the micro services, the fixed calling relationship is determined at the time of micro service design, and the calling relationship is expressed as,WhereinRepresenting microservicesTo micro serviceThe call relationship of (1). Due to this calling relationshipIn the presence of a gas (b) in a gas (a),when the workload of the system is changed, the system will be rightAlso causing a certain impact.
(2) And predicting the work load of the micro-service. And (3) adopting a model combining the graph attention network and the recurrent neural network to predict the micro-service working load. Specifically, a time-space diagram neural network combining a GAT network and a GRU network is constructed and trained to perform workload prediction, and the GAT-GRU network is also referred to as a GAT-GRU network in the invention.
In the GAT-GRU network constructed by the invention, the input data of the network isAndwhereinRepresenting the length of the time series of the input data,representing the number of micro-services,number of features representing microservice workload, hereThe matrix of (1) corresponds to the input data in the GAT network in the foregoing,A set of call relationships representing the entire microservice architecture. The CPU occupancy rate and the memory occupancy rate of the micro-service are mainly considered, so. The input data is first processed through a GAT layer and then the hidden state of its output is input into the GRU layer. And then, the hidden state output by the GRU layer is processed by another GAT layer to be used as the hidden input of the GRU of the next time sequence. Finally, merging the hidden outputs of GRUs in each time sequence, processing the merged outputs through a prediction layer, and finally outputting the required prediction dataWhereinRepresents the time-sequence length of the predicted data, hereIs a matrix ofCorresponding output data in the GAT network in the preamble(ii) a The prediction layer is formed by connecting several full connection layers in series. The structure of the whole network is shown in fig. 1.
(3) And (4) scaling the decision of the micro-service level. A DDPG model is employed to decide whether each microservice scales based on predictions of microservice workload. The environment state comprises the resource occupation condition and the service quality condition of each micro service; the resource occupation condition specifically comprises CPU occupancy rate, memory occupancy rate and the number of work copies; the quality of service situation comprises in particular an average request response time. The action set comprises three types of selection of capacity reduction, maintenance and capacity expansion of each micro service, capacity expansion is carried out when the action value is larger than 1, and the number of the working copies is rounded down for the action value; when the action value is smaller than-1, carrying out capacity reduction, wherein the number of the working copies is the action value and rounded up; the number of microservice working copies is maintained constant when the action value is between-1 and 1. And the reward is the reciprocal of the weighted average of the average occupancy rate of the CPU, the average occupancy rate of the memory and the normalized request response time of each micro service.
Examples
The method is deployed on a micro-service architecture of a cloud computing center, and all micro-service node information in the micro-service architecture and calling relation information between micro-services are acquired. Then, every time, the micro-service active scaling method according to the invention performs horizontal scaling control on each micro-service, namely, controls the number of working copies of each micro-service. The specific implementation method is shown in fig. 3, and mainly includes the following steps:
monitoring the occupation condition of micro service resources: the method mainly monitors the number of the micro-service working copies, the CPU occupancy rate and the memory occupancy rate of all the micro-service working copies;
and (3) predicting the micro-service working load: predicting the resource occupation situation of all the micro-services at the next moment according to the resource occupation situation of all the micro-services in the past period, and specifically using the GAT-GRU network to predict the working load of the micro-services. Then, the GAT-GRU network is continuously trained according to the actual micro-service working load data acquired at the next moment, and the GAT-GRU network prediction capability is improved continuously;
performing a microservice level scaling decision: after the micro-service workload at the next moment is predicted, the predicted micro-service workload data is input into the DDPG model to obtain the output action value, and then the capacity expansion operation, the capacity reduction operation or no operation is determined for the corresponding micro-service according to the action value of each micro-service. The network is then trained and updated according to the algorithms of the DDPG model. After a period of training, the DDPG model can reach a relatively stable state and provide a better expansion decision;
performing horizontal scaling on all micro-services in the micro-service architecture according to scaling decisions: the number of the micro-service work copies is directly controlled by the micro-service work copy controller, so that the scaling decision is applied to the micro-service architecture.
The circles in FIG. 3 represent working copies of the microservice, the circles within a box represent working copies of the same microservice, the unfilled circles represent working copies in a normal operating state, the black filled circles represent working copies in an initialized state, and the dotted filled circles represent working copies in a destroyed state. The microservice architecture of FIG. 3 contains four microservices, which the present invention (1) monitors for resource usage data; (2) predict their future workload; (3) making a horizontal scaling decision for each micro-service; (4) and applying the horizontal scaling decision to the micro-service architecture through the work copy controller, and adjusting the quantity of the work copies of each micro-service.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein, and any reference signs in the claims are not intended to be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.
Claims (1)
1. A micro-service active scaling method based on space-time diagram neural network load prediction comprises the following steps:
step one, modeling a micro service architecture:
the whole micro-service architecture comprises N micro-services and a set of micro-servicesIth micro serviceIs represented by(ii) a Wherein the content of the first and second substances,representing microservicesThe work load of (a) is,representing microservicesThe computing resources of (a) are,representing microservicesQuality of service of (2); fixed calling relation exists among micro services, and set of calling relation exists,(ii) a Wherein the relationship is invokedRepresenting microservicesTo micro serviceCalling relationship of, micro-serviceWhen the workload of (2) changes, the relationship is calledWill micro-serveThe workload of (2) changes;
step two, predicting the working load of the micro-service:
constructing and training a space-time diagram neural network consisting of a GAT network and a GRU network, and recording the space-time diagram neural network as a GAT-GRU network;
in a GAT-GRU network, the input includes input dataAnd set of calling relationshipsWhereinRepresenting the length of the time series of the input data,representing the number of microservices;representing the characteristic number of the micro-service workload, wherein the characteristics of the micro-service workload comprise the CPU occupancy rate and the memory occupancy rate of the micro-service; firstly processing input data by a GAT layer GAT-1, then inputting the hidden state output by the GAT-1 into a GRU layer, then processing the hidden state output by the GRU layer by another GAT layer GAT-2 as the input of the GRU layer of the next time sequence, finally merging the hidden state output by each time sequence GRU, processing the merged state by a prediction layer, and finally outputting the required prediction data(ii) a WhereinRepresenting the temporal length of the predicted data;
step three, scaling decision of micro-service level:
adopting a DDPG model to decide whether each micro-service is scaled based on the prediction of the micro-service workload;
the environment state of the DDPG model comprises the resource occupation condition and the service quality condition of each micro service obtained from the prediction data; the resource occupation condition comprises the CPU occupancy rate, the memory occupancy rate and the number of the work copies of the microservice; the quality of service condition comprises an average request response time of the microservice;
the action set of the DDPG model comprises the capacity reduction, maintenance or capacity expansion of each micro-service; when the action value is larger than 1, carrying out capacity expansion, and taking the number of the working copies as the action value and rounding down; when the action value is smaller than-1, carrying out capacity reduction, wherein the number of the working copies is the action value and rounded up; maintaining the number of microservice work copies unchanged when the action value is between-1 and 1;
the reward of the DDPG model is the reciprocal of the weighted average of the average occupancy rate of the CPU of each micro service, the average occupancy rate of the memory and the normalized request response time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211442766.6A CN115499511B (en) | 2022-11-18 | 2022-11-18 | Micro-service active scaling method based on space-time diagram neural network load prediction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211442766.6A CN115499511B (en) | 2022-11-18 | 2022-11-18 | Micro-service active scaling method based on space-time diagram neural network load prediction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115499511A true CN115499511A (en) | 2022-12-20 |
CN115499511B CN115499511B (en) | 2023-03-24 |
Family
ID=85116144
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211442766.6A Active CN115499511B (en) | 2022-11-18 | 2022-11-18 | Micro-service active scaling method based on space-time diagram neural network load prediction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115499511B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116257363A (en) * | 2023-05-12 | 2023-06-13 | 中国科学技术大学先进技术研究院 | Resource scheduling method, device, equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190379605A1 (en) * | 2018-06-08 | 2019-12-12 | Cisco Technology, Inc. | Inferring device load and availability in a network by observing weak signal network based metrics |
CN112199150A (en) * | 2020-08-13 | 2021-01-08 | 北京航空航天大学 | Online application dynamic capacity expansion and contraction method based on micro-service calling dependency perception |
US20210266358A1 (en) * | 2020-02-24 | 2021-08-26 | Netapp, Inc. | Quality of service (qos) settings of volumes in a distributed storage system |
CN114020326A (en) * | 2021-11-04 | 2022-02-08 | 砺剑防务技术(新疆)有限公司 | Micro-service response time prediction method and system based on graph neural network |
WO2022167840A1 (en) * | 2021-02-04 | 2022-08-11 | Telefonaktiebolaget Lm Ericsson (Publ) | Profiling workloads using graph based neural networks in a cloud native environment |
CN115037749A (en) * | 2022-06-08 | 2022-09-09 | 山东省计算中心(国家超级计算济南中心) | Performance-aware intelligent multi-resource cooperative scheduling method and system for large-scale micro-service |
-
2022
- 2022-11-18 CN CN202211442766.6A patent/CN115499511B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190379605A1 (en) * | 2018-06-08 | 2019-12-12 | Cisco Technology, Inc. | Inferring device load and availability in a network by observing weak signal network based metrics |
US20210266358A1 (en) * | 2020-02-24 | 2021-08-26 | Netapp, Inc. | Quality of service (qos) settings of volumes in a distributed storage system |
CN112199150A (en) * | 2020-08-13 | 2021-01-08 | 北京航空航天大学 | Online application dynamic capacity expansion and contraction method based on micro-service calling dependency perception |
WO2022167840A1 (en) * | 2021-02-04 | 2022-08-11 | Telefonaktiebolaget Lm Ericsson (Publ) | Profiling workloads using graph based neural networks in a cloud native environment |
CN114020326A (en) * | 2021-11-04 | 2022-02-08 | 砺剑防务技术(新疆)有限公司 | Micro-service response time prediction method and system based on graph neural network |
CN115037749A (en) * | 2022-06-08 | 2022-09-09 | 山东省计算中心(国家超级计算济南中心) | Performance-aware intelligent multi-resource cooperative scheduling method and system for large-scale micro-service |
Non-Patent Citations (1)
Title |
---|
耿德胜: "面向微服务架构的容器级弹性资源供给方法", 《信息与电脑(理论版)》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116257363A (en) * | 2023-05-12 | 2023-06-13 | 中国科学技术大学先进技术研究院 | Resource scheduling method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115499511B (en) | 2023-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning | |
CN111835827A (en) | Internet of things edge computing task unloading method and system | |
CN110780938B (en) | Computing task unloading method based on differential evolution in mobile cloud environment | |
CN113852432B (en) | Spectrum Prediction Sensing Method Based on RCS-GRU Model | |
CN115686846B (en) | Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation | |
CN115499511B (en) | Micro-service active scaling method based on space-time diagram neural network load prediction | |
CN116126534A (en) | Cloud resource dynamic expansion method and system | |
CN113902116A (en) | Deep learning model-oriented reasoning batch processing optimization method and system | |
Bian et al. | Neural task scheduling with reinforcement learning for fog computing systems | |
Qazi et al. | Towards quantum computing algorithms for datacenter workload predictions | |
WO2023272726A1 (en) | Cloud server cluster load scheduling method and system, terminal, and storage medium | |
Chai et al. | A computation offloading algorithm based on multi-objective evolutionary optimization in mobile edge computing | |
da Silva et al. | Online machine learning for auto-scaling in the edge computing | |
CN113553149A (en) | Cloud server cluster load scheduling method, system, terminal and storage medium | |
CN116009990A (en) | Cloud edge collaborative element reinforcement learning computing unloading method based on wide attention mechanism | |
CN115883371A (en) | Virtual network function placement method based on learning optimization method in edge-cloud collaborative system | |
CN115934349A (en) | Resource scheduling method, device, equipment and computer readable storage medium | |
Liu et al. | Hidden markov model based spot price prediction for cloud computing | |
CN113157344B (en) | DRL-based energy consumption perception task unloading method in mobile edge computing environment | |
Nguyen et al. | Reinforcement learning for maintenance decision-making of multi-state component systems with imperfect maintenance | |
Damaševičius et al. | Short time prediction of cloud server round-trip time using a hybrid neuro-fuzzy network | |
Kumaran et al. | Deep Reinforcement Learning algorithms for Low Latency Edge Computing Systems | |
Jananee et al. | Allocation of cloud resources based on prediction and performing auto-scaling of workload | |
CN111917854B (en) | Cooperation type migration decision method and system facing MCC | |
Lee | System-agnostic meta-learning for MDP-based dynamic scheduling via descriptive policy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |