CN114816699A - Data center job scheduling method and system based on temperature prediction - Google Patents

Data center job scheduling method and system based on temperature prediction Download PDF

Info

Publication number
CN114816699A
CN114816699A CN202210372549.8A CN202210372549A CN114816699A CN 114816699 A CN114816699 A CN 114816699A CN 202210372549 A CN202210372549 A CN 202210372549A CN 114816699 A CN114816699 A CN 114816699A
Authority
CN
China
Prior art keywords
server
scheduling
scheduled
cabinet
job
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210372549.8A
Other languages
Chinese (zh)
Inventor
杨美红
陈泳杰
王继彬
郭莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Shandong Computer Science Center National Super Computing Center in Jinan
Original Assignee
Shandong Computer Science Center National Super Computing Center in Jinan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Computer Science Center National Super Computing Center in Jinan filed Critical Shandong Computer Science Center National Super Computing Center in Jinan
Priority to CN202210372549.8A priority Critical patent/CN114816699A/en
Publication of CN114816699A publication Critical patent/CN114816699A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4893Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/329Power saving characterised by the action undertaken by task scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data center operation scheduling method and system based on temperature prediction, which comprises the following steps: acquiring relevant parameters of a data center cabinet, relevant parameters of a server in the cabinet, the size of resources required by jobs to be scheduled in a job queue and relevant parameters of cooling equipment; preprocessing the acquired data, and screening the characteristics of the preprocessed data; predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the characteristics obtained by screening, and selecting the cabinet with the lowest temperature; performing initial scheduling and optimized scheduling on the jobs to be scheduled in a plurality of servers of the cabinet with the lowest temperature, and selecting an optimal mapping scheme between the servers and the jobs to be scheduled through multiple iterations; and realizing the scheduling of the job to be scheduled according to the optimal mapping scheme.

Description

Data center job scheduling method and system based on temperature prediction
Technical Field
The invention relates to the technical field of data center operation scheduling, in particular to a data center operation scheduling method and system based on temperature prediction.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
With the rapid development of the current data center, the calculation performance of the data center is greatly improved, and the calculation scale is continuously enlarged. However, large-scale computing cluster systems consume more and more energy. The energy consumption of the data center is mainly divided into two parts: energy consumption and cooling energy consumption are calculated. The calculated energy consumption mainly comprises energy consumption generated by a server system, a network system and the like. The cooling energy consumption is mainly the energy consumption consumed by cooling facilities such as an air conditioner and the like.
The reasons for the high energy consumption of data centers are mainly two-fold: the first aspect is that the resource utilization rate is low, and due to different scheduling strategies of jobs, a large number of jobs are deployed on a plurality of servers, so that the servers of the data center are in a low-load state, the overall resource utilization rate of the data center is low, and the computing energy consumption of the data center is improved. The second aspect is the hot spot problem, and at present, the data center mainly adopts mixed load scheduling, and the server cannot know the type and the size of operation, so that the hot spot problem occurs in part of cabinets, and the set temperature of an air conditioner is reduced, and the cooling energy consumption of the data center is increased.
In view of the above problems, the existing energy saving technology is mainly considered from two levels of hardware and software. In the aspect of hardware, a dynamic power frequency adjustment technology (DVFS) is mainly adopted, frequency and voltage are dynamically adjusted according to different requirements of programs operated by a chip on computing capacity, so that energy conservation is realized, but if a scheduling program does not accurately know tasks, the scheduling program cannot guarantee that given frequency can reduce energy consumption. Sensing and node distribution technologies are mainly adopted on a software level, and in the prior art, a server node with low temperature is mainly selected to distribute the operation, so that hot spots are avoided. Or the operation is selected to be distributed to one or more servers in a centralized way, so that the utilization rate of resources is improved. However, the temperature and the resource utilization rate are not considered together, and a server with low temperature and proper resources is selected for job scheduling, so that the aim of minimizing the energy consumption of the data center is achieved.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a data center operation scheduling method and system based on temperature prediction; and predicting the temperature of the cabinets in the data center through machine learning, and selecting the cabinet with the lowest temperature. On the basis, the server and the operation are mapped through the algorithm provided by the invention. The method aims to solve the problems of one-sided optimization and low energy-saving efficiency of the related technical center and realize the final aim of reducing the overall energy consumption of the data center.
In a first aspect, the invention provides a data center job scheduling method based on temperature prediction;
the data center job scheduling method based on temperature prediction comprises the following steps:
acquiring relevant parameters of a data center cabinet, relevant parameters of a server in the cabinet, the size of resources required by jobs to be scheduled in a job queue and relevant parameters of cooling equipment;
preprocessing the acquired data, and screening the characteristics of the preprocessed data; predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the characteristics obtained by screening, and selecting the cabinet with the lowest temperature;
performing initial scheduling and optimized scheduling on the jobs to be scheduled in a plurality of servers of the cabinet with the lowest temperature, and selecting an optimal mapping scheme between the servers and the jobs to be scheduled through multiple iterations; and realizing the scheduling of the job to be scheduled according to the optimal mapping scheme.
In a second aspect, the present invention provides a data center job scheduling system based on temperature prediction;
data center job scheduling system based on temperature prediction includes:
a resource monitoring module configured to: acquiring relevant parameters of a data center cabinet, relevant parameters of a server in the cabinet, the size of resources required by jobs to be scheduled in a job queue and relevant parameters of cooling equipment;
a temperature prediction module configured to: preprocessing the acquired data, and screening the characteristics of the preprocessed data; predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the characteristics obtained by screening, and selecting the cabinet with the lowest temperature;
a job scheduling module configured to: performing initial scheduling and optimized scheduling on the jobs to be scheduled in a plurality of servers of the cabinet with the lowest temperature, and selecting an optimal mapping scheme between the servers and the jobs to be scheduled through multiple iterations; and realizing the scheduling of the job to be scheduled according to the optimal mapping scheme.
Compared with the prior art, the invention has the beneficial effects that:
according to the method and the system, relevant data of the equipment cabinet and the server in the current data center, resource requirements of operation in the operation queue and relevant parameters of air conditioning equipment in the data center are collected, model training is carried out on the collected partial data according to a set machine learning method, and the temperature of the equipment cabinet in a period of time in the future is predicted. And selecting the cabinet with the lowest predicted temperature to carry out random scheduling and optimized scheduling, and finally obtaining the optimized scheduling scheme. The operation is deployed in the cabinet with the lowest average temperature, so that the hot spot problem existing in the data center is solved, and the energy consumption of the air conditioner in the data center is reduced. Meanwhile, a plurality of jobs are distributed to one or more servers under the condition that the threshold value of the resource utilization rate is met, and the idle servers are shut down and dormant, so that the problem of low resource utilization rate of the data center is solved, the calculation energy consumption of the data center is reduced, and the aim of reducing the energy consumption of the data center is fulfilled on the whole.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a flowchart of a data center job scheduling method based on temperature prediction according to an embodiment of the present invention;
FIG. 2 is an architecture diagram of a data center job scheduling method based on temperature prediction according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a resource monitoring module according to an embodiment of the present invention;
fig. 4 is a flowchart of job scheduling operation according to an embodiment of the present invention.
Detailed Description
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
All data are obtained according to the embodiment and are legally applied on the data on the basis of compliance with laws and regulations and user consent.
Example one
The embodiment provides a data center job scheduling method based on temperature prediction;
as shown in fig. 1, a data center job scheduling method based on temperature prediction includes:
s100: acquiring relevant parameters of a data center cabinet, relevant parameters of a server in the cabinet, the size of resources required by jobs to be scheduled in a job queue and relevant parameters of cooling equipment;
s200: preprocessing the acquired data, and screening the characteristics of the preprocessed data; predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the characteristics obtained by screening, and selecting the cabinet with the lowest temperature;
s300: performing initial scheduling and optimized scheduling on the jobs to be scheduled in a plurality of servers of the cabinet with the lowest temperature, and selecting an optimal mapping scheme between the servers and the jobs to be scheduled through multiple iterations; and realizing the scheduling of the job to be scheduled according to the optimal mapping scheme.
Further, the data center cabinet related parameters include: cabinet Cab n Of (2) is
Figure BDA0003589401450000051
The cabinet list is Cab ═ Cab 1 ,Cab 2 ,…,Cab n }。
Further, the relevant parameters of the servers in the cabinet include: server S i Available CPU size in
Figure BDA0003589401450000052
And server S i The size of the memory available in
Figure BDA0003589401450000053
Server list is S ═ S 1 ,S 2 ,…,S n }。
Further, the size of the resource required by the job to be scheduled in the job queue includes: server S i Middle work J j Required CPU size
Figure BDA0003589401450000054
And server S i Operation J j Required memory size
Figure BDA0003589401450000055
The job list is J ═ J 1 ,J 2 ,…,J n }。
Further, parameters relating to said cooling device, packageComprises the following steps: air inlet temperature T of air conditioning equipment in And the air outlet temperature T of the air conditioning equipment out And the inlet water temperature of the air conditioner water cooling system
Figure BDA0003589401450000056
Return water temperature of air conditioner water cooling system
Figure BDA0003589401450000057
Air inlet humidity H of air conditioning equipment in And the outlet air humidity H of the air conditioning equipment out
Further, the preprocessing the acquired data includes:
and deleting incomplete data and null data, and merging the cleaned data.
Illustratively, the preprocessing the acquired data includes:
and filtering the data with the value of 0 and the incomplete data, and combining the fields by taking time as a primary key.
Further, the performing feature screening on the preprocessed data includes:
selecting a characteristic value related to the temperature of the cabinet according to the preprocessed data, wherein the specific characteristic value comprises the following steps: cabinet Cab n Of (2) is
Figure BDA0003589401450000061
Air inlet temperature T of air conditioning equipment in And the air outlet temperature T of the air conditioning equipment out And the inlet water temperature of the air conditioner water cooling system
Figure BDA0003589401450000062
Return water temperature of air conditioner water cooling system
Figure BDA0003589401450000063
Air inlet humidity H of air conditioning equipment in And the outlet air humidity H of the air conditioning equipment out
Selecting a characteristic value related to the temperature of the cabinet, wherein the specific characteristic value is shown in table 1:
TABLE 1 rack temperature-related eigenvalues
Time Cabinet power Return air humidity Temperature of return air Temperature of water outlet Temperature of inlet water Humidity of air supply Temperature of the air supply
2021/12/1 0:00:00 4.36 32.83 21.02 18.25 16.6 42.96 18.23
2021/12/1 0:05:00 4.78 23.14 31.2 20.3 16.5 39.57 19.6
2021/12/1 0:10:00 4.76 22.94 31.23 20.27 16.5 39.54 19.6
Further, predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the screened features, and selecting the cabinet with the lowest temperature, wherein the trained machine learning model; the specific training process comprises the following steps:
constructing a machine learning prediction model;
constructing a training set; the training set includes: the numerical value of a plurality of rack, the numerical value of every rack all includes: knowing a plurality of characteristic values in a historical time period of the temperature of the cabinet at the current time point;
eigenvalues in the training set, including: cabinet Cab n Of (2) is
Figure BDA0003589401450000064
Air inlet temperature T of air conditioning equipment in And the air outlet temperature T of the air conditioning equipment out And the inlet water temperature of the air conditioner water cooling system
Figure BDA0003589401450000065
Return water temperature of air conditioner water cooling system
Figure BDA0003589401450000071
Air inlet humidity H of air conditioning equipment in And the outlet air humidity H of the air conditioning equipment out
And inputting the training set into a machine learning model, training the model, and stopping training when the loss function value of the model does not decrease or the iteration number reaches the set number to obtain the trained machine learning model.
And the machine learning prediction model selects a random forest model.
Model training, namely training the characteristic values by adopting a machine learning model, predicting the temperature of the equipment cabinet in the data center in a future period of time, wherein the predicted behavior is expressed as a formula (1):
Figure BDA0003589401450000072
further, the cabinet with the lowest temperature is selected
Figure BDA0003589401450000073
The formula used is shown in equation (2):
Figure BDA0003589401450000074
wherein the content of the first and second substances,
Figure BDA0003589401450000075
representing cabinets Cab n The predicted temperature of (a);
Figure BDA0003589401450000076
cabinet Cab showing the lowest predicted temperature n
Further, the operation to be scheduled is initially scheduled and optimally scheduled in a plurality of servers of the cabinet with the lowest temperature, and an optimal mapping scheme between the servers and the operation to be scheduled is selected through multiple iterations; realizing the scheduling of the job to be scheduled according to the optimal mapping scheme; the specific process comprises the following steps:
initial scheduling: converting job J in job queue to { J ═ J 1 ,J 2 ,…,J n And the server S in the lowest temperature cabinet is set as S 1 ,S 2 ,…,S n Carrying out random mapping;
optimizing and scheduling: further optimization is carried out according to the result of the initial scheduling, and the size of the residual resource in each server S is calculated
Figure BDA0003589401450000077
And ascending, judging whether the server with the minimum residual resource can meet the operation requirement in the server with the maximum residual resource and is within a set resource utilization rate threshold k according to the sorting sequence, if so, reallocating the operation, otherwise, continuously traversing.
Further, as shown in fig. 4, the optimized schedule; the method specifically comprises the following steps:
step 1): according to the result of the initial scheduling, calculating each server S in the current scheduling scheme i The size of the remaining resources in
Figure BDA0003589401450000081
And will obtain
Figure BDA0003589401450000082
Arranging according to the sequence from small to large; performing step 2);
step 2): traversing corresponding server nodes according to the sorted residual resource size, selecting the server node with the first current sorting and the server node with the first last sorting, and executing the step 3);
step 3): judging the size of the residual resource in the server node with the first current sequencing
Figure BDA0003589401450000083
Whether job J in the server node with the order of the last to last is satisfied j Required CPU size
Figure BDA0003589401450000084
And operation J j Required memory size
Figure BDA0003589401450000085
If it satisfiesIf not, executing step 4), and if not, executing step 7);
step 4): distributing the jobs in the server nodes with the first last sequence to the server nodes with the first last sequence, removing the server nodes with the first last sequence from the queue, updating the position information of the server queue, and executing the step 5);
step 5): judging whether the first server in the sequence has residual resources under the condition of meeting a resource utilization rate threshold k; if yes, continuing to execute the step 4); otherwise, executing step 6);
step 6): deleting the server node with the first current sorting from all the server queues, updating the sorting information of the server queues, and then returning to the step 2);
step 7): judging whether the server node queue is traversed and ended, and if not, executing the step 6); and if so, obtaining the final mapping relation, and finishing the scheduling.
Further, the step 1) is replaced by:
judging each server S according to the result of the initial scheduling i Whether the resource utilization rate is more than or equal to a set resource utilization rate threshold value k or not; if so, performing the initial scheduling again; if not, calculating each server S in the current scheduling scheme i The size of the remaining resources in
Figure BDA0003589401450000086
And will obtain
Figure BDA0003589401450000087
The arrangement is from small to large.
Further, each server S in the current scheduling scheme i The size of the remaining resources in
Figure BDA0003589401450000091
The calculation process is as shown in formula (3):
Figure BDA0003589401450000092
wherein the content of the first and second substances,
Figure BDA0003589401450000093
presentation Server S i Total amount of CPU in (1);
Figure BDA0003589401450000094
presentation Server S i The total amount of memory in.
Further, selecting an optimal mapping scheme between the server and the job to be scheduled; the method specifically comprises the following steps:
according to the process of optimizing scheduling, obtaining the mapping relation between the optimized server and the operation, and calculating the server S under the current mapping relation i Computing power consumption of
Figure BDA0003589401450000095
As shown in equation (4):
Figure BDA0003589401450000096
wherein the content of the first and second substances,
Figure BDA0003589401450000097
presentation Server S i The energy consumption of the CPU;
Figure BDA0003589401450000098
presentation Server S i Energy consumption of the middle memory;
performing multiple iterations on the initial scheduling and the optimized scheduling, and finally selecting
Figure BDA0003589401450000099
The lowest mapping relation is used as the optimal mapping scheme.
Illustratively, referring to FIG. 3, a resource monitoring process; the method specifically comprises the following steps:
step S321: judging each server S according to the result of random mapping i Whether resource utilization is greater than or equal toAt a set resource utilization threshold k.
Step S322: if so, the initial scheduling is carried out again.
Step S323: if not, calculating the residual resources of the server according to a formula
Figure BDA00035894014500000910
And sorted in ascending order.
Step S324: and traversing the sorting, and initializing i, j to be 0 and n. Where i, j each represent a subscript of the server.
Step S325: judgment server S i Surplus resources
Figure BDA00035894014500000911
Whether or not server S is satisfied j The required resources of the job.
If yes, step S326, calculate S j Middle work is divided into S i Back server S i The resource utilization of.
Step S3261: continuously judging S i Whether the medium resource utilization rate exceeds a set resource utilization rate threshold value k.
In step S3262, if yes, i is set to i +1, and the process returns to step S324.
Step S3263: if not, then S j The job in (1) is redistributed to S i In (1).
Step S3264: let j equal j-1 and return to step S325.
Step S327: if not, whether the mapping relation between the server and the operation is traversed or not is continuously judged.
Step S328: if not, let i be i +1, and return to step S324.
Step S329: and if so, obtaining a final mapping method and ending.
Step S330: and (3) a final scheduling scheme: finally determining a scheduling scheme according to the two steps, and calculating the calculated energy consumption under the mapping relation
Figure BDA0003589401450000101
Over a plurality of iterationsThe step of selecting
Figure BDA0003589401450000102
And scheduling according to the minimum mapping relation.
The embodiment of the invention aims to protect the proposed data center operation scheduling method based on temperature prediction, and the method has the following effects: machine learning-based cabinet temperature prediction is carried out according to the monitored data, the temperature of each cabinet of the data center in a future period of time can be obtained, and the cabinet with the lowest temperature is selected for operation scheduling, so that hot spots in the data center can be avoided, excessive reduction of air conditioner temperature is avoided, and cooling energy consumption of the data center is reduced. The initial scheduling and the optimized scheduling are performed in the cabinet with the lowest temperature, so that the condition that the resource utilization rate of part of servers in the cabinet is low can be avoided, the resource utilization rate of the servers is improved, and the calculation energy consumption of the data center is reduced. Thus, the goal of reducing the overall energy consumption of the data center is achieved overall.
Example two
The embodiment provides a data center job scheduling system based on temperature prediction;
as shown in fig. 2, the data center job scheduling system based on temperature prediction includes:
a resource monitoring module configured to: acquiring relevant parameters of a data center cabinet, relevant parameters of a server in the cabinet, the size of resources required by jobs to be scheduled in a job queue and relevant parameters of cooling equipment;
a temperature prediction module configured to: preprocessing the acquired data, and screening the characteristics of the preprocessed data; predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the characteristics obtained by screening, and selecting the cabinet with the lowest temperature;
a job scheduling module configured to: performing initial scheduling and optimized scheduling on the jobs to be scheduled in a plurality of servers of the cabinet with the lowest temperature, and selecting an optimal mapping scheme between the servers and the jobs to be scheduled through multiple iterations; and realizing the scheduling of the job to be scheduled according to the optimal mapping scheme.
It should be noted here that the resource monitoring module, the temperature prediction module, and the job scheduling module correspond to steps S101 to S103 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and application scenarios, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
In the foregoing embodiments, the descriptions of the embodiments have different emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The proposed system can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The data center operation scheduling method based on temperature prediction is characterized by comprising the following steps:
acquiring relevant parameters of a data center cabinet, relevant parameters of a server in the cabinet, the size of resources required by jobs to be scheduled in a job queue and relevant parameters of cooling equipment;
preprocessing the acquired data, and screening the characteristics of the preprocessed data; predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the characteristics obtained by screening, and selecting the cabinet with the lowest temperature;
performing initial scheduling and optimized scheduling on the jobs to be scheduled in a plurality of servers of the cabinet with the lowest temperature, and selecting an optimal mapping scheme between the servers and the jobs to be scheduled through multiple iterations; and realizing the scheduling of the job to be scheduled according to the optimal mapping scheme.
2. The data center job scheduling method based on temperature prediction according to claim 1, wherein the job to be scheduled is initially scheduled and optimally scheduled in a plurality of servers of a cabinet with the lowest temperature, and an optimal mapping scheme between the servers and the job to be scheduled is selected through multiple iterations; realizing the scheduling of the job to be scheduled according to the optimal mapping scheme; the specific process comprises the following steps:
initial scheduling: converting job J in job queue to { J ═ J 1 ,J 2 ,...,J n And the server S in the lowest temperature cabinet is set as S 1 ,S 2 ,...,S n Performing random mapping;
optimizing and scheduling: further optimization is carried out according to the result of the initial scheduling, and the size of the residual resource in each server S is calculated
Figure FDA0003589401440000011
And ascending, judging whether the server with the minimum residual resource can meet the operation requirement in the server with the maximum residual resource and is within a set resource utilization rate threshold k according to the sorting sequence, if so, reallocating the operation, otherwise, continuously traversing.
3. The method for data center job scheduling based on temperature prediction according to claim 1 or 2, wherein the optimized scheduling; the method specifically comprises the following steps:
step 1): according to the result of the initial scheduling, calculating each server S in the current scheduling scheme i The size of the remaining resources in
Figure FDA0003589401440000021
And will obtain
Figure FDA0003589401440000022
Arranging according to the sequence from small to large; performing step 2);
step 2): traversing corresponding server nodes according to the sorted residual resource size, selecting the server node with the first current sorting and the server node with the first last sorting, and executing the step 3);
step 3): judging the size of the residual resource in the server node with the first current sequencing
Figure FDA0003589401440000023
Whether job J in the server node with the order of the last to last is satisfied j Required CPU size
Figure FDA0003589401440000024
And operation J j Required memory size
Figure FDA0003589401440000025
If yes, executing step 4), if not, executing step 7);
step 4): distributing the jobs in the server nodes with the first last sequence to the server nodes with the first last sequence, removing the server nodes with the first last sequence from the queue, updating the position information of the server queue, and executing the step 5);
step 5): judging whether the first server in the sequence has residual resources under the condition of meeting a resource utilization rate threshold k; if yes, continuing to execute the step 4); otherwise, executing step 6);
step 6): deleting the server node with the first current sorting from all the server queues, updating the sorting information of the server queues, and then returning to the step 2);
step 7): judging whether the server node queue is traversed and ended, and if not, executing the step 6); and if so, obtaining the final mapping relation, and finishing the scheduling.
4. The method for scheduling data center jobs according to claim 3, wherein the step 1) is replaced with:
judging each server S according to the result of the initial scheduling i Whether the resource utilization rate is more than or equal to a set resource utilization rate threshold value k or not; if so, performing the initial scheduling again; if not, calculating each server S in the current scheduling scheme i The size of the remaining resources in
Figure FDA0003589401440000026
And will obtain
Figure FDA0003589401440000027
The arrangement is from small to large.
5. The method according to claim 4, wherein each server S in the current scheduling plan is a server S i The size of the remaining resources in
Figure FDA0003589401440000031
The calculation process is as follows:
Figure FDA0003589401440000032
wherein the content of the first and second substances,
Figure FDA0003589401440000033
presentation Server S i Total amount of CPU in (1);
Figure FDA0003589401440000034
presentation Server S i The total amount of memory in (1), α and β represent weights.
6. The method according to claim 1, wherein the optimal mapping scheme between the servers and the jobs to be scheduled is selected; the method specifically comprises the following steps:
according to the process of optimizing scheduling, obtaining the mapping relation between the optimized server and the operation, and calculating the server S under the current mapping relation i Computing power consumption of
Figure FDA0003589401440000035
Figure FDA0003589401440000036
Wherein the content of the first and second substances,
Figure FDA0003589401440000037
presentation Server S i The energy consumption of the CPU;
Figure FDA0003589401440000038
presentation Server S i Energy consumption of the middle memory;
performing multiple iterations on the initial scheduling and the optimized scheduling, and finally selecting
Figure FDA0003589401440000039
The lowest mapping relation is used as the optimal mapping scheme.
7. The method for scheduling data center jobs based on temperature prediction according to claim 1, wherein the performing feature screening on the preprocessed data includes:
selecting a characteristic value related to the temperature of the cabinet according to the preprocessed data, wherein the specific characteristic value comprises the following steps: cabinet Cab n Of (2) is
Figure FDA00035894014400000310
Air inlet temperature T of air conditioning equipment in And the air outlet temperature T of the air conditioning equipment out And the inlet water temperature of the air conditioner water cooling system
Figure FDA00035894014400000311
Backwater temperature of air conditioner water cooling system
Figure FDA00035894014400000312
Air inlet humidity H of air conditioning equipment in And the outlet air humidity H of the air conditioning equipment out
8. Data center operation scheduling system based on temperature prediction is characterized by comprising:
a resource monitoring module configured to: acquiring relevant parameters of a data center cabinet, relevant parameters of a server in the cabinet, the size of resources required by jobs to be scheduled in a job queue and relevant parameters of cooling equipment;
a temperature prediction module configured to: preprocessing the acquired data, and screening the characteristics of the preprocessed data; predicting the temperature of the cabinet in a set time period in the future based on the trained machine learning model and the characteristics obtained by screening, and selecting the cabinet with the lowest temperature;
a job scheduling module configured to: performing initial scheduling and optimized scheduling on the jobs to be scheduled in a plurality of servers of the cabinet with the lowest temperature, and selecting an optimal mapping scheme between the servers and the jobs to be scheduled through multiple iterations; and realizing the scheduling of the job to be scheduled according to the optimal mapping scheme.
9. The system according to claim 8, wherein the job to be scheduled is initially scheduled and optimally scheduled in the servers of the rack with the lowest temperature, and the optimal mapping scheme between the servers and the job to be scheduled is selected through multiple iterations; realizing the scheduling of the job to be scheduled according to the optimal mapping scheme; the specific process comprises the following steps:
initial scheduling: converting job J in job queue to { J ═ J 1 ,J 2 ,...,J n With the service in the lowest temperature cabinetDevice S ═ S 1 ,S 2 ,...,S n Carrying out random mapping;
optimizing and scheduling: further optimization is carried out according to the result of the initial scheduling, and the size of the residual resource in each server S is calculated
Figure FDA0003589401440000041
And ascending, judging whether the server with the minimum residual resource can meet the operation requirement in the server with the maximum residual resource and is within a set resource utilization rate threshold k according to the sorting sequence, if so, reallocating the operation, otherwise, continuously traversing.
10. The data center job scheduling system based on temperature prediction of claim 8, wherein the optimized schedule; the method specifically comprises the following steps:
step 1): according to the result of the initial scheduling, calculating each server S in the current scheduling scheme i The size of the remaining resources in
Figure FDA0003589401440000042
And will obtain
Figure FDA0003589401440000043
Arranging according to the sequence from small to large; performing step 2);
step 2): traversing corresponding server nodes according to the sorted residual resource size, selecting the server node with the first current sorting and the server node with the first last sorting, and executing the step 3);
and step 3): judging the size of the residual resource in the server node with the first current sequencing
Figure FDA0003589401440000051
Whether job J in the server node with the order of the last to last is satisfied j Required CPU size
Figure FDA0003589401440000052
And operation J j Required memory size
Figure FDA0003589401440000053
If yes, executing step 4), if not, executing step 7);
step 4): distributing the jobs in the server nodes with the first last sequence to the server nodes with the first last sequence, removing the server nodes with the first last sequence from the queue, updating the position information of the server queue, and executing the step 5);
step 5): judging whether the first server in the sequence has residual resources under the condition of meeting a resource utilization rate threshold k; if yes, continuing to execute the step 4); otherwise, executing step 6);
step 6): deleting the server node with the first current sorting from all the server queues, updating the sorting information of the server queues, and then returning to the step 2);
step 7): judging whether the server node queue is traversed and ended, and if not, executing the step 6); and if so, obtaining the final mapping relation, and finishing the scheduling.
CN202210372549.8A 2022-04-11 2022-04-11 Data center job scheduling method and system based on temperature prediction Pending CN114816699A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210372549.8A CN114816699A (en) 2022-04-11 2022-04-11 Data center job scheduling method and system based on temperature prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210372549.8A CN114816699A (en) 2022-04-11 2022-04-11 Data center job scheduling method and system based on temperature prediction

Publications (1)

Publication Number Publication Date
CN114816699A true CN114816699A (en) 2022-07-29

Family

ID=82534527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210372549.8A Pending CN114816699A (en) 2022-04-11 2022-04-11 Data center job scheduling method and system based on temperature prediction

Country Status (1)

Country Link
CN (1) CN114816699A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115220900A (en) * 2022-09-19 2022-10-21 山东省计算中心(国家超级计算济南中心) Energy-saving scheduling method and system based on operation power consumption prediction
CN115685941A (en) * 2022-11-04 2023-02-03 中国电子工程设计院有限公司 Machine room operation regulation and control method and device based on cabinet hot spot temperature prediction
CN117707741A (en) * 2024-02-05 2024-03-15 山东省计算中心(国家超级计算济南中心) Energy consumption balanced scheduling method and system based on spatial position

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115220900A (en) * 2022-09-19 2022-10-21 山东省计算中心(国家超级计算济南中心) Energy-saving scheduling method and system based on operation power consumption prediction
CN115220900B (en) * 2022-09-19 2022-12-13 山东省计算中心(国家超级计算济南中心) Energy-saving scheduling method and system based on operation power consumption prediction
CN115685941A (en) * 2022-11-04 2023-02-03 中国电子工程设计院有限公司 Machine room operation regulation and control method and device based on cabinet hot spot temperature prediction
CN117707741A (en) * 2024-02-05 2024-03-15 山东省计算中心(国家超级计算济南中心) Energy consumption balanced scheduling method and system based on spatial position
CN117707741B (en) * 2024-02-05 2024-05-24 山东省计算中心(国家超级计算济南中心) Energy consumption balanced scheduling method and system based on spatial position

Similar Documents

Publication Publication Date Title
CN114816699A (en) Data center job scheduling method and system based on temperature prediction
CN110096349B (en) Job scheduling method based on cluster node load state prediction
CN109800066B (en) Energy-saving scheduling method and system for data center
CN104698843B (en) A kind of data center's energy-saving control method based on Model Predictive Control
Wallace et al. A data driven scheduling approach for power management on hpc systems
Rajabzadeh et al. Energy-aware framework with Markov chain-based parallel simulated annealing algorithm for dynamic management of virtual machines in cloud data centers
KR20160005367A (en) Power-aware thread scheduling and dynamic use of processors
CN107861796B (en) Virtual machine scheduling method supporting energy consumption optimization of cloud data center
Al-Daoud et al. Power-aware linear programming based scheduling for heterogeneous computer clusters
CN115220900B (en) Energy-saving scheduling method and system based on operation power consumption prediction
CN110362392A (en) A kind of ETL method for scheduling task, system, equipment and storage medium
Hao et al. An adaptive algorithm for scheduling parallel jobs in meteorological Cloud
KR101770736B1 (en) Method for reducing power consumption of system software using query scheduling of application and apparatus for reducing power consumption using said method
CN103455131B (en) A kind of based on method for scheduling task energy consumption minimized in the embedded system of probability
Hao et al. Adaptive energy-aware scheduling method in a meteorological cloud
Jiang et al. An energy-aware virtual machine migration strategy based on three-way decisions
Peng et al. Energy-aware scheduling of workflow using a heuristic method on green cloud
Escobar et al. Energy‐aware load balancing of parallel evolutionary algorithms with heavy fitness functions in heterogeneous CPU‐GPU architectures
Than et al. Energy-saving resource allocation in cloud data centers
Tian et al. Modeling and analyzing power management policies in server farms using stochastic petri nets
CN109614210B (en) Storm big data energy-saving scheduling method based on energy consumption perception
Barone et al. An approach to forecast queue time in adaptive scheduling: how to mediate system efficiency and users satisfaction
Song et al. A deep reinforcement learning-based task scheduling algorithm for energy efficiency in data centers
CN117251044A (en) Cloud server dynamic energy consumption management method and system based on ARIMA technology
Ohmura et al. Toward building a digital twin of job scheduling and power management on an hpc system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230328

Address after: 250014 No. 19, ASTRI Road, Lixia District, Shandong, Ji'nan

Applicant after: SHANDONG COMPUTER SCIENCE CENTER(NATIONAL SUPERCOMPUTER CENTER IN JINAN)

Applicant after: Qilu University of Technology (Shandong Academy of Sciences)

Address before: 250014 No. 19, ASTRI Road, Lixia District, Shandong, Ji'nan

Applicant before: SHANDONG COMPUTER SCIENCE CENTER(NATIONAL SUPERCOMPUTER CENTER IN JINAN)

TA01 Transfer of patent application right