CN107861606A - A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping - Google Patents
A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping Download PDFInfo
- Publication number
- CN107861606A CN107861606A CN201711163506.4A CN201711163506A CN107861606A CN 107861606 A CN107861606 A CN 107861606A CN 201711163506 A CN201711163506 A CN 201711163506A CN 107861606 A CN107861606 A CN 107861606A
- Authority
- CN
- China
- Prior art keywords
- cpu
- gpu
- power consumption
- time
- power
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3296—Power saving characterised by the action undertaken by lowering the supply or operating voltage
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Power Sources (AREA)
Abstract
The present invention discloses a kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, survey calculation node power consumption can be distinguished after the completion of program execution by being realized first against heterogeneous system, the script of CPU power consumption and GPU power consumptions, then the concurrent testing benchmark program of selection is changed, for obtaining the execution time of different kernel functions;Then in the case where CPU and GPU sets different frequency, application program only is run on CPU and GPU respectively, obtains detailed operation information, including total execution time, each kernel function perform time, calculate node power consumption and CPU power consumption and GPU power consumptions;Based on operation information, a forecast model, including prediction execution time model and power consumption model are designed;Finally, based on forecast model, different cpu frequencies, GPU frequencies and system power dissipation under task allocative decision are obtained and performs the time insert in allocation list, according to improved greedy algorithm, searches out allocation optimum scheme.System power dissipation budget is limited while can improving systematic function using the present invention.
Description
Technical field
The invention belongs to field of computer architecture, and in particular to realize a kind of by coordinating DVFS and duty mapping
Heterogeneous polynuclear power cap method.
Background technology
It is gradual as the advanced architectures of representative using polycaryon processor by the continuous research and development of recent years
Single core processor is substituted to turn into the main path for improving processor performance.The isomorphism that compares polycaryon processor, heterogeneous polynuclear platform
Better performance can be realized.Power cap is a kind of technology of power consumption limit by heterogeneous system under predeterminated level.Power consumption
The lifting of heterogeneous polynuclear performance is limited with radiating.The structure of modern processors allows them to bear certain level power consumption band
The injury come, so as to be required to the system for realizing the processor power upper limit.Most common power budget technology is by hard at present
Part component is worked at different frequencies, therefore has different power consumptions, and main thought is scaled using dynamic voltage frequency
(DVFS).While limiting heterogeneous system power consumption using DVFS, the situation of laod unbalance occurs between CPU and GPU.Pass through
Concurrent program is decomposed into can performing simultaneously for task, and each duty mapping can be made full use of to most suitable processor
The computing capability of system, systematic function is improved, but this mapping scheme usually not considers system power dissipation.This paper presents
A kind of scheme for combining DVFS and duty mapping, systematic function is improved in the case where limiting system power dissipation budget.
The content of the invention
The present invention proposes a kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, improves system
System power dissipation budget is limited while performance, realizes that one can be surveyed respectively after the completion of program execution first against heterogeneous system
The script of gauge operator node power consumption, CPU power consumption and GPU power consumptions, the concurrent testing benchmark program of selection is then changed, for obtaining
The execution time of different kernel functions.Then in the case where CPU and GPU sets different frequency, application is only run on CPU and GPU respectively
Program, detailed operation information is obtained, including total execution time, each kernel function perform time, calculate node power consumption and CPU work(
Consumption and GPU power consumptions.Based on operation information, a forecast model, including prediction execution time model and power consumption model are designed.Most
Afterwards, based on forecast model, obtain different cpu frequencies, GPU frequencies and system power dissipation under task allocative decision and perform the time filling out
Enter in allocation list.According to improved greedy algorithm, allocation optimum scheme (cpu frequency, GPU frequencies, duty mapping table) is searched out.
In order to achieve the above object, the present invention uses following technical scheme.
A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, DVFS and duty mapping are combined
System power dissipation is limited under budget power consumption while systematic function is pursued;Comprise the following steps:
Step 1, realize that measurement application program always performs time, CPU power consumption and GPU work(after the completion of application program execution
Consumption.
Step 2, the concurrent testing benchmark program of selection is changed, obtains the execution time of each kernel function in program.
Step 3, respectively only application program is performed on CPU or on GPU, set different CPU (GPU) can selected frequency, obtain
Detailed operation information, always perform the time including program, each kernel function performs the time in program, total system power consumption,
CPU power consumption and GPU power consumptions.
Step 4, design forecast model, predict different task mapping scheme under different CPU and GPU frequencies power consumption and
Perform the time.The input of forecast model is cpu frequency, GPU frequencies and duty mapping scheme, present document relates to duty mapping be
Refer to using each kernel function as an entirety, be mapped on CPU or GPU, rather than kernel function is distributed according to certain proportion
Performed simultaneously on to CPU and GPU.The output of forecast model is that the program of prediction always performs the system power dissipation of time and prediction.This
Forecast model includes execution time model and power consumption model.
Step 4.1, execution time model
Application program always performs the time can be according to the execution time of the kernel of each in program and corresponding data biography
The defeated time obtains.1. always perform the time by program is represented by formula.
Wherein, fcpu,fgpuCpu frequency, GPU frequencies, T are represented respectivelyi(fcpu,fgpu) represent i-th of kernel function execution
Time and required data transmission period, 2. represented by formula.
The Part I of formula represents to perform time, Part II expression data transmission period.The kernel function execution time exists
2nd step has obtained.H2D and D2H represent respectively the transmission of data transfer cost from main frame to equipment and equipment to main frame into
This.Required be equal to 1 or 0, represent whether core k whether data d.Data are possible in equipment, it is not necessary to are passed
It is defeated, so representing data whether in equipment using OnDevice.Size of data is represented with size.
Step 4.2, power consumption model
System power dissipation can be by three part expressions, respectively idle power consumption, CPU power consumption PcpuWith GPU power consumptions Pgpu.System
3. power consumption is represented by formula.
P=Pidle(fcpu, fgpu)+Pcpu(fcpu, fgpu)+Pgpu(fcpu, fgpu) ③
Wherein, PidleIdle power consumption is represented, it is relevant with cpu frequency and GPU frequencies, and it is unrelated with duty mapping.It can lead to
Cross the power consumption that the total power consumption that execution application program obtains only on CPU is subtracted on CPU to obtain, 4. obtained by formula.
Represent respectively under conditions of performing application program only on CPU, the total system power consumption of acquisition,
CPU power consumption and GPU power consumptions, it is relevant with setting frequency.
When application program performs, CPU and GPU are not always in execution, institute due to the data correlation between kernel function
Changed with CPU power consumption and GPU power consumptions according to the difference of given duty mapping scheme, with reference to this phenomenon, we are false
If power consumption and to perform the time directly proportional with total execution time ratios, thus CPU and GPU power consumptions can respectively by formula 5. and formula
6. represent.
Wherein, λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively,Represent only to hold on GPU respectively
Under conditions of row application program, the CPU power consumption and GPU power consumptions of acquisition.WithCPU is represented respectively
With the estimator of the maximum of dynamic power consumption in GPU.
Because λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively, it is possible to the operation measured from the 3rd step
Information obtains.When program performs, each equipment rush hour is defined as the summation of each kernel function actual execution time.λcpuWith
λgpu7. can 8. it be represented with formula by formula respectively.
tcpuRepresent cpu busy time, tgpuRepresent the GPU rush hours, ttotalRepresent that application program always performs the time.
Step 5, based on forecast model, configuration parameter table is built.
Different cpu frequency and GPU frequencies, different task mapping scheme are calculated according to execution time model and power consumption model
Under the execution time and power consumption, and insert in configuration parameter table.
Step 6, according to configuration parameter table, optimized parameter collection is searched for using improved greedy algorithm.It is divided into two steps, exists first
Time most short duty mapping scheme is performed using greedy algorithm search under given cpu frequency and GPU frequencies, according to this task
Mapping, cpu frequency, GPU frequencies and power module, obtain the system prediction power consumption under this parameter configuration.Then, CPU frequencies are changed
Rate and GPU frequencies, system prediction power consumption is calculated again according to previous step, finally give be limited in it is optimal under budget power consumption
Allocation plan.
Step 6.1, cpu frequency and GPU frequencies are given, search performs time most short duty mapping scheme, and according to work(
Model is consumed, draws system prediction power consumption.
Step 6.2, according to cpu frequency and the optional setting of GPU frequencies, change cpu frequency and GPU frequencies, repeat step
Rapid 6.1, mapping scheme corresponding to the optimal exercising time under this combination of frequency is obtained, and according to this mapping scheme computing system work(
Consumption, by given budget power consumption, draw the optimal frequency parameter selection being limited under budget power consumption and mapping scheme.
Compared with prior art, the present invention has advantages below:
DVFS and duty mapping are combined, common realization ensures system while system power dissipation is limited in into budget power consumption
System performance.Existing system power capping technology is largely all realized by dynamic voltage frequency scaling, because system
In device frequency system power dissipation is influenceed maximum, but do not account for and change cpu frequency and system that GPU frequencies can be brought
The situation of load imbalance.By the way that concurrent program is decomposed into can performing for task simultaneously, and by each duty mapping to most closing
Suitable processor can make full use of the computing capability of system, improve systematic function, but this mapping scheme is usually not examined
Consider system power dissipation.So it is of the invention by the way that two kinds of optimisation strategies of DVFS and duty mapping are combined, in lifting system
System power dissipation is limited under certain budget level while energy.
Brief description of the drawings
To make the purpose of the present invention, scheme is more easy-to-understand, and below in conjunction with figure, the present invention is further described.
Fig. 1 is CPU-GPU heterogeneous multi-core system Organization Charts, and the heterogeneous system is to be simulated to build by gem5-gpu, 4 cores
CPU and GPU being made up of 8 CU is integrated on the same chip.
Fig. 2 is the power cap conceptual design schematic diagram based on detailed operation information in the present invention.
Fig. 3 is the fine granularity synchronization schematic diagram between CPU and GPU using traditional task data piecemeal.
Fig. 4 is CPU the and GPU rush hours of the task kernel function piecemeal used in the present invention and waits task management data
Caused by CPU and GPU free time schematic diagrames.
Embodiment
The present invention will be further described below in conjunction with the accompanying drawings.
Fig. 1 is the heterogeneous multi-core system built by gem5-gpu simulators, simulation be one by 4 core CPUs and
One is integrated in the isomery framework on same chip by 8 CU GPU formed, can be according to configuration text in gem5-gpu
Part flexibly changes this analog architectures, and gem5-gpu supports DVFS.
The present invention realizes one kind by the way that DVFS and task are reflected in the heterogeneous multi-core system of structure is simulated by gem5-gpu
The power cap method combined is penetrated, includes step in detail below:
Step 1, realize and measure total calculate node time, CPU power consumption and GPU power consumptions after the completion of application program execution.
In gem5-gpu, after an application program execution terminates, one can be automatically generated and perform letter comprising all programs
The file stat.txt of breath, wherein just comprising program execution time.McPAT modules can independent measurement CPU power consumption,
GPUWattch modules can independent measurement CPU module, pass through in gem5-gpu configure McPAT module volume GPUWattch modules
CPU power consumption and GPU power consumptions can be obtained after the completion of program execution.
Step 2, the concurrent testing benchmark program of selection is changed, obtains the execution time of each kernel function in program.
OpenCL programs can perform on different devices, including CPU and GPU.The benchmark used in the present invention
Test program is the NAS concurrent testing benchmark program collection of OpenCL versions.Each benchmark has different characteristics, wherein,
Some programs comprise more than 60 kernel, and some programs only have two kernel.By rewriting test program, each is collected
The kernel execution time.
Step 3, respectively only application program is performed on CPU or on GPU, set different CPU (GPU) can selected frequency, obtain
Obtain detailed operation information, the input as the power cap scheme based on operation information.
Fig. 2 shows the flow of the power cap strategy based on operation information.Wherein, CPU Profile Runs and GPU
Profile Runs represent the operation information only obtained on CPU and only on GPU after execution application program respectively, pass through these
Operation information, establish the forecast model in step 4, including time model and power consumption model, by the Time model in Fig. 2 and
Power model are represented.The cpu frequency during input of forecast model, GPU frequencies and duty mapping scheme, output are corresponding pre-
Survey and perform time and forecasting system power consumption.Different input and output construct an allocation list, improved according to allocation list, use
Greedy algorithm is according to the algorithm search optimum mapping scheme and set of frequency in step 6.By the Distribute in Fig. 2
Parallel tasks and set device frequencies are represented.
Step 4, forecast model, including execution time model and power consumption model are established, for predicting in heterogeneous multi-core environment
The execution time of middle application program and system power dissipation.In step 3 under different set of frequency, only on CPU and only on GPU
It is the basis for establishing forecast model to perform the operation information that application program obtains.From formula 1. to formula 8., it can be seen that pass through
Program operation information includes each kernel execution informations, CPU power consumption and GPU power consumptions, can predict different cpu frequencies, GPU
Under frequency and duty mapping scheme, the execution time of program and power consumption.
The duty mapping being related in the present invention refers to using any one kernel in program as an entirety, mapping
To CPU or GPU, this from traditional according to task data to distribute a certain proportion of data different to CPU and GPU.According to number of tasks
According to pro rate refer to a kernel simultaneously on CPU and GPU perform identical code, kernel need data according to
For pro rate to CPU and GPU, CPU and GPU handle the data of distribution simultaneously.According to during task data pro rate due to each
Individual kernel needs on CPU and GPU synchronously after execution terminates, so fixed data distribution ratio may be in kernel
It is idle idle with GPU time that many CPU times are produced when execution.As shown in figure 3, the CPU and GPU of task data piecemeal it
Between fine granularity synchronously illustrate.Fixed data allocation proportion is α, and for kernel1, GPU execution efficiencys are more preferable, and CPU
Processing speed will be slow, and for the data distributed, GPU can perform completion prior to CPU, and this when, GPU was at sky
Idle is carved, and waiting for CPU performs completion.It is relative, for kernel2, CPU processing speed faster, so CPU can be prior to
GPU performs completion, and CPU would be at idle condition and wait GPU to perform completion this when.So from figure 3, it can be seen that by
When the CPU required for task data piecemeal and GPU can synchronously cause CPU in the process of implementation and GPU to produce many free time
Between.For being distributed directly to CPU or GPU using a kernel as an entirety, the synchronization between CPU and GPU is avoided the need for,
But it is relative, data transmission period can be longer.As shown in figure 4, it is shown that one whole as one based on kernel
CPU the and GPU implementation procedures of body mapping, it is not necessary to which CPU and GPU is synchronous, but data transfer is more frequent between CPU and GPU.
Step 5, based on forecast model, configuration parameter table is built.
Using the time prediction model and power consumption forecast model in step 4, all possible CPU frequencies can be calculated
Rate, execution time and power consumption under GPU frequencies and duty mapping scheme, and be stored in configuration parameter table.
Step 6, according to configuration parameter table, optimized parameter collection is searched for using improved greedy algorithm.
Step 6.1, cpu frequency and GPU frequencies are given, search performs time most short duty mapping scheme, and according to work(
Model is consumed, draws system prediction power consumption.As shown in algorithm 1.
Step 6.2, according to cpu frequency and the optional setting of GPU frequencies, change cpu frequency and GPU frequencies, repeat step
Rapid 6.1, mapping scheme corresponding to the optimal exercising time under this combination of frequency is obtained, and according to this mapping scheme computing system work(
Consumption, by given budget power consumption, draw the optimal frequency parameter selection being limited under budget power consumption and mapping scheme.This step by
Shown in algorithm 2.
Claims (2)
- A kind of 1. heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, it is characterised in that including following step Suddenly:Step 1, realization measure total calculate node time, CPU power consumption and GPU power consumptions after the completion of application program execution;Step 2, the concurrent testing benchmark program for changing selection, obtain the execution time of each kernel function in program;Step 3, application program is only performed on CPU or on GPU respectively, set different CPU or GPU can selected frequency, obtain detailed Thin operation information, each kernel function in time, program is always performed including program and performs time, total system power consumption, CPU work( Consumption and GPU power consumptions;Step 4, forecast model is designed, predicts power consumption and the execution of different task mapping scheme under different CPU and GPU frequencies Time;Wherein, the input of forecast model is cpu frequency, GPU frequencies and duty mapping scheme, the duty mapping refer to by Each kernel function is mapped on CPU or GPU, rather than kernel function is assigned into CPU according to certain proportion as an entirety With performed simultaneously on GPU;The output of forecast model is that the program of prediction always performs the system power dissipation of time and prediction;Step 5, different cpu frequency and GPU frequencies, different task mapping side are calculated according to execution time model and power consumption model Execution time and power consumption under case, and insert in configuration parameter table;Step 6, according to configuration parameter table, optimized parameter collection is searched for using improved greedy algorithm;It is divided into two steps, first given Cpu frequency and GPU frequencies under using greedy algorithm search perform time most short duty mapping scheme, reflected according to this task Penetrate, cpu frequency, GPU frequencies and power module, obtain the system prediction power consumption under this parameter configuration;Then, cpu frequency is changed With GPU frequencies, system prediction power consumption is calculated again according to previous step, finally gives be limited under budget power consumption optimal and matches somebody with somebody Put scheme.
- 2. as claimed in claim 1 by coordinating DVFS and duty mapping heterogeneous polynuclear power cap method, its feature exists In forecast model includes described in step 4:Execution time model and power consumption model,Step 4.1, execution time modelApplication program always performs the time when can be according to execution time and the corresponding data transfer of the kernel of each in program Between obtain.1. always perform the time by program is represented by formula.Wherein, fcpu,fgpuCpu frequency, GPU frequencies, T are represented respectivelyi(fcpu,fgpu) represent i-th of kernel function the execution time and Required data transmission period, 2. represented by formula.The Part I of formula represents to perform time, Part II expression data transmission period.Kernel function performs the time in the 2nd step Obtain.H2D and D2H represents data transfer cost from main frame to equipment and equipment to the transmission cost of main frame respectively. Required be equal to 1 or 0, represent whether core k whether data d.Data are possible in equipment, it is not necessary to transmit, So represent data whether in equipment using OnDevice.Size of data is represented with size.Step 4.2, power consumption modelSystem power dissipation can be by three part expressions, respectively idle power consumption, CPU power consumption PcpuWith GPU power consumptions Pgpu.System power dissipation 3. represented by formula.P=Pidle(fcpu, fgpu)+Pcpu(fcpu, fgpu)+Pgpu(fcpu, fgpu) ③Wherein, PidleIdle power consumption is represented, it is relevant with cpu frequency and GPU frequencies, and it is unrelated with duty mapping.Can be by only existing The power consumption that the total power consumption that application program obtains is subtracted on CPU is performed on CPU to obtain, and is 4. obtained by formula.Represent respectively under conditions of performing application program only on CPU, the total system power consumption of acquisition, CPU work( Consumption and GPU power consumptions, it is relevant with setting frequency.When application program performs, CPU and GPU due to the data correlation between kernel function, be not always in execution, so CPU power consumption and GPU power consumptions change according to the difference of given duty mapping scheme, with reference to this phenomenon, it will be assumed that Power consumption and to perform the time directly proportional with total execution time ratios, thus CPU and GPU power consumptions can respectively by formula 5. with formula 6. Represent.Wherein, λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively,Represent to perform only on GPU respectively and answer Under conditions of program, the CPU power consumption and GPU power consumptions of acquisition.WithCPU and GPU is represented respectively The estimator of the maximum of middle dynamic power consumption.Because λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively, it is possible to the operation information measured from the 3rd step Obtain, when program performs, each equipment rush hour is defined as the summation of each kernel function actual execution time, λcpuAnd λgpuPoint It can not represented by formula 07 and formula 8 zero.tcpuRepresent cpu busy time, tgpuRepresent the GPU rush hours, ttotalRepresent that application program always performs the time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711163506.4A CN107861606A (en) | 2017-11-21 | 2017-11-21 | A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711163506.4A CN107861606A (en) | 2017-11-21 | 2017-11-21 | A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107861606A true CN107861606A (en) | 2018-03-30 |
Family
ID=61703284
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711163506.4A Pending CN107861606A (en) | 2017-11-21 | 2017-11-21 | A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107861606A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109246151A (en) * | 2018-11-05 | 2019-01-18 | 国家电网有限公司 | A kind of transmission line of electricity video intelligent inspection analysis dispatching method |
CN109542596A (en) * | 2018-10-22 | 2019-03-29 | 西安交通大学 | A kind of Scheduling Framework based on OpenCL kernel tasks |
CN109753134A (en) * | 2018-12-24 | 2019-05-14 | 四川大学 | A kind of GPU inside energy consumption control system and method based on overall situation decoupling |
CN110262887A (en) * | 2019-06-26 | 2019-09-20 | 北京邮电大学 | CPU-FPGA method for scheduling task and device based on feature identification |
CN110287032A (en) * | 2019-07-02 | 2019-09-27 | 南京理工大学 | A kind of optimised power consumption dispatching method of YoloV3-Tiny in multicore system on chip |
CN110308784A (en) * | 2019-04-30 | 2019-10-08 | 东莞恒创智能科技有限公司 | CPU, GPU based on Nvidia TX2 combine frequency modulation energy-saving optimization method |
CN111221640A (en) * | 2020-01-09 | 2020-06-02 | 黔南民族师范学院 | GPU-CPU (graphics processing unit-central processing unit) cooperative energy-saving method |
CN111522420A (en) * | 2019-01-17 | 2020-08-11 | 电子科技大学 | Multi-core chip dynamic thermal management method based on power budget |
CN111914000A (en) * | 2020-06-22 | 2020-11-10 | 华南理工大学 | Server power capping method and system based on power consumption prediction model |
CN112363842A (en) * | 2020-11-27 | 2021-02-12 | Oppo(重庆)智能科技有限公司 | Frequency adjusting method and device for graphic processor, electronic equipment and storage medium |
WO2021042373A1 (en) * | 2019-09-06 | 2021-03-11 | 阿里巴巴集团控股有限公司 | Data processing and task scheduling method, device and system, and storage medium |
WO2021128084A1 (en) * | 2019-12-25 | 2021-07-01 | 阿里巴巴集团控股有限公司 | Data processing, acquisition, model training and power consumption control methods, system and device |
CN113311934A (en) * | 2021-04-09 | 2021-08-27 | 北京航空航天大学 | Dynamic power consumption adjusting method and system for multi-core heterogeneous domain controller |
CN113434034A (en) * | 2021-07-08 | 2021-09-24 | 北京华恒盛世科技有限公司 | Large-scale cluster energy-saving method for adjusting CPU frequency of calculation task by utilizing deep learning |
CN114880108A (en) * | 2021-12-15 | 2022-08-09 | 中国科学院深圳先进技术研究院 | Performance analysis method and equipment based on CPU-GPU heterogeneous architecture and storage medium |
CN114895773A (en) * | 2022-04-08 | 2022-08-12 | 中山大学 | Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201108716D0 (en) * | 2010-05-25 | 2011-07-06 | Nvidia Corp | System and method for power optimization |
CN102650957A (en) * | 2012-04-09 | 2012-08-29 | 武汉理工大学 | Self-adaptive energy-saving dispatching method in isomorphic cluster system based on dynamic voltage regulation technology |
US20120324250A1 (en) * | 2011-06-14 | 2012-12-20 | Utah State University | Architecturally Homogeneous Power-Performance Heterogeneous Multicore Processor |
CN103235640A (en) * | 2013-01-08 | 2013-08-07 | 北京邮电大学 | DVFS-based energy-saving dispatching method for large-scale parallel tasks |
CN104657219A (en) * | 2015-02-27 | 2015-05-27 | 西安交通大学 | Application program thread count dynamic regulating method used under isomerous many-core system |
CN106681453A (en) * | 2016-11-24 | 2017-05-17 | 电子科技大学 | Dynamic heat treatment method of high-performance multi-core microprocessor |
-
2017
- 2017-11-21 CN CN201711163506.4A patent/CN107861606A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201108716D0 (en) * | 2010-05-25 | 2011-07-06 | Nvidia Corp | System and method for power optimization |
US20120324250A1 (en) * | 2011-06-14 | 2012-12-20 | Utah State University | Architecturally Homogeneous Power-Performance Heterogeneous Multicore Processor |
CN102650957A (en) * | 2012-04-09 | 2012-08-29 | 武汉理工大学 | Self-adaptive energy-saving dispatching method in isomorphic cluster system based on dynamic voltage regulation technology |
CN103235640A (en) * | 2013-01-08 | 2013-08-07 | 北京邮电大学 | DVFS-based energy-saving dispatching method for large-scale parallel tasks |
CN104657219A (en) * | 2015-02-27 | 2015-05-27 | 西安交通大学 | Application program thread count dynamic regulating method used under isomerous many-core system |
CN106681453A (en) * | 2016-11-24 | 2017-05-17 | 电子科技大学 | Dynamic heat treatment method of high-performance multi-core microprocessor |
Non-Patent Citations (3)
Title |
---|
OMER ERDIL ALBAYRAK,ISMAIL AKTURK,OZCAN OZTURK: "Effective Kernel Mapping for OpenCL Applications in Heterogeneous Platforms", 《2012 41ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS》 * |
TOSHIYA KOMODA;SHINGO HAYASHI; TAKASHI NAKADA; SHINOBU MIWA;HIRO: ""Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task"", 《2013 IEEE 31ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD)》 * |
邱晓杰,安 虹,陈俊仕,迟孟贤,金 旭: "功耗受限情况下多核处理器能效优化方案", 《计算机工程》 * |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109542596A (en) * | 2018-10-22 | 2019-03-29 | 西安交通大学 | A kind of Scheduling Framework based on OpenCL kernel tasks |
CN109542596B (en) * | 2018-10-22 | 2023-09-12 | 西安交通大学 | Scheduling method based on OpenCL kernel task |
CN109246151B (en) * | 2018-11-05 | 2021-03-30 | 国家电网有限公司 | Intelligent video inspection analysis scheduling method for power transmission line |
CN109246151A (en) * | 2018-11-05 | 2019-01-18 | 国家电网有限公司 | A kind of transmission line of electricity video intelligent inspection analysis dispatching method |
CN109753134A (en) * | 2018-12-24 | 2019-05-14 | 四川大学 | A kind of GPU inside energy consumption control system and method based on overall situation decoupling |
CN109753134B (en) * | 2018-12-24 | 2022-04-15 | 四川大学 | Global decoupling-based GPU internal energy consumption control system and method |
CN111522420A (en) * | 2019-01-17 | 2020-08-11 | 电子科技大学 | Multi-core chip dynamic thermal management method based on power budget |
CN111522420B (en) * | 2019-01-17 | 2023-03-14 | 电子科技大学 | Multi-core chip dynamic thermal management method based on power budget |
CN110308784A (en) * | 2019-04-30 | 2019-10-08 | 东莞恒创智能科技有限公司 | CPU, GPU based on Nvidia TX2 combine frequency modulation energy-saving optimization method |
CN110262887A (en) * | 2019-06-26 | 2019-09-20 | 北京邮电大学 | CPU-FPGA method for scheduling task and device based on feature identification |
CN110262887B (en) * | 2019-06-26 | 2022-04-01 | 北京邮电大学 | CPU-FPGA task scheduling method and device based on feature recognition |
CN110287032A (en) * | 2019-07-02 | 2019-09-27 | 南京理工大学 | A kind of optimised power consumption dispatching method of YoloV3-Tiny in multicore system on chip |
CN110287032B (en) * | 2019-07-02 | 2022-09-20 | 南京理工大学 | Power consumption optimization scheduling method of YoloV3-Tiny on multi-core system on chip |
WO2021042373A1 (en) * | 2019-09-06 | 2021-03-11 | 阿里巴巴集团控股有限公司 | Data processing and task scheduling method, device and system, and storage medium |
CN113748398A (en) * | 2019-09-06 | 2021-12-03 | 阿里巴巴集团控股有限公司 | Data processing and task scheduling method, device, system and storage medium |
WO2021128084A1 (en) * | 2019-12-25 | 2021-07-01 | 阿里巴巴集团控股有限公司 | Data processing, acquisition, model training and power consumption control methods, system and device |
CN111221640A (en) * | 2020-01-09 | 2020-06-02 | 黔南民族师范学院 | GPU-CPU (graphics processing unit-central processing unit) cooperative energy-saving method |
CN111221640B (en) * | 2020-01-09 | 2023-10-17 | 黔南民族师范学院 | GPU-CPU cooperative energy saving method |
CN111914000A (en) * | 2020-06-22 | 2020-11-10 | 华南理工大学 | Server power capping method and system based on power consumption prediction model |
CN111914000B (en) * | 2020-06-22 | 2024-03-26 | 华南理工大学 | Server power capping method and system based on power consumption prediction model |
CN112363842A (en) * | 2020-11-27 | 2021-02-12 | Oppo(重庆)智能科技有限公司 | Frequency adjusting method and device for graphic processor, electronic equipment and storage medium |
CN113311934A (en) * | 2021-04-09 | 2021-08-27 | 北京航空航天大学 | Dynamic power consumption adjusting method and system for multi-core heterogeneous domain controller |
CN113434034A (en) * | 2021-07-08 | 2021-09-24 | 北京华恒盛世科技有限公司 | Large-scale cluster energy-saving method for adjusting CPU frequency of calculation task by utilizing deep learning |
CN114880108A (en) * | 2021-12-15 | 2022-08-09 | 中国科学院深圳先进技术研究院 | Performance analysis method and equipment based on CPU-GPU heterogeneous architecture and storage medium |
WO2023108800A1 (en) * | 2021-12-15 | 2023-06-22 | 中国科学院深圳先进技术研究院 | Performance analysis method based on cpu-gpu heterogeneous architecture, and device and storage medium |
CN114895773A (en) * | 2022-04-08 | 2022-08-12 | 中山大学 | Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium |
CN114895773B (en) * | 2022-04-08 | 2024-02-13 | 中山大学 | Energy consumption optimization method, system and device for heterogeneous multi-core processor and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107861606A (en) | A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping | |
KR20210082210A (en) | Creating an Integrated Circuit Floor Plan Using Neural Networks | |
CN106339351B (en) | A kind of SGD algorithm optimization system and method | |
US9239734B2 (en) | Scheduling method and system, computing grid, and corresponding computer-program product | |
CN104216783A (en) | Method for automatically managing and controlling virtual GPU (Graphics Processing Unit) resource in cloud gaming | |
Zidenberg et al. | Multiamdahl: How should i divide my heterogenous chip? | |
CN104657219A (en) | Application program thread count dynamic regulating method used under isomerous many-core system | |
CN102360313A (en) | Performance acceleration method of heterogeneous multi-core computing platform on chip | |
CN114895773B (en) | Energy consumption optimization method, system and device for heterogeneous multi-core processor and storage medium | |
Sayadi et al. | Scheduling multithreaded applications onto heterogeneous composite cores architecture | |
CN115619005A (en) | Intelligent power utilization network resource optimal configuration method and system | |
Xia et al. | Voltage, throughput, power, reliability, and multicore scaling | |
CN103645989B (en) | Device and method for analyzing test resource required by test case during test | |
CN106649067B (en) | A kind of performance and energy consumption prediction technique and device | |
CN103049310B (en) | A kind of multi-core simulation parallel acceleration method based on sampling | |
Qian et al. | Elasticai-creator: Optimizing neural networks for time-series-analysis for on-device machine learning in iot systems | |
CN105426247A (en) | HLA federate planning and scheduling method | |
CN104090813B (en) | A kind of method for analyzing and modeling of the virtual machine CPU usage of cloud data center | |
Ni et al. | Online performance and power prediction for edge TPU via comprehensive characterization | |
CN107451022A (en) | A kind of method and system for automatically adjusting linpack performance tests | |
Sundaresan et al. | Veerbench-an intelligent computing framework for workload characterisation in multi-core heterogeneous architectures | |
US8521464B2 (en) | Accelerating automatic test pattern generation in a multi-core computing environment via speculatively scheduled sequential multi-level parameter value optimization | |
Ou et al. | Container Power Consumption Prediction Based on GBRT-PL for Edge Servers in Smart City | |
Yao et al. | EALI: Energy-aware layer-level scheduling for convolutional neural network inference services on GPUs | |
Hajiamini et al. | A fast heuristic for improving the energy efficiency of asymmetric VFI-Based manycore systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180330 |
|
RJ01 | Rejection of invention patent application after publication |