CN107861606A - A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping - Google Patents

A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping Download PDF

Info

Publication number
CN107861606A
CN107861606A CN201711163506.4A CN201711163506A CN107861606A CN 107861606 A CN107861606 A CN 107861606A CN 201711163506 A CN201711163506 A CN 201711163506A CN 107861606 A CN107861606 A CN 107861606A
Authority
CN
China
Prior art keywords
cpu
gpu
power consumption
time
power
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711163506.4A
Other languages
Chinese (zh)
Inventor
方娟
汪梦萱
马傲男
程妍瑾
常泽清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201711163506.4A priority Critical patent/CN107861606A/en
Publication of CN107861606A publication Critical patent/CN107861606A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/324Power saving characterised by the action undertaken by lowering clock frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3296Power saving characterised by the action undertaken by lowering the supply or operating voltage

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

The present invention discloses a kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, survey calculation node power consumption can be distinguished after the completion of program execution by being realized first against heterogeneous system, the script of CPU power consumption and GPU power consumptions, then the concurrent testing benchmark program of selection is changed, for obtaining the execution time of different kernel functions;Then in the case where CPU and GPU sets different frequency, application program only is run on CPU and GPU respectively, obtains detailed operation information, including total execution time, each kernel function perform time, calculate node power consumption and CPU power consumption and GPU power consumptions;Based on operation information, a forecast model, including prediction execution time model and power consumption model are designed;Finally, based on forecast model, different cpu frequencies, GPU frequencies and system power dissipation under task allocative decision are obtained and performs the time insert in allocation list, according to improved greedy algorithm, searches out allocation optimum scheme.System power dissipation budget is limited while can improving systematic function using the present invention.

Description

A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping
Technical field
The invention belongs to field of computer architecture, and in particular to realize a kind of by coordinating DVFS and duty mapping Heterogeneous polynuclear power cap method.
Background technology
It is gradual as the advanced architectures of representative using polycaryon processor by the continuous research and development of recent years Single core processor is substituted to turn into the main path for improving processor performance.The isomorphism that compares polycaryon processor, heterogeneous polynuclear platform Better performance can be realized.Power cap is a kind of technology of power consumption limit by heterogeneous system under predeterminated level.Power consumption The lifting of heterogeneous polynuclear performance is limited with radiating.The structure of modern processors allows them to bear certain level power consumption band The injury come, so as to be required to the system for realizing the processor power upper limit.Most common power budget technology is by hard at present Part component is worked at different frequencies, therefore has different power consumptions, and main thought is scaled using dynamic voltage frequency (DVFS).While limiting heterogeneous system power consumption using DVFS, the situation of laod unbalance occurs between CPU and GPU.Pass through Concurrent program is decomposed into can performing simultaneously for task, and each duty mapping can be made full use of to most suitable processor The computing capability of system, systematic function is improved, but this mapping scheme usually not considers system power dissipation.This paper presents A kind of scheme for combining DVFS and duty mapping, systematic function is improved in the case where limiting system power dissipation budget.
The content of the invention
The present invention proposes a kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, improves system System power dissipation budget is limited while performance, realizes that one can be surveyed respectively after the completion of program execution first against heterogeneous system The script of gauge operator node power consumption, CPU power consumption and GPU power consumptions, the concurrent testing benchmark program of selection is then changed, for obtaining The execution time of different kernel functions.Then in the case where CPU and GPU sets different frequency, application is only run on CPU and GPU respectively Program, detailed operation information is obtained, including total execution time, each kernel function perform time, calculate node power consumption and CPU work( Consumption and GPU power consumptions.Based on operation information, a forecast model, including prediction execution time model and power consumption model are designed.Most Afterwards, based on forecast model, obtain different cpu frequencies, GPU frequencies and system power dissipation under task allocative decision and perform the time filling out Enter in allocation list.According to improved greedy algorithm, allocation optimum scheme (cpu frequency, GPU frequencies, duty mapping table) is searched out.
In order to achieve the above object, the present invention uses following technical scheme.
A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, DVFS and duty mapping are combined System power dissipation is limited under budget power consumption while systematic function is pursued;Comprise the following steps:
Step 1, realize that measurement application program always performs time, CPU power consumption and GPU work(after the completion of application program execution Consumption.
Step 2, the concurrent testing benchmark program of selection is changed, obtains the execution time of each kernel function in program.
Step 3, respectively only application program is performed on CPU or on GPU, set different CPU (GPU) can selected frequency, obtain Detailed operation information, always perform the time including program, each kernel function performs the time in program, total system power consumption, CPU power consumption and GPU power consumptions.
Step 4, design forecast model, predict different task mapping scheme under different CPU and GPU frequencies power consumption and Perform the time.The input of forecast model is cpu frequency, GPU frequencies and duty mapping scheme, present document relates to duty mapping be Refer to using each kernel function as an entirety, be mapped on CPU or GPU, rather than kernel function is distributed according to certain proportion Performed simultaneously on to CPU and GPU.The output of forecast model is that the program of prediction always performs the system power dissipation of time and prediction.This Forecast model includes execution time model and power consumption model.
Step 4.1, execution time model
Application program always performs the time can be according to the execution time of the kernel of each in program and corresponding data biography The defeated time obtains.1. always perform the time by program is represented by formula.
Wherein, fcpu,fgpuCpu frequency, GPU frequencies, T are represented respectivelyi(fcpu,fgpu) represent i-th of kernel function execution Time and required data transmission period, 2. represented by formula.
The Part I of formula represents to perform time, Part II expression data transmission period.The kernel function execution time exists 2nd step has obtained.H2D and D2H represent respectively the transmission of data transfer cost from main frame to equipment and equipment to main frame into This.Required be equal to 1 or 0, represent whether core k whether data d.Data are possible in equipment, it is not necessary to are passed It is defeated, so representing data whether in equipment using OnDevice.Size of data is represented with size.
Step 4.2, power consumption model
System power dissipation can be by three part expressions, respectively idle power consumption, CPU power consumption PcpuWith GPU power consumptions Pgpu.System 3. power consumption is represented by formula.
P=Pidle(fcpu, fgpu)+Pcpu(fcpu, fgpu)+Pgpu(fcpu, fgpu) ③
Wherein, PidleIdle power consumption is represented, it is relevant with cpu frequency and GPU frequencies, and it is unrelated with duty mapping.It can lead to Cross the power consumption that the total power consumption that execution application program obtains only on CPU is subtracted on CPU to obtain, 4. obtained by formula.
Represent respectively under conditions of performing application program only on CPU, the total system power consumption of acquisition, CPU power consumption and GPU power consumptions, it is relevant with setting frequency.
When application program performs, CPU and GPU are not always in execution, institute due to the data correlation between kernel function Changed with CPU power consumption and GPU power consumptions according to the difference of given duty mapping scheme, with reference to this phenomenon, we are false If power consumption and to perform the time directly proportional with total execution time ratios, thus CPU and GPU power consumptions can respectively by formula 5. and formula 6. represent.
Wherein, λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively,Represent only to hold on GPU respectively Under conditions of row application program, the CPU power consumption and GPU power consumptions of acquisition.WithCPU is represented respectively With the estimator of the maximum of dynamic power consumption in GPU.
Because λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively, it is possible to the operation measured from the 3rd step Information obtains.When program performs, each equipment rush hour is defined as the summation of each kernel function actual execution time.λcpuWith λgpu7. can 8. it be represented with formula by formula respectively.
tcpuRepresent cpu busy time, tgpuRepresent the GPU rush hours, ttotalRepresent that application program always performs the time.
Step 5, based on forecast model, configuration parameter table is built.
Different cpu frequency and GPU frequencies, different task mapping scheme are calculated according to execution time model and power consumption model Under the execution time and power consumption, and insert in configuration parameter table.
Step 6, according to configuration parameter table, optimized parameter collection is searched for using improved greedy algorithm.It is divided into two steps, exists first Time most short duty mapping scheme is performed using greedy algorithm search under given cpu frequency and GPU frequencies, according to this task Mapping, cpu frequency, GPU frequencies and power module, obtain the system prediction power consumption under this parameter configuration.Then, CPU frequencies are changed Rate and GPU frequencies, system prediction power consumption is calculated again according to previous step, finally give be limited in it is optimal under budget power consumption Allocation plan.
Step 6.1, cpu frequency and GPU frequencies are given, search performs time most short duty mapping scheme, and according to work( Model is consumed, draws system prediction power consumption.
Step 6.2, according to cpu frequency and the optional setting of GPU frequencies, change cpu frequency and GPU frequencies, repeat step Rapid 6.1, mapping scheme corresponding to the optimal exercising time under this combination of frequency is obtained, and according to this mapping scheme computing system work( Consumption, by given budget power consumption, draw the optimal frequency parameter selection being limited under budget power consumption and mapping scheme.
Compared with prior art, the present invention has advantages below:
DVFS and duty mapping are combined, common realization ensures system while system power dissipation is limited in into budget power consumption System performance.Existing system power capping technology is largely all realized by dynamic voltage frequency scaling, because system In device frequency system power dissipation is influenceed maximum, but do not account for and change cpu frequency and system that GPU frequencies can be brought The situation of load imbalance.By the way that concurrent program is decomposed into can performing for task simultaneously, and by each duty mapping to most closing Suitable processor can make full use of the computing capability of system, improve systematic function, but this mapping scheme is usually not examined Consider system power dissipation.So it is of the invention by the way that two kinds of optimisation strategies of DVFS and duty mapping are combined, in lifting system System power dissipation is limited under certain budget level while energy.
Brief description of the drawings
To make the purpose of the present invention, scheme is more easy-to-understand, and below in conjunction with figure, the present invention is further described.
Fig. 1 is CPU-GPU heterogeneous multi-core system Organization Charts, and the heterogeneous system is to be simulated to build by gem5-gpu, 4 cores CPU and GPU being made up of 8 CU is integrated on the same chip.
Fig. 2 is the power cap conceptual design schematic diagram based on detailed operation information in the present invention.
Fig. 3 is the fine granularity synchronization schematic diagram between CPU and GPU using traditional task data piecemeal.
Fig. 4 is CPU the and GPU rush hours of the task kernel function piecemeal used in the present invention and waits task management data Caused by CPU and GPU free time schematic diagrames.
Embodiment
The present invention will be further described below in conjunction with the accompanying drawings.
Fig. 1 is the heterogeneous multi-core system built by gem5-gpu simulators, simulation be one by 4 core CPUs and One is integrated in the isomery framework on same chip by 8 CU GPU formed, can be according to configuration text in gem5-gpu Part flexibly changes this analog architectures, and gem5-gpu supports DVFS.
The present invention realizes one kind by the way that DVFS and task are reflected in the heterogeneous multi-core system of structure is simulated by gem5-gpu The power cap method combined is penetrated, includes step in detail below:
Step 1, realize and measure total calculate node time, CPU power consumption and GPU power consumptions after the completion of application program execution.
In gem5-gpu, after an application program execution terminates, one can be automatically generated and perform letter comprising all programs The file stat.txt of breath, wherein just comprising program execution time.McPAT modules can independent measurement CPU power consumption, GPUWattch modules can independent measurement CPU module, pass through in gem5-gpu configure McPAT module volume GPUWattch modules CPU power consumption and GPU power consumptions can be obtained after the completion of program execution.
Step 2, the concurrent testing benchmark program of selection is changed, obtains the execution time of each kernel function in program.
OpenCL programs can perform on different devices, including CPU and GPU.The benchmark used in the present invention Test program is the NAS concurrent testing benchmark program collection of OpenCL versions.Each benchmark has different characteristics, wherein, Some programs comprise more than 60 kernel, and some programs only have two kernel.By rewriting test program, each is collected The kernel execution time.
Step 3, respectively only application program is performed on CPU or on GPU, set different CPU (GPU) can selected frequency, obtain Obtain detailed operation information, the input as the power cap scheme based on operation information.
Fig. 2 shows the flow of the power cap strategy based on operation information.Wherein, CPU Profile Runs and GPU Profile Runs represent the operation information only obtained on CPU and only on GPU after execution application program respectively, pass through these Operation information, establish the forecast model in step 4, including time model and power consumption model, by the Time model in Fig. 2 and Power model are represented.The cpu frequency during input of forecast model, GPU frequencies and duty mapping scheme, output are corresponding pre- Survey and perform time and forecasting system power consumption.Different input and output construct an allocation list, improved according to allocation list, use Greedy algorithm is according to the algorithm search optimum mapping scheme and set of frequency in step 6.By the Distribute in Fig. 2 Parallel tasks and set device frequencies are represented.
Step 4, forecast model, including execution time model and power consumption model are established, for predicting in heterogeneous multi-core environment The execution time of middle application program and system power dissipation.In step 3 under different set of frequency, only on CPU and only on GPU It is the basis for establishing forecast model to perform the operation information that application program obtains.From formula 1. to formula 8., it can be seen that pass through Program operation information includes each kernel execution informations, CPU power consumption and GPU power consumptions, can predict different cpu frequencies, GPU Under frequency and duty mapping scheme, the execution time of program and power consumption.
The duty mapping being related in the present invention refers to using any one kernel in program as an entirety, mapping To CPU or GPU, this from traditional according to task data to distribute a certain proportion of data different to CPU and GPU.According to number of tasks According to pro rate refer to a kernel simultaneously on CPU and GPU perform identical code, kernel need data according to For pro rate to CPU and GPU, CPU and GPU handle the data of distribution simultaneously.According to during task data pro rate due to each Individual kernel needs on CPU and GPU synchronously after execution terminates, so fixed data distribution ratio may be in kernel It is idle idle with GPU time that many CPU times are produced when execution.As shown in figure 3, the CPU and GPU of task data piecemeal it Between fine granularity synchronously illustrate.Fixed data allocation proportion is α, and for kernel1, GPU execution efficiencys are more preferable, and CPU Processing speed will be slow, and for the data distributed, GPU can perform completion prior to CPU, and this when, GPU was at sky Idle is carved, and waiting for CPU performs completion.It is relative, for kernel2, CPU processing speed faster, so CPU can be prior to GPU performs completion, and CPU would be at idle condition and wait GPU to perform completion this when.So from figure 3, it can be seen that by When the CPU required for task data piecemeal and GPU can synchronously cause CPU in the process of implementation and GPU to produce many free time Between.For being distributed directly to CPU or GPU using a kernel as an entirety, the synchronization between CPU and GPU is avoided the need for, But it is relative, data transmission period can be longer.As shown in figure 4, it is shown that one whole as one based on kernel CPU the and GPU implementation procedures of body mapping, it is not necessary to which CPU and GPU is synchronous, but data transfer is more frequent between CPU and GPU.
Step 5, based on forecast model, configuration parameter table is built.
Using the time prediction model and power consumption forecast model in step 4, all possible CPU frequencies can be calculated Rate, execution time and power consumption under GPU frequencies and duty mapping scheme, and be stored in configuration parameter table.
Step 6, according to configuration parameter table, optimized parameter collection is searched for using improved greedy algorithm.
Step 6.1, cpu frequency and GPU frequencies are given, search performs time most short duty mapping scheme, and according to work( Model is consumed, draws system prediction power consumption.As shown in algorithm 1.
Step 6.2, according to cpu frequency and the optional setting of GPU frequencies, change cpu frequency and GPU frequencies, repeat step Rapid 6.1, mapping scheme corresponding to the optimal exercising time under this combination of frequency is obtained, and according to this mapping scheme computing system work( Consumption, by given budget power consumption, draw the optimal frequency parameter selection being limited under budget power consumption and mapping scheme.This step by Shown in algorithm 2.

Claims (2)

  1. A kind of 1. heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping, it is characterised in that including following step Suddenly:
    Step 1, realization measure total calculate node time, CPU power consumption and GPU power consumptions after the completion of application program execution;
    Step 2, the concurrent testing benchmark program for changing selection, obtain the execution time of each kernel function in program;
    Step 3, application program is only performed on CPU or on GPU respectively, set different CPU or GPU can selected frequency, obtain detailed Thin operation information, each kernel function in time, program is always performed including program and performs time, total system power consumption, CPU work( Consumption and GPU power consumptions;
    Step 4, forecast model is designed, predicts power consumption and the execution of different task mapping scheme under different CPU and GPU frequencies Time;Wherein, the input of forecast model is cpu frequency, GPU frequencies and duty mapping scheme, the duty mapping refer to by Each kernel function is mapped on CPU or GPU, rather than kernel function is assigned into CPU according to certain proportion as an entirety With performed simultaneously on GPU;The output of forecast model is that the program of prediction always performs the system power dissipation of time and prediction;
    Step 5, different cpu frequency and GPU frequencies, different task mapping side are calculated according to execution time model and power consumption model Execution time and power consumption under case, and insert in configuration parameter table;
    Step 6, according to configuration parameter table, optimized parameter collection is searched for using improved greedy algorithm;It is divided into two steps, first given Cpu frequency and GPU frequencies under using greedy algorithm search perform time most short duty mapping scheme, reflected according to this task Penetrate, cpu frequency, GPU frequencies and power module, obtain the system prediction power consumption under this parameter configuration;Then, cpu frequency is changed With GPU frequencies, system prediction power consumption is calculated again according to previous step, finally gives be limited under budget power consumption optimal and matches somebody with somebody Put scheme.
  2. 2. as claimed in claim 1 by coordinating DVFS and duty mapping heterogeneous polynuclear power cap method, its feature exists In forecast model includes described in step 4:Execution time model and power consumption model,
    Step 4.1, execution time model
    Application program always performs the time when can be according to execution time and the corresponding data transfer of the kernel of each in program Between obtain.1. always perform the time by program is represented by formula.
    Wherein, fcpu,fgpuCpu frequency, GPU frequencies, T are represented respectivelyi(fcpu,fgpu) represent i-th of kernel function the execution time and Required data transmission period, 2. represented by formula.
    The Part I of formula represents to perform time, Part II expression data transmission period.Kernel function performs the time in the 2nd step Obtain.H2D and D2H represents data transfer cost from main frame to equipment and equipment to the transmission cost of main frame respectively. Required be equal to 1 or 0, represent whether core k whether data d.Data are possible in equipment, it is not necessary to transmit, So represent data whether in equipment using OnDevice.Size of data is represented with size.
    Step 4.2, power consumption model
    System power dissipation can be by three part expressions, respectively idle power consumption, CPU power consumption PcpuWith GPU power consumptions Pgpu.System power dissipation 3. represented by formula.
    P=Pidle(fcpu, fgpu)+Pcpu(fcpu, fgpu)+Pgpu(fcpu, fgpu) ③
    Wherein, PidleIdle power consumption is represented, it is relevant with cpu frequency and GPU frequencies, and it is unrelated with duty mapping.Can be by only existing The power consumption that the total power consumption that application program obtains is subtracted on CPU is performed on CPU to obtain, and is 4. obtained by formula.
    Represent respectively under conditions of performing application program only on CPU, the total system power consumption of acquisition, CPU work( Consumption and GPU power consumptions, it is relevant with setting frequency.
    When application program performs, CPU and GPU due to the data correlation between kernel function, be not always in execution, so CPU power consumption and GPU power consumptions change according to the difference of given duty mapping scheme, with reference to this phenomenon, it will be assumed that Power consumption and to perform the time directly proportional with total execution time ratios, thus CPU and GPU power consumptions can respectively by formula 5. with formula 6. Represent.
    Wherein, λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively,Represent to perform only on GPU respectively and answer Under conditions of program, the CPU power consumption and GPU power consumptions of acquisition.WithCPU and GPU is represented respectively The estimator of the maximum of middle dynamic power consumption.
    Because λcpuAnd λgpuCPU and GPU rush hour ratios are represented respectively, it is possible to the operation information measured from the 3rd step Obtain, when program performs, each equipment rush hour is defined as the summation of each kernel function actual execution time, λcpuAnd λgpuPoint It can not represented by formula 07 and formula 8 zero.
    tcpuRepresent cpu busy time, tgpuRepresent the GPU rush hours, ttotalRepresent that application program always performs the time.
CN201711163506.4A 2017-11-21 2017-11-21 A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping Pending CN107861606A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711163506.4A CN107861606A (en) 2017-11-21 2017-11-21 A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711163506.4A CN107861606A (en) 2017-11-21 2017-11-21 A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping

Publications (1)

Publication Number Publication Date
CN107861606A true CN107861606A (en) 2018-03-30

Family

ID=61703284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711163506.4A Pending CN107861606A (en) 2017-11-21 2017-11-21 A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping

Country Status (1)

Country Link
CN (1) CN107861606A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109246151A (en) * 2018-11-05 2019-01-18 国家电网有限公司 A kind of transmission line of electricity video intelligent inspection analysis dispatching method
CN109542596A (en) * 2018-10-22 2019-03-29 西安交通大学 A kind of Scheduling Framework based on OpenCL kernel tasks
CN109753134A (en) * 2018-12-24 2019-05-14 四川大学 A kind of GPU inside energy consumption control system and method based on overall situation decoupling
CN110262887A (en) * 2019-06-26 2019-09-20 北京邮电大学 CPU-FPGA method for scheduling task and device based on feature identification
CN110287032A (en) * 2019-07-02 2019-09-27 南京理工大学 A kind of optimised power consumption dispatching method of YoloV3-Tiny in multicore system on chip
CN110308784A (en) * 2019-04-30 2019-10-08 东莞恒创智能科技有限公司 CPU, GPU based on Nvidia TX2 combine frequency modulation energy-saving optimization method
CN111221640A (en) * 2020-01-09 2020-06-02 黔南民族师范学院 GPU-CPU (graphics processing unit-central processing unit) cooperative energy-saving method
CN111522420A (en) * 2019-01-17 2020-08-11 电子科技大学 Multi-core chip dynamic thermal management method based on power budget
CN111914000A (en) * 2020-06-22 2020-11-10 华南理工大学 Server power capping method and system based on power consumption prediction model
CN112363842A (en) * 2020-11-27 2021-02-12 Oppo(重庆)智能科技有限公司 Frequency adjusting method and device for graphic processor, electronic equipment and storage medium
WO2021042373A1 (en) * 2019-09-06 2021-03-11 阿里巴巴集团控股有限公司 Data processing and task scheduling method, device and system, and storage medium
WO2021128084A1 (en) * 2019-12-25 2021-07-01 阿里巴巴集团控股有限公司 Data processing, acquisition, model training and power consumption control methods, system and device
CN113311934A (en) * 2021-04-09 2021-08-27 北京航空航天大学 Dynamic power consumption adjusting method and system for multi-core heterogeneous domain controller
CN113434034A (en) * 2021-07-08 2021-09-24 北京华恒盛世科技有限公司 Large-scale cluster energy-saving method for adjusting CPU frequency of calculation task by utilizing deep learning
CN114880108A (en) * 2021-12-15 2022-08-09 中国科学院深圳先进技术研究院 Performance analysis method and equipment based on CPU-GPU heterogeneous architecture and storage medium
CN114895773A (en) * 2022-04-08 2022-08-12 中山大学 Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201108716D0 (en) * 2010-05-25 2011-07-06 Nvidia Corp System and method for power optimization
CN102650957A (en) * 2012-04-09 2012-08-29 武汉理工大学 Self-adaptive energy-saving dispatching method in isomorphic cluster system based on dynamic voltage regulation technology
US20120324250A1 (en) * 2011-06-14 2012-12-20 Utah State University Architecturally Homogeneous Power-Performance Heterogeneous Multicore Processor
CN103235640A (en) * 2013-01-08 2013-08-07 北京邮电大学 DVFS-based energy-saving dispatching method for large-scale parallel tasks
CN104657219A (en) * 2015-02-27 2015-05-27 西安交通大学 Application program thread count dynamic regulating method used under isomerous many-core system
CN106681453A (en) * 2016-11-24 2017-05-17 电子科技大学 Dynamic heat treatment method of high-performance multi-core microprocessor

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201108716D0 (en) * 2010-05-25 2011-07-06 Nvidia Corp System and method for power optimization
US20120324250A1 (en) * 2011-06-14 2012-12-20 Utah State University Architecturally Homogeneous Power-Performance Heterogeneous Multicore Processor
CN102650957A (en) * 2012-04-09 2012-08-29 武汉理工大学 Self-adaptive energy-saving dispatching method in isomorphic cluster system based on dynamic voltage regulation technology
CN103235640A (en) * 2013-01-08 2013-08-07 北京邮电大学 DVFS-based energy-saving dispatching method for large-scale parallel tasks
CN104657219A (en) * 2015-02-27 2015-05-27 西安交通大学 Application program thread count dynamic regulating method used under isomerous many-core system
CN106681453A (en) * 2016-11-24 2017-05-17 电子科技大学 Dynamic heat treatment method of high-performance multi-core microprocessor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
OMER ERDIL ALBAYRAK,ISMAIL AKTURK,OZCAN OZTURK: "Effective Kernel Mapping for OpenCL Applications in Heterogeneous Platforms", 《2012 41ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS》 *
TOSHIYA KOMODA;SHINGO HAYASHI; TAKASHI NAKADA; SHINOBU MIWA;HIRO: ""Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task"", 《2013 IEEE 31ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD)》 *
邱晓杰,安 虹,陈俊仕,迟孟贤,金 旭: "功耗受限情况下多核处理器能效优化方案", 《计算机工程》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542596A (en) * 2018-10-22 2019-03-29 西安交通大学 A kind of Scheduling Framework based on OpenCL kernel tasks
CN109542596B (en) * 2018-10-22 2023-09-12 西安交通大学 Scheduling method based on OpenCL kernel task
CN109246151B (en) * 2018-11-05 2021-03-30 国家电网有限公司 Intelligent video inspection analysis scheduling method for power transmission line
CN109246151A (en) * 2018-11-05 2019-01-18 国家电网有限公司 A kind of transmission line of electricity video intelligent inspection analysis dispatching method
CN109753134A (en) * 2018-12-24 2019-05-14 四川大学 A kind of GPU inside energy consumption control system and method based on overall situation decoupling
CN109753134B (en) * 2018-12-24 2022-04-15 四川大学 Global decoupling-based GPU internal energy consumption control system and method
CN111522420A (en) * 2019-01-17 2020-08-11 电子科技大学 Multi-core chip dynamic thermal management method based on power budget
CN111522420B (en) * 2019-01-17 2023-03-14 电子科技大学 Multi-core chip dynamic thermal management method based on power budget
CN110308784A (en) * 2019-04-30 2019-10-08 东莞恒创智能科技有限公司 CPU, GPU based on Nvidia TX2 combine frequency modulation energy-saving optimization method
CN110262887A (en) * 2019-06-26 2019-09-20 北京邮电大学 CPU-FPGA method for scheduling task and device based on feature identification
CN110262887B (en) * 2019-06-26 2022-04-01 北京邮电大学 CPU-FPGA task scheduling method and device based on feature recognition
CN110287032A (en) * 2019-07-02 2019-09-27 南京理工大学 A kind of optimised power consumption dispatching method of YoloV3-Tiny in multicore system on chip
CN110287032B (en) * 2019-07-02 2022-09-20 南京理工大学 Power consumption optimization scheduling method of YoloV3-Tiny on multi-core system on chip
WO2021042373A1 (en) * 2019-09-06 2021-03-11 阿里巴巴集团控股有限公司 Data processing and task scheduling method, device and system, and storage medium
CN113748398A (en) * 2019-09-06 2021-12-03 阿里巴巴集团控股有限公司 Data processing and task scheduling method, device, system and storage medium
WO2021128084A1 (en) * 2019-12-25 2021-07-01 阿里巴巴集团控股有限公司 Data processing, acquisition, model training and power consumption control methods, system and device
CN111221640A (en) * 2020-01-09 2020-06-02 黔南民族师范学院 GPU-CPU (graphics processing unit-central processing unit) cooperative energy-saving method
CN111221640B (en) * 2020-01-09 2023-10-17 黔南民族师范学院 GPU-CPU cooperative energy saving method
CN111914000A (en) * 2020-06-22 2020-11-10 华南理工大学 Server power capping method and system based on power consumption prediction model
CN111914000B (en) * 2020-06-22 2024-03-26 华南理工大学 Server power capping method and system based on power consumption prediction model
CN112363842A (en) * 2020-11-27 2021-02-12 Oppo(重庆)智能科技有限公司 Frequency adjusting method and device for graphic processor, electronic equipment and storage medium
CN113311934A (en) * 2021-04-09 2021-08-27 北京航空航天大学 Dynamic power consumption adjusting method and system for multi-core heterogeneous domain controller
CN113434034A (en) * 2021-07-08 2021-09-24 北京华恒盛世科技有限公司 Large-scale cluster energy-saving method for adjusting CPU frequency of calculation task by utilizing deep learning
CN114880108A (en) * 2021-12-15 2022-08-09 中国科学院深圳先进技术研究院 Performance analysis method and equipment based on CPU-GPU heterogeneous architecture and storage medium
WO2023108800A1 (en) * 2021-12-15 2023-06-22 中国科学院深圳先进技术研究院 Performance analysis method based on cpu-gpu heterogeneous architecture, and device and storage medium
CN114895773A (en) * 2022-04-08 2022-08-12 中山大学 Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium
CN114895773B (en) * 2022-04-08 2024-02-13 中山大学 Energy consumption optimization method, system and device for heterogeneous multi-core processor and storage medium

Similar Documents

Publication Publication Date Title
CN107861606A (en) A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping
KR20210082210A (en) Creating an Integrated Circuit Floor Plan Using Neural Networks
CN106339351B (en) A kind of SGD algorithm optimization system and method
US9239734B2 (en) Scheduling method and system, computing grid, and corresponding computer-program product
CN104216783A (en) Method for automatically managing and controlling virtual GPU (Graphics Processing Unit) resource in cloud gaming
Zidenberg et al. Multiamdahl: How should i divide my heterogenous chip?
CN104657219A (en) Application program thread count dynamic regulating method used under isomerous many-core system
CN102360313A (en) Performance acceleration method of heterogeneous multi-core computing platform on chip
CN114895773B (en) Energy consumption optimization method, system and device for heterogeneous multi-core processor and storage medium
Sayadi et al. Scheduling multithreaded applications onto heterogeneous composite cores architecture
CN115619005A (en) Intelligent power utilization network resource optimal configuration method and system
Xia et al. Voltage, throughput, power, reliability, and multicore scaling
CN103645989B (en) Device and method for analyzing test resource required by test case during test
CN106649067B (en) A kind of performance and energy consumption prediction technique and device
CN103049310B (en) A kind of multi-core simulation parallel acceleration method based on sampling
Qian et al. Elasticai-creator: Optimizing neural networks for time-series-analysis for on-device machine learning in iot systems
CN105426247A (en) HLA federate planning and scheduling method
CN104090813B (en) A kind of method for analyzing and modeling of the virtual machine CPU usage of cloud data center
Ni et al. Online performance and power prediction for edge TPU via comprehensive characterization
CN107451022A (en) A kind of method and system for automatically adjusting linpack performance tests
Sundaresan et al. Veerbench-an intelligent computing framework for workload characterisation in multi-core heterogeneous architectures
US8521464B2 (en) Accelerating automatic test pattern generation in a multi-core computing environment via speculatively scheduled sequential multi-level parameter value optimization
Ou et al. Container Power Consumption Prediction Based on GBRT-PL for Edge Servers in Smart City
Yao et al. EALI: Energy-aware layer-level scheduling for convolutional neural network inference services on GPUs
Hajiamini et al. A fast heuristic for improving the energy efficiency of asymmetric VFI-Based manycore systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180330

RJ01 Rejection of invention patent application after publication