CN106991030B

CN106991030B - Online learning-based system power consumption optimization lightweight method

Info

Publication number: CN106991030B
Application number: CN201710116452.XA
Authority: CN
Inventors: 王翔; 李林; 王维克; 杜培; 李明哲
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2017-03-01
Filing date: 2017-03-01
Publication date: 2020-04-14
Anticipated expiration: 2037-03-01
Also published as: CN106991030A

Abstract

A lightweight method for optimizing system power consumption based on online learning comprises the following steps: 1, compiling a program into object codes; 2, starting a monitoring module to monitor a hardware event; 3, normalizing the processing event; 4, establishing a system power consumption model; 5, designing different optimization modes; 6, designing a value function module; 7, writing the power consumption model, the penalty factor and the value function module into an Agent module Agent; 8, designing a software timer, and starting the steps 3 and 7 at regular time; 9, executing the program, the steps 7 and 3, and updating the Agent; 10, setting convergence; 11, turning to the step 2 according to the result of the Agent module, and starting from the step 3 until the operation is finished; through the steps, the temperature, the performance and the power consumption are comprehensively considered cooperatively, the lightweight machine learning algorithm is used for searching the existing optimization space, the effects of low power consumption and reasonable performance requirements are achieved, and the problem that the embedded equipment and the like are limited by the battery to influence the working time is solved.

Description

Online learning-based system power consumption optimization lightweight method

Technical Field

The invention provides a lightweight method for optimizing system power consumption based on online learning, which relates to the technical field of embedded system power consumption optimization, in particular to a method for combining embedded system power consumption optimization and a machine learning algorithm.

Background

Embedded devices are increasingly applied in daily life, more embedded terminals and wider on-line interconnection enable power consumption of embedded systems to become a problem that designers must face, and in addition, the current situations of energy shortage and environmental protection enable the problem of power consumption of processors to receive more and more attention, low power consumption becomes an important index of embedded processors and even each kind of electronic devices, and overall, the design of low power consumption of processors faces the following challenges: firstly, the dynamic power consumption and the voltage are in a square relation, the reduction of the voltage can obviously reduce the dynamic power consumption, the supply voltage is continuously reduced, the leakage power consumption is increased sharply, and the stability and the performance of the system are greatly reduced. Second, with the advent of multi-core technology, while the increase in power consumption density is greatly reduced, the increase in overall power consumption continues.

Power consumption is one of the most basic electrical characteristics of processor performance, and one of the most important reasons is that as frequency increases, the increase in power consumption is accompanied by changes in thermal characteristics, which have serious limitations and influences on the materials and packaging of the processor. The power consumption of a CMOS (Complementary Metal Oxide Semiconductor) circuit in an SoC (System On Chip) includes: the first is static power consumption, which is mainly caused by factors such as static current and leakage current; the second is dynamic power consumption, which is mainly caused by factors such as transient open circuit and load current caused by signal conversion in the circuit, and is a main source of power consumption in SoC. Therefore, addressing dynamic power consumption in good socs is a key to reducing overall SoC power consumption. The hardware power consumption of the bottom layer is generated by software driving, and many existing low-power-consumption designs cannot reduce the power consumption of the system on the whole, so that the combination of power consumption management and optimization technologies among multiple layers gradually becomes an important means for controlling the power consumption of the embedded system.

The research on the low-power-consumption safe embedded processor chip is still in the primary stage at present, and has many unsolved problems and a set of complete theoretical system is lacked. Hardware relies on software running on it to perform its function of processing information, and software itself does not generate power consumption, but both data access and instruction execution by software cause power consumption by hardware. Therefore, to reduce power consumption, consideration must be given to how power consumption is optimized by embedded hardware and software. Designers must face the problem of low power consumption designs, embedded systems are widely used in handheld devices and mobile products, so the designer needs to consider from every detail how to reduce power consumption and extend battery life as much as possible. Some current low power design methods are to use a DC-DC voltage conversion circuit in a proper environment to improve the power efficiency and reduce the system power consumption; in CMOS design circuits, a lower VDD is used, and as low a clock frequency as possible is used, along with cache to minimize memory reads and to cooperate with sleep mode. The ultra-low power design may even allow embedded systems to operate without batteries, relying only on harvesting some of the energy in the environment. In addition, managing power consumption is now an important factor affecting integrated circuit packaging and heat dissipation. The price, size, weight and reliability of embedded systems all depend on power consumption. The power consumption of the microprocessor can be divided into two categories, one is a block level method, the other is an instruction set method, in the block level method, the microprocessor is considered as a running module set, and the characteristic item of each module influences the power consumption of the microprocessor; in the instruction set approach, the power consumption of each microprocessor is allocated evenly to its functioning. The results obtained by this method require a minimum of instruction flow and real-time data. The traditional power consumption optimization is independently designed among all layers, and because the hardware power consumption of the bottom layers is generated by software driving, the power consumption of the system cannot be reduced from the whole by a plurality of existing single-layer low-power-consumption designs, so that the combination of power consumption management and optimization technologies among a plurality of layers gradually becomes an important means for controlling the power consumption of the embedded system.

In summary, the following problems still exist in the current power consumption optimization method for the embedded system:

(1) common system power consumption optimization is low-power consumption optimization design from the aspect of a single layer, and because the power consumption is generated by software driving through hardware, the low-power consumption design of the single layer does not consider factors influencing the power consumption of other layers, the capability of reducing the overall power consumption of the system is limited;

(2) for the method for establishing the multi-level system power consumption model, the mutual influence relationship between performance and power consumption needs to be considered, and meanwhile, the power consumption model is suitable for different application environments, and a system power consumption model with robustness established on multiple levels is lacked.

(3) Common machine learning has the defects of convergence and overfitting, and is not suitable for being applied to system power consumption optimization, because the algorithm can increase the load of system operation, and frequent CPU calculation and I/O calling can increase dynamic power consumption, so that the overall power consumption reduction is difficult to realize.

Disclosure of Invention

1. Objects of the invention

Aiming at the problems, the invention provides a lightweight method for optimizing system power consumption based on online learning. The method performs mathematical modeling on uncertain factors such as PVT (physical verification test) which affect static power consumption and dynamic power consumption, performs power consumption optimization by a value function and a machine learning algorithm by analyzing an optimization space existing among power consumption, performance and temperature, greatly reduces system load generated by the algorithm, and improves self-adaption and robustness of a power consumption optimization module. For different application requirements, the punishment items of the value function are modified, the optimization strategies of two modes are set, the robustness of the power consumption optimization module can be effectively improved, the granularity of power consumption optimization is improved, and accurate optimization among the power consumption, the performance and the temperature of the system is realized.

2. Technical scheme

(1) Preparation work:

the technical scheme of the invention relates to the following formula regulation and serial number specification:

in the formula: cycles _ l1i _ stashed and cycles _ l1d _ stashed are cycles when instructions and data stop accessing

In the formula: IPC is the Instruction per Clock, i.e., the number of instructions per cycle

ene＝(1-ρ)μf^2β+ρ(1-μ)f^β+ρμf^β-1(3)

Where ρ is the percentage of leakage power to total power, μ is the percentage of the CPU calculation at the application, β is the proportional relationship between v (voltage) and f (clock)

First free optimization mode:

pnlt＝w_E*pnlt_E+w_P*pnlt_P+w_T*pnlt_T(6)

in the formula: pnlt is a penalty factor, w is a constant, which can be set by the user;

second limited optimization mode:

if pnlt2≤con2+diff2 and pnlt1≤con1+diff1 then (10)

pnlt＝pnlt_obj；

if pnlt2>con2+diff2 and pnlt1≤con1+diff1 then (11)

pnlt＝pnlt_obj+c*pnlt2；

if pnlt2≤con2+diff2 and pnlt1>con1+diff1 then (12)

pnlt＝pnlt_obj+c*pnlt1；

if otherwise

pnlt ＝ pnlt_obj+c1*pnlt1+c2*pnlt2； (13)

(2) the technical scheme is as follows

Specifically, the invention provides a lightweight method for optimizing system power consumption based on online learning, which comprises the following steps:

step 1, compiling and linking a user source program to generate an object code;

step 2, in a Linux operating system platform, starting a kernel monitoring and analyzing module such as Perf (a tool for performing software performance analysis) and the like, and monitoring a hardware characteristic event by using the module;

step 3, preprocessing the system event, mainly performing normalization processing to obtain a group of characteristic parameter vectors;

step 4, a system power consumption model is established aiming at temperature, performance and power consumption in a combined mode, and the characteristic parameter vector obtained in the step 3 is transmitted into the power consumption model;

step 5, designing a penalty factor module in the value function, and selecting different optimization modes according to different calculation modes of penalty factors;

step 6, designing a value function calculation module based on a Q learning algorithm framework;

step 7, designing an Agent module (Agent) based on the Q learning algorithm and the penalty factor, and writing the power consumption model, the penalty factor and the value function module into the Agent module;

step 8, designing a software timer, and starting the step 3 and the step 7 at regular time;

step 9, executing the current application program, then executing step 7, then executing step 3, updating the Agent module, and obtaining a calculated value through a value function in the module;

step 10, setting convergence, namely judging according to the calculated value obtained in the step 9 and ideal clock parameters of performance, power consumption and temperature, and accelerating the Agent module to output a decision result;

and 11, switching to execute the step 2 according to the decision result of the Agent module, then entering the decision process of the next period, and circulating from the step 3 until the user program is completely operated.

The method for compiling and linking the user source program to generate the object code in step 1 is as follows: compiling the source code into a binary file by a gcc compiling tool and the like;

wherein, the "hardware characteristic event" in step 2 refers to IPCs (instruments/Cycles), cache memories (caches), superscalars, branch predictions, and the like;

wherein, the "system event" in step 3 refers to data collected by the hardware characteristic event;

the "feature parameter vector" in step 3 refers to a set of values obtained by normalizing the data of the system events;

wherein, the "power consumption model" described in step 4 refers to the formulas (3), (4) and (5) in the above "preparation work";

wherein, the "temperature, performance and power consumption are combined to establish a system power consumption model" in step 4, and the temperature, performance and power consumption system power consumption model, i.e. the system power consumption model module, refers to the formulas (3), (4) and (5) in the above "preparation work";

in the step 4, the characteristic parameter vector obtained in the step 3 is transmitted to the power consumption model, and in the process of doing so, considering that the power consumption, the performance and the temperature are affected by different system events to different degrees, certain specific representative system events are adopted to establish the model, so that the complexity of the power consumption model can be reduced, and the state of the system power consumption can be fully described; that is, the formulas (3), (4) and (5) in the above "preparation work";

wherein, the "penalty factor" mentioned in step 5 refers to the formulas (6) to (13) in the above-mentioned "preparation work";

in the step 5, different optimization modes are selected according to calculation modes of different penalty factors by the penalty factor module in the design value function, in the course of the method, different calculation forms of the penalty factors are caused by different parameter optimization modes in the power consumption model, the first optimization mode is a free optimization mode, and the working mode is to make a decision by adaptively calculating the optimal space among power consumption, performance and temperature through a learning algorithm; the second optimization mode is a limited optimization mode, the working mode is that a user sets an optimization space within a specified range according to the application environment of the user, and a machine learning algorithm optimizes parameters in the limited space; the formulas (7) to (9) in the specification are one type of penalty factor module, and the formulas (10) to (13) are another type of penalty factor module;

wherein, the "value function" described in step 6 refers to the above formula (14) of "preparation work";

in step 6, the "Q-learning algorithm-based framework, design value function calculation module" is implemented as follows:

q is a learning algorithm which is under the assumption of deterministic return and action;

s represents the state, a represents the action, Q (s, a) represents an estimate of the overall return from action a in state s, r is the immediate return of the action, γ is a discounting factor, where 0 ≦ γ < 1;

step 6.1, initializing the table entry Q (s, a) to be 0 for each s, a;

step 6.2, observing the current state s;

and 6.3, repeatedly executing the following steps:

selecting an action a which maximizes Q (s, a) and executing it;

receiving an immediate report r;

observing a new state s';

for Q (s, a), the table entry is updated according to the following formula:

Q(s，a)＝r(s，a)+γ*max Q(s'，a')；

s＝s'。

the Agent module in the step 7 is a module consisting of a power consumption model, a penalty factor, a value function and a Q learning algorithm frame;

in step 7, an Agent module (Agent) is designed based on the Q learning algorithm and the penalty factor, and the power consumption model, the penalty factor, and the value function module are written into the Agent module, which is as follows: replacing p in the value function formula (14) by formulas (6) to (13), and calculating the Q values of the temperature, the performance and the energy consumption by using the formula (14) respectively.

Wherein, the "design software timer" in step 8 is as follows: for example, a millisecond timer is programmed in linux by using select function

In step 9, the "update Agent module obtains the calculated value through the value function in the module" is as follows: repeatedly executing the following steps:

selecting an action a which maximizes Q (s, a) and executing it;

receiving an immediate report r;

observing a new state s';

for Q (s, a), the table entry is updated according to the following formula:

Q(s，a)＝r(s，a)+γ*max Q(s'，a')；

s＝s'。

the convergence setting in step 10 is used to set the convergence of the algorithm framework, reduce the overfitting and further reduce the load of system operation, which is finer-grained power consumption optimization, improve the decision process of the Agent module by comparing the clock parameters of the optimal power consumption of performance, power consumption and temperature, and execute step 2 to restart monitoring and execute codes if the system fails.

Through the steps, the temperature, the performance and the power consumption are comprehensively considered cooperatively, the lightweight machine learning algorithm is used for searching the existing optimization space, the effects of reducing the power consumption and ensuring the lowest performance requirement are achieved, and the problem that the embedded equipment and the like are limited by a battery to influence short working time is solved.

3. Advantages and effects

The invention has the beneficial effects that:

the invention provides a lightweight method for optimizing system power consumption based on online learning. The method is based on hardware characteristic events influencing the power consumption of the system, and obtains characteristic parameters by preprocessing and calculating the hardware characteristic events triggered when an application program runs, so that mathematical modeling is realized on factors influencing the power consumption at different levels, and a multi-level system power consumption model is established. When the timer is started, the Agent module is started, the module starts to receive monitoring information transmitted by the monitoring module, different optimization modes are selected through user setting, the application range of the power consumption optimization scheme can be widened, iteration times of the algorithm are reduced through convergence setting, the load of system operation is reduced, and the dynamic power consumption of the system is reduced.

(1) A multi-level system power consumption model is established for the embedded system, hardware characteristic events triggered when static power consumption and dynamic power consumption are generated when an application program runs are monitored, the monitoring information is preprocessed and calculated and transmitted into the power consumption model, factors with small influence on power consumption in different levels are greatly reduced, the speed of the power consumption model during optimization is improved, and nervous embedded system resources are saved.

(2) In order to improve the application range of the optimization algorithm, two power consumption optimization working modes are established by modifying the punishment factor of the value function, the first free optimization working mode does not limit the optimization space of the learning algorithm and only makes a decision by judging the value function, and the second limited optimization working mode makes a decision by self-setting the optimization space by a user according to the application environment and then making a decision by calculating and judging the value function.

(3) The optimal clock parameters of power consumption, performance and temperature are compared with the clock obtained by optimizing the value function to judge whether the learning algorithm carries out the next iterative computation or not, so that the convergence of the online learning algorithm is improved, the overfitting iteration of the algorithm is prevented, the calling of the value function is reduced, and the performance reduction of the system is avoided.

Drawings

FIG. 1 is a block diagram of the process of the present invention.

Detailed Description

The invention discloses a lightweight method for optimizing system power consumption based on online learning, which comprises the following specific implementation steps as shown in figure 1:

step 1, compiling and linking an application source program code to generate an object code;

and 2, utilizing perf, enabling the application program to utilize a PMU and a kernel counter to carry out performance statistics and monitor hardware characteristic events such as IPS (intrusion prevention system), processor clock cycles, cache, branch prediction and the like.

And 3, performing calculation processing on the obtained system event, namely the monitoring data sampled according to time through the step 2 by using the formulas (1) and (2), wherein the calculation processing is mainly used for performing data normalization processing.

And 4, establishing a system power consumption model through formulas (3), (4) and (5), and transmitting the data calculated in the step.

And 5, designing a first working mode self-adaptive optimization mode according to the formulas (6), (7), (8) and (9), and designing a second working mode condition optimization mode according to the formulas (10), (11), (12) and (13).

And 6, designing a value function calculation module according to a formula (14).

Step 7, designing an Agent module, and writing the power consumption model, the penalty factor and value function module and the Q learning algorithm frame into the Agent module;

step 8, designing a timer, and starting the step 3 and the step 7 at regular time;

and 9, executing the code segment of the current application, then executing the step 7, then executing the step 3, updating the Agent module, and obtaining a calculated value through a value function in the module.

And step 10, setting convergence, namely judging according to the calculated value obtained in the step 9 and ideal clock parameters of performance, power consumption and temperature, and accelerating the Agent module to output a decision result.

Claims

1. A lightweight method for optimizing system power consumption based on online learning is characterized in that: the method comprises the following steps:

step 1, a user source program generates an object code through compiling and linking;

step 2, starting a kernel monitoring and analyzing module in a Linux operating system platform, and monitoring a hardware characteristic event by using the module;

step 3, preprocessing the system event, and performing normalization processing to obtain a group of characteristic parameter vectors;

step 7, designing an Agent module, namely an Agent module, based on the Q learning algorithm and the penalty factor, and writing the power consumption model, the penalty factor and the value function into the Agent module;

step 10, setting convergence, namely comparing the calculated value obtained in the step 9 with ideal clock parameters of performance, power consumption and temperature to accelerate an Agent module to output a decision result;

2. The lightweight method for optimizing the system power consumption based on the online learning according to claim 1, wherein: the "user source program generates object code through compiling and linking" in step 1, which is as follows: it is a compilation of source code into binary files by a gcc compilation tool.

3. The lightweight method for optimizing the system power consumption based on the online learning according to claim 1, wherein: in the step 4, the characteristic parameter vector obtained in the step 3 is transmitted to the power consumption model, and in the process of doing so, considering that the power consumption, the performance and the temperature are affected by different system events to different degrees, certain specific representative system events are adopted to build the model, so that the complexity of the power consumption model can be reduced, and the state of the system power consumption can be fully described.

4. The lightweight method for optimizing the system power consumption based on the online learning according to claim 1, wherein: in the step 5, different optimization modes are selected according to calculation modes of different penalty factors by a penalty factor module in the design value function, in the process of doing so, different calculation forms of the penalty factors are caused by different parameter optimization modes in the power consumption model, the first optimization mode is a free optimization mode, and the working mode is to make a decision by adaptively calculating the optimal space among power consumption, performance and temperature through a Q learning algorithm; the second optimization mode is a limited optimization mode, the working mode is that the user sets an optimization space within a specified range according to the application environment of the user, and the machine Q learning algorithm optimizes parameters in the limited space.

5. The lightweight method for optimizing the system power consumption based on the online learning according to claim 1, wherein: the "Q-learning algorithm based framework, design value function calculation module" described in step 6, does the following:

the Q learning algorithm is under the assumption of deterministic return and action, s represents the state, a represents the action, Q (s, a) represents an estimation of the overall return obtained by the action a under the state s, r is the immediate return of the action, and gamma is a discount factor, wherein gamma is more than or equal to 0 and less than 1;

step 6.1, initializing the table entry Q (s, a) to be 0 for each s, a;

step 6.2, observing the current state s;

and 6.3, repeatedly executing the following steps:

selecting an action a which maximizes Q (s, a) and executing it;

receiving an immediate report r;

observing a new state s';

for Q (s, a), the table entry is updated according to the following formula:

Q(s，a)＝r(s，a)+γ*max Q(s'，a')；

s＝s' 。

6. the lightweight method for optimizing the system power consumption based on the online learning according to claim 1, wherein: the Agent module in step 7 refers to a module composed of a power consumption model, a penalty factor, a value function and a Q learning algorithm framework.

7. The lightweight method for optimizing the system power consumption based on the online learning according to claim 1, wherein: the "design software timer" described in step 8 is done as follows: writing millisecond timer by using select function in linux

8. The lightweight method for optimizing the system power consumption based on the online learning according to claim 5, wherein: in step 9, the "update Agent module, which obtains the calculated value through the value function in the module" is as follows: repeatedly executing the following steps:

selecting an action a which maximizes Q (s, a) and executing it;

receiving an immediate report r;

observing a new state s';

for Q (s, a), the table entry is updated according to the following formula:

Q(s，a)＝r(s，a)+γ*max Q(s'，a')；

s＝s' 。

9. the lightweight method for optimizing the system power consumption based on the online learning according to claim 1, wherein: the convergence setting in step 10 is used to set the convergence of the Q learning algorithm framework, reduce the overfitting and further reduce the load of system operation, which is finer grained power consumption optimization, improve the decision process of the Agent module by comparing the clock parameters of the optimal power consumption of performance, power consumption and temperature, and execute step 2 to restart monitoring and execute codes if failure occurs.