CN113435042B - Reinforced learning modeling method for demand response of building air conditioning system - Google Patents
Reinforced learning modeling method for demand response of building air conditioning system Download PDFInfo
- Publication number
- CN113435042B CN113435042B CN202110716683.0A CN202110716683A CN113435042B CN 113435042 B CN113435042 B CN 113435042B CN 202110716683 A CN202110716683 A CN 202110716683A CN 113435042 B CN113435042 B CN 113435042B
- Authority
- CN
- China
- Prior art keywords
- indoor
- wall
- reinforcement learning
- model
- air conditioning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/08—Thermal analysis or thermal optimisation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Air Conditioning Control Device (AREA)
Abstract
The invention discloses a reinforcement learning modeling method for demand response of a building air conditioning system, which comprises the following steps: developing an RC ash box model by utilizing known indoor temperature, refrigerating capacity, weather and personnel data of a building, and establishing a relation between the indoor temperature and the refrigerating capacity of an air conditioning system; establishing a reinforced model based on value function linear approximation for the control of the air conditioning system and the storage battery; training a reinforcement learning model intelligent agent by using meteorological and personnel data; and applying the intelligent body with complete training to a target time period to obtain a control strategy of the air conditioning system and the storage battery. The invention can reflect the thermal characteristics of the building and reduce the reference data and the prior knowledge. On the premise of ensuring the thermal comfort of indoor personnel, an air conditioning system with low energy consumption and low electricity consumption cost and a storage battery operation strategy can be provided.
Description
Technical Field
The invention belongs to the crossing field of building energy management and artificial intelligence, and particularly relates to a reinforcement learning modeling method for demand response of a building air conditioning system.
Background
The energy consumption of building operation is an important aspect of energy consumption in China, and the energy consumption of an air conditioner accounts for a large proportion in the operation of the building. But this adds complexity to the control of the air conditioning system due to delays and attenuations in the response of the building system to external weather conditions. Therefore, the air conditioner operation strategy is made based on the experience of the operator, namely the operator adjusts the air conditioner operation strategy according to the current weather condition, weather forecast, past experience, operation economy and other factors. The comfort level and the energy-saving condition of personnel are only subjectively judged, and the comfort level of indoor personnel and the reduction of energy consumption cannot be guaranteed.
At present, a plurality of methods for automatically controlling a building air conditioning system exist, and when feedback control is adopted, because the feedback control has the characteristic of delay, indoor temperature cannot be controlled in time and the system cannot be guaranteed to run in an efficient state; when a supervised machine learning algorithm is adopted for control, reference data and priori knowledge are often needed, and existing conditions in actual operation cannot meet the algorithm requirements easily. Meanwhile, most of the existing control methods only control the air conditioning system, do not relate to storage battery control, and cannot further achieve the purpose of power grid load transfer.
Disclosure of Invention
In view of the above, the present invention provides a reinforcement learning modeling method for demand response of a building air conditioning system, which implements automatic control of a building air conditioner and a storage battery system under the condition of reducing dependence on reference data and prior knowledge, so as to ensure comfort of indoor personnel and reduce energy consumption and electricity consumption of the air conditioning system.
In order to achieve the aim, the invention provides a reinforcement learning modeling method for demand response of a building air conditioning system, which comprises the following steps:
step 1): firstly, establishing an RC ash box model reflecting the change relation of indoor temperature and refrigerating capacity for a target building, wherein input variables of the model are the indoor personnel number n and the air-conditioning refrigerating capacity QLTotal solar radiationWindow area FwinWall (except window) area FwallThe output variable of the model is the indoor hourly temperature Tin;
Step 2): using a known indoor time-by-time temperature TinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallTraining a gray box model by data, and identifying thermal resistance R of building wallwThermal resistance R of windowwinWall heat capacity CwIndoor air heat capacity CinWindow coefficient of heat gain c1Wall heat gain coefficient c2;
Step 3): establishing a reinforcement learning model for the air conditioner control system and the storage battery control system, wherein the interaction environment of the intelligent bodies in the model is the ash box model in the step 1), the parameters in the ash box model are the identification result in the step 2), and the input variables are the time-by-time indoor personnel number n and the air conditioner refrigerating capacity QLTotal solar radiationWindow area FwinWall (except window) area FwallThe output variable is the refrigerating capacity Q of the space-time modulation systemLAnd a charging/discharging action Delta E of a storage battery control systemt;
Step 4): indoor hourly temperature T using continuous 60-day actual measurementsinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallTraining the reinforcement learning model by data, and storing the training model;
step 5): will wait for the indoor hourly temperature T of the simulation time intervalinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallAnd (4) inputting data into the well-trained reinforcement learning model to obtain a control strategy of the air conditioning system and the storage battery.
Further, in the step 3), the gray box model established in the step 1) is combined with a reinforcement learning algorithm to be used as a reinforcement learning agent interaction environment for modeling, and the RC gray box model expression is as follows:
further, the strong learning model in the step 3) adopts a reinforcement learning algorithm based on value function approximation, the approximation method is linear approximation, and the basis function is a gaussian kernel function.
Further, a return function for controlling the training effect of the strong learning model in the step 4) is as follows:
formula (3) rAC,tRepresents the reward that the agent controlling the air conditioning system won at time t, formula rES,tWherein represents the reward obtained by the battery control agent at time t, formula (3) wherein alpha1、α2The parameters are respectively temperature importance and electricity consumption cost importance, omega1、ω2Is a time parameter respectively representing the indoor time of a person and the air conditioner starting time, P'tTo normalize the energy consumption of the plant, PmaxThe maximum energy consumption of the equipment; lambda [ alpha ]tAt the present time t, the electricity price is set,as a temperature offset penalty function, Δ EtThe amount of change in the electric energy of the storage battery is represented,represents a penalty of the control result violating the temperature setting interval, -alpha2×P′t×λt×ω2Represents a penalty for spending the electricity charge,penalty for representing electricity cost of accumulatortIndicating a penalty of a discharge amount greater than the remaining charge amount.
Advantageous effects
1. The method is used for modeling the building based on the RC ash box model, effectively expressing the thermodynamic characteristics of the building and accurately simulating the response of the indoor temperature to the air conditioner control strategy.
2. According to the invention, by establishing the reinforcement learning model, the information of interaction between the algorithm and the environment is effectively utilized, and the dependence on reference data and priori knowledge is reduced.
3. The invention utilizes the reinforcement learning model to carry out control strategy proposition and simulation on the air conditioning system and the storage battery, reduces the operation load of the air conditioning system and reduces the operation cost of the system while ensuring the thermal comfort of indoor personnel.
Drawings
FIG. 1 is a technical route diagram of a reinforcement learning modeling method for demand response of a building air conditioning system according to the present invention;
FIG. 2 is a comparison graph of the room temperature calculation results of the RC gray box model and the DEST room temperature simulation results in one embodiment of the present invention;
FIG. 3 is a diagram illustrating the variation of the accumulated reward function with the number of iterations in the training process according to an embodiment of the present invention;
FIG. 4 is a graph illustrating simulation results of indoor temperature and air conditioner energy consumption based on reinforcement learning control according to an embodiment of the present invention;
fig. 5 is a graph showing simulation results of indoor temperature, air conditioning, and charge/discharge energy consumption based on reinforcement learning control according to an embodiment of the present invention.
Detailed Description
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
The invention provides a reinforcement learning modeling method for demand response of a building air conditioning system, a flow chart of which is shown in figure 1, and the method comprises the following steps:
step 1: firstly, establishing an RC ash box model capable of reflecting the change relation of indoor temperature and refrigerating capacity for a building, wherein input variables of the model are the indoor personnel number n and the air-conditioning refrigerating capacity QLTotal solar radiationWindow area FwinWall (except window) area FwallThe output variable of the model is the indoor hourly temperature Tin。
The gray box model expression is as follows:
step 2: using a known indoor time-by-time temperature TinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallThe data was trained on a gray box model. Using particle swarm optimization algorithm to Rw、Rwin、Cw、Cin、c1、c2And (5) performing identification.
And 3, step 3: establishing a reinforcement learning model for the air conditioner control system and the storage battery control system, wherein the interaction environment of the intelligent bodies in the model is the ash box model in the step 1, the parameters in the ash box model are the identification result in the step 2, and the input variables are the time-by-time indoor personnel number n and the air conditioner refrigerating capacity QLTotal solar radiationWindow area FwinWall (except window) area FwallThe output variable is the refrigerating capacity Q of the space-time modulation systemLAnd charging/discharging operation of secondary batteryt. The reinforcement learning model adopts a reinforcement learning algorithm based on value function approximation, the approximation method is linear approximation, and a Gaussian kernel function is used as a basis function.
And 4, step 4: indoor time-by-time temperature T actually measured by using continuous period of timeinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallThe data trains the reinforcement learning model and stores the training model.
And 5: will wait for the indoor hourly temperature T of the simulation time intervalinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallAnd (4) inputting data into the well-trained reinforcement learning model in the step 4 to obtain a control strategy of the air conditioning system and the storage battery.
Example (b):
step 1: firstly, establishing an RC ash box model reflecting the relation between indoor temperature and indoor refrigerating capacity for a target building;
step 2: using a known indoor time-by-time temperature TinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallThe data was trained on a gray box model. Using particle swarm optimization algorithm to Rw、Rwin、Cw、Cin、c1、c2And (5) performing identification.
Specifically, the calculated result of the RC gray box model in the present embodiment is shown in fig. 2 as a comparison with the room temperature of the DeST simulation. Through calculation, the RRMSE of the RC ash box model is 2.26%, which shows that the RC ash box model can accurately reflect the relation between the indoor temperature and the indoor load change.
And step 3: and establishing a reinforcement learning model for the air conditioner control system and the storage battery control system. The interactive environment of the intelligent agent in the model is the ash box model in the step 1, the parameters in the ash box model are the identification result in the step 2, and the input variables are the indoor personnel number n and the air conditioning refrigerating capacity QLTotal solar radiationWindow area FwinWall (except window) area Fwall。
And 4, step 4: indoor time-by-time temperature T actually measured by using continuous period of timeinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallThe data trains the reinforcement learning model and stores the training model.
In this embodiment, 1440 hours of hourly meteorological and personnel data for 60 days are used, and the number of iterations is set to 1000 generations. In the training, the accumulated return value finally tends to be stable, and the accumulated return value is changed along with the iteration number as shown in fig. 3.
And 5: will wait for the indoor hourly temperature T of the simulation time intervalinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall (except window) area FwallAnd (4) inputting data into the well-trained reinforcement learning model in the step 4 to obtain a control strategy of the air conditioning system and the storage battery.
Specifically, in this embodiment, the hourly weather and personnel data for 72 hours on a total basis on 3 days different from that in step 4 are selected, the iteration number is set to 1000 generations, the indoor temperature change and the air conditioner energy consumption are shown in fig. 4, and the indoor temperature change and the storage battery charge and discharge control strategy are shown in fig. 5.
Compared with a rule-based control strategy with the same control requirement, the strategy generated based on the reinforcement learning model can reduce the energy consumption of the air conditioning system by 9.8 percent and reduce the electricity consumption by 14.8 percent; after the storage battery is used, the electricity utilization cost can be further reduced by 29.7 percent compared with a rule-based control strategy.
Claims (4)
1. A reinforcement learning modeling method for demand response of a building air conditioning system is characterized by comprising the following steps:
step 1): firstly, establishing an RC ash box model reflecting the change relation of indoor temperature and refrigerating capacity for a target building, wherein input variables of the model are the indoor personnel number n and the air-conditioning refrigerating capacity QLTotal solar radiationWindow area FwinWall area FwallThe output variable of the model is the indoor hourly temperature Tin;
Step 2): using a known indoor time-by-time temperature TinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall area FwallTraining a gray box model by data, and identifying thermal resistance R of building wallwThermal resistance R of windowwinWall heat capacity CwIndoor air heat capacity CinWindow coefficient of heat gain c1Wall heat gain coefficient c2;
Step 3): establishing a reinforcement learning model for the air conditioner control system and the storage battery control system, wherein the interaction environment of the intelligent bodies in the model is the ash box model in the step 1), the parameters in the ash box model are the identification result in the step 2), and the input variables are the time-by-time indoor personnel number n and the air conditioner refrigerating capacity QLTotal solar radiationWindow area FwinWall area FwallThe output variable is the refrigerating capacity Q of the space-time modulation systemLAnd a charging/discharging action Delta E of a storage battery control systemt;
Step 4): indoor hourly temperature T using continuous 60-day actual measurementsinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall area FwallTraining the reinforcement learning model by data, and storing the training model;
step 5): will wait for the indoor hourly temperature T of the simulation time intervalinThe number n of indoor persons and the refrigerating capacity Q of the air conditionerLTotal solar radiationWindow area FwinWall area FwallAnd (4) inputting data into the well-trained reinforcement learning model to obtain a control strategy of the air conditioning system and the storage battery.
2. The reinforcement learning modeling method for demand response of building air conditioning system according to claim 1, wherein in step 3), the gray box model established in step 1) is combined with a reinforcement learning algorithm to be used as reinforcement learning agent interaction environment modeling, and the RC gray box model expression is as follows:
3. the reinforcement learning modeling method for demand response of building air conditioning system according to claim 1, wherein the reinforcement learning model in step 3) adopts a reinforcement learning algorithm based on value function approximation, the approximation method is linear approximation, and the basis function is gaussian kernel function.
4. The method as claimed in claim 1, wherein the return function for controlling the training effect of the strong learning model in the step 4) is as follows:
formula (3) rAC,tRepresents the reward that the agent controlling the air conditioning system won at time t, formula rES,tWherein represents the reward obtained by the battery control agent at time t, formula (3) wherein alpha1、α2The parameters are respectively temperature importance and electricity consumption cost importance, omega1、ω2Is a time parameter respectively representing the indoor time of a person and the air conditioner starting time, P'tTo normalize the energy consumption of the plant, PmaxThe maximum energy consumption of the equipment; lambda [ alpha ]tAt the present time t, the electricity price is set,as a temperature offset penalty function, Δ EtThe amount of change in the electric energy of the storage battery is represented,represents a penalty of the control result violating the temperature setting interval, -alpha2×P′t×λt×ω2Represents a penalty for spending the electricity charge,penalty for representing electricity cost of accumulatortIndicating a penalty of a discharge amount greater than the remaining charge amount.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110716683.0A CN113435042B (en) | 2021-06-28 | 2021-06-28 | Reinforced learning modeling method for demand response of building air conditioning system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110716683.0A CN113435042B (en) | 2021-06-28 | 2021-06-28 | Reinforced learning modeling method for demand response of building air conditioning system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113435042A CN113435042A (en) | 2021-09-24 |
CN113435042B true CN113435042B (en) | 2022-05-17 |
Family
ID=77754821
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110716683.0A Active CN113435042B (en) | 2021-06-28 | 2021-06-28 | Reinforced learning modeling method for demand response of building air conditioning system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113435042B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115018184B (en) * | 2022-06-28 | 2024-04-05 | 天津大学 | Double-layer optimal scheduling method for air conditioning system based on demand response |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109347149A (en) * | 2018-09-20 | 2019-02-15 | 国网河南省电力公司电力科学研究院 | Micro-capacitance sensor energy storage dispatching method and device based on depth Q value network intensified learning |
CN109617205A (en) * | 2018-11-28 | 2019-04-12 | 江苏理工学院 | The cooperative control method of electric car composite power source power distribution |
CN112460741A (en) * | 2020-11-23 | 2021-03-09 | 香港中文大学(深圳) | Control method of building heating, ventilation and air conditioning system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108895633A (en) * | 2018-05-08 | 2018-11-27 | 林兴斌 | Using building structure as the central air conditioner system control method of cool storage medium |
US11002202B2 (en) * | 2018-08-21 | 2021-05-11 | Cummins Inc. | Deep reinforcement learning for air handling control |
CN110781551A (en) * | 2019-11-14 | 2020-02-11 | 长安大学 | Simulation method for thermal process of cold supply room with embedded pipe type enclosure structure |
US11573540B2 (en) * | 2019-12-23 | 2023-02-07 | Johnson Controls Tyco IP Holdings LLP | Methods and systems for training HVAC control using surrogate model |
-
2021
- 2021-06-28 CN CN202110716683.0A patent/CN113435042B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109347149A (en) * | 2018-09-20 | 2019-02-15 | 国网河南省电力公司电力科学研究院 | Micro-capacitance sensor energy storage dispatching method and device based on depth Q value network intensified learning |
CN109617205A (en) * | 2018-11-28 | 2019-04-12 | 江苏理工学院 | The cooperative control method of electric car composite power source power distribution |
CN112460741A (en) * | 2020-11-23 | 2021-03-09 | 香港中文大学(深圳) | Control method of building heating, ventilation and air conditioning system |
Non-Patent Citations (2)
Title |
---|
"RC热网络建筑能耗预测模型综述";石欣等;《仪器仪表学报》;20141231;第35卷(第12期);全文 * |
"Reinforcement learning for building controls: The opportunities and challenges";Zhe Wang.etc.;《Applied Energy》;20200701;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113435042A (en) | 2021-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110458443B (en) | Smart home energy management method and system based on deep reinforcement learning | |
US11610214B2 (en) | Deep reinforcement learning based real-time scheduling of Energy Storage System (ESS) in commercial campus | |
CN106487011A (en) | A kind of based on the family of Q study microgrid energy optimization method | |
Li et al. | Reinforcement learning of room temperature set-point of thermal storage air-conditioning system with demand response | |
CN113112077B (en) | HVAC control system based on multi-step prediction deep reinforcement learning algorithm | |
CN114370698B (en) | Indoor thermal environment learning efficiency improvement optimization control method based on reinforcement learning | |
CN113572157A (en) | User real-time autonomous energy management optimization method based on near-end policy optimization | |
CN105356491A (en) | Power fluctuation smoothening method based on optimum control of energy storage and virtual energy storage | |
Georgiou et al. | Implementing artificial neural networks in energy building applications—A review | |
CN116663820A (en) | Comprehensive energy system energy management method under demand response | |
CN110097217A (en) | A kind of building dynamic Room Temperature Prediction method based on equivalent RC model | |
CN113435042B (en) | Reinforced learning modeling method for demand response of building air conditioning system | |
CN114623569A (en) | Cluster air conditioner load differentiation regulation and control method based on deep reinforcement learning | |
CN112560160A (en) | Model and data-driven heating ventilation air conditioner optimal set temperature obtaining method and equipment | |
CN115840987A (en) | Hybrid vehicle thermal management strategy generation method based on deep reinforcement learning | |
CN117172499A (en) | Smart community energy optimal scheduling method, system and storage medium based on reinforcement learning | |
CN115882463A (en) | Commercial building air conditioner load schedulable potential evaluation method | |
CN114498649A (en) | Active power distribution network building thermal load control method and device, electronic equipment and storage medium | |
CN112288161A (en) | Method and device for optimizing peak-shifting electricity consumption of residents | |
CN115169839A (en) | Heating load scheduling method based on data-physics-knowledge combined drive | |
CN116734424B (en) | Indoor thermal environment control method based on RC model and deep reinforcement learning | |
CN115840986B (en) | Energy management method based on stochastic model predictive control | |
Ozawa et al. | Data-driven HVAC Control Using Symbolic Regression: Design and Implementation | |
CN114781274B (en) | Comprehensive energy system control optimization method and system for simulation and decision alternate learning | |
CN117112202A (en) | Virtual power plant distributed resource scheduling method based on deep reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |