Building HVAC control with reinforcement learning for reduction of energy cost and demand charge

Abstract Energy efficiency remains a significant topic in the control of building heating, ventilation, and air-conditioning (HVAC) systems, and diverse set of control strategies have been developed to optimize performance, including recently emerging techniques of deep reinforcement learning (DRL). While most existing works have focused on minimizing energy consumption, the generalization to energy cost minimization under time-varying electricity price profiles and demand charges has rarely been studied. Under these utility structures, significant cost savings can be achieved by pre-cooling buildings in the early morning when electricity is cheaper, thereby reducing expensive afternoon consumption and lowering peak demand. However, correctly identifying these savings requires planning horizons of one day or more. To tackle this problem, we develop Deep Q-Network (DQN) with an action processor, defining the environment as a Partially Observable Markov Decision Process (POMDP) with a reward function consisting of energy cost (time-of-use and peak demand charges) and a discomfort penalty, which is an extension of most reward functions used in existing DRL works in this area. Moreover, we develop a reward shaping technique to overcome the issue of reward sparsity caused by the demand charge. Through a single-zone building simulation platform, we demonstrate that the customized DQN outperforms the baseline rule-based policy, saving close to 6% of total cost with demand charges, while close to 8% without demand charges.

[1]  Daniel Urieli,et al.  A learning agent for heat-pump thermostat control , 2013, AAMAS.

[2]  Shiyu Yang,et al.  Model predictive control with adaptive machine-learning-based model for building energy efficiency and comfort optimization , 2020 .

[3]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[4]  Frauke Oldewurtel,et al.  Building modeling as a crucial part for building predictive control , 2013 .

[5]  Viktor K. Prasanna,et al.  Building HVAC Scheduling Using Reinforcement Learning via Neural Network Based Model Approximation , 2019, BuildSys@SenSys.

[6]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[7]  Ming Jin,et al.  Advanced Building Control via Deep Reinforcement Learning , 2019, Energy Procedia.

[8]  Yue Yuan,et al.  Data-driven model predictive control for building climate control: Three case studies on different buildings , 2019, Building and Environment.

[9]  Francesco Borrelli,et al.  Stochastic Model Predictive Control for Building HVAC Systems: Complexity and Conservatism , 2015, IEEE Transactions on Control Systems Technology.

[10]  Semiha Ergan,et al.  Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network , 2020 .

[11]  Jin Wen,et al.  Review of building energy modeling for control and operation , 2014 .

[12]  Martin Kozek,et al.  Ten questions concerning model predictive control for energy efficient buildings , 2016 .

[13]  Johan Driesen,et al.  Deep Reinforcement Learning for Optimal Control of Space Heating , 2018, ArXiv.

[14]  Etienne Perot,et al.  Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.

[15]  Zicheng Cai,et al.  Gnu-RL: A Precocial Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy , 2019, BuildSys@SenSys.

[16]  Zhe Wang,et al.  Reinforcement learning for building controls: The opportunities and challenges , 2020, Applied Energy.

[17]  K. Max Zhang,et al.  Model predictive control of thermal storage for demand response , 2015, 2015 American Control Conference (ACC).

[18]  James B. Rawlings,et al.  Economic Model Predictive Control for Time-Varying Cost and Peak Demand Charge Optimization , 2020, IEEE Transactions on Automatic Control.

[19]  Anas Alanqar,et al.  Practice-Oriented System Identification Strategies for MPC of Building Thermal and HVAC Dynamics , 2018 .

[20]  Simeng Liu,et al.  Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 1. Theoretical foundation , 2006 .

[21]  Maher Kayal,et al.  Comparison of MPC Formulations for Building Control under Commercial Time-of-Use Tariffs , 2019, 2019 IEEE Milan PowerTech.

[22]  Gregor P. Henze,et al.  Evaluation of optimal control for active and passive building thermal storage , 2004 .

[23]  Sherif Abdelwahed,et al.  Learning-based Model Predictive Control for Smart Building Thermal Management , 2019, 2019 IEEE 16th International Conference on Smart Cities: Improving Quality of Life Using ICT & IoT and AI (HONET-ICT).

[24]  Peng Xu,et al.  Demand reduction in building energy systems based on economic model predictive control , 2012 .

[25]  Mark Cannon,et al.  Efficient nonlinear model predictive control algorithms , 2004, Annu. Rev. Control..

[26]  Khee Poh Lam,et al.  Whole building energy model for HVAC optimal control: A practical framework based on deep reinforcement learning , 2019, Energy and Buildings.

[27]  Simeng Liu,et al.  Evaluation of Reinforcement Learning for Optimal Control of Building Active and Passive Thermal Storage Inventory , 2005 .

[28]  Costas J. Spanos,et al.  HVAC Energy Cost Optimization for a Multizone Building via a Decentralized Approach , 2019, IEEE Transactions on Automation Science and Engineering.

[29]  Manfred Morari,et al.  Use of model predictive control and weather forecasts for energy efficient building climate control , 2012 .

[30]  Manfred Morari,et al.  NeurOpt: Neural network based optimization for building energy management and climate control , 2020, L4DC.

[31]  Simeng Liu,et al.  Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 2: Results and analysis , 2006 .

[32]  Yonggang Wen,et al.  Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning , 2017, IEEE Transactions on Cybernetics.

[33]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[34]  Tianshu Wei,et al.  Deep reinforcement learning for building HVAC control , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[35]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Hao Huang,et al.  Model predictive control for energy-efficient buildings: An airport terminal building study , 2014, 11th IEEE International Conference on Control & Automation (ICCA).

[37]  Dario Ambrosini,et al.  Data-driven model predictive control using random forests for building energy optimization and climate control , 2018, Applied Energy.

[38]  Alberto Cerpa,et al.  Occupancy based demand response HVAC control strategy , 2010, BuildSys '10.

[39]  Alberto Bemporad,et al.  From linear to nonlinear MPC: bridging the gap via the real-time iteration , 2020, Int. J. Control.

[40]  Philip Haves,et al.  Model predictive control for the operation of building cooling systems , 2010, Proceedings of the 2010 American Control Conference.

[41]  Stephen P. Boyd,et al.  Fast Model Predictive Control Using Online Optimization , 2010, IEEE Transactions on Control Systems Technology.