Data-driven control of micro-climate in buildings; an event-triggered reinforcement learning approach

Abstract Smart buildings have great potential for shaping an energy-efficient, sustainable, and more economic future for our planet as buildings account for approximately 40% of the global energy consumption. Future of the smart buildings lies in using sensory data for adaptive decision making and control that is currently gloomed by the key challenge of learning a good control policy in a short period of time in an online and continuing fashion. To tackle this challenge, an event-triggered – as opposed to classic time-triggered – paradigm, is proposed in which learning and control decisions are made when events occur and enough information is collected. Events are characterized by certain design conditions and they occur when the conditions are met, for instance, when a certain state threshold is reached. By systematically adjusting the time of learning and control decisions, the proposed framework can potentially reduce the variance in learning, and consequently, improve the control process. We formulate the micro-climate control problem based on semi-Markov decision processes that allow for variable-time state transitions and decision making. Using extended policy gradient theorems and temporal difference methods in a reinforcement learning set-up, we propose two learning algorithms for event-triggered control of micro-climate in buildings. We show the efficacy of our proposed approach via designing a smart learning thermostat that simultaneously optimizes energy consumption and occupants’ comfort in a test building.

[1]  Martin A. Riedmiller,et al.  High Quality Thermostat Control by Reinforcement Learning - A Case Study , 1998 .

[2]  Dirk Deschrijver,et al.  Data-driven Optimization of Energy Efficiency and Comfort in an Apartment , 2018, 2018 International Conference on Intelligent Systems (IS).

[3]  Johan Driesen,et al.  Deep Reinforcement Learning based Optimal Control of Hot Water Systems , 2018, ArXiv.

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Johan A. K. Suykens,et al.  Multi-agent reinforcement learning for modeling and control of thermostatically controlled loads , 2019, Applied Energy.

[6]  Manfred Morari,et al.  Use of model predictive control and weather forecasts for energy efficient building climate control , 2012 .

[7]  Simeng Liu,et al.  Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 1. Theoretical foundation , 2006 .

[8]  Lei Yang,et al.  Reinforcement learning for optimal control of low exergy buildings , 2015 .

[9]  Dimitrios Soudris,et al.  Towards plug&play smart thermostats inspired by reinforcement learning , 2018, INTESA@ESWEEK.

[10]  Farrokh Janabi-Sharifi,et al.  Theory and applications of HVAC control systems – A review of model predictive control (MPC) , 2014 .

[11]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[12]  G. J. Levermore Building energy management systems : an application to heating and control / G.J. Levermore , 1992 .

[13]  Richard S. Sutton,et al.  Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.

[14]  Johan Driesen,et al.  Deep Reinforcement Learning for Optimal Control of Space Heating , 2018, ArXiv.

[15]  H. Ouerdane,et al.  Model predictive control of indoor microclimate: Existing building stock comfort improvement , 2018, Energy Conversion and Management.

[16]  Rémi Munos,et al.  Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..

[17]  Li Xia,et al.  A multi-grid reinforcement learning method for energy conservation and comfort of HVAC in buildings , 2015, 2015 IEEE International Conference on Automation Science and Engineering (CASE).

[18]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[19]  Munther A. Dahleh,et al.  Event-Triggered Reinforcement Learning; An Application to Buildings' Micro-Climate Control , 2020, AAAI Spring Symposium: MLPS.

[20]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[21]  Bart De Schutter,et al.  Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[22]  Yonggang Wen,et al.  Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning , 2017, IEEE Transactions on Cybernetics.

[23]  W. Fisk,et al.  Is CO2 an Indoor Pollutant? Direct Effects of Low-to-Moderate CO2 Concentrations on Human Decision-Making Performance , 2012, Environmental health perspectives.

[24]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[25]  Peter Vrancx,et al.  Convolutional Neural Networks for Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control , 2016, IEEE Transactions on Smart Grid.

[26]  Richard S. Sutton,et al.  Discounted Reinforcement Learning is Not an Optimization Problem , 2019, ArXiv.

[27]  Pawel Wargocki,et al.  Ten questions concerning thermal and indoor air quality effects on the performance of office work and schoolwork , 2017 .

[28]  Michael C. Mozer,et al.  The Neural Network House: An Environment that Adapts to its Inhabitants , 1998 .

[29]  Li Xia,et al.  Satisfaction based Q-learning for integrated lighting and blind control , 2016 .

[30]  Enda Barrett,et al.  Autonomous HVAC Control, A Reinforcement Learning Approach , 2015, ECML/PKDD.

[31]  Giuseppe Tommaso Costanzo,et al.  Experimental analysis of data-driven control for a building heating system , 2015, ArXiv.

[32]  Paulo Tabuada,et al.  An introduction to event-triggered and self-triggered control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[33]  Dimitrios Soudris,et al.  Rapid Prototyping of Low-Complexity Orchestrator Targeting CyberPhysical Systems: The Smart-Thermostat Usecase , 2020, IEEE Transactions on Control Systems Technology.

[34]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[35]  Muhd Zaimi Abd Majid,et al.  A global review of energy consumption, CO2 emissions and policy in the residential sector (with an overview of the top ten CO2 emitting countries) , 2015 .

[36]  Ronnie Belmans,et al.  Learning Agent for a Heat-Pump Thermostat With a Set-Back Strategy Using Model-Free Reinforcement Learning , 2015, ArXiv.

[37]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[38]  R. Belmans,et al.  Reinforcement Learning Applied to an Electric Water Heater: From Theory to Practice , 2015, IEEE Transactions on Smart Grid.

[39]  Kazem Sohraby,et al.  IoT Considerations, Requirements, and Architectures for Smart Buildings—Energy Optimization and Next-Generation Building Management Systems , 2017, IEEE Internet of Things Journal.

[40]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[41]  Anastasios I. Dounis,et al.  Advanced control systems engineering for energy and comfort management in a building environment--A review , 2009 .

[42]  Tianshu Wei,et al.  Deep reinforcement learning for building HVAC control , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[43]  Biao Huang,et al.  A Long-Short Term Memory Recurrent Neural Network Based Reinforcement Learning Controller for Office Heating Ventilation and Air Conditioning Systems , 2017 .

[44]  Leslie K. Norford,et al.  Optimal control of HVAC and window systems for natural ventilation through reinforcement learning , 2018, Energy and Buildings.

[45]  Ibrahim Ahmed,et al.  Online Energy Management in Commercial Buildings using Deep Reinforcement Learning , 2019, 2019 IEEE International Conference on Smart Computing (SMARTCOMP).

[46]  Simeng Liu,et al.  Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 2: Results and analysis , 2006 .

[47]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[48]  D. Kolokotsa,et al.  Reinforcement learning for energy conservation and comfort in buildings , 2007 .

[49]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[50]  Geoff Levermore,et al.  Building Energy Management Systems: An Application to Heating, Natural Ventilation, Lighting and Occupant Satisfaction , 2000 .

[51]  Michael C. Mozer,et al.  Parsing the Stream of Time: The Value of Event-Based Segmentation in a Complex Real-World Control Problem , 1997, Summer School on Neural Networks.

[52]  Yanjie Li,et al.  A basic formula for performance gradient estimation of semi-Markov decision processes , 2013, Eur. J. Oper. Res..

[53]  Dario Ambrosini,et al.  Data-driven model predictive control using random forests for building energy optimization and climate control , 2018, Applied Energy.

[54]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[55]  Jie Li,et al.  Energy-Efficient Thermal Comfort Control in Smart Buildings via Deep Reinforcement Learning , 2019, ArXiv.