OCTOPUS: Deep Reinforcement Learning for Holistic Smart Building Control

Recently, significant efforts have been done to improve quality of comfort for commercial buildings' users while also trying to reduce energy use and costs. Most of these efforts have concentrated in energy efficient control of the HVAC (Heating, Ventilation, and Air conditioning) system, which is usually the core system in charge of controlling buildings' conditioning and ventilation. However, in practice, HVAC systems alone cannot control every aspect of conditioning and comfort that affects buildings' occupants. Modern lighting, blind and window systems, usually considered as independent systems, when present, can significantly affect building energy use, and perhaps more importantly, user comfort in terms of thermal, air quality and illumination conditions. For example, it has been shown that a blind system can provide 12%~35% reduction in cooling load in summer while also improving visual comfort. In this paper, we take a holistic approach to deal with the trade-offs between energy use and comfort in commercial buildings. We developed a system called OCTOPUS, which employs a novel deep reinforcement learning (DRL) framework that uses a data-driven approach to find the optimal control sequences of all building's subsystems, including HVAC, lighting, blind and window systems. The DRL architecture includes a novel reward function that allows the framework to explore the trade-offs between energy use and users' comfort, while at the same time enable the solution of the high-dimensional control problem due to the interactions of four different building subsystems. In order to cope with OCTOPUS's data training requirements, we argue that calibrated simulations that match the target building operational points are the vehicle to generate enough data to be able to train our DRL framework to find the control solution for the target building. In our work, we trained OCTOPUS with 10-year weather data and a building model that is implemented in the EnergyPlus building simulator, which was calibrated using data from a real production building. Through extensive simulations we demonstrate that OCTOPUS can achieve 14.26% and 8.1% energy savings compared with the state-of-the art rule-based method in a LEED Gold Certified building and the latest DRL-based method available in the literature respectively, while maintaining human comfort within a desired range.

[1]  Li Xia,et al.  Satisfaction based Q-learning for integrated lighting and blind control , 2016 .

[2]  D. Kolokotsaa,et al.  Genetic algorithms optimized fuzzy controller for the indoor environmental management in buildings implemented using PLC and local operating networks , 2003 .

[3]  Standard Ashrae Thermal Environmental Conditions for Human Occupancy , 1992 .

[4]  Alberto Cerpa,et al.  Optimal HVAC building control with occupancy prediction , 2014, BuildSys@SenSys.

[5]  P. Fanger Moderate Thermal Environments Determination of the PMV and PPD Indices and Specification of the Conditions for Thermal Comfort , 1984 .

[6]  A. Athienitis,et al.  The impact of shading design and control on building cooling and lighting demand , 2007 .

[7]  Steve Greenberg,et al.  Window operation and impacts on building energy consumption , 2015 .

[8]  Tianshu Wei,et al.  Deep reinforcement learning for building HVAC control , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[9]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[10]  Khee Poh Lam,et al.  Practical implementation and evaluation of deep reinforcement learning control for a radiant heating system , 2018, BuildSys@SenSys.

[11]  Azman Osman Lim,et al.  PID Controller for Temperature Control with Multiple Actuators in Cyber-Physical Home System , 2012, 2012 15th International Conference on Network-Based Information Systems.

[12]  K. Dalamagkidisa,et al.  Reinforcement learning for energy conservation and comfort in buildings , 2007 .

[13]  J. E. Janssen,et al.  Ventilation for acceptable indoor air quality , 1989 .

[14]  Siliang Lu,et al.  A DEEP REINFORCEMENT LEARNING APPROACH TO USINGWHOLE BUILDING ENERGYMODEL FOR HVAC OPTIMAL CONTROL , 2018 .

[15]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[16]  Paul P. Maglio,et al.  FORCES: feedback and control for occupants to refine comfort and energy savings , 2016, UbiComp.

[17]  S. Emmerich,et al.  State-Of-The-Art Review of Co2 Demand Controlled Ventilation Technology and Application , 2003 .

[18]  L. D. Danny Harvey,et al.  (Sub)Section: All Author(s): CLAs: Oswaldo Lucon, Diana Urge Vorsatz LAs: Azni Zain Ahmed, Hashem Akbari, Paolo Bertoldi, Luisa Cabeza, Nicholas , 2012 .

[19]  P. O. Fanger,et al.  Thermal comfort: analysis and applications in environmental engineering, , 1972 .

[20]  Xi Zhao,et al.  DeepAPP: a deep reinforcement learning framework for mobile application usage prediction , 2019, SenSys.

[21]  Lieve Helsen,et al.  Practical implementation and evaluation of model predictive control for an office building in Brussels , 2016 .

[22]  Li Xia,et al.  A multi-grid reinforcement learning method for energy conservation and comfort of HVAC in buildings , 2015, 2015 IEEE International Conference on Automation Science and Engineering (CASE).

[23]  Alberto Cerpa,et al.  ThermoSense: Occupancy Thermal Based Sensing for HVAC Control , 2013, BuildSys@SenSys.

[24]  Ivan P. Gavrilyuk,et al.  Lagrange multiplier approach to variational problems and applications , 2010, Math. Comput..

[25]  M. Anand “1984” , 1962 .

[26]  Michael Wetter,et al.  Co-simulation of building energy and control systems with the Building Controls Virtual Test Bed , 2011 .

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  M. Gribaudo,et al.  2002 , 2001, Cell and Tissue Research.

[29]  Kazufumi Ito,et al.  Lagrange multiplier approach to variational problems and applications , 2008, Advances in design and control.

[30]  Arash Tavakoli,et al.  Action Branching Architectures for Deep Reinforcement Learning , 2017, AAAI.

[31]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.