Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control

[1]  M. Thring World Energy Outlook , 1977 .

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[4]  Michael C. Mozer,et al.  The Neural Network House: An Environment that Adapts to its Inhabitants , 1998 .

[5]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[6]  John Langford,et al.  Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.

[7]  Gregor P. Henze,et al.  Evaluation of Reinforcement Learning Control for Thermal Energy Storage Systems , 2003 .

[8]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Simeng Liu,et al.  Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 2: Results and analysis , 2006 .

[11]  Simeng Liu,et al.  Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 1. Theoretical foundation , 2006 .

[12]  Simeng Liu,et al.  Evaluation of reinforcement learning for optimal control of building active and passive thermal storage inventory , 2007 .

[13]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[14]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[15]  Farrokh Janabi-Sharifi,et al.  Theory and applications of HVAC control systems – A review of model predictive control (MPC) , 2014 .

[16]  Lizhen Huang,et al.  Shelter and residential building energy consumption within the 450 ppm CO2eq constraints in different climate zones , 2015 .

[17]  Ronnie Belmans,et al.  Learning Agent for a Heat-Pump Thermostat With a Set-Back Strategy Using Model-Free Reinforcement Learning , 2015, ArXiv.

[18]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[19]  Giuseppe Tommaso Costanzo,et al.  Experimental analysis of data-driven control for a building heating system , 2015, ArXiv.

[20]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[21]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[22]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[23]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[24]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[25]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[26]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[27]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[28]  Tianshu Wei,et al.  Deep reinforcement learning for building HVAC control , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[29]  Bart De Schutter,et al.  Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[30]  Sergey Levine,et al.  Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.

[31]  Dale Schuurmans,et al.  Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.

[32]  Biao Huang,et al.  A Long-Short Term Memory Recurrent Neural Network Based Reinforcement Learning Controller for Office Heating Ventilation and Air Conditioning Systems , 2017 .

[33]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[34]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[35]  Chayan Nadjahi,et al.  A review of thermal management and innovative cooling strategies for data center , 2018, Sustain. Comput. Informatics Syst..

[36]  Giovanni De Magistris,et al.  Reinforcement Learning Testbed for Power-Consumption Optimization , 2018, ArXiv.

[37]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[38]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[39]  D. Ślęzak,et al.  Methods and Applications for Modeling and Simulation of Complex Systems , 2018, Communications in Computer and Information Science.

[40]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[41]  Marc G. Bellemare,et al.  Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.

[42]  Zicheng Cai,et al.  Gnu-RL: A Precocial Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy , 2019, BuildSys@SenSys.

[43]  Jie Li,et al.  Energy-Efficient Thermal Comfort Control in Smart Buildings via Deep Reinforcement Learning , 2019, ArXiv.

[44]  Xingxing Zhang,et al.  A review of reinforcement learning methodologies for controlling occupant comfort in buildings , 2019, Sustainable Cities and Society.

[45]  José R. Vázquez-Canteli,et al.  Reinforcement learning for demand response: A review of algorithms and modeling techniques , 2019, Applied Energy.

[46]  V. Prasanna,et al.  Building HVAC Scheduling Using Reinforcement Learning via Neural Network Based Model Approximation , 2019, BuildSys@SenSys.

[47]  Nicolas Le Roux,et al.  Understanding the impact of entropy on policy optimization , 2018, ICML.

[48]  Khee Poh Lam,et al.  Whole building energy model for HVAC optimal control: A practical framework based on deep reinforcement learning , 2019, Energy and Buildings.

[49]  Enda Barrett,et al.  Transfer Learning Applied to Reinforcement Learning-Based HVAC Control , 2020, SN Computer Science.

[50]  Anjukan Kathirgamanathan,et al.  A Centralised Soft Actor Critic Deep Reinforcement Learning Approach to District Demand Side Management through CityLearn , 2020, ArXiv.

[51]  José R. Vázquez-Canteli,et al.  MARLISA: Multi-Agent Reinforcement Learning with Iterative Sequential Action Selection for Load Shaping of Grid-Interactive Connected Buildings , 2020, BuildSys@SenSys.

[52]  Zhe Wang,et al.  Reinforcement learning for building controls: The opportunities and challenges , 2020, Applied Energy.

[53]  Alberto E. Cerpa,et al.  MB2C: Model-Based Deep Reinforcement Learning for Multi-zone Building Control , 2020, BuildSys@SenSys.

[54]  Zheng O'Neill,et al.  One for Many: Transfer Learning for Building HVAC Control , 2020, BuildSys@SenSys.

[55]  Bingqing Chen,et al.  Gnu-RL: A Practical and Scalable Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy , 2020, Frontiers in Built Environment.

[56]  Sameera S. Ponda,et al.  Autonomous navigation of stratospheric balloons using reinforcement learning , 2020, Nature.

[57]  Hartmut Schmeck,et al.  A Guide for the Design of Benchmark Environments for Building Energy Optimization , 2020, BuildSys@SenSys.

[58]  Yonggang Wen,et al.  Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning , 2017, IEEE Transactions on Cybernetics.

[59]  Parameswaran Kamalaruban,et al.  Applications of reinforcement learning in energy systems , 2021, Renewable and Sustainable Energy Reviews.

[60]  Helia Zandi,et al.  Intelligent multi-zone residential HVAC control strategy based on deep reinforcement learning , 2021 .

[61]  Zhibin Niu,et al.  Understanding energy demand behaviors through spatio-temporal smart meter data analysis , 2021, Energy.