Learning Agent for a Heat-Pump Thermostat With a Set-Back Strategy Using Model-Free Reinforcement Learning

The conventional control paradigm for a heat pump with a less efficient auxiliary heating element is to keep its temperature set point constant during the day. This constant temperature set point ensures that the heat pump operates in its more efficient heat-pump mode and minimizes the risk of activating the less efficient auxiliary heating element. As an alternative to a constant set-point strategy, this paper proposes a learning agent for a thermostat with a set-back strategy. This set-back strategy relaxes the set-point temperature during convenient moments, e.g., when the occupants are not at home. Finding an optimal set-back strategy requires solving a sequential decision-making process under uncertainty, which presents two challenges. The first challenge is that for most residential buildings, a description of the thermal characteristics of the building is unavailable and challenging to obtain. The second challenge is that the relevant information on the state, i.e., the building envelope, cannot be measured by the learning agent. In order to overcome these two challenges, our paper proposes an auto-encoder coupled with a batch reinforcement learning technique. The proposed approach is validated for two building types with different thermal characteristics for heating in the winter and cooling in the summer. The simulation results indicate that the proposed learning agent can reduce the energy consumption by 4%–9% during 100 winter days and by 9%–11% during 80 summer days compared to the conventional constant set-point strategy.

[1]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[2]  R. C. Sonderegger Dynamic models of house heating based on equivalent thermal parameters , 1978 .

[3]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[4]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[5]  K. Schneider,et al.  GridLAB-D: An open-source power systems modeling and simulation environment , 2008, 2008 IEEE/PES Transmission and Distribution Conference and Exposition.

[6]  Philipp Blum,et al.  Greenhouse gas emission savings of ground source heat pump systems in Europe: A review , 2012 .

[7]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[8]  김동권,et al.  Nest learning thermostat 소개 , 2014 .

[9]  Í. Ciglera,et al.  Beyond theory : the challenge of implementing Model Predictive Control in buildings Ji ř , 2013 .

[10]  Klaas De Craemer,et al.  Peak shaving of a heterogeneous cluster of residential flexibility carriers using reinforcement learning , 2013, IEEE PES ISGT Europe 2013.

[11]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Antonio Messineo,et al.  A Dynamic Fuzzy Controller to Meet Thermal Comfort by Using Neural Network Forecasted Parameters as the Input , 2014 .

[13]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[14]  Daniel E. Fisher,et al.  EnergyPlus: creating a new-generation building energy simulation program , 2001 .

[15]  Johan Löfberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004 .

[16]  Yan Chen,et al.  Saving Building Energy through Advanced Control Strategies , 2013 .

[17]  Lori Megdal,et al.  Billing Analysis & Environment that “Re-Sets” Savings for Programmable Thermostats in New Homes , 2010 .

[18]  Martin A. Riedmiller,et al.  Reinforcement learning for robot soccer , 2009, Auton. Robots.

[19]  Nicolas Morel,et al.  NEUROBAT, A PREDICTIVE AND ADAPTIVE HEATING CONTROL SYSTEM USING ARTIFICIAL NEURAL NETWORKS , 2001 .

[20]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Martin A. Riedmiller,et al.  Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[23]  Lieve Helsen,et al.  Influence of massive heat‐pump introduction on the electricity‐generation mix and the GHG effect—Belgian case study , 2008 .

[24]  Gregor P. Henze,et al.  Evaluation of Reinforcement Learning Control for Thermal Energy Storage Systems , 2003 .

[25]  Johan Efberg,et al.  YALMIP : A toolbox for modeling and optimization in MATLAB , 2004 .

[26]  Peter Vrancx,et al.  Reinforcement Learning: State-of-the-Art , 2012 .

[27]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[28]  Ricardo Vigário,et al.  Nonlinear PCA: a new hierarchical approach , 2002, ESANN.

[29]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[30]  Zheng Wen,et al.  Optimal Demand Response Using Device-Based Reinforcement Learning , 2014, IEEE Transactions on Smart Grid.

[31]  Nicholas R. Jennings,et al.  Adaptive home heating control through Gaussian process prediction and mathematical programming , 2011 .

[32]  Ronnie Belmans,et al.  Demand response of a heterogeneous cluster of electric water heaters using batch reinforcement learning , 2014, 2014 Power Systems Computation Conference.

[33]  Daniel Urieli,et al.  A learning agent for heat-pump thermostat control , 2013, AAMAS.

[34]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[35]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[36]  Cinzia Buratti,et al.  Development of Innovative Heating and Cooling Systems Using Renewable Energy Sources for Non-Residential Buildings , 2013 .

[37]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[38]  Koen Vanthournout,et al.  LINEAR breakthrough project: Large-scale implementation of smart grid technologies in distribution grids , 2012, 2012 3rd IEEE PES Innovative Smart Grid Technologies Europe (ISGT Europe).

[39]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[40]  Martin A. Riedmiller,et al.  Batch Reinforcement Learning , 2012, Reinforcement Learning.

[41]  Jin Woo Moon,et al.  Determining Adaptability Performance of Artificial Neural Network-Based Thermal Control Logics for Envelope Conditions in Residential Buildings , 2013 .