Residential Demand Response Applications Using Batch Reinforcement Learning

Driven by recent advances in batch Reinforcement Learning (RL), this paper contributes to the application of batch RL to demand response. In contrast to conventional model-based approaches, batch RL techniques do not require a system identification step, which makes them more suitable for a large-scale implementation. This paper extends fitted Q-iteration, a standard batch RL technique, to the situation where a forecast of the exogenous data is provided. In general, batch RL techniques do not rely on expert knowledge on the system dynamics or the solution. However, if some expert knowledge is provided, it can be incorporated by using our novel policy adjustment method. Finally, we tackle the challenge of finding an open-loop schedule required to participate in the day-ahead market. We propose a model-free Monte-Carlo estimator method that uses a metric to construct artificial trajectories and we illustrate this method by finding the day-ahead schedule of a heat-pump thermostat. Our experiments show that batch RL techniques provide a valuable alternative to model-based controllers and that they can be used to construct both closed-loop and open-loop policies.

[1]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[2]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Zheng Wen,et al.  Optimal Demand Response Using Device-Based Reinforcement Learning , 2014, IEEE Transactions on Smart Grid.

[4]  Peter Vrancx,et al.  Reinforcement Learning: State-of-the-Art , 2012 .

[5]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[6]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[7]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[8]  Matthew Crosby,et al.  Association for the Advancement of Artificial Intelligence , 2014 .

[9]  Amritanshu Pandey,et al.  IEEE Transactions on Smart Grid , 2012 .

[10]  Koen Vanthournout,et al.  A Smart Domestic Hot Water Buffer , 2012, IEEE Transactions on Smart Grid.

[11]  Martin A. Riedmiller,et al.  Improved neural fitted Q iteration applied to a novel computer gaming and learning benchmark , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[12]  Soummya Kar,et al.  Using smart devices for system-level management and control in the smart grid: A reinforcement learning framework , 2012, 2012 IEEE Third International Conference on Smart Grid Communications (SmartGridComm).

[13]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[14]  Martin A. Riedmiller,et al.  Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[15]  Ronnie Belmans,et al.  Demand response of a heterogeneous cluster of electric water heaters using batch reinforcement learning , 2014, 2014 Power Systems Computation Conference.

[16]  EnergyInformationAdministration Annual Energy Outlook 2008 With Projections to 2030 , 2008 .

[17]  Í. Ciglera,et al.  Beyond theory : the challenge of implementing Model Predictive Control in buildings Ji ř , 2013 .

[18]  Bart De Schutter,et al.  Exploiting policy knowledge in online least-squares policy iteration: An empirical study , 2010 .

[19]  Henrik Madsen,et al.  Probabilistic Forecasts of Wind Power Generation Accounting for Geographically Dispersed Information , 2014, IEEE Transactions on Smart Grid.

[20]  Gregor P. Henze,et al.  Evaluation of Reinforcement Learning Control for Thermal Energy Storage Systems , 2003 .

[21]  Robert Babuska,et al.  Experience Replay for Real-Time Reinforcement Learning Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[22]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[23]  Martin A. Riedmiller,et al.  Batch Reinforcement Learning , 2012, Reinforcement Learning.

[24]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[25]  Warren B. Powell,et al.  An Intelligent Battery Controller Using Bias-Corrected Q-learning , 2012, AAAI.

[26]  Daan Six,et al.  The ADDRESS Project: An architecture and markets to enable active demand , 2009, 2009 6th International Conference on the European Energy Market.

[27]  Louis Wehenkel,et al.  Batch mode reinforcement learning based on the synthesis of artificial trajectories , 2013, Ann. Oper. Res..

[28]  Gerard J. M. Smit,et al.  Management and Control of Domestic Smart Grid Technology , 2010, IEEE Transactions on Smart Grid.

[29]  Tom Holvoet,et al.  Reinforcement Learning of Heuristic EV Fleet Charging in a Day-Ahead Electricity Market , 2015, IEEE Transactions on Smart Grid.

[30]  K. Schneider,et al.  GridLAB-D: An open-source power systems modeling and simulation environment , 2008, 2008 IEEE/PES Transmission and Distribution Conference and Exposition.

[31]  Daniel E. Fisher,et al.  EnergyPlus: creating a new-generation building energy simulation program , 2001 .

[32]  Koen Vanthournout,et al.  LINEAR breakthrough project: Large-scale implementation of smart grid technologies in distribution grids , 2012, 2012 3rd IEEE PES Innovative Smart Grid Technologies Europe (ISGT Europe).

[33]  Johanna L. Mathieu,et al.  Modeling and Control of Aggregated Heterogeneous Thermostatically Controlled Loads for Ancillary Services , 2011 .

[34]  Marco Levorato,et al.  Residential Demand Response Using Reinforcement Learning , 2010, 2010 First IEEE International Conference on Smart Grid Communications.

[35]  Frits Bliek,et al.  PowerMatching City, a living lab smart grid demonstration , 2010, 2010 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT Europe).

[36]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[37]  Tom Holvoet,et al.  A Scalable Three-Step Approach for Demand Side Management of Plug-in Hybrid Vehicles , 2013, IEEE Transactions on Smart Grid.