Direct Load Control of Thermostatically Controlled Loads Based on Sparse Observations Using Deep Reinforcement Learning

This paper considers a demand response agent that must find a near-optimal sequence of decisions based on sparse observations of its environment. Extracting a relevant set of features from these observations is a challenging task and may require substantial domain knowledge. One way to tackle this problem is to store sequences of past observations and actions in the state vector, making it high dimensional, and apply techniques from deep learning. This paper investigates the capabilities of different deep learning techniques, such as convolutional neural networks and recurrent neural networks, to extract relevant features for finding near-optimal policies for a residential heating system and electric water heater that are hindered by sparse observations. Our simulation results indicate that in this specific scenario, feeding sequences of time-series to an LSTM network, which is a specific type of recurrent neural network, achieved a higher performance than stacking these time-series in the input of a convolutional neural network or deep neural network.

[1]  Claire J. Tomlin,et al.  Building model identification during regular operation - empirical results and challenges , 2016, 2016 American Control Conference (ACC).

[2]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Yoshua Bengio,et al.  The problem of learning long-term dependencies in recurrent networks , 1993, IEEE International Conference on Neural Networks.

[5]  W. Marsden I and J , 2012 .

[6]  Sergey Levine,et al.  PLATO: Policy learning using adaptive trajectory optimization , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Bart De Schutter,et al.  Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[8]  Jürgen Schmidhuber,et al.  Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.

[9]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[10]  Koen Vanthournout,et al.  A Smart Domestic Hot Water Buffer , 2012, IEEE Transactions on Smart Grid.

[11]  Peter Vrancx,et al.  Convolutional Neural Networks for Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control , 2016, IEEE Transactions on Smart Grid.

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  Damien Ernst,et al.  Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives , 2017 .

[14]  Bram Bakker,et al.  Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[15]  Madeleine Gibescu,et al.  Unsupervised energy prediction in a Smart Grid context using reinforcement cross-building transfer learning , 2016 .

[16]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[17]  Marko Bacic,et al.  Model predictive control , 2003 .

[19]  Wei Zhang,et al.  Aggregate model for heterogeneous thermostatically controlled loads with demand response , 2012, 2012 IEEE Power and Energy Society General Meeting.

[20]  Zheng Wen,et al.  Optimal Demand Response Using Device-Based Reinforcement Learning , 2014, IEEE Transactions on Smart Grid.

[21]  Chiara Delmastro,et al.  Generalizable occupant-driven optimization model for domestic hot water production in NZEB , 2016 .

[22]  R. Belmans,et al.  Reinforcement Learning Applied to an Electric Water Heater: From Theory to Practice , 2015, IEEE Transactions on Smart Grid.

[23]  Koen Vanthournout,et al.  LINEAR breakthrough project: Large-scale implementation of smart grid technologies in distribution grids , 2012, 2012 3rd IEEE PES Innovative Smart Grid Technologies Europe (ISGT Europe).

[24]  Tom M. Mitchell,et al.  Reinforcement learning with hidden states , 1993 .

[25]  Soummya Kar,et al.  Using smart devices for system-level management and control in the smart grid: A reinforcement learning framework , 2012, 2012 IEEE Third International Conference on Smart Grid Communications (SmartGridComm).

[26]  Damien Ernst,et al.  Deep Reinforcement Learning Solutions for Energy Microgrids Management , 2016 .

[27]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[28]  Duncan S. Callaway,et al.  Arbitraging Intraday Wholesale Energy Market Prices With Aggregations of Thermostatic Loads , 2015, IEEE Transactions on Power Systems.

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.