Comparing neural architectures for demand response through model-free reinforcement learning for heat pump control

As batch reinforcement learning algorithms reach maturity and neural networks are used increasingly in reinforcement learning, a performance comparison of these models should be performed. This paper discusses the implementation of a heat pump agent in a demand response setting and its cost effectiveness when implemented with different neural network types. The agent maintains the interior air temperature of a building between pre-set temperature constraints, with four actions at its disposal. The agent is incentivized to shift loads in a day-ahead market in order to minimize daily electricity costs. The simulation considered a multilayer perceptron (MLP), a convolutional neural network (CNN) and a long short-term memory neural network (LSTM) to model the environment dynamics. All architectures outperform a trivial thermostat controller and shift loads successfully after 20–25 days. For this particular setup, there is no significant difference between the MLP and the LSTM, while they do outperform the CNN model. The MLP is preferred as it requires far less computation time.

[1]  Koen Vanthournout,et al.  Using reinforcement learning for demand response of domestic hot water buffers: A real-life demonstration , 2017, 2017 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe).

[2]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[3]  Benjamin Sovacool,et al.  Electricity market design for the prosumer era , 2016, Nature Energy.

[4]  Aie World Energy Outlook 2017 , 2017 .

[5]  Geert Deconinck,et al.  Direct Load Control of Thermostatically Controlled Loads Based on Sparse Observations Using Deep Reinforcement Learning , 2017, CSEE Journal of Power and Energy Systems.

[6]  Bart De Schutter,et al.  Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[7]  Damien Ernst,et al.  Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives , 2017 .

[8]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[9]  J. Canadell,et al.  Towards real-time verification of CO2 emissions , 2017, Nature Climate Change.

[10]  Mohammed H. Albadi,et al.  Demand Response in Electricity Markets: An Overview , 2007, 2007 IEEE Power Engineering Society General Meeting.

[11]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[12]  Ronnie Belmans,et al.  Demand response of a heterogeneous cluster of electric water heaters using batch reinforcement learning , 2014, 2014 Power Systems Computation Conference.

[13]  Atul K. Jain,et al.  Climate Sensitivity Uncertainty and the Need for Energy Without CO2 Emission , 2003, Science.

[14]  Alice Bows,et al.  Beyond ‘dangerous’ climate change: emission scenarios for a new world , 2011, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[15]  Giuseppe Tommaso Costanzo,et al.  Experimental analysis of data-driven control for a building heating system , 2015, ArXiv.

[16]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[17]  Peter Vrancx,et al.  Convolutional Neural Networks for Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control , 2016, IEEE Transactions on Smart Grid.

[18]  Koen Vanthournout,et al.  A Smart Domestic Hot Water Buffer , 2012, IEEE Transactions on Smart Grid.

[19]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.