Using reinforcement learning for optimizing heat pump control in a building model in Modelica

With the increasing share of renewable energy sources in the electricity grid comes the need to exploit the available flexibility at the demand side. Demand response programs seek to exploit the flexibility of consumers by motivating endusers to shift demand based on grid signals. An important source of flexibility available at residential consumers are Thermostatically Controlled Loads (TCLs). Additionally, recent advances within the reinforcement learning area have made it possible to apply this technique to a large range of problems. Driven by these promising examples, a Batch Reinforcement Learning (BRL) algorithm is applied to a TCL. An important property of the complex optimization problem studied here, is its partial observability. The main contribution of this paper is the application of BRL to a detailed building and heating system model, implemented in Modelica. A detailed TCL model allows to perform an in-depth analysis of the effects of partial observability on the performance of the chosen control strategy. Ultimately, this paper illustrates that Modelica can be used to provide a detailed environment for a BRL algorithm. At the end, the learned control policy has been compared with two other control policies. The obtained policy outperforms both and is economically feasible after a limited amount of training days.

[1]  Bart De Schutter,et al.  Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[2]  A.J. Conejo,et al.  Day-ahead electricity price forecasting using the wavelet transform and ARIMA models , 2005, IEEE Transactions on Power Systems.

[3]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[4]  Tom Holvoet,et al.  Reinforcement Learning of Heuristic EV Fleet Charging in a Day-Ahead Electricity Market , 2015, IEEE Transactions on Smart Grid.

[5]  Urban Lundin,et al.  Power system flexibility need induced by wind and solar power intermittency on time scales of 1–14 days , 2015 .

[6]  Peter Vrancx,et al.  Convolutional Neural Networks for Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control , 2016, IEEE Transactions on Smart Grid.

[7]  Claes Sandels,et al.  Simulating occupancy in office buildings with non-homogeneous Markov chains for Demand Response analysis , 2015, 2015 IEEE Power & Energy Society General Meeting.

[8]  J. Widén,et al.  Forecasting household consumer electricity load profiles with a combined physical and behavioral approach , 2014 .

[9]  Í. Ciglera,et al.  Beyond theory : the challenge of implementing Model Predictive Control in buildings Ji ř , 2013 .

[10]  Lieve Helsen,et al.  Agent-Based Control of A Neighborhood: A Generic Approach by Coupling Modelica with Python , 2015, Building Simulation Conference Proceedings.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[13]  Manfred Morari,et al.  Use of model predictive control and weather forecasts for energy efficient building climate control , 2012 .

[14]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[15]  Giuseppe Tommaso Costanzo,et al.  Experimental analysis of data-driven control for a building heating system , 2015, ArXiv.

[16]  Dirk Saelens,et al.  OpenIDEAS – An Open Framework for integrated District Energy Simulations , 2015, Building Simulation Conference Proceedings.