Actor-critic learning for optimal building energy management with phase change materials

Abstract Energy management in buildings using phase change materials (PCM) to improve thermal performance is challenging due to the nonlinear thermal capacity of the PCM. To address this problem, this paper adopts a model-free actor-critic on-policy reinforcement learning method based on deep deterministic policy gradient (DDPG). The proposed approach overcomes the major weakness of model-based approaches, such as approximate dynamic programming (ADP), which require an explicit thermal model of the building under control. This requirement makes a plug-and-play implementation of the energy management algorithm in an existing smart meter difficult due to the wide variety of building design and construction types. To overcome this difficulty, we use a DDPG algorithm that can learn policies in continuous action spaces without access to the full dynamics of the building. We demonstrate the competitive performance of DDPG by benchmarking it against an ADP-based approach with access to the full thermal dynamics of the building.

[1]  Chris Underwood,et al.  Modelling Methods for Energy in Buildings , 2004 .

[2]  Mohammed M. Farid,et al.  Use of Phase Change Materials for Thermal Comfort and Electrical Energy Peak Load Shifting: Experimental Investigations , 2008 .

[3]  Archie C. Chapman,et al.  Energy Management of Buildings with Phase Change Materials Based on Dynamic Programming , 2019, 2019 IEEE Milan PowerTech.

[4]  Gregor Verbic,et al.  Towards a smart home energy management system - A dynamic programming approach , 2011, 2011 IEEE PES Innovative Smart Grid Technologies.

[5]  Jay G. Sanjayan,et al.  Energy saving potential of phase change materials in major Australian cities , 2014 .

[6]  M. M. Gouda,et al.  Low-order model for the simulation of a building and its heating system , 2000 .

[7]  Behdad Moghtaderi,et al.  Effect of thermal mass on the thermal performance of various Australian residential constructions systems , 2008 .

[8]  Vice President,et al.  AMERICAN SOCIETY OF HEATING, REFRIGERATION AND AIR CONDITIONING ENGINEERS INC. , 2007 .

[9]  Luisa F. Cabeza,et al.  Experimental Study of PCM Inclusion in Different Building Envelopes , 2009 .

[10]  Archie C. Chapman,et al.  A Fast Technique for Smart Home Management: ADP With Temporal Difference Learning , 2018, IEEE Transactions on Smart Grid.

[11]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[12]  Gregor Verbic,et al.  Using Thermal Inertia of Buildings with Phase Change Material for Demand Response , 2017 .

[13]  G. Uhlenbeck,et al.  On the Theory of the Brownian Motion , 1930 .

[14]  Stuart E. Dreyfus,et al.  Applied Dynamic Programming , 1965 .

[15]  H. Paksoy,et al.  Review on using microencapsulated phase change materials (PCM) in building applications , 2015 .

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Saifur Rahman,et al.  An Algorithm for Intelligent Home Energy Management and Demand Response Analysis , 2012, IEEE Transactions on Smart Grid.

[18]  Giuseppe Peter Vanoli,et al.  Energy refurbishment of existing buildings through the use of phase change materials: Energy savings and indoor comfort in the cooling season , 2014 .