Stochastic Dispatch of Energy Storage in Microgrids: A Reinforcement Learning Approach Incorporated with MCTS

The dynamic dispatch (DD) of battery energy storage systems (BESSs) in microgrids integrated with volatile energy resources is essentially a multiperiod stochastic optimization problem (MSOP). Because the life span of a BESS is significantly affected by its charging and discharging behaviors, its lifecycle degradation costs should be incorporated into the DD model of BESSs, which makes it non-convex. In general, this MSOP is intractable. To solve this problem, we propose a reinforcement learning (RL) solution augmented with Monte-Carlo tree search (MCTS) and domain knowledge expressed as dispatching rules. In this solution, the Q-learning with function approximation is employed as the basic learning architecture that allows multistep bootstrapping and continuous policy learning. To improve the computation efficiency of randomized multistep simulations, we employed the MCTS to estimate the expected maximum action values. Moreover, we embedded a few dispatching rules in RL as probabilistic logics to reduce infeasible action explorations, which can improve the quality of the data-driven solution. Numerical test results show the proposed algorithm outperforms other baseline RL algorithms in all cases tested.

[1]  P. Gilman,et al.  MICROPOWER SYSTEM MODELING WITH HOMER , 2005 .

[2]  Felix A. Farret,et al.  Micropower System Modeling with Homer , 2006 .

[3]  Babak Fahimi,et al.  Economic Dispatch of a Hybrid Microgrid With Distributed Energy Storage , 2015, IEEE Transactions on Smart Grid.

[4]  Bo Zhao,et al.  Operation Optimization of Standalone Microgrids Considering Lifetime Characteristics of Battery Energy Storage System , 2013, IEEE Transactions on Sustainable Energy.

[5]  Yang Cai,et al.  Learning Safe Policies with Expert Guidance , 2018, NeurIPS.

[6]  Enrico Zio,et al.  A reinforcement learning framework for optimal operation and maintenance of power grids , 2019, Applied Energy.

[7]  Hao Liang,et al.  Distributed Economic Dispatch in Microgrids Based on Cooperative Reinforcement Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Derong Liu,et al.  A Novel Dual Iterative $Q$-Learning Method for Optimal Battery Management in Smart Residential Environments , 2015, IEEE Transactions on Industrial Electronics.

[9]  Shane Legg,et al.  Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[10]  Chunyang Liu,et al.  Economic scheduling model of microgrid considering the lifetime of batteries , 2017 .

[11]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[12]  Jeffrey P. Kharoufeh,et al.  Managing Energy Storage in Microgrids: A Multistage Stochastic Programming Approach , 2018, IEEE Transactions on Smart Grid.

[13]  Vincenzo Marano,et al.  A stochastic dynamic programming model for co-optimization of distributed energy storage , 2013, Energy Systems.

[14]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[15]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[16]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[17]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[18]  T. Nguyen,et al.  Stochastic Optimization of Renewable-Based Microgrid Operation Incorporating Battery Operating Cost , 2016, IEEE Transactions on Power Systems.

[19]  Anil Pahwa,et al.  Intelligent Dispatch for Distributed Renewable Resources , 2012, IEEE Transactions on Smart Grid.

[20]  Seyed Hossein Hosseinian,et al.  An Optimal Dispatch Algorithm for Managing Residential Distributed Energy Resources , 2014, IEEE Transactions on Smart Grid.

[21]  Wencong Su,et al.  Stochastic Energy Scheduling in Microgrids With Intermittent Renewable Energy Resources , 2014, IEEE Transactions on Smart Grid.

[22]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[23]  M. E. Baran,et al.  Optimal capacitor placement on radial distribution systems , 1989 .

[24]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[25]  Hongxing Yang,et al.  A feasibility study of a stand-alone hybrid solar–wind–battery system for a remote island , 2014 .

[26]  Antonio Pietrabissa,et al.  Model Predictive Control of Energy Storage Systems for Power Tracking and Shaving in Distribution Grids , 2017, IEEE Transactions on Sustainable Energy.

[27]  Nikos D. Hatziargyriou,et al.  Centralized Control for Optimizing Microgrids Operation , 2008 .

[28]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[29]  Mahmoud-Reza Haghifam,et al.  Stochastic-based scheduling of the microgrid operation including wind turbines, photovoltaic cells, energy storages and responsive loads , 2015 .