Double Deep $Q$ -Learning-Based Distributed Operation of Battery Energy Storage System Considering Uncertainties

<inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning-based operation strategies are being recently applied for optimal operation of energy storage systems, where, a <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-table is used to store <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-values for all possible state-action pairs. However, <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning faces challenges when it comes to large state space problems, i.e., continuous state space problems or problems with environment uncertainties. In order to address the limitations of <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning, this paper proposes a distributed operation strategy using double deep <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning method. It is applied to managing the operation of a community battery energy storage system (CBESS) in a microgrid system. In contrast to <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning, the proposed operation strategy is capable of dealing with uncertainties in the system in both grid-connected and islanded modes. This is due to the utilization of a deep neural network as a function approximator to estimate the <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-values. Moreover, the proposed method can mitigate the overestimation that is the major drawback of the standard deep <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning. The proposed method trains the model faster by decoupling the selection and evaluation processes. Finally, the performance of the proposed double deep <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning-based operation method is evaluated by comparing its results with a centralized approach-based operation.

[1]  Yasser Abdel-Rady I. Mohamed,et al.  Energy Management in Multi-Microgrid Systems—Development and Assessment , 2017, IEEE Transactions on Power Systems.

[2]  Claudio A. Cañizares,et al.  A Centralized Energy Management System for Isolated Microgrids , 2014, IEEE Transactions on Smart Grid.

[3]  Yinliang Xu,et al.  Distributed Optimal Resource Management Based on the Consensus Algorithm in a Microgrid , 2015, IEEE Transactions on Industrial Electronics.

[4]  Enrico Zio,et al.  Reinforcement learning for microgrid energy management , 2013 .

[5]  Mohammad Shahidehpour,et al.  Market-Based Versus Price-Based Microgrid Optimal Scheduling , 2016, IEEE Transactions on Smart Grid.

[6]  Juan C. Vasquez,et al.  Microgrid supervisory controllers and energy management systems: A literature review , 2016 .

[7]  Dong Hui,et al.  Battery Energy Storage Station (BESS)-Based Smoothing Control of Photovoltaic (PV) and Wind Power Generation Fluctuations , 2013, IEEE Transactions on Sustainable Energy.

[8]  Sayon Dutta Reinforcement Learning with TensorFlow , 2018 .

[9]  S. G. Ponnambalam,et al.  A comparative study of policies in Q-learning for foraging tasks , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[10]  Hak-Man Kim,et al.  A Multiagent-Based Hierarchical Energy Management Strategy for Multi-Microgrids Considering Adjustable Power and Demand Response , 2018, IEEE Transactions on Smart Grid.

[11]  Antonio Liotta,et al.  On-Line Building Energy Optimization Using Deep Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[12]  Bangyin Liu,et al.  Smart energy management system for optimal microgrid economic operation , 2011 .

[13]  Jianhui Wang,et al.  Energy Management Systems in Microgrid Operations , 2012 .

[14]  Soummya Kar,et al.  Using smart devices for system-level management and control in the smart grid: A reinforcement learning framework , 2012, 2012 IEEE Third International Conference on Smart Grid Communications (SmartGridComm).

[15]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[16]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[17]  Ramazan Bayindir,et al.  Investigation on North American Microgrid Facility , 2015, International Journal of Renewable Energy Research.

[18]  Amin Khodaei,et al.  Microgrid Optimal Scheduling With Multi-Period Islanding Constraints , 2014, IEEE Transactions on Power Systems.

[19]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[20]  Hak-Man Kim,et al.  Q-Learning-Based Operation Strategy for Community Battery Energy Storage System (CBESS) in Microgrid System , 2019 .

[21]  Q. Jiang,et al.  Energy Management of Microgrid in Grid-Connected and Stand-Alone Modes , 2013, IEEE Transactions on Power Systems.

[22]  R. Iravani,et al.  Microgrids management , 2008, IEEE Power and Energy Magazine.

[23]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  Hak-Man Kim,et al.  Optimal Operation of Microgrids Considering Auto-Configuration Function Using Multiagent System , 2017 .

[26]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Vol. II , 1976 .

[27]  Hak-Man Kim,et al.  Diffusion Strategy-Based Distributed Operation of Microgrids Using Multiagent System , 2017 .

[28]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[29]  N. D. Hatziargyriou,et al.  Multi-agent reinforcement learning for microgrids , 2010, IEEE PES General Meeting.

[30]  Vijay R. Konda,et al.  OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..

[31]  Hak-Man Kim,et al.  Robust Optimal Operation of AC/DC Hybrid Microgrids Under Market Price Uncertainties , 2018, IEEE Access.

[32]  Yong He,et al.  Optimal control in microgrid using multi-agent reinforcement learning. , 2012, ISA transactions.

[33]  Ratnesh K. Sharma,et al.  Dynamic Energy Management System for a Smart Microgrid , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[35]  Peter Vrancx,et al.  Reinforcement Learning: State-of-the-Art , 2012 .

[36]  Juan C. Vasquez,et al.  Supervisory Control of an Adaptive-Droop Regulated DC Microgrid With Battery Management Capability , 2014, IEEE Transactions on Power Electronics.

[37]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.