Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives

Abstract In this paper, we review past (including very recent) research considerations in using reinforcement learning (RL) to solve electric power system decision and control problems. The RL considerations are reviewed in terms of specific electric power system problems, type of control and RL method used. We also provide observations about past considerations based on a comprehensive review of available publications. The review reveals the RL is considered as viable solutions to many decision and control problems across different time scales and electric power system states. Furthermore, we analyse the perspectives of RL approaches in light of the emergence of new-generation, communications, and instrumentation technologies currently in use, or available for future use, in power systems. The perspectives are also analysed in terms of recent breakthroughs in RL algorithms (Safe RL, Deep RL and path integral control for RL) and other, not previously considered, problems for RL considerations (most notably restorative, emergency controls together with so-called system integrity protection schemes, fusion with existing robust controls, and combining preventive and emergency control).

[1]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[2]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Antoine Wehenkel,et al.  An App-based Algorithmic Approach for Harvesting Local and Renewable Energy using Electric Vehicles , 2017, ICAART.

[4]  Mevludin Glavic,et al.  Design of a resistive brake controller for power system stability enhancement using reinforcement learning , 2005, IEEE Transactions on Control Systems Technology.

[5]  Adeniyi A. Babalola,et al.  Reinforcement learning approach for congestion management and cascading failure prevention with experimental application , 2016 .

[6]  M. Glavic,et al.  Distributed Undervoltage Load Shedding , 2007, IEEE Transactions on Power Systems.

[7]  D. Ernst,et al.  Damping Control by Fusion of Reinforcement Learning and Control Lyapunov Functions , 2006, 2006 38th North American Power Symposium.

[8]  G. Burt,et al.  Comparing Policy Gradient and Value Function Based Reinforcement Learning Methods in Simulated Electrical Power Trade , 2012, IEEE Transactions on Power Systems.

[9]  J. Rochet,et al.  Platform competition in two sided markets , 2003 .

[10]  Alexander Apostolov,et al.  IEEE PSRC Report on Global Industry Experiences With System Integrity Protection Schemes (SIPS) , 2010, IEEE Transactions on Power Delivery.

[11]  Damien Ernst,et al.  A comparison of Nash equilibria analysis and agent-based modelling for power markets , 2006 .

[12]  Tao Yu,et al.  Stochastic Optimal Relaxed Automatic Generation Control in Non-Markov Environment Based on Multi-Step $Q(\lambda)$ Learning , 2011, IEEE Transactions on Power Systems.

[13]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[14]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[15]  Louis Wehenkel,et al.  Trajectory-Based Supplementary Damping Control for Power System Electromechanical Oscillations , 2014, IEEE Transactions on Power Systems.

[16]  Damien Ernst,et al.  Active network management for electrical distribution systems: problem formulation, benchmark, and approximate solution , 2014, 1405.2806.

[17]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[18]  Mevludin Glavic,et al.  Potential, opportunities and benefits of electric vehicles as frequency regulation resources , 2016 .

[19]  R. Ambatipudi,et al.  Dynamic Energy management in embedded systems , 2003 .

[20]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[21]  Zhong-Ping Jiang,et al.  Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming , 2016, Autom..

[22]  Shie Mannor,et al.  Hierarchical Decision Making In Electricity Grid Management , 2016, ICML.

[23]  D. Ernst,et al.  Combining a stability and a performance-oriented control in power systems , 2005, IEEE Transactions on Power Systems.

[24]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[25]  Nick Szabo,et al.  Formalizing and Securing Relationships on Public Networks , 1997, First Monday.

[26]  Xiaoxin Zhou,et al.  Learning-coordinate fuzzy logic control of dynamic quadrature boosters in multi-machine power systems , 1999 .

[27]  T. Das,et al.  A Reinforcement Learning Model to Assess Market Power Under Auction-Based Energy Pricing , 2007, IEEE Transactions on Power Systems.

[28]  Damien Ernst,et al.  Deep Reinforcement Learning Solutions for Energy Microgrids Management , 2016 .

[29]  Haibo He,et al.  Power System Stability Control for a Wind Farm Based on Adaptive Dynamic Programming , 2015, IEEE Transactions on Smart Grid.

[30]  Tao Yu,et al.  R(λ) imitation learning for automatic generation control of interconnected power grids , 2012, Autom..

[31]  Louis Wehenkel,et al.  A reinforcement learning based discrete supplementary control for power system transient stability enhancement , 2005 .

[32]  Louis Wehenkel,et al.  Batch mode reinforcement learning based on the synthesis of artificial trajectories , 2013, Ann. Oper. Res..

[33]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[34]  Zhe Zhang,et al.  Reinforcement-Learning-Based Intelligent Maximum Power Point Tracking Control for Wind Energy Conversion Systems , 2015, IEEE Transactions on Industrial Electronics.

[35]  Yong He,et al.  Optimal control in microgrid using multi-agent reinforcement learning. , 2012, ISA transactions.

[36]  Damien Ernst,et al.  Towards the Minimization of the Levelized Energy Costs of Microgrids using both Long-term and Short-term Storage Devices , 2016 .

[37]  Ratnesh K. Sharma,et al.  Dynamic Energy Management System for a Smart Microgrid , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Damien Ernst,et al.  On the Dynamics of the Deployment of Renewable Energy Production Capacities , 2017 .

[39]  Shie Mannor,et al.  Reinforcement learning for the unit commitment problem , 2015, 2015 IEEE Eindhoven PowerTech.

[40]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[41]  Akihiko Yokoyama,et al.  Smart Grid: Technology and Applications , 2012 .

[42]  T. E. Dy Liacco,et al.  Real-time computer control of power systems , 1974 .

[43]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[44]  Ali Feliachi,et al.  Reinforcement learning tuned decentralized synergetic control of power systems , 2012 .

[45]  Massoud Pedram,et al.  A Near-Optimal Model-Based Control Algorithm for Households Equipped With Residential Photovoltaic Power Generation and Energy Storage Systems , 2016, IEEE Transactions on Sustainable Energy.

[46]  Shie Mannor,et al.  Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..

[47]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[48]  Wei Zhang,et al.  Multiagent-Based Reinforcement Learning for Optimal Reactive Power Dispatch , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[49]  D. Ernst,et al.  Automatic learning of sequential decision strategies for dynamic security assessment and control , 2006, 2006 IEEE Power Engineering Society General Meeting.

[50]  D. Ernst,et al.  Transient stability emergency control combining open-loop and closed-loop techniques , 2003, 2003 IEEE Power Engineering Society General Meeting (IEEE Cat. No.03CH37491).

[51]  Damien Ernst,et al.  Benchmarking for Bayesian Reinforcement Learning , 2016, PloS one.

[52]  D. Ernst,et al.  Power systems stability control: reinforcement learning framework , 2004, IEEE Transactions on Power Systems.

[53]  N.D. Hatziargyriou,et al.  Reinforcement learning for reactive power control , 2004, IEEE Transactions on Power Systems.

[54]  P. S. Nagendra Rao,et al.  A reinforcement learning approach to automatic generation control , 2002 .

[55]  E. A. Jasmin,et al.  Reinforcement Learning approaches to Economic Dispatch problem , 2011 .

[56]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[57]  Mihaela van der Schaar,et al.  Dynamic Pricing and Energy Consumption Scheduling With Reinforcement Learning , 2016, IEEE Transactions on Smart Grid.

[58]  Habib Rajabi Mashhadi,et al.  An Adaptive $Q$-Learning Algorithm Developed for Agent-Based Computational Modeling of Electricity Market , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[59]  Tao Yu,et al.  Hierarchical correlated Q-learning for multi-layer optimal generation command dispatch , 2016 .

[60]  Minjie Zhang,et al.  A Hybrid Multiagent Framework With Q-Learning for Power Grid Systems Restoration , 2011, IEEE Transactions on Power Systems.

[61]  Haibo He,et al.  Q-Learning-Based Vulnerability Analysis of Smart Grid Against Sequential Topology Attacks , 2017, IEEE Transactions on Information Forensics and Security.

[62]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[63]  Paul H. Kydd,et al.  Vehicle-Solar-Grid Integration: Concept and Construction , 2016, IEEE Power and Energy Technology Systems Journal.

[64]  Damien Ernst,et al.  Imitative Learning for Online Planning in Microgrids , 2015, DARE.

[65]  Tom Holvoet,et al.  Reinforcement Learning of Heuristic EV Fleet Charging in a Day-Ahead Electricity Market , 2015, IEEE Transactions on Smart Grid.

[66]  K. W. Chan,et al.  Multi-Agent Correlated Equilibrium Q(λ) Learning for Coordinated Smart Generation Control of Interconnected Power Grids , 2015, IEEE Transactions on Power Systems.

[67]  Alireza Bakhshai,et al.  Intelligent Control of Grid-Connected Microgrids: An Adaptive Critic-Based Approach , 2015, IEEE Journal of Emerging and Selected Topics in Power Electronics.

[68]  Hassan Bevrani,et al.  Load–frequency control : a GA-based multi-agent reinforcement learning , 2010 .

[69]  Philip S. Thomas,et al.  Safe Reinforcement Learning , 2015 .

[70]  Ali Feliachi,et al.  Reinforcement learning based backstepping control of power system oscillations , 2009 .

[71]  Damien Ernst,et al.  Min Max Generalization for Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes , 2012, SIAM J. Control. Optim..

[72]  Damien Ernst,et al.  Active Management of Low-Voltage Networks for Mitigating Overvoltages Due to Photovoltaic Units , 2016, IEEE Transactions on Smart Grid.

[73]  Tariq Samad,et al.  SEPIA. A simulator for electric power industry agents , 2000 .

[74]  Bart De Schutter,et al.  Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[75]  Ramtin Hadidi,et al.  Reinforcement Learning Based Real-Time Wide-Area Stabilizing Control Agents to Enhance Power System Stability , 2013, IEEE Transactions on Smart Grid.

[76]  Sukumar Kamalasadan,et al.  Design and Real-Time Implementation of Optimal Power System Wide-Area System-Centric Controller Based on Temporal Difference Learning , 2014, IEEE Transactions on Industry Applications.

[77]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..