Curriculum Based Reinforcement Learning of Grid Topology Controllers to Prevent Thermal Cascading

This paper describes how domain knowledge of power system operators can be integrated into reinforcement learning (RL) frameworks to effectively learn agents that control the grid’s topology to prevent thermal cascading. Typical RLbased topology controllers fail to perform well due to the large search/optimization space. Here, we propose an actor-critic-based agent to address the problem’s combinatorial nature and train the agent using the RL environment developed by RTE, the French TSO. To address the challenge of the large optimization space, a curriculum-based approach with reward tuning is incorporated into the training procedure by modifying the environment using network physics for enhanced agent learning. Further, a parallel training approach on multiple scenarios is employed to avoid biasing the agent to a few scenarios and make it robust to the natural variability in grid operations. Without these modifications to the training procedure, the RL agent failed for most test scenarios, illustrating the importance of properly integrating domain knowledge of physical systems for real-world RL learning. The agent was tested by RTE for the 2019 learning to run the power network challenge and was awarded the 2 place in accuracy and 1 place in speed. The developed code is open-sourced for public use.

[1]  H. Glavitsch,et al.  Integrated security control using an optimal power flow and switching concepts , 1988 .

[2]  Zhehan Yi,et al.  Deep-Reinforcement-Learning-Based Autonomous Voltage Control for Power Grid Operations , 2020, IEEE Transactions on Power Systems.

[3]  E. Karangelos,et al.  ‘Cooperative game’ inspired approach for multi-area power system security management taking advantage of grid flexibilities , 2021, Philosophical Transactions of the Royal Society A.

[4]  Payman Dehghanian,et al.  Dynamic Uncertainty Set Characterization for Bulk Power Grid Flexibility Assessment , 2020, IEEE Systems Journal.

[5]  Patrick Panciatici,et al.  Expert System for topological remedial action discovery in smart grids , 2018 .

[6]  Isabelle Guyon,et al.  Learning to run a power network challenge for training topology controllers , 2019, Electric Power Systems Research.

[7]  P. Hines,et al.  Cascading failures in power grids , 2009, IEEE Potentials.

[8]  Milad Soroush,et al.  Accuracies of Optimal Transmission Switching Heuristics Based on DCOPF and ACOPF , 2014, IEEE Transactions on Power Systems.

[9]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[10]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[11]  Yingchen Zhang,et al.  A Review on Artificial Intelligence for Grid Stability Assessment , 2020, 2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm).

[12]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[13]  H. Glavitsch,et al.  Security enhancement using an optimal switching power flow , 1989, Conference Papers Power Industry Computer Application Conference.

[14]  Qiuhua Huang,et al.  Deep Reinforcement Scheduling of Energy Storage Systems for Real-Time Voltage Regulation in Unbalanced LV Networks With High PV Penetration , 2021, IEEE Transactions on Sustainable Energy.

[15]  Yuandong Tian,et al.  Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[16]  M. Ferris,et al.  Optimal Transmission Switching , 2008, IEEE Transactions on Power Systems.

[17]  Antoine Marot,et al.  Exploring grid topology reconfiguration using a simple deep reinforcement learning approach , 2020, ArXiv.

[18]  Marvin Lerousseau,et al.  Design and implementation of an environment for Learning to Run a Power Network (L2RPN) , 2021, ArXiv.

[19]  Mario Montagna,et al.  Optimal network reconfiguration for congestion management by deterministic and genetic algorithms , 2006 .