A Reinforcement Learning Approach to Parameter Selection for Distributed Optimization in Power Systems

With the increasing penetration of distributed energy resources, distributed optimization algorithms have attracted significant attention for power systems applications due to their potential for superior scalability, privacy, and robustness to a single point-of-failure. The Alternating Direction Method of Multipliers (ADMM) is a popular distributed optimization algorithm; however, its convergence performance is highly dependent on the selection of penalty parameters, which are usually chosen heuristically. In this work, we use reinforcement learning (RL) to develop an adaptive penalty parameter selection policy for the AC optimal power flow (ACOPF) problem solved via ADMM with the goal of minimizing the number of iterations until convergence. We train our RL policy using deep Q-learning, and show that this policy can result in significantly accelerated convergence (up to a 59% reduction in the number of iterations compared to existing, curvature-informed penalty parameter selection methods). Furthermore, we show that our RL policy demonstrates promise for generalizability, performing well under unseen loading schemes as well as under unseen losses of lines and generators (up to a 50% reduction in iterations). This work thus provides a proof-of-concept for using RL for parameter selection in ADMM for power systems applications.

[1]  Kibaek Kim,et al.  A Privacy-Preserving Distributed Control of Optimal Power Flow , 2021, IEEE Transactions on Power Systems.

[2]  Daniel Bienstock,et al.  Strong NP-hardness of AC power flows feasibility , 2019, Oper. Res. Lett..

[3]  R D Zimmerman,et al.  MATPOWER: Steady-State Operations, Planning, and Analysis Tools for Power Systems Research and Education , 2011, IEEE Transactions on Power Systems.

[4]  Sleiman Mhanna,et al.  Adaptive ADMM for Distributed AC Optimal Power Flow , 2019, IEEE Transactions on Power Systems.

[5]  Mihai Anitescu,et al.  Leveraging GPU batching for scalable nonlinear programming through massive Lagrangian decomposition , 2021, ArXiv.

[6]  Guangcan Liu,et al.  Differentiable Linearized ADMM , 2019, ICML.

[7]  Guannan Qu,et al.  Reinforcement Learning for Decision-Making and Control in Power Systems: Tutorial, Review, and Vision , 2021, ArXiv.

[8]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[9]  Gregor Verbic,et al.  A Component-Based Dual Decomposition Method for the OPF Problem , 2017, ArXiv.

[10]  Jennifer Annoni,et al.  Distributed Reinforcement Learning with ADMM-RL , 2019, 2019 American Control Conference (ACC).

[11]  Francesco Borrelli,et al.  Accelerating Quadratic Optimization with Reinforcement Learning , 2021, NeurIPS.

[12]  B. He,et al.  Alternating Direction Method with Self-Adaptive Penalty Parameters for Monotone Variational Inequalities , 2000 .

[13]  Henrik Sandberg,et al.  A Survey of Distributed Optimization and Control Algorithms for Electric Power Systems , 2017, IEEE Transactions on Smart Grid.

[14]  Joshua Hare Dealing with Sparse Rewards in Reinforcement Learning , 2019, ArXiv.

[15]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[16]  Gabriela Hug,et al.  Toward Distributed/Decentralized DC Optimal Power Flow Implementation in Future Electric Power Systems , 2018, IEEE Transactions on Smart Grid.

[17]  Fangxing Li,et al.  From AlphaGo to Power System AI: What Engineers Can Learn from Solving the Most Complex Board Game , 2018, IEEE Power and Energy Magazine.

[18]  Tomaso Erseghe,et al.  Distributed Optimal Power Flow Using ADMM , 2014, IEEE Transactions on Power Systems.

[19]  Zheng Xu,et al.  Adaptive ADMM with Spectral Penalty Parameter Selection , 2016, AISTATS.

[20]  H. Robbins A Stochastic Approximation Method , 1951 .

[21]  Wotao Yin,et al.  Learning to Optimize: A Primer and A Benchmark , 2021, J. Mach. Learn. Res..

[22]  B. Kroposki,et al.  Autonomous Energy Grids: Controlling the Future Grid With Large Amounts of Distributed Energy Resources , 2020, IEEE Power and Energy Magazine.

[23]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[24]  Louis Wehenkel,et al.  Recent Developments in Machine Learning for Energy Systems Reliability Management , 2020, Proceedings of the IEEE.

[25]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[26]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[27]  Lei Wu,et al.  Distributed optimization approaches for emerging power systems operation: A review , 2017 .

[28]  Ahmed S. Zamzam,et al.  Learning-Accelerated ADMM for Distributed Optimal Power Flow , 2019, ArXiv.