Safe Off-Policy Deep Reinforcement Learning Algorithm for Volt-VAR Control in Power Distribution Systems

Volt-VAR control is critical to keeping distribution network voltages within allowable range, minimizing losses, and reducing wear and tear of voltage regulating devices. To deal with incomplete and inaccurate distribution network models, we propose a safe off-policy deep reinforcement learning algorithm to solve Volt-VAR control problems in a model-free manner. The Volt-VAR control problem is formulated as a constrained Markov decision process with discrete action space, and solved by our proposed constrained soft actor-critic algorithm. Our proposed reinforcement learning algorithm achieves scalability, sample efficiency, and constraint satisfaction by synergistically combining the merits of the maximum-entropy framework, the method of multiplier, a device-decoupled neural network structure, and an ordinal encoding scheme. Comprehensive numerical studies with the IEEE distribution test feeders show that our proposed algorithm outperforms the existing reinforcement learning algorithms and conventional optimization-based approaches on a large feeder.

[1]  Zhiwei Wang,et al.  Autonomous Voltage Control for Grid Operation Using Deep Reinforcement Learning , 2019, 2019 IEEE Power & Energy Society General Meeting (PESGM).

[2]  J. Andrew Bagnell,et al.  Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .

[3]  Dariusz Czarkowski,et al.  Optimal Distributed Voltage Regulation for Secondary Networks With DGs , 2012, IEEE Transactions on Smart Grid.

[4]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[5]  Ruggero Carli,et al.  Distributed Reactive Power Feedback Control for Voltage Regulation and Loss Minimization , 2013, IEEE Transactions on Automatic Control.

[6]  Wei Wang,et al.  Dynamic Distribution Network Reconfiguration Using Reinforcement Learning , 2019, 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm).

[7]  Wei Zhang,et al.  Multiagent-Based Reinforcement Learning for Optimal Reactive Power Dispatch , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[8]  Eilyan Bitar,et al.  Real-time Voltage Regulation in Distribution Systems via Decentralized PV Inverter Control , 2018, HICSS.

[9]  Martha White,et al.  Linear Off-Policy Actor-Critic , 2012, ICML.

[10]  Jianzhong Wu,et al.  Coordinated Control Method of Voltage and Reactive Power for Active Distribution Networks Based on Soft Open Point , 2017, IEEE Transactions on Sustainable Energy.

[11]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[12]  Hao Jan Liu,et al.  Fast Local Voltage Control Under Limited Reactive Power: Optimality and Stability Analysis , 2015, IEEE Transactions on Power Systems.

[13]  Jianhui Wang,et al.  MPC-Based Voltage/Var Optimization for Distribution Circuits With Distributed Generators and Exponential Load Models , 2014, IEEE Transactions on Smart Grid.

[14]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[15]  W. H. Kersting,et al.  Radial distribution test feeders , 1991, 2001 IEEE Power Engineering Society Winter Meeting. Conference Proceedings (Cat. No.01CH37194).

[16]  Albert Y. S. Lam,et al.  An Optimal and Distributed Method for Voltage Regulation in Power Distribution Systems , 2012, IEEE Transactions on Power Systems.

[17]  Vahid Vahidinasab,et al.  Centralized Support Distributed Voltage Control by Using End-Users as Reactive Power Support , 2016, IEEE Transactions on Smart Grid.

[18]  Mehrdad Ehsani,et al.  Dynamic Reactive Power Control of Islanded Microgrids , 2013, IEEE Transactions on Power Systems.

[19]  Brandon Foggo,et al.  Improving Supervised Phase Identification Through the Theory of Information Losses , 2019, IEEE Transactions on Smart Grid.

[20]  Chadi Assi,et al.  Volt-VAR Control Through Joint Optimization of Capacitor Bank Switching, Renewable Energy, and Home Appliances , 2018, IEEE Transactions on Smart Grid.

[21]  Hamed Ahmadi,et al.  A Framework for Volt-VAR Optimization in Distribution Systems , 2015, IEEE Transactions on Smart Grid.

[22]  Vivek S. Borkar,et al.  An actor-critic algorithm for constrained Markov decision processes , 2005, Syst. Control. Lett..

[23]  Hanchen Xu,et al.  Optimal Tap Setting of Voltage Regulation Transformers Using Batch Reinforcement Learning , 2018, IEEE Transactions on Power Systems.

[24]  Pieter Abbeel,et al.  Constrained Policy Optimization , 2017, ICML.

[25]  Boming Zhang,et al.  Robust Reactive Power Optimization and Voltage Control Method for Active Distribution Networks via Dual Time-scale Coordination , 2016, ArXiv.

[26]  Wei Shi,et al.  Distributed Voltage Control in Distribution Networks: Online and Robust Implementations , 2018, IEEE Transactions on Smart Grid.

[27]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[28]  Juan Li,et al.  Phase Identification in Electric Power Distribution Systems by Clustering of Smart Meter Data , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[29]  Sergey Levine,et al.  Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.

[30]  W. Wang,et al.  Advanced Metering Infrastructure Data Driven Phase Identification in Smart Grid , 2017 .

[31]  Shie Mannor,et al.  Reward Constrained Policy Optimization , 2018, ICLR.

[32]  Nikos D. Hatziargyriou,et al.  Distributed and Decentralized Voltage Control of Smart Distribution Networks: Models, Methods, and Future Research , 2017, IEEE Transactions on Smart Grid.

[33]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[34]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[35]  N.D. Hatziargyriou,et al.  Reinforcement learning for reactive power control , 2004, IEEE Transactions on Power Systems.

[36]  Hao Zhu,et al.  Asynchronous local voltage control in power distribution networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Yunhao Tang,et al.  Discretizing Continuous Action Space for On-Policy Optimization , 2019, AAAI.

[38]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[39]  Ruggero Carli,et al.  Local and Distributed Voltage Control Algorithms in Distribution Networks , 2018, IEEE Transactions on Power Systems.

[40]  Bikash C. Pal,et al.  A Two-Stage Chance Constrained Volt/Var Control Scheme for Active Distribution Networks With Nodal Power Uncertainties , 2019, IEEE Transactions on Power Systems.

[41]  Renke Huang,et al.  Adaptive Power System Emergency Control Using Deep Reinforcement Learning , 2019, IEEE Transactions on Smart Grid.

[42]  Dale Schuurmans,et al.  Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.

[43]  David J. Hill,et al.  Multi-Timescale Coordinated Voltage/Var Control of High Renewable-Penetrated Distribution Systems , 2017, IEEE Transactions on Power Systems.

[44]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[45]  C. Cañizares,et al.  Reactive Power and Voltage Control in Distribution Systems With Limited Switching Operations , 2009, IEEE Transactions on Power Systems.

[46]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[47]  Wei Wang,et al.  Volt-VAR Control in Power Distribution Systems with Deep Reinforcement Learning , 2019, 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm).