Consensus Transfer ${Q}$ -Learning for Decentralized Generation Command Dispatch Based on Virtual Generation Tribe

This paper develops a consensus transfer <inline-formula> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula>-learning (CTQ) for decentralized generation command dispatch (GCD) of automatic generation control (AGC). A two-layer decentralized GCD based on virtual generation tribe (VGT) is adopted to resolve the course of dimension emerged in large-scale power systems. The leader of VGTs can calculate the generation command of each VGT through exchanging the <inline-formula> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula>-value matrices with its adjacent VGTs. In addition, a behavior transfer is employed into CTQ to exploit the prior knowledge of source tasks for a new optimization task according to their similarities, such that the algorithm convergence rate can be accelerated and the requirement of AGC period is satisfied. Case studies are carried out to evaluate the performance of CTQ for decentralized GCD on China southern power grid model.

[1]  Hansen Yee,et al.  Self-tuning algorithm for automatic generation control in an interconnected power system , 1991 .

[2]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[3]  Dewen Hu,et al.  Multiobjective Reinforcement Learning: A Comprehensive Overview , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Shie Mannor,et al.  A Geometric Approach to Multi-Criterion Reinforcement Learning , 2004, J. Mach. Learn. Res..

[5]  Jian-Xin Xu,et al.  Consensus Based Approach for Economic Dispatch Problem in a Smart Grid , 2013, IEEE Transactions on Power Systems.

[6]  Yu Xichang,et al.  Practical implementation of the SCADA+AGC/ED system of the hunan power pool in the central China power network , 1994, IEEE Power Engineering Review.

[7]  Tao Yu,et al.  Hierarchical correlated Q-learning for multi-layer optimal generation command dispatch , 2016 .

[8]  N. Jaleeli,et al.  NERC's new control performance standards , 1999 .

[9]  Mo-Yuen Chow,et al.  Convergence Analysis of the Incremental Cost Consensus Algorithm Under Different Communication Network Topologies in a Smart Grid , 2012, IEEE Transactions on Power Systems.

[10]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[11]  Jie Lin,et al.  Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..

[12]  Maja J. Matarić,et al.  Action Selection methods using Reinforcement Learning , 1996 .

[13]  L. Conradt,et al.  Consensus decision making in animals. , 2005, Trends in ecology & evolution.

[14]  Tao Yu,et al.  Stochastic Optimal Relaxed Automatic Generation Control in Non-Markov Environment Based on Multi-Step $Q(\lambda)$ Learning , 2011, IEEE Transactions on Power Systems.

[15]  Michael L. Littman,et al.  Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[16]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[17]  Young-Bae Ko,et al.  Improving the reliability of IEEE 802.11s based wireless mesh networks for smart grid systems , 2012, Journal of Communications and Networks.

[18]  Taskin Koçak,et al.  Smart Grid Technologies: Communication Technologies and Standards , 2011, IEEE Transactions on Industrial Informatics.

[19]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[20]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[21]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[22]  Tao Yu,et al.  Stochastic optimal generation command dispatch based on improved hierarchical reinforcement learning approach , 2011 .

[23]  Andrea Castelletti,et al.  Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[24]  K. W. Chan,et al.  Multi-Agent Correlated Equilibrium Q(λ) Learning for Coordinated Smart Generation Control of Interconnected Power Grids , 2015, IEEE Transactions on Power Systems.

[25]  Gabriela Hug,et al.  Consensus + Innovations Approach for Distributed Multiagent Coordination in a Microgrid , 2015, IEEE Transactions on Smart Grid.

[26]  Frank L. Lewis,et al.  Distributed Consensus-Based Economic Dispatch With Transmission Losses , 2014, IEEE Transactions on Power Systems.

[27]  Yu Xichang,et al.  Practical implementation of the SCADA+AGC/ED system of the hunan power pool in the central China power network , 1994 .

[28]  Dan J. Trudnowski,et al.  Real-time very short-term load prediction for power-system automatic generation control , 2001, IEEE Trans. Control. Syst. Technol..

[29]  Reinaldo A. C. Bianchi,et al.  Transferring knowledge as heuristics in reinforcement learning: A case-based approach , 2015, Artif. Intell..

[30]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[31]  Hassan Bevrani,et al.  Load–frequency control : a GA-based multi-agent reinforcement learning , 2010 .

[32]  Timothy W. McLain,et al.  Decentralized Cooperative Aerial Surveillance Using Fixed-Wing Miniature UAVs , 2006, Proceedings of the IEEE.

[33]  Srini Narayanan,et al.  Learning all optimal policies with multiple criteria , 2008, ICML '08.

[34]  Husheng Li,et al.  Multi-objective reinforcement learning based routing in cognitive radio networks: Walking in a random maze , 2012, 2012 International Conference on Computing, Networking and Communications (ICNC).

[35]  Shankar P. Bhattacharyya,et al.  Linear Control Theory , 2009 .

[36]  Hsiao-Hwa Chen,et al.  Smart Grid Communication: Its Challenges and Opportunities , 2013, IEEE Transactions on Smart Grid.

[37]  Y. L. Abdel-Magid,et al.  Optimal AGC tuning with genetic algorithms , 1996 .

[38]  L. H. Fink,et al.  Understanding automatic generation control , 1992 .

[39]  H. Vincent Poor,et al.  QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations , 2012, IEEE Trans. Signal Process..

[40]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[41]  Yujing Hu,et al.  Accelerating Multiagent Reinforcement Learning by Equilibrium Transfer , 2015, IEEE Transactions on Cybernetics.

[42]  Li Li,et al.  Virtual generation tribe based robust collaborative consensus algorithm for dynamic generation command dispatch optimization of smart grid , 2016 .

[43]  Tao Yu,et al.  R(λ) imitation learning for automatic generation control of interconnected power grids , 2012, Autom..

[44]  Qingwei Chen,et al.  Multi-objective reinforcement learning algorithm for MOSDMP in unknown environment , 2010, 2010 8th World Congress on Intelligent Control and Automation.

[45]  Keith B. Hall,et al.  Correlated Q-Learning , 2003, ICML.

[46]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[47]  Xuesong Wang,et al.  Multi-source transfer ELM-based Q learning , 2014, Neurocomputing.