论文信息 - Deep transfer Q-learning with virtual leader-follower for supply-demand Stackelberg game of smart grid

Deep transfer Q-learning with virtual leader-follower for supply-demand Stackelberg game of smart grid

This paper proposes a novel deep transfer Q-learning (DTQ) associated with a virtual leader-follower pattern for supply-demand Stackelberg game of smart grid. Each generator and load are regarded as an agent of a supplier and a demander, respectively, in which an economic dispatch (ED) and demand response (DR) can be simultaneously solved by DTQ. To maximize the total payoff of all the agents, a virtual leader-follower pattern is employed to achieve a reliable collaboration among the agents. Then, Q-learning with a cooperative swarm is adopted for the knowledge learning for each agent via appropriate explorations and exploitations in an unknown environment. Furthermore, the original extremely large-scale knowledge matrix can be efficiently decomposed into several simplified small-scale knowledge matrices through a binary state-action chain, while the continuous actions can be generated for continuous variables. Lastly, a deep belief network (DBN) is used for knowledge transfer, thus DTQ can effectively exploit the prior knowledge from source tasks so as to rapidly obtain an optimal solution of a new task. Case studies are carried out to evaluate the performance of DTQ for supply-demand Stackelberg game of smart grid on a 94-agent system and a practical Shenzhen power grid of southern China.

[1] Rahmat-Allah Hooshmand,et al. Emission, reserve and economic load dispatch problem with non-smooth and non-convex cost functions using the hybrid bacterial foraging-Nelder–Mead algorithm , 2012 .

[2] James A. Momoh,et al. Improved interior point method for OPF problems , 1999 .

[3] Gabriela Hug,et al. Consensus + Innovations Approach for Distributed Multiagent Coordination in a Microgrid , 2015, IEEE Transactions on Smart Grid.

[4] Frank L. Lewis,et al. Distributed Consensus-Based Economic Dispatch With Transmission Losses , 2014, IEEE Transactions on Power Systems.

[5] Lei Zheng,et al. A Distributed Demand Response Control Strategy Using Lyapunov Optimization , 2014, IEEE Transactions on Smart Grid.

[6] Yitao Liu,et al. Deep belief network based deterministic and probabilistic wind speed forecasting approach , 2016 .

[7] Tao Yu,et al. Stochastic Optimal Relaxed Automatic Generation Control in Non-Markov Environment Based on Multi-Step $Q(\lambda)$ Learning , 2011, IEEE Transactions on Power Systems.

[8] Gregor Verbic,et al. A Faithful Distributed Mechanism for Demand Response Aggregation , 2016, IEEE Transactions on Smart Grid.

[9] Wenhao Huang,et al. Deep Architecture for Traffic Flow Prediction: Deep Belief Networks With Multitask Learning , 2014, IEEE Transactions on Intelligent Transportation Systems.

[10] Enrico Zio,et al. Reinforcement learning for microgrid energy management , 2013 .

[11] Vincent W. S. Wong,et al. Tackling the Load Uncertainty Challenges for Energy Consumption Scheduling in Smart Grid , 2013, IEEE Transactions on Smart Grid.

[12] Ayman M. Eldeib,et al. Breast cancer classification using deep belief networks , 2016, Expert Syst. Appl..

[13] Tao Yu,et al. Robust collaborative consensus algorithm for decentralized economic dispatch with a practical communication network , 2016 .

[14] Seung Ho Hong,et al. A Real-Time Demand-Response Algorithm for Smart Grids: A Stackelberg Game Approach , 2016, IEEE Transactions on Smart Grid.

[15] Tao Yu,et al. R(λ) imitation learning for automatic generation control of interconnected power grids , 2012, Autom..

[16] Jian-Xin Xu,et al. Consensus based approach for economic dispatch problem in a smart grid , 2013, IECON 2013 - 39th Annual Conference of the IEEE Industrial Electronics Society.

[17] Sanjoy Mandal,et al. Economic load dispatch using krill herd algorithm , 2014 .

[18] Mengmeng Yu,et al. Supply–demand balancing for power management in smart grid: A Stackelberg game approach , 2016 .

[19] D. Menniti,et al. Purchase-Bidding Strategies of an Energy Coalition With Demand-Response Capabilities , 2009, IEEE Transactions on Power Systems.

[20] P. K. Chattopadhyay,et al. Hybrid Differential Evolution With Biogeography-Based Optimization for Solution of Economic Load Dispatch , 2010, IEEE Transactions on Power Systems.

[21] Xin-She Yang,et al. Economic dispatch using chaotic bat algorithm , 2016 .

[22] Xuesong Wang,et al. Multi-source transfer ELM-based Q learning , 2014, Neurocomputing.

[23] Dapeng Oliver Wu,et al. Why Deep Learning Works: A Manifold Disentanglement Perspective , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[24] Taher Niknam,et al. A new fuzzy adaptive particle swarm optimization for non-smooth economic dispatch , 2010 .

[25] Hsueh-Hsien Chang,et al. Genetic algorithms and non-intrusive energy management system based economic dispatch for cogeneration units , 2011 .

[26] Chao-Lung Chiang,et al. Improved genetic algorithm for power economic dispatch of units with valve-point effects and multiple fuels , 2005 .

[27] Mohammad Reza Hesamzadeh,et al. Short-run economic dispatch with mathematical modelling of the adjustment cost , 2014 .

[28] A. Philpott,et al. Optimizing demand-side bids in day-ahead electricity markets , 2006, IEEE Transactions on Power Systems.

[29] Mo-Yuen Chow,et al. Convergence Analysis of the Incremental Cost Consensus Algorithm Under Different Communication Network Topologies in a Smart Grid , 2012, IEEE Transactions on Power Systems.

[30] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[31] Frank L. Lewis,et al. A Distributed Auction-Based Algorithm for the Nonconvex Economic Dispatch Problem , 2014, IEEE Transactions on Industrial Informatics.

[32] Gregor Verbic,et al. A Fast Distributed Algorithm for Large-Scale Demand Response Aggregation , 2016, IEEE Transactions on Smart Grid.

[33] Yuan Zou,et al. Reinforcement learning-based real-time energy management for a hybrid tracked vehicle , 2016 .

[34] Sadegh Sadeghi,et al. Thermodynamic analysis and optimization of a geothermal Kalina cycle system using Artificial Bee Colony algorithm , 2016 .

[35] Zhu Han,et al. How Geo-Distributed Data Centers Do Demand Response: A Game-Theoretic Approach , 2016, IEEE Transactions on Smart Grid.

[36] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[37] Javier Jaén Martínez,et al. An efficient ant colony optimization strategy for the resolution of multi-class queries , 2016, Knowl. Based Syst..

[38] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[39] Dinu Calin Secui,et al. A modified Symbiotic Organisms Search algorithm for large scale economic dispatch problem with valve-point effects , 2016 .

[40] Hamdi Abdi,et al. Optimal pricing in time of use demand response by integrating with dynamic economic dispatch problem , 2016 .

[41] Yin Xu,et al. Strategic Bidding and Compensation Mechanism for a Load Aggregator With Direct Thermostat Control Capabilities , 2018, IEEE Transactions on Smart Grid.

[42] Q. Henry Wu,et al. Group Search Optimizer: An Optimization Algorithm Inspired by Animal Searching Behavior , 2009, IEEE Transactions on Evolutionary Computation.

[43] Meng Joo Er,et al. Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[44] Chaohua Dai,et al. Seeker Optimization Algorithm for Optimal Reactive Power Dispatch , 2009, IEEE Transactions on Power Systems.

[45] R. Faranda,et al. Load Shedding: A New Proposal , 2007, IEEE Transactions on Power Systems.

[46] Nasrudin Abd Rahim,et al. Solving non-convex economic dispatch problem via backtracking search algorithm , 2014 .

[47] D. Kirschen. Demand-side view of electricity markets , 2003 .

[48] Richard A. Buswell,et al. The implications of heat electrification on national electrical supply-demand balance under published 2050 energy scenarios , 2016 .

[49] Alfredo Vaccaro,et al. Decentralized Economic Dispatch in Smart Grids by Self-Organizing Dynamic Agents , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[50] N. Grudinin. Reactive power optimization using successive quadratic programming method , 1998 .

[51] Zwe-Lee Gaing,et al. Particle swarm optimization to solving the economic dispatch considering the generator constraints , 2003 .

[52] P. K. Chattopadhyay,et al. Evolutionary programming techniques for economic load dispatch , 2003, IEEE Trans. Evol. Comput..

[53] Xiaodong Wang,et al. Distributed Real-Time Energy Scheduling in Smart Grid: Stochastic Model and Fast Optimization , 2013, IEEE Transactions on Smart Grid.

[54] Wenxin Liu,et al. Distributed Online Optimal Energy Management for Smart Grids , 2015, IEEE Transactions on Industrial Informatics.

[55] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[56] Vincent W. S. Wong,et al. Advanced Demand Side Management for the Future Smart Grid Using Mechanism Design , 2012, IEEE Transactions on Smart Grid.