A Multistage Game in Smart Grid Security: A Reinforcement Learning Solution

Existing smart grid security research investigates different attack techniques and cascading failures from the attackers’ viewpoints, while the defenders’ or the operators’ protection strategies are somehow neglected. Game theoretic methods are applied for the attacker–defender games in the smart grid security area. Yet, most of the existing works only use the one-shot game and do not consider the dynamic process of the electric power grid. In this paper, we propose a new solution for a multistage game (also called a dynamic game) between the attacker and the defender based on reinforcement learning to identify the optimal attack sequences given certain objectives (e.g., transmission line outages or generation loss). Different from a one-shot game, the attacker here learns a sequence of attack actions applying for the transmission lines and the defender protects a set of selected lines. After each time step, the cascading failure will be measured, and the line outage (and/or generation loss) will be used as the feedback for the attacker to generate the next action. The performance is evaluated on W&W 6-bus and IEEE 39-bus systems. A comparison between a multistage attack and a one-shot attack is conducted to show the significance of the multistage attack. Furthermore, different protection strategies are evaluated in simulation, which shows that the proposed reinforcement learning solution can identify optimal attack sequences under several attack objectives. It also indicates that attacker’s learned information helps the defender to enhance the security of the system.

[1]  Wei Sun,et al.  Electrical Distance Approach for Searching Vulnerable Branches During Contingencies , 2018, IEEE Transactions on Smart Grid.

[2]  Haibo He,et al.  Q-Learning-Based Vulnerability Analysis of Smart Grid Against Sequential Topology Attacks , 2017, IEEE Transactions on Information Forensics and Security.

[3]  M. Cheng,et al.  A game theory approach to vulnerability analysis: Integrating power flows with topological analysis , 2016 .

[4]  Sylvain Sorin,et al.  Stochastic Games and Applications , 2003 .

[5]  Chen-Ching Liu,et al.  Distribution System Restoration With Microgrids Using Spanning Tree Search , 2014, IEEE Transactions on Power Systems.

[6]  W. Sudderth,et al.  Discrete Gambling and Stochastic Games , 1996 .

[7]  David K. Y. Yau,et al.  Markov Game Analysis for Attack-Defense of Power Networks Under Possible Misinformation , 2013, IEEE Transactions on Power Systems.

[8]  Deepa Kundur,et al.  A Game-Theoretic Analysis of Cyber Switching Attacks and Mitigation in Smart Grid Systems , 2016, IEEE Transactions on Smart Grid.

[9]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[10]  Aditya Ashok,et al.  Cyber-physical security of Wide-Area Monitoring, Protection and Control in a smart grid environment , 2013, Journal of advanced research.

[11]  Abolfazl Mehbodniya,et al.  Fuzzy logic game-theoretic approach for energy efficient operation in HetNets , 2017, 2017 IEEE International Conference on Communications Workshops (ICC Workshops).

[12]  Kai Sun,et al.  Estimating the Propagation of Interdependent Cascading Outages With Multi-Type Branching Processes , 2014, IEEE Transactions on Power Systems.

[13]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[14]  Dongbin Zhao,et al.  Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Frank L. Lewis,et al.  Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Lingfeng Wang,et al.  A game-theoretic study of load redistribution attack and defense in power systems , 2017 .

[17]  Huaguang Zhang,et al.  An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..

[18]  James P. Bagrow,et al.  Reducing Cascading Failure Risk by Increasing Infrastructure Network Interdependence , 2017, Scientific Reports.

[19]  Tianyou Chai,et al.  Online Solution of Two-Player Zero-Sum Games for Continuous-Time Nonlinear Systems With Completely Unknown Dynamics , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Paul Hines,et al.  A “Random Chemistry” Algorithm for Identifying Collections of Multiple Contingencies That Initiate Cascading Failure , 2012, IEEE Transactions on Power Systems.

[21]  Frank L. Lewis,et al.  Online solution of nonlinear two-player zero-sum games using synchronous policy iteration , 2010, 49th IEEE Conference on Decision and Control (CDC).

[22]  Frank L. Lewis,et al.  Neurodynamic Programming and Zero-Sum Games for Constrained Control Systems , 2008, IEEE Transactions on Neural Networks.

[23]  Margaret J. Eppstein,et al.  A “Random Chemistry” algorithm for identifying collections of multiple contingencies that initiate cascading failure , 2013, PES 2013.

[24]  Walid Saad,et al.  Game theory for secure critical interdependent gas-power-water infrastructure , 2017, 2017 Resilience Week (RWS).

[25]  Walid Saad,et al.  Game-Theoretic Methods for the Smart Grid: An Overview of Microgrid Systems, Demand-Side Management, and Smart Grid Communications , 2012, IEEE Signal Processing Magazine.

[26]  Zhen Ni,et al.  A Study of Linear Programming and Reinforcement Learning for One-Shot Game in Smart Grid Security , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[27]  Zhu Han,et al.  Bad Data Injection Attack and Defense in Electricity Market Using Game Theory Study , 2012, IEEE Transactions on Smart Grid.

[28]  Ying Chen,et al.  Evaluation of Reinforcement Learning-Based False Data Injection Attack to Automatic Voltage Control , 2019, IEEE Transactions on Smart Grid.

[29]  Huaguang Zhang,et al.  Near-Optimal Control for Nonzero-Sum Differential Games of Continuous-Time Nonlinear Systems Using Single-Network ADP , 2013, IEEE Transactions on Cybernetics.

[30]  Walid Saad,et al.  Stochastic Games for Power Grid Protection Against Coordinated Cyber-Physical Attacks , 2018, IEEE Transactions on Smart Grid.

[31]  Siddharth Sridhar,et al.  Cyber–Physical System Security for the Electric Power Grid , 2012, Proceedings of the IEEE.

[32]  Frank L. Lewis,et al.  Adaptive Dynamic Programming algorithm for finding online the equilibrium solution of the two-player zero-sum differential game , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[33]  Zhen Ni,et al.  Vulnerability analysis for simultaneous attack in smart grid security , 2017, 2017 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT).

[34]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Janusz Bialek,et al.  Benchmarking and Validation of Cascading Failure Analysis Tools , 2016, IEEE Transactions on Power Systems.

[36]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[37]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[38]  Marcus Johnson,et al.  Approximate $N$ -Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Frank L. Lewis,et al.  Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Haibo He,et al.  Resilience Analysis of Power Grids Under the Sequential Attack , 2014, IEEE Transactions on Information Forensics and Security.

[41]  Harry Eugene Stanley,et al.  Catastrophic cascade of failures in interdependent networks , 2009, Nature.

[42]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[43]  Qinglai Wei,et al.  A reinforcement learning approach for sequential decision-making process of attacks in smart grid , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[44]  Kwang-Cheng Chen,et al.  Smart attacks in smart grid communication networks , 2012, IEEE Communications Magazine.

[45]  Derong Liu,et al.  Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Naresh Malla,et al.  Real-time cyber physical system testbed for power system security and control , 2017 .

[47]  Walid Saad,et al.  Stochastic Coalitional Games for Cooperative Random Access in M2M Communications , 2017, IEEE Transactions on Wireless Communications.

[48]  Chase Qishi Wu,et al.  A Survey of Game Theory as Applied to Network Security , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[49]  Paul Hines,et al.  Reducing Cascading Failure Risk by Increasing Infrastructure Network Interdependence , 2014, Scientific Reports.

[50]  Csaba Szepesvári,et al.  A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.

[51]  Haibo He,et al.  Supplementary File : Revealing Cascading Failure Vulnerability in Power Grids using Risk-Graph , 2013 .

[52]  Feng Liu,et al.  Risk Assessment of Multi-Timescale Cascading Outages Based on Markovian Tree Search , 2016, IEEE Transactions on Power Systems.

[53]  Haibo He,et al.  Integrated Security Analysis on Cascading Failure in Complex Networks , 2014, IEEE Transactions on Information Forensics and Security.

[54]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[55]  Carlos Ramos,et al.  AI in Power Systems and Energy Markets , 2011, IEEE Intell. Syst..