Distributed Reinforcement Learning for Cyber-Physical System With Multiple Remote State Estimation Under DoS Attacker

In this paper, we consider cyber-physical system (CPS) with multiple remote state estimation under denial-of-service (DoS) attack in infinite time-horizon. The sensors monitor the system and send their local state estimate to remote estimators by choosing the local channels in “State 0” or “State 1”. The aim of sensors is to find policies for choosing local channel in a specific state to transmit message to minimize the total estimation error covariance on account of energy-saving in an infinite time-horizon. The DoS attacker aims to achieve the opposite goal by choosing channels to attack or not. The games between sensors and DoS attacker under two different structures of public information are investigated, that is the open-loop case (where sensors and attacker cannot observe others’ behaviors) and the closed-loop case (where sensors and attacker can observe the others’ behaviors causally). For the open-loop case with assumption that the DoS attacker can get the information from the remote estimators to the sensors, the distributed reinforcement learning algorithms for sensors and attacker based on local information are proposed to find their Nash equilibrium policies, respectively. Further, we consider in closed loop case that the DoS attacker cannot get the information from the remote estimators to the sensors which leads to asymmetric information between the sensors and attacker. To derive Nash equilibrium policies for sensors and attacker, we convert the original game into a belief-based continuous-state stochastic game. The convergence of distributed reinforcement learning method is proved. Some simulations are presented to demonstrate its effectiveness.

[1]  Yuanqing Xia,et al.  Resilient State Estimation of Cyber-Physical System With Multichannel Transmission Under DoS Attack , 2021, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[2]  Pietro Tesi,et al.  Networked Control Under DoS Attacks: Tradeoffs Between Resilience and Data Rate , 2021, IEEE Transactions on Automatic Control.

[3]  Wei Xing Zheng,et al.  Distributed $Q$ -Learning-Based Online Optimization Algorithm for Unit Commitment and Dispatch in Smart Grid , 2020, IEEE Transactions on Cybernetics.

[4]  Guoqiang Hu,et al.  Distributed Secure Cooperative Control Under Denial-of-Service Attacks From Multiple Adversaries , 2020, IEEE Transactions on Cybernetics.

[5]  Dan Ye,et al.  Summation Detector for False Data-Injection Attack in Cyber-Physical Systems , 2020, IEEE Transactions on Cybernetics.

[6]  Guanghui Wen,et al.  Distributed Reinforcement Learning Algorithm for Dynamic Economic Dispatch With Unknown Generation Cost Functions , 2020, IEEE Transactions on Industrial Informatics.

[7]  Zhu Han,et al.  Trust-Based Social Networks with Computing, Caching and Communications: A Deep Reinforcement Learning Approach , 2020, IEEE Transactions on Network Science and Engineering.

[8]  Xiaoqiang Ren,et al.  DoS Attacks on Remote State Estimation With Asymmetric Information , 2019, IEEE Transactions on Control of Network Systems.

[9]  Peijun Wang,et al.  Synchronization of Resilient Complex Networks Under Attacks , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[10]  Lingkun Fu,et al.  DoS Attack Energy Management Against Remote State Estimation , 2018, IEEE Transactions on Control of Network Systems.

[11]  Massimo Franceschetti,et al.  Authentication of cyber-physical systems under learning-based attacks , 2018, IFAC-PapersOnLine.

[12]  Tong Zhang,et al.  Design of Highly Nonlinear Substitution Boxes Based on I-Ching Operators , 2018, IEEE Transactions on Cybernetics.

[13]  Yuanqing Xia,et al.  Resilient strategy design for cyber-physical system under DoS attack over a multi-channel framework , 2018, Inf. Sci..

[14]  Ling Shi,et al.  Optimal Denial-of-Service Attack Scheduling With Energy Constraint Over Packet-Dropping Networks , 2018, IEEE Transactions on Automatic Control.

[15]  Qing-Long Han,et al.  Security Control for Discrete-Time Stochastic Nonlinear Systems Subject to Deception Attacks , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[16]  Ling Shi,et al.  SINR-Based DoS Attack on Remote State Estimation: A Game-Theoretic Approach , 2017, IEEE Transactions on Control of Network Systems.

[17]  Ling Shi,et al.  A multi-channel transmission schedule for remote state estimation under DoS attacks , 2017, Autom..

[18]  W. P. M. H. Heemels,et al.  Event-Triggered Control Systems Under Denial-of-Service Attacks , 2017, IEEE Transactions on Control of Network Systems.

[19]  Ling Shi,et al.  Optimal DoS Attack Scheduling in Wireless Networked Control System , 2016, IEEE Transactions on Control Systems Technology.

[20]  Lei Guo,et al.  Resilient Control of Networked Control System Under DoS Attacks: A Unified Game Approach , 2016, IEEE Transactions on Industrial Informatics.

[21]  Donghua Zhou,et al.  Two-Channel False Data Injection Attacks Against Output Tracking Control of Networked Systems , 2016, IEEE Transactions on Industrial Electronics.

[22]  Ling Shi,et al.  Jamming Attacks on Remote State Estimation in Cyber-Physical Systems: A Game-Theoretic Approach , 2015, IEEE Transactions on Automatic Control.

[23]  Marc G. Bellemare,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[24]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[25]  Ling Shi,et al.  Optimal Sensor Power Scheduling for State Estimation of Gauss–Markov Systems Over a Packet-Dropping Network , 2012, IEEE Transactions on Signal Processing.

[26]  Jiming Chen,et al.  Distributed Collaborative Control for Industrial Automation With Wireless Sensor and Actuator Networks , 2010, IEEE Transactions on Industrial Electronics.

[27]  Bruno Sinopoli,et al.  Secure control against replay attacks , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[28]  S. Shankar Sastry,et al.  Secure Control: Towards Survivable Cyber-Physical Systems , 2008, 2008 The 28th International Conference on Distributed Computing Systems Workshops.

[29]  Xin Xu,et al.  Defending DDoS Attacks Using Hidden Markov Models and Cooperative Reinforcement Learning , 2007, PAISI.

[30]  Richard M. Murray,et al.  On a stochastic sensor selection algorithm with applications in sensor scheduling and sensor coverage , 2006, Autom..

[31]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[32]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[33]  Robert J. Elliott,et al.  On Finite-State Stochastic Modeling and Secure Estimation of Cyber-Physical Systems , 2017, IEEE Transactions on Automatic Control.