Faster Learning and Adaptation in Security Games by Exploiting Information Asymmetry

With the advancement of modern technologies, the security battle between a legitimate system (LS) and an adversary is becoming increasingly sophisticated, involving complex interactions in unknown dynamic environments. Stochastic game (SG), together with multi-agent reinforcement learning (MARL), offers a systematic framework for the study of information warfare in current and emerging cyber-physical systems. In practical security games, each player usually has only incomplete information about the opponent, which induces information asymmetry. This paper exploits information asymmetry from a new angle, considering how to exploit information unknown to the opponent to the player's advantage. Two new MARL algorithms, termed minimax post-decision state (minimax-PDS) and Win-or-Learn Fast post-decision state (WoLF-PDS), are proposed, which enable the LS to learn and adapt faster in dynamic environments by exploiting its information advantage. The proposed algorithms are provably convergent and rational, respectively. Also, numerical results are presented to show their effectiveness through three important applications.

[1]  Amitav Mukherjee,et al.  Deploying multi-antenna energy-harvesting cooperative jammers in the MIMO wiretap channel , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[2]  Mihaela van der Schaar,et al.  Optimal Foresighted Multi-User Wireless Video , 2015, IEEE Journal of Selected Topics in Signal Processing.

[3]  Zhu Han,et al.  Dogfight in Spectrum: Combating Primary User Emulation Attacks in Cognitive Radio Systems—Part II: Unknown Channel Statistics , 2010, IEEE Transactions on Wireless Communications.

[4]  Sarit Kraus,et al.  Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[5]  Ian F. Akyildiz,et al.  NeXt generation/dynamic spectrum access/cognitive radio wireless networks: A survey , 2006, Comput. Networks.

[6]  Peng Ning,et al.  Improving learning and adaptation in security games by exploiting information asymmetry , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[7]  Ian F. Akyildiz,et al.  Multiagent jamming-resilient control channel game for cognitive radio ad hoc networks , 2012, 2012 IEEE International Conference on Communications (ICC).

[8]  Xiang-Yang Li,et al.  Towards Optimal Adaptive UFH-Based Anti-Jamming Wireless Communication , 2012, IEEE Journal on Selected Areas in Communications.

[9]  Derrick Wing Kwan Ng,et al.  Multi-objective beamforming for secure communication in systems with wireless information and power transfer , 2013, 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[10]  R. Venkatesha Prasad,et al.  Reincarnation in the Ambiance: Devices and Networks with Energy Harvesting , 2014, IEEE Communications Surveys & Tutorials.

[11]  Abhijeet Bhorkar,et al.  An on-line learning algorithm for energy efficient delay constrained scheduling over a fading channel , 2008, IEEE Journal on Selected Areas in Communications.

[12]  Ulas C. Kozat,et al.  Stochastic Game for Wireless Network Virtualization , 2013, IEEE/ACM Transactions on Networking.

[13]  Sarit Kraus,et al.  Using Game Theory for Los Angeles Airport Security , 2009, AI Mag..

[14]  Anthony Ephremides,et al.  Jamming games in wireless networks with incomplete information , 2011, IEEE Communications Magazine.

[15]  Jason Flinn,et al.  Virtualized in-cloud security services for mobile devices , 2008, MobiVirt '08.

[16]  Jeffrey H. Reed,et al.  Defense against Primary User Emulation Attacks in Cognitive Radio Networks , 2008, IEEE Journal on Selected Areas in Communications.

[17]  Alexandros G. Fragkiadakis,et al.  A Survey on Security Threats and Detection Techniques in Cognitive Radio Networks , 2013, IEEE Communications Surveys & Tutorials.

[18]  Kee Chaing Chua,et al.  Secrecy wireless information and power transfer with MISO beamforming , 2013, 2013 IEEE Global Communications Conference (GLOBECOM).

[19]  Mihaela van der Schaar,et al.  Fast Reinforcement Learning for Energy-Efficient Wireless Communication , 2010, IEEE Transactions on Signal Processing.

[20]  Vijay Varadharajan,et al.  Security as a Service Model for Cloud Environment , 2014, IEEE Transactions on Network and Service Management.

[21]  Xi Fang,et al.  Coping with a Smart Jammer in Wireless Networks: A Stackelberg Game Approach , 2013, IEEE Transactions on Wireless Communications.

[22]  Chase Qishi Wu,et al.  A Survey of Game Theory as Applied to Network Security , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[23]  H. Vincent Poor,et al.  Increasing Smart Meter Privacy Through Energy Harvesting and Storage Devices , 2013, IEEE Journal on Selected Areas in Communications.

[24]  Csaba Szepesvári,et al.  A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.

[25]  Yang Xiao,et al.  Game Theory for Network Security , 2013, IEEE Communications Surveys & Tutorials.

[26]  K. J. Ray Liu,et al.  An anti-jamming stochastic game for cognitive radio networks , 2011, IEEE Journal on Selected Areas in Communications.

[27]  Zhu Han,et al.  Dogfight in Spectrum: Combating Primary User Emulation Attacks in Cognitive Radio Systems, Part I: Known Channel Statistics , 2010, IEEE Transactions on Wireless Communications.

[28]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[29]  Wenjuan Li,et al.  Design of Cloud-Based Parallel Exclusive Signature Matching Model in Intrusion Detection , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[30]  Peng Ning,et al.  A stochastic multi-channel spectrum access game with incomplete information , 2015, 2015 IEEE International Conference on Communications (ICC).

[31]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[32]  Lang Tong,et al.  Scheduling Parallel Tasks onto Opportunistically Available Cloud Resources , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[33]  Sarit Kraus,et al.  Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport , 2008, AAMAS.

[34]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[35]  Eitan Altman,et al.  A Jamming Game in Wireless Networks with Transmission Cost , 2007, NET-COOP.

[36]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[37]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[38]  Nur Izura Udzir,et al.  A Cloud-based Intrusion Detection Service framework , 2012, Proceedings Title: 2012 International Conference on Cyber Security, Cyber Warfare and Digital Forensic (CyberSec).

[39]  Peng Ning,et al.  Dynamic IDS Configuration in the Presence of Intruder Type Uncertainty , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).

[40]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[41]  K. J. Ray Liu,et al.  Anti-Jamming Games in Multi-Channel Cognitive Radio Networks , 2012, IEEE Journal on Selected Areas in Communications.