Improving learning and adaptation in security games by exploiting information asymmetry

With the advancement of modern technologies, the security battle between a legitimate system (LS) and an adversary is becoming increasingly sophisticated, involving complex interactions in unknown dynamic environments. Stochastic game (SG), together with multi-agent reinforcement learning (MARL), offers a systematic framework for the study of information warfare in current and emerging cyber-physical systems. In practical security games, each player usually has only incomplete information about the opponent, which induces information asymmetry. This work exploits information asymmetry from a new angle, considering how to exploit local information unknown to the opponent to the player's advantage. Two new MARL algorithms, termed minimax-PDS and WoLF-PDS, are proposed, which enable the LS to learn and adapt faster in dynamic environments by exploiting its private local information. The proposed algorithms are provably convergent and rational, respectively. Also, numerical results are presented to show their effectiveness through two concrete anti-jamming examples.

[1]  Eitan Altman,et al.  A Jamming Game in Wireless Networks with Transmission Cost , 2007, NET-COOP.

[2]  Ian F. Akyildiz,et al.  NeXt generation/dynamic spectrum access/cognitive radio wireless networks: A survey , 2006, Comput. Networks.

[3]  Anthony Ephremides,et al.  Jamming games in wireless networks with incomplete information , 2011, IEEE Communications Magazine.

[4]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[5]  Yang Xiao,et al.  Game Theory for Network Security , 2013, IEEE Communications Surveys & Tutorials.

[6]  Derrick Wing Kwan Ng,et al.  Multi-objective beamforming for secure communication in systems with wireless information and power transfer , 2013, 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[7]  R. Venkatesha Prasad,et al.  Reincarnation in the Ambiance: Devices and Networks with Energy Harvesting , 2014, IEEE Communications Surveys & Tutorials.

[8]  Alexandros G. Fragkiadakis,et al.  A Survey on Security Threats and Detection Techniques in Cognitive Radio Networks , 2013, IEEE Communications Surveys & Tutorials.

[9]  K. J. Ray Liu,et al.  An anti-jamming stochastic game for cognitive radio networks , 2011, IEEE Journal on Selected Areas in Communications.

[10]  Amitav Mukherjee,et al.  Deploying multi-antenna energy-harvesting cooperative jammers in the MIMO wiretap channel , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[11]  Ian F. Akyildiz,et al.  Multiagent jamming-resilient control channel game for cognitive radio ad hoc networks , 2012, 2012 IEEE International Conference on Communications (ICC).

[12]  Xiang-Yang Li,et al.  Towards Optimal Adaptive UFH-Based Anti-Jamming Wireless Communication , 2012, IEEE Journal on Selected Areas in Communications.

[13]  Zhu Han,et al.  Dogfight in Spectrum: Combating Primary User Emulation Attacks in Cognitive Radio Systems—Part II: Unknown Channel Statistics , 2011, IEEE Transactions on Wireless Communications.

[14]  Chase Qishi Wu,et al.  A Survey of Game Theory as Applied to Network Security , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[15]  H. Vincent Poor,et al.  Increasing Smart Meter Privacy Through Energy Harvesting and Storage Devices , 2013, IEEE Journal on Selected Areas in Communications.

[16]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[17]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[18]  K. J. Ray Liu,et al.  Anti-Jamming Games in Multi-Channel Cognitive Radio Networks , 2012, IEEE Journal on Selected Areas in Communications.

[19]  Kee Chaing Chua,et al.  Secrecy wireless information and power transfer with MISO beamforming , 2013, GLOBECOM.

[20]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[21]  Xi Fang,et al.  Coping with a Smart Jammer in Wireless Networks: A Stackelberg Game Approach , 2013, IEEE Transactions on Wireless Communications.

[22]  Jeffrey H. Reed,et al.  Defense against Primary User Emulation Attacks in Cognitive Radio Networks , 2008, IEEE Journal on Selected Areas in Communications.

[23]  Mihaela van der Schaar,et al.  Fast Reinforcement Learning for Energy-Efficient Wireless Communication , 2011, IEEE Transactions on Signal Processing.

[24]  Zhu Han,et al.  Dogfight in Spectrum: Combating Primary User Emulation Attacks in Cognitive Radio Systems, Part I: Known Channel Statistics , 2010, IEEE Transactions on Wireless Communications.