Joint Optimization of Jamming Link and Power Control in Communication Countermeasures: A Multiagent Deep Reinforcement Learning Approach

Due to the nonconvexity feature of optimal controlling such as jamming link selection and jamming power allocation issues, obtaining the optimal resource allocation strategy in communication countermeasures scenarios is challenging. Thus, we propose a novel decentralized jamming resource allocation algorithm based on multiagent deep reinforcement learning (MADRL) to improve the efficiency of jamming resource allocation in battlefield communication countermeasures. We first model the communication jamming resource allocation problem as a fully cooperative multiagent task, considering the cooperative interrelationship of jamming equipment (JE). Then, to alleviate the nonstationarity feature and high decision dimensions in the multiagent system, we introduce a centralized training with decentralized execution framework (CTDE), which means all JEs are trained with global information and rely on their local observations only while making decisions. Each JE obtains a decentralized policy after the training process. Subsequently, we develop the multiagent soft actor-critic (MASAC) algorithm to enhance the exploration capability of agents and accelerate the learning of cooperative policies among agents by leveraging the maximum policy entropy criterion. Finally, the simulation results are presented to demonstrate that the proposed MASAC algorithm outperforms the existing centralized allocation benchmark algorithms.

[1]  Stan Matwin,et al.  Continuous Control with Deep Reinforcement Learning for Autonomous Vessels , 2021, ArXiv.

[2]  Ying-Chang Liang,et al.  Joint Optimization of Handover Control and Power Allocation Based on Multi-Agent Deep Reinforcement Learning , 2020, IEEE Transactions on Vehicular Technology.

[3]  Xiaojiang Du,et al.  A Graph Convolutional Network-Based Deep Reinforcement Learning Approach for Resource Allocation in a Cognitive Radio Network , 2020, Sensors.

[4]  Victor C. M. Leung,et al.  Power Control Based on Deep Reinforcement Learning for Spectrum Sharing , 2020, IEEE Transactions on Wireless Communications.

[5]  T. Başar,et al.  Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.

[6]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[7]  Weigang Zhu,et al.  Research on Decision-making System of Cognitive Jamming against Multifunctional Radar , 2019, 2019 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC).

[8]  Yiyang Pei,et al.  Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Cellular Networks , 2019, IEEE Transactions on Wireless Communications.

[9]  Jun-an Yang,et al.  An algorithm for jamming strategy using OMP and MAB , 2019, EURASIP J. Wirel. Commun. Netw..

[10]  Adnan Orduyilmaz,et al.  Ultra Wideband Spectrum Sensing for Cognitive Electronic Warfare Applications , 2019, 2019 IEEE Radar Conference (RadarConf).

[11]  Jie Zhang,et al.  On the Performance of Deep Reinforcement Learning-Based Anti-Jamming Method Confronting Intelligent Jammer , 2019, Applied Sciences.

[12]  Ming Diao,et al.  Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare , 2019, IEEE Access.

[13]  Lenan Wu,et al.  Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches , 2019, IEEE Transactions on Wireless Communications.

[14]  Saeid Nahavandi,et al.  Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications , 2018, IEEE Transactions on Cybernetics.

[15]  Jie Xu,et al.  Secure UAV Communication With Cooperative Jamming and Trajectory Control , 2018, IEEE Communications Letters.

[16]  Ying-Chang Liang,et al.  Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[17]  Dongning Guo,et al.  Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks , 2018, IEEE Journal on Selected Areas in Communications.

[18]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[19]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[20]  Joel Z. Leibo,et al.  Value-Decomposition Networks For Cooperative Multi-Agent Learning , 2017, ArXiv.

[21]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[22]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[23]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[24]  Mihaela van der Schaar,et al.  Jamming Bandits—A Novel Learning Method for Optimal Jamming , 2016, IEEE Transactions on Wireless Communications.

[25]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[26]  Dorian Kodelja,et al.  Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.

[27]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[28]  R. Michael Buehrer,et al.  Optimal Jamming Against Digital Modulation , 2015, IEEE Transactions on Information Forensics and Security.

[29]  Cheng-Xiang Wang,et al.  Distributed Subchannel Allocation for Interference Mitigation in OFDMA Femtocells: A Utility-Based Learning Approach , 2015, IEEE Transactions on Vehicular Technology.

[30]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  R. Michael Buehrer,et al.  Optimal jamming strategies in digital communications — Impact of modulation , 2014, 2014 IEEE Global Communications Conference.

[33]  Alejandro Betancourt,et al.  A fictitious play-based game-theoretical approach to alleviating jamming attacks for cognitive radios , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[35]  Madiha Jalil,et al.  An overview of electronic warfare in radar systems , 2013, 2013 The International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE).

[36]  Sinan Gezici,et al.  Optimum Power Allocation for Average Power Constrained Jammers in the Presence of Non-Gaussian Noise , 2012, IEEE Communications Letters.

[37]  Anthony Ephremides,et al.  Jamming games in wireless networks with incomplete information , 2011, IEEE Communications Magazine.

[38]  Syed Ali Jafar,et al.  Approaching the Capacity of Wireless Networks through Distributed Interference Alignment , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[39]  Darren Baker,et al.  Advances in Communications Electronic Warfare , 2006, 2006 Canadian Conference on Electrical and Computer Engineering.

[40]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.