Dynamic Spectrum Anti-Jamming With Reinforcement Learning Based on Value Function Approximation

This letter addresses the spectrum anti-jamming problem with multiple Internet of Things (IoT) devices for uplink transmissions, where policies for configuring frequency-domain channels have to be learned without the knowledge of the time-frequency distribution of the interference. The problem of decision-making or learning is expected to be solved by reinforcement learning (RL) approaches. However, the state-of-the-art RL-based spectrum anti-jamming methods may not be applicable in IoT systems, suffer from high computational complexity or may converge to a policy that may not be the best for each user. Therefore, we propose a novel spectrum anti-jamming scheme where configuration policies for the IoT devices are sequentially optimized with value function approximation-based multi-agent RL. Simulation results show that our proposed algorithm outperforms various baselines in terms of average normalized throughput.

[1]  B. Clerckx,et al.  Dynamic Air-Ground Collaboration for Multi-Access Edge Computing , 2022, ICC 2022 - IEEE International Conference on Communications.

[2]  Nan Qi,et al.  Game-Theoretic Learning Anti-Jamming Approaches in Wireless Networks , 2022, IEEE Communications Magazine.

[3]  Fuhui Zhou,et al.  Dynamic Resource Configuration for Low-Power IoT Networks: A Multi-Objective Reinforcement Learning Method , 2021, IEEE Communications Letters.

[4]  Yawar Abbas Bangash,et al.  An In-Depth Analysis of IoT Security Requirements, Challenges, and Their Countermeasures via Software-Defined Security , 2020, IEEE Internet of Things Journal.

[5]  Alagan Anpalagan,et al.  Dynamic Spectrum Anti-Jamming in Broadband Communications: A Hierarchical Deep Reinforcement Learning Approach , 2020, IEEE Wireless Communications Letters.

[6]  Xin Liu,et al.  Dynamic Spectrum Anti-Jamming Communications: Challenges and Opportunities , 2020, IEEE Communications Magazine.

[7]  Luliang Jia,et al.  A Collaborative Multi-Agent Reinforcement Learning Anti-Jamming Algorithm in Wireless Networks , 2018, IEEE Wireless Communications Letters.

[8]  Alagan Anpalagan,et al.  Stackelberg Game Approaches for Anti-Jamming Defence in Wireless Networks , 2018, IEEE Wireless Communications.

[9]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[10]  Alagan Anpalagan,et al.  Anti-Jamming Communications Using Spectrum Waterfall: A Deep Reinforcement Learning Approach , 2017, IEEE Communications Letters.

[11]  Sudharman K. Jayaweera,et al.  Multi-Agent Reinforcement Learning Based Cognitive Anti-Jamming , 2017, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[12]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[13]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[14]  Xin Xu,et al.  Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.

[15]  Norman C. Beaulieu,et al.  NDA estimation of SINR for QAM signals , 2005, IEEE Communications Letters.

[16]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.