Multi-Agent Deep Reinforcement Learning-Based Cooperative Spectrum Sensing With Upper Confidence Bound Exploration

In this paper, a multi-agent deep reinforcement learning method was adopted to realize cooperative spectrum sensing in cognitive radio networks. Each secondary user learns an efficient sensing strategy from the sensing results of some of the selected spectra to avoid interference to the primary users and to coordinate with other secondary users. It is necessary to balance exploration and exploitation in the learning process when using deep reinforcement learning methods, helping explain that upper confidence bound with Hoeffding-style bonus has been adopted in this paper to improve the efficiency of exploration. The simulation results verify that the proposed algorithm, when compared with the conventional reinforcement learning methods with $\varepsilon $ -greedy exploration, is much easier to achieve faster convergence speed and better reward performance.

[1]  Benxiong Huang,et al.  Double Threshold Energy Detection of Cooperative Spectrum Sensing in Cognitive Radio , 2008, 2008 3rd International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CrownCom 2008).

[2]  Ian F. Akyildiz,et al.  Reinforcement learning for cooperative sensing gain in cognitive radio ad hoc networks , 2013, Wirel. Networks.

[3]  Michael I. Jordan,et al.  Is Q-learning Provably Efficient? , 2018, NeurIPS.

[4]  Joseph Mitola,et al.  Cognitive radio: making software radios more personal , 1999, IEEE Wirel. Commun..

[5]  Zhi Chen,et al.  Intelligent Power Control for Spectrum Sharing in Cognitive Radios: A Deep Reinforcement Learning Approach , 2017, IEEE Access.

[6]  Bin-Jie Hu,et al.  A Contention-Free Reporting Scheme Based MAC Protocol for Cooperative Spectrum Sensing in Cognitive Radio Networks , 2018, IEEE Access.

[7]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[8]  Mei Song,et al.  Reinforcement Learning Based Auction Algorithm for Dynamic Spectrum Access in Cognitive Radio Networks , 2010, 2010 IEEE 72nd Vehicular Technology Conference - Fall.

[9]  Qinghua Guo,et al.  Cooperative Spectrum Sensing: A Blind and Soft Fusion Detector , 2018, IEEE Transactions on Wireless Communications.

[10]  Zhou Xianwei,et al.  Cooperative Spectrum Sensing in Cognitive Radio Networks , 2008 .

[11]  Hyung Seok Kim,et al.  Energy and throughput efficient cooperative spectrum sensing in cognitive radio sensor networks , 2015, Trans. Emerg. Telecommun. Technol..

[12]  Sudharman K. Jayaweera,et al.  Multi-Agent Reinforcement Learning Based Cognitive Anti-Jamming , 2017, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[13]  Anant Sahai,et al.  Cooperative Sensing among Cognitive Radios , 2006, 2006 IEEE International Conference on Communications.

[14]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[15]  T. Sudha,et al.  Optimal sensing scheduling for green Cognitive Radio , 2015, 2015 International Conference on Control Communication & Computing India (ICCC).

[16]  Bhaskar Krishnamachari,et al.  Deep Reinforcement Learning for Dynamic Multichannel Access in Wireless Networks , 2018, IEEE Transactions on Cognitive Communications and Networking.

[17]  Zhenyu Na,et al.  Multi-Modal Cooperative Spectrum Sensing Based on Dempster-Shafer Fusion in 5G-Based Cognitive Radio , 2018, IEEE Access.

[18]  H. Vincent Poor,et al.  Reinforcement learning based distributed multiagent sensing policy for cognitive radio networks , 2011, 2011 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN).

[19]  Tobias Renk,et al.  Occupation Measurements Supporting Dynamic Spectrum Allocation for Cognitive Radio Design , 2007, 2007 2nd International Conference on Cognitive Radio Oriented Wireless Networks and Communications.

[20]  Hyung Seok Kim,et al.  Cooperative Spectrum Sensing for Cognitive Radio Networks Application: Performance Analysis for Realistic Channel Conditions , 2013 .

[21]  H. Vincent Poor,et al.  Two-dimensional anti-jamming communication based on deep reinforcement learning , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Hüseyin Arslan,et al.  A survey of spectrum sensing algorithms for cognitive radio applications , 2009, IEEE Communications Surveys & Tutorials.

[23]  Abdul Ghafoor,et al.  Grouping technique for cooperative spectrum sensing in cognitive radios , 2009, 2009 Second International Workshop on Cognitive Radio and Advanced Spectrum Management.