Multiagent Reinforcement Learning Based Spectrum Sensing Policies for Cognitive Radio Networks

This paper proposes distributed multiuser multiband spectrum sensing policies for cognitive radio networks based on multiagent reinforcement learning. The spectrum sensing problem is formulated as a partially observable stochastic game and multiagent reinforcement learning is employed to find a solution. In the proposed reinforcement learning based sensing policies the secondary users (SUs) collaborate to improve the sensing reliability and to distribute the sensing tasks among the network nodes. The SU collaboration is carried out through local interactions in which the SUs share their local test statistics or decisions as well as information on the frequency bands sensed with their neighbors. As a result, a map of spectrum occupancy in a local neighborhood is created. The goal of the proposed sensing policies is to maximize the amount of free spectrum found given a constraint on the probability of missed detection. This is addressed by obtaining a balance between sensing more spectrum and the reliability of sensing results. Simulation results show that the proposed sensing policies provide an efficient way to find available spectrum in multiuser multiband cognitive radio scenarios.

[1]  A. Lee Swindlehurst,et al.  IEEE Journal of Selected Topics in Signal Processing Inaugural Issue: [editor-in-chief's message] , 2007, J. Sel. Topics Signal Processing.

[2]  H. Vincent Poor,et al.  Reinforcement learning based distributed multiagent sensing policy for cognitive radio networks , 2011, 2011 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN).

[3]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Visa Koivunen,et al.  Reinforcement learning based sensing policy optimization for energy efficient cognitive radio networks , 2011, Neurocomputing.

[6]  Alfred O. Hero,et al.  Partially Observable Markov Decision Process Approximations for Adaptive Sensing , 2009, Discret. Event Dyn. Syst..

[7]  J. I. Mararm,et al.  Energy Detection of Unknown Deterministic Signals , 2022 .

[8]  Fangwen Fu,et al.  Detection of Spectral Resources in Cognitive Radios Using Reinforcement Learning , 2008, 2008 3rd IEEE Symposium on New Frontiers in Dynamic Spectrum Access Networks.

[9]  Mohamed-Slim Alouini,et al.  On the Energy Detection of Unknown Signals Over Fading Channels , 2007, IEEE Transactions on Communications.

[10]  H. Vincent Poor,et al.  Cognitive Medium Access: Exploration, Exploitation, and Competition , 2007, IEEE Transactions on Mobile Computing.

[11]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[12]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[13]  H. Vincent Poor,et al.  Spectrum exploration and exploitation , 2009 .

[14]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[15]  Alagan Anpalagan,et al.  Opportunistic Spectrum Access in Cognitive Radio Networks: Global Optimization Using Local Interaction Games , 2012, IEEE Journal of Selected Topics in Signal Processing.

[16]  Nikos A. Vlassis,et al.  Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[17]  Husheng Li,et al.  Learning the Spectrum via Collaborative Filtering in Cognitive Radio Networks , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).

[18]  Bhaskar Krishnamachari,et al.  On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance , 2007, IEEE Transactions on Wireless Communications.

[19]  Kok-Lim Alvin Yau,et al.  Achieving Efficient and Optimal Joint Action in Distributed Cognitive Radio Networks Using Payoff Propagation , 2010, 2010 IEEE International Conference on Communications.

[20]  Abbas Jamalipour,et al.  Wireless communications , 2005, GLOBECOM '05. IEEE Global Telecommunications Conference, 2005..

[21]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[22]  H. Vincent Poor,et al.  Exploiting spatial diversity in multiagent reinforcement learning based spectrum sensing , 2011, 2011 4th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[23]  Walid Saad,et al.  Coalitional Games in Partition Form for Joint Spectrum Sensing and Access in Cognitive Radio Networks , 2012, IEEE Journal of Selected Topics in Signal Processing.

[24]  Ian F. Akyildiz,et al.  CRAHNs: Cognitive radio ad hoc networks , 2009, Ad Hoc Networks.

[25]  Husheng Li Multiagent Q-Learning for Aloha-Like Spectrum Access in Cognitive Radio Systems , 2010, EURASIP J. Wirel. Commun. Netw..

[26]  Luciano Bononi,et al.  To Sense or to Transmit: A Learning-Based Spectrum Management Scheme for Cognitive Radiomesh Networks , 2010, 2010 Fifth IEEE Workshop on Wireless Mesh Networks.

[27]  Qing Zhao,et al.  Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[28]  Ao Tang,et al.  Opportunistic Spectrum Access with Multiple Users: Learning under Competition , 2010, 2010 Proceedings IEEE INFOCOM.

[29]  Sean P. Meyn,et al.  An analysis of reinforcement learning with function approximation , 2008, ICML '08.

[30]  Bhaskar Krishnamachari,et al.  Dynamic Multichannel Access With Imperfect Channel State Detection , 2010, IEEE Transactions on Signal Processing.