Replicated Q-learning based sub-band selection for wideband spectrum sensing in cognitive radios

Spectrum sensing is a key basic function in any wideband cognitive radio (CR) for detecting the presence of any spectral activities. However, due to hardware constraints, the instantaneous sensing bandwidth is limited to a single sub-band out of all sub-bands in the spectrum of interest. Hence, sub-band selection is an important step in wideband spectrum sensing. In this paper we develop a partially observable Markov decision process (POMDP) to model the sub-band dynamics and propose an efficient sub-band selection policy based on replicated Q-learning. It is shown through simulations that the proposed selection policy has reasonably low computational complexity and significantly outperforms the random sub-band selection policy.

[1]  E. J. Sondik,et al.  The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .

[2]  Ana Galindo-Serrano,et al.  Decentralized Q-Learning for Aggregated Interference Control in Completely and Partially Observable Cognitive Radio Networks , 2010, 2010 7th IEEE Consumer Communications and Networking Conference.

[3]  H. Vincent Poor,et al.  Reinforcement learning based distributed multiagent sensing policy for cognitive radio networks , 2011, 2011 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN).

[4]  Christos G. Christodoulou,et al.  Radiobots: Architecture, Algorithms and Realtime Reconfigurable Antenna Designs for Autonomous, Self-learning Future Cognitive Radios , 2011 .

[5]  E. Gilbert Capacity of a burst-noise channel , 1960 .

[6]  Sudharman K. Jayaweera,et al.  Learning-Aided Sub-Band Selection Algorithms for Spectrum Sensing in Wide-Band Cognitive Radios , 2014, IEEE Transactions on Wireless Communications.

[7]  Husheng Li,et al.  Multi-agent Q-learning of channel selection in multi-user cognitive radio systems: A two by two case , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[8]  Simon Haykin,et al.  Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.

[9]  Sudharman K. Jayaweera,et al.  Distributed Reinforcement Learning based MAC protocols for autonomous cognitive secondary users , 2011, 2011 20th Annual Wireless and Optical Communications Conference (WOCC).

[10]  Sudharman K. Jayaweera Signal Processing for Cognitive Radios: Jayaweera/Signal Processing for Cognitive Radios , 2014 .

[11]  Joseph Mitola,et al.  Cognitive Radio Architecture Evolution , 2009, Proceedings of the IEEE.

[12]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[13]  K. J. Ray Liu,et al.  Advances in cognitive radio networks: A survey , 2011, IEEE Journal of Selected Topics in Signal Processing.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Luciano Bononi,et al.  To Sense or to Transmit: A Learning-Based Spectrum Management Scheme for Cognitive Radiomesh Networks , 2010, 2010 Fifth IEEE Workshop on Wireless Mesh Networks.

[16]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[17]  Kok-Lim Alvin Yau,et al.  Achieving Context Awareness and Intelligence in Cognitive Radio Networks using Reinforcement Learning for Stateful Applications , 2010 .