Multi-channel opportunistic spectrum access in unslotted primary systems with unknown models

Multi-channel opportunistic spectrum access in unslotted primary systems is considered. The primary occupancy of each channel is modeled as a general on-off renewal process. The distributions of the busy and idle times and the utilization factors of all channels are unknown to the secondary user. The objective of the secondary user is to identify and exploit the best channel (i.e., the channel with the least primary traffic) through efficient online learning. A dynamic channel access policy is constructed that achieves the throughput offered by the best channel under certain mild conditions on the busy/idle time distributions. More specifically, the cost associated with learning the unknown channel occupancy models over a horizon of length T diminishes at the rate of log T/T. The policy is obtained by constructing a hypothetical multi-armed bandit with virtual reward which, while not directly reflecting throughput, preserves the ranking of the channels in terms of throughput.

[1]  Mingyan Liu,et al.  Optimality of Myopic Sensing in Multi-Channel Opportunistic Access , 2008, 2008 IEEE International Conference on Communications.

[2]  Venugopal V. Veeravalli,et al.  Algorithms for Dynamic Spectrum Access With Learning for Cognitive Radio , 2008, IEEE Transactions on Signal Processing.

[3]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[4]  Lang Tong,et al.  Maximum Throughput Region of Multiuser Cognitive Access of Continuous Time Markovian Channels , 2011, IEEE Journal on Selected Areas in Communications.

[5]  Ananthram Swami,et al.  Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[6]  Hang Su,et al.  Cross-Layer Based Opportunistic MAC Protocols for QoS Provisionings Over Cognitive Radio Wireless Networks , 2008, IEEE Journal on Selected Areas in Communications.

[7]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[8]  Yi Gai,et al.  Decentralized Online Learning Algorithms for Opportunistic Spectrum Access , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[9]  Qing Zhao,et al.  Learning and sharing in a changing world: Non-Bayesian restless bandit with multiple players , 2011, 2011 Information Theory and Applications Workshop.

[10]  SwamiAnanthram,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks , 2007 .

[11]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[12]  Wenhan Dai,et al.  The non-Bayesian restless multi-armed bandit: A case of near-logarithmic regret , 2010, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Amir Ghasemi,et al.  Interference Aggregation in Spectrum-Sensing Cognitive Wireless Networks , 2008, IEEE Journal of Selected Topics in Signal Processing.

[14]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[15]  Bhaskar Krishnamachari,et al.  On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance , 2007, IEEE Transactions on Wireless Communications.

[16]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[17]  Ananthram Swami,et al.  Joint Design and Separation Principle for Opportunistic Spectrum Access , 2006, ASILOMAR 2006.

[18]  Brian M. Sadler,et al.  A Survey of Dynamic Spectrum Access , 2007, IEEE Signal Processing Magazine.

[19]  Brian M. Sadler,et al.  Opportunistic Spectrum Access via Periodic Channel Sensing , 2008, IEEE Transactions on Signal Processing.

[20]  Qing Zhao,et al.  Separation principle for opportunistic spectrum access in unslotted primary systems , 2009, 2009 43rd Annual Conference on Information Sciences and Systems.

[21]  Mingyan Liu,et al.  Online Learning of Rested and Restless Bandits , 2011, IEEE Transactions on Information Theory.

[22]  A. P. Hulbert Spectrum sharing through beacons , 2005, 2005 IEEE 16th International Symposium on Personal, Indoor and Mobile Radio Communications.

[23]  Qing Zhao,et al.  Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.