Channel Exploration and Exploitation with Imperfect Spectrum Sensing in Cognitive Radio Networks

In this paper, the problem of opportunistic channel sensing and access in cognitive radio networks when the sensing is imperfect and a secondary user can access up to a limited number of channels at a time is investigated. Primary users' statistical information is assumed to be unknown, and therefore, a secondary user needs to learn the information online during channel sensing and access process, which means learning loss, also referred to as regret, is inevitable. For each channel, the busy/idle state is independent from one slot to another. In this research, the case when all potential channels can be sensed simultaneously is investigated first. The channel access process is modeled as a multi-armed bandit problem with side observation. And channel access rules are derived and theoretically proved to have asymptotically finite regret. Then the case when the secondary user can sense only a limited number of channels at a time is investigated. The channel sensing and access process is modeled as a bi-level multi-armed bandit problem. It is shown that any adaptive rule has at least logarithmic regret. Then we derive channel sensing and access rules and theoretically prove that they have logarithmic regret asymptotically and with finite time. The case when the busy/idle states of a channel are correlated over slots is also investigated. And a channel sensing and access rule with logarithmic regret is derived. The effectiveness of the derived rules is validated by simulation.

[1]  Hai Jiang,et al.  Channel Sensing-Order Setting in Cognitive Radio Networks: A Two-User Case , 2009, IEEE Transactions on Vehicular Technology.

[2]  Joseph Mitola,et al.  Cognitive radio: making software radios more personal , 1999, IEEE Wirel. Commun..

[3]  Hyundong Shin,et al.  Sensing and Probing Cardinalities for Active Cognitive Radios , 2012, IEEE Transactions on Signal Processing.

[4]  Yonghong Zeng,et al.  Sensing-Throughput Tradeoff for Cognitive Radio Networks , 2008, IEEE Trans. Wirel. Commun..

[5]  Mingyan Liu,et al.  Online Learning of Rested and Restless Bandits , 2011, IEEE Transactions on Information Theory.

[6]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[7]  Shuguang Cui,et al.  Dynamic Resource Allocation in Cognitive Radio Networks , 2010, IEEE Signal Processing Magazine.

[8]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[9]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[10]  Zhu Han,et al.  Distributed Cognitive Sensing for Time Varying Channels: Exploration and Exploitation , 2010, 2010 IEEE Wireless Communication and Networking Conference.

[11]  H. Vincent Poor,et al.  Bandit problems with side observations , 2005, IEEE Transactions on Automatic Control.

[12]  Shuguang Cui,et al.  On Ergodic Sum Capacity of Fading Cognitive Multiple-Access and Broadcast Channels , 2008, IEEE Transactions on Information Theory.

[13]  Hai Jiang,et al.  Cognitive Radio with Imperfect Spectrum Sensing: The Optimal Set of Channels to Sense , 2012, IEEE Wireless Communications Letters.

[14]  Yiyang Pei,et al.  How much time is needed for wideband spectrum sensing? , 2009, IEEE Transactions on Wireless Communications.

[15]  G. Simons Great Expectations: Theory of Optimal Stopping , 1973 .

[16]  Xi Fang,et al.  Taming Wheel of Fortune in the Air: An Algorithmic Framework for Channel Selection Strategy in Cognitive Radio Networks , 2013, IEEE Transactions on Vehicular Technology.

[17]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[18]  J. Walrand,et al.  Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .

[19]  H. Vincent Poor,et al.  Cognitive Medium Access: Exploration, Exploitation, and Competition , 2007, IEEE Transactions on Mobile Computing.

[20]  Michael J. Neely,et al.  Opportunistic scheduling with worst case delay guarantees in single and multi-hop networks , 2011, 2011 Proceedings IEEE INFOCOM.

[21]  H. Vincent Poor,et al.  Optimal selection of channel sensing order in cognitive radio , 2009, IEEE Transactions on Wireless Communications.

[22]  Michael J. Neely,et al.  Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[23]  Hai Jiang,et al.  Optimal multi-channel cooperative sensing in cognitive radio networks , 2010, IEEE Transactions on Wireless Communications.

[24]  A. Dembo,et al.  Large Deviation Techniques and Applications. , 1994 .

[25]  Simon Haykin,et al.  Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.

[26]  Baochun Li,et al.  Cooperative Resource Management in Cognitive WiMAX with Femto Cells , 2010, 2010 Proceedings IEEE INFOCOM.

[27]  Andrea J. Goldsmith,et al.  Breaking Spectrum Gridlock With Cognitive Radios: An Information Theoretic Perspective , 2009, Proceedings of the IEEE.

[28]  H. Vincent Poor,et al.  Optimal Multiband Joint Detection for Spectrum Sensing in Cognitive Radio Networks , 2008, IEEE Transactions on Signal Processing.

[29]  Hai Jiang,et al.  Joint Optimal Cooperative Sensing and Resource Allocation in Multichannel Cognitive Radio Networks , 2011, IEEE Transactions on Vehicular Technology.

[30]  Larry J. Greenstein,et al.  Propagation Issues for Cognitive Radio , 2009, Proceedings of the IEEE.

[31]  Ananthram Swami,et al.  Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[32]  J. D. Biggins,et al.  Large Deviation Techniques, and Applications , 1994 .