论文信息 - Reduction of collisions and regret in time sharing schemes for opportunistic spectrum access

Reduction of collisions and regret in time sharing schemes for opportunistic spectrum access

We examine decentralized learning and access algorithms for opportunistic spectrum access with multiple users. Several distributed algorithms have been proposed for this problem, mainly as an application of corresponding algorithms for the multiarmed bandit problem, which are provably order optimal in terms of regret. However, none of them pays particular attention to reducing collisions among users caused by lack of message exchanges. The effect of such collisions becomes more observable as the number of users increases, causing a considerable amount of added regret, despite retaining the optimal order. Focusing on time division fair sharing schemes based on the idea of orthogonal offsets, we propose a simple algorithm for detecting offset collisions and trying to resolve them as quickly as possible, inspired from persistent distributed schemes for multiple access. We demonstrate the improved performance achieved by our algorithm by means of simulations.

[1] Ananthram Swami,et al. Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[2] J. Walrand,et al. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .

[3] Yi Gai,et al. Decentralized Online Learning Algorithms for Opportunistic Spectrum Access , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[4] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .

[5] Qing Zhao,et al. Cooperative Game in Dynamic Spectrum Access with Unknown Model and Imperfect Sensing , 2012, IEEE Transactions on Wireless Communications.

[6] D.J. Leith,et al. A Self-Managed Distributed Channel Selection Algorithm for WLANs , 2006, 2006 4th International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks.

[7] Cristina Cano,et al. Learning-BEB: Avoiding Collisions in WLAN , 2008 .

[8] Sattar Vakili,et al. Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems , 2011, IEEE Journal of Selected Topics in Signal Processing.

[9] Jean C. Walrand,et al. Design and Analysis of an Asynchronous Zero Collision MAC Protocol , 2008, ArXiv.

[10] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[11] Qing Zhao,et al. Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[12] Mingyan Liu,et al. Performance and Convergence of Multi-user Online Learning , 2011, GAMENETS.

[13] R. Agrawal. Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[14] David Malone,et al. Decentralised learning MACs for collision-free access in WLANs , 2010, Wirel. Networks.