Application of multi-armed bandit algorithms for channel sensing in cognitive radio

We apply a multi-armed bandit algorithm for channel sensing in a cognitive radio system. In order to find a channel whose utilization rate is low, we formulate several problems in cognitive wireless systems as the multi-armed bandit problems, which is the problem to select the best arm providing rewards with the highest rate within smaller sensing time. There are several effective algorithms to solve the multi-armed bandit problems. In this paper, we introduce the e-greedy algorithm and apply it to selection of a channel with lowest utilization ratio. We also extend the proposed scheme for a frequency hopping system, since the frequent use of the vacant channel can be corresponded to the increase of rewards by multi-armed bandit algorithms. In order to evaluate the performance of the proposed scheme, we use real measured data in 2.4GHz band. The results show that the proposed scheme improves the performance of the communication systems, by efficient sensing of the channels with lower utilization rate.

[1]  Simon Haykin,et al.  Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.

[2]  Hai Jiang,et al.  Medium access in cognitive radio networks: A competitive multi-armed bandit framework , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[3]  K. J. Ray Liu,et al.  Game theory for cognitive radio networks: An overview , 2010, Comput. Networks.

[4]  Doina Precup,et al.  Algorithms for multi-armed bandit problems , 2014, ArXiv.

[5]  Wassim Jouini,et al.  Multi-armed bandit based policies for cognitive radio's decision making issues , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).

[6]  H. Vincent Poor,et al.  Cognitive Medium Access: Exploration, Exploitation, and Competition , 2007, IEEE Transactions on Mobile Computing.

[7]  Kwang-Cheng Chen,et al.  Distributed Spectrum Sharing in Cognitive Radio Networks - Game Theoretical View , 2010, 2010 7th IEEE Consumer Communications and Networking Conference.

[8]  Takyu Osamu,et al.  High Speed Rendezvous Channel based on Recursive Update Measurement Method for Channel Occupancy Ratio , 2011 .

[9]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[10]  T. Clancy,et al.  Predictive Dynamic Spectrum Access , 2006 .

[11]  Kaveh Pahlavan,et al.  Wireless data communications , 1994, Proc. IEEE.

[12]  Nicolò Cesa-Bianchi,et al.  Finite-Time Regret Bounds for the Multiarmed Bandit Problem , 1998, ICML.

[13]  Abhay Parekh,et al.  Spectrum sharing for unlicensed bands , 2005, IEEE Journal on Selected Areas in Communications.

[14]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[15]  Yu-Kwong Kwok,et al.  On adaptive frequency hopping to combat coexistence interference between Bluetooth and IEEE 802.11b with practical resource constraints , 2004, 7th International Symposium on Parallel Architectures, Algorithms and Networks, 2004. Proceedings..

[16]  Zhu Han,et al.  Cooperative Game Theory for Distributed Spectrum Sharing , 2007, 2007 IEEE International Conference on Communications.

[17]  Song-Ju Kim,et al.  Tug-of-War Model for Multi-armed Bandit Problem , 2010, UC.