Online Sequential Channel Accessing Control: A Double Exploration vs. Exploitation Problem

In opportunistic channel access, the user needs to make real time decisions on when and which channel to access with uncertainty. Assuming perfect channel statistics, several studies have applied optimal stopping theory to derive control strategy for sequential sensing/probing based opportunistically accessing (s-SPA), exploiting temporary opportunities among multiple channels. Meanwhile, numerous multi-arm bandit (MAB)-based approaches have been proposed for online learning of channel selection in periodical sensing/accessing system, however, these schemes fail to exploit the opportunistic diversity in short term. In this paper, we investigate online learning of optimal control in s-SPA systems, where both statistics learning and temporary opportunity utilization are jointly considered. An effective and efficient online policy, so called IE-OSP, is proposed, which theoretically guarantees system converges to the optimal s -SPA strategy with bounded probability. Experimental results further show that, the regret of IE-OSP is almost in optimal logarithmic increasing rate over time, and is sub-linear with the increasing number of channels. Compared with existing solutions, our proposed algorithm achieves 25 ~ 30% throughput gain in typical scenarios.

[1]  Ian F. Akyildiz,et al.  CRAHNs: Cognitive radio ad hoc networks , 2009, Ad Hoc Networks.

[2]  Mo Dong,et al.  Combinatorial auction with time-frequency flexibility in cognitive radio networks , 2012, 2012 Proceedings IEEE INFOCOM.

[3]  Xiang-Yang Li,et al.  TOFU: Semi-Truthful Online Frequency Allocation Mechanism for Wireless Networks , 2011, IEEE/ACM Transactions on Networking.

[4]  Paramvir Bahl,et al.  White space networking with wi-fi like connectivity , 2009, SIGCOMM '09.

[5]  Zhang Lan,et al.  ZIMO: building cross-technology MIMO to harmonize zigbee smog with WiFi flash without intervention , 2013, MOBICOM 2013.

[6]  Martin Herdegen Optimal Stopping and Applications Example 2 : American options , 2009 .

[7]  Shaojie Tang,et al.  Understanding Multi-Task Schedulabilityin Duty-Cycling Sensor Networks , 2014, IEEE Transactions on Parallel and Distributed Systems.

[8]  Shaojie Tang,et al.  Throughput Optimizing Localized Link Scheduling for Multihop Wireless Networks under Physical Interference Model , 2014, IEEE Transactions on Parallel and Distributed Systems.

[9]  Yiyang Pei,et al.  Energy-Efficient Design of Sequential Channel Sensing in Cognitive Radio Networks: Optimal Sensing Strategy, Power Allocation, and Sensing Order , 2011, IEEE Journal on Selected Areas in Communications.

[10]  Ao Tang,et al.  Opportunistic Spectrum Access with Multiple Users: Learning under Competition , 2010, 2010 Proceedings IEEE INFOCOM.

[11]  Sudipto Guha,et al.  Information Acquisition and Exploitation in Multichannel Wireless Networks , 2008, ArXiv.

[12]  Ashutosh Sabharwal,et al.  Opportunistic spectral usage: bounds and a multi-band CSMA/CA protocol , 2007, TNET.

[13]  Mingyan Liu,et al.  Online learning in opportunistic spectrum access: A restless bandit approach , 2010, 2011 Proceedings IEEE INFOCOM.

[14]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[15]  Yunhao Liu,et al.  Exploiting Constructive Interference for Scalable Flooding in Wireless Networks , 2013, IEEE/ACM Transactions on Networking.

[16]  Demosthenis Teneketzis,et al.  Multi-Armed Bandit Problems , 2008 .

[17]  Yi Gai,et al.  Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).

[18]  H. Vincent Poor,et al.  Cognitive Medium Access: Exploration, Exploitation, and Competition , 2007, IEEE Transactions on Mobile Computing.

[19]  Ananthram Swami,et al.  Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[20]  Prasanna Chaporkar,et al.  Optimal joint probing and transmission strategy for maximizing throughput in wireless systems , 2008, IEEE Journal on Selected Areas in Communications.

[21]  Zhongcheng Li,et al.  Almost Optimal Channel Access in Multi-Hop Networks with Unknown Channel Variables , 2013, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[22]  Shaojie Tang,et al.  Almost optimal accessing of nonstochastic channels in cognitive radio networks , 2012, 2012 Proceedings IEEE INFOCOM.

[23]  Keith D. Kastella,et al.  Foundations and Applications of Sensor Management , 2010 .

[24]  Ian F. Akyildiz,et al.  NeXt generation/dynamic spectrum access/cognitive radio wireless networks: A survey , 2006, Comput. Networks.

[25]  Jaesung Lim,et al.  Outband Sensing-Based Dynamic Frequency Selection (DFS) Algorithm without Full DFS Test in IEEE 802.11h Protocol , 2012, IEICE Trans. Commun..

[26]  H. Vincent Poor,et al.  Optimal selection of channel sensing order in cognitive radio , 2009, IEEE Transactions on Wireless Communications.

[27]  Bowen Li,et al.  Demo Abstract: Online Optimal Channel Sensing, Probing, Accessing in USRP Networks , 2012, 2012 IEEE/ACM Third International Conference on Cyber-Physical Systems.

[28]  Hai Jiang,et al.  Channel Sensing-Order Setting in Cognitive Radio Networks: A Two-User Case , 2009, IEEE Transactions on Vehicular Technology.

[29]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[30]  Peter Steenkiste,et al.  Supporting Integrated MAC and PHY Software Development for the USRP SDR , 2006, 2006 1st IEEE Workshop on Networking Technologies for Software Defined Radio Networks.

[31]  Shaojie Tang,et al.  Optimal Frequency-Temporal Opportunity Exploitation for Multichannel Ad Hoc Networks , 2012, IEEE Transactions on Parallel and Distributed Systems.

[32]  Shaojie Tang,et al.  Almost Optimal Dynamically-Ordered Channel Sensing and Accessing for Cognitive Networks , 2014, IEEE Transactions on Mobile Computing.

[33]  Saleem A. Kassam,et al.  Finite-state Markov model for Rayleigh fading channels , 1999, IEEE Trans. Commun..

[34]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[35]  Qing Zhao,et al.  Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[36]  Naumaan Nayyar,et al.  Decentralized Learning for Multiplayer Multiarmed Bandits , 2014, IEEE Transactions on Information Theory.

[37]  Xinbing Wang,et al.  Capacity Scaling of General Cognitive Networks , 2012, IEEE/ACM Transactions on Networking.

[38]  Xiang-Yang Li,et al.  SALSA: Strategyproof Online Spectrum Admissions for Wireless Networks , 2010, IEEE Transactions on Computers.

[39]  Mingyan Liu,et al.  Optimal Channel Probing and Transmission Scheduling for Opportunistic Spectrum Access , 2007, IEEE/ACM Transactions on Networking.

[40]  Krishna Balachandran,et al.  Channel quality estimation and rate adaptation for cellular mobile radio , 1999, IEEE J. Sel. Areas Commun..

[41]  Xin Wang,et al.  Channel sensing order in multi-user cognitive radio networks , 2012, 2012 IEEE International Symposium on Dynamic Spectrum Access Networks.

[42]  Marwan Krunz,et al.  Throughput-efficient sequential channel sensing and probing in cognitive radio networks under sensing errors , 2009, MobiCom '09.

[43]  Yunhao Liu,et al.  L2: Lazy forwarding in low duty cycle wireless sensor networks , 2012, 2012 Proceedings IEEE INFOCOM.

[44]  Yunhao Liu,et al.  CitySee: not only a wireless sensor network , 2013, IEEE Network.

[45]  Erik G. Larsson,et al.  Spectrum Sensing for Cognitive Radio : State-of-the-Art and Recent Advances , 2012, IEEE Signal Processing Magazine.