Distributed Algorithms for Efficient Learning and Coordination in Ad Hoc Networks

A distributed sampling strategy for multiple (N) agents is considered that minimizes the sample complexity and regret of acquiring the best subset of size $N$ among total $K$ ≥ $N$ channels in a cognitive radio access setup. Agents cannot directly communicate with each other, and no central coordination is possible. Each agent can transmit on one channel at a time, and if multiple agents transmit on the same channel at the same time, a collision occurs, and no agent gets any information about the channel gain or how many other agents transmitted on the same channel. If no collision occurs, the agent observes a reward (or gain) sample drawn from an underlying distribution associated with the channel. An algorithm to minimize the sample complexity and regret is proposed. One important property of our algorithm that distinguishes it from the prior work (that do not assume knowledge of N) is that it requires no information about the difference of the means of the channel gains of the $K$ channels. Our approach results in fewer collisions with improved regret performance compared to the state-of-the-art algorithms. We validate our theoretical guarantees with experiments.

[1]  Shie Mannor,et al.  Concurrent Bandits and Cognitive Radio Networks , 2014, ECML/PKDD.

[2]  Ambuj Tewari,et al.  PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.

[3]  Sumit Jagdish Darak,et al.  Trekking based distributed algorithm for opportunistic spectrum access in infrastructure-less network , 2018, 2018 16th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt).

[4]  Min Dong,et al.  Distributed Stochastic Learning and Adaptation to Primary Traffic for Dynamic Spectrum Access , 2016, IEEE Transactions on Wireless Communications.

[5]  Özgür B. Akan,et al.  Clustering in Multi-Channel Cognitive Radio Ad Hoc and Sensor Networks , 2018, IEEE Communications Magazine.

[6]  Naumaan Nayyar,et al.  Decentralized Learning for Multiplayer Multiarmed Bandits , 2014, IEEE Transactions on Information Theory.

[7]  Shivaram Kalyanakrishnan,et al.  Information Complexity in Bandit Subset Selection , 2013, COLT.

[8]  Harshvardhan Tibrewal,et al.  Distributed Learning and Optimal Assignment in Multiplayer Heterogeneous Networks , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[9]  Sumit Jagdish Darak,et al.  Multi-Player Bandits: A Trekking Approach , 2018, ArXiv.

[10]  Matthew Malloy,et al.  lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.

[11]  Vianney Perchet,et al.  SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits , 2018, NeurIPS.

[12]  Qing Zhao,et al.  Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[13]  Stefan Parkvall,et al.  NR - The New 5G Radio-Access Technology , 2017, 2018 IEEE 87th Vehicular Technology Conference (VTC Spring).

[14]  H. Vincent Poor,et al.  Cognitive Medium Access: Exploration, Exploitation, and Competition , 2007, IEEE Transactions on Mobile Computing.

[15]  Ohad Shamir,et al.  Multi-player bandits: a musical chairs approach , 2016, ICML 2016.

[16]  Lilian Besson,et al.  {Multi-Player Bandits Revisited} , 2017, ALT.

[17]  Ananthram Swami,et al.  Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[18]  Kyung Sup Kwak,et al.  Channel Clustering and QoS Level Identification Scheme for Multi-Channel Cognitive Radio Networks , 2018, IEEE Communications Magazine.