Shrewd Selection Speeds Surfing: Use Smart EXP3!

In this paper, we explore the use of multi-armed bandit online learning techniques to solve distributed resource selection problems. As an example, we focus on the problem of network selection. Mobile devices often have several wireless networks at their disposal. While choosing the right network is vital for good performance, a decentralized solution remains a challenge. The impressive theoretical properties of multi-armed bandit algorithms, like EXP3, suggest that it should work well for this type of problem. Yet, its real-word performance lags far behind. The main reasons are the hidden cost of switching networks and its slow rate of convergence. We propose Smart EXP3, a novel bandit-style algorithm that (a) retains the good theoretical properties of EXP3, (b) bounds the number of switches, and (c) yields significantly better performance in practice. We evaluate Smart EXP3 using simulations, controlled experiments, and in-the-wild experiments. Results show that it stabilizes at the optimal state, achieves fairness among devices and gracefully deals with transient behaviors. In real world experiments, it can achieve 18% faster download over alternate strategies. We conclude that multi-armed bandit algorithms can play an important role in distributed resource selection problems, when practical concerns, such as switching costs and convergence time, are addressed.

[1]  Umts Long Term Evolution (lte) Technology Introduction Application Note 1ma111 Lte/e-utra , 2022 .

[2]  Qihui Wu,et al.  Learning with handoff cost constraint for network selection in heterogeneous wireless networks , 2016, Wirel. Commun. Mob. Comput..

[3]  Jafar Saniie,et al.  Convergence properties of general network selection games , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[4]  Setareh Maghsudi,et al.  Relay selection with no side information: An adversarial bandit approach , 2013, 2013 IEEE Wireless Communications and Networking Conference (WCNC).

[5]  Konstantina Papagiannaki,et al.  Measurement-Based Self Organization of Interfering 802.11 Wireless Access Networks , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[6]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[7]  Yi Gai,et al.  Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).

[8]  Mung Chiang,et al.  Max-Min Fair Resource Allocation in HetNets: Distributed Algorithms and Hybrid Architecture , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[9]  Qihui Wu,et al.  Traffic-Aware Online Network Selection in Heterogeneous Wireless Networks , 2016, IEEE Transactions on Vehicular Technology.

[10]  Mung Chiang,et al.  RAT selection games in HetNets , 2013, 2013 Proceedings IEEE INFOCOM.

[11]  Andreas Krause,et al.  Online distributed sensor selection , 2010, IPSN '10.

[12]  Seung-Jae Han,et al.  Fairness and Load Balancing in Wireless LANs Using Association Control , 2004, IEEE/ACM Transactions on Networking.

[13]  Paramvir Bahl,et al.  MultiNet: connecting to multiple IEEE 802.11 networks using a single wireless card , 2004, IEEE INFOCOM 2004.

[14]  Sherali Zeadally,et al.  QoE-Based Server Selection for Content Distribution Networks , 2014, IEEE Transactions on Computers.

[15]  Rajesh K. Gupta,et al.  CoolSpots: reducing the power consumption of wireless mobile devices with multiple radio interfaces , 2006, MobiSys '06.

[16]  Dusit Niyato,et al.  Network Selection in Heterogeneous Wireless Networks: Evolution with Incomplete Information , 2010, 2010 IEEE Wireless Communication and Networking Conference.

[17]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[18]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[19]  Mark Handley,et al.  TCP Extensions for Multipath Operation with Multiple Addresses , 2011 .

[20]  Ramachandran Ramjee,et al.  Coordinating cellular background transfers using loadsense , 2013, MobiCom.

[21]  Mingyan Liu,et al.  Performance and Convergence of Multi-user Online Learning , 2011, GAMENETS.

[22]  Dusit Niyato,et al.  Dynamics of Network Selection in Heterogeneous Wireless Networks: An Evolutionary Game Approach , 2009, IEEE Transactions on Vehicular Technology.

[23]  Éva Tardos,et al.  Multiplicative updates outperform generic no-regret learning in congestion games: extended abstract , 2009, STOC '09.

[24]  Hari Balakrishnan,et al.  WiFi, LTE, or Both?: Measuring Multi-Homed Wireless Internet Performance , 2014, Internet Measurement Conference.

[25]  András György,et al.  Adaptive Routing Using Expert Advice , 2006, Comput. J..

[26]  Mark Handley,et al.  TCP Extensions for Multipath Operation with Multiple Addresses , 2020, RFC.

[27]  Hari Balakrishnan,et al.  All your network are belong to us: a transport framework for mobile network selection , 2014, HotMobile.

[28]  Srinivasan Seshan,et al.  Wifi-Reports: Improving Wireless Network Selection with Collaboration , 2010, IEEE Transactions on Mobile Computing.

[29]  Man Hon Cheung,et al.  Congestion-Aware Distributed Network Selection for Integrated Cellular and Wi-Fi Networks , 2017, ArXiv.

[30]  Marceau Coupechoux,et al.  Opportunistic Spectrum Access with Channel Switching Cost for Cognitive Radio Networks , 2011, 2011 IEEE International Conference on Communications (ICC).

[31]  Dan Pei,et al.  Characterizing and Improving WiFi Latency in Large-Scale Operational Networks , 2016, MobiSys.

[32]  Aravind Srinivasan,et al.  A Client-Driven Approach for Channel Management in Wireless LANs , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[33]  Edmund Wong,et al.  Large-scale Measurements of Wireless Network Behavior , 2015, SIGCOMM.