FCD: Fast-concurrent-distributed load balancing under switching costs and imperfect observations

The problem of distributed load balancing among m agents operating in an n-server slotted system is considered. A randomized local search mechanism, FCD (fast, concurrent and distributed) algorithm, is implemented concurrently by each agent associated with a user. It involves switching to a different server with a certain exploration probability and then backtracking with a probability proportional to the ratio of the measured loads in the two servers (in consecutive time slots). The exploration and backtracking operations are executed concurrently by users in local alternating time slots. To ensure that users do not switch to other servers asymptotically, each user chooses the exploration probability to be decaying polynomially with time for decaying rate β ∈ [0.5, 1]. The backtracking decision is then based on an estimate of the server load which is computed based on local information. Thus, FCD algorithm does not require synchronization or coordination with other users. The main contribution of this work, besides the FCD algorithm, is the analysis of the convergence time for the system to be approximately balanced, i.e. to reach an c-Nash equilibrium. We show that the system reaches an c-Nash equilibrium in expected time O (max {n log n/ϵ + n1/β, (n3/m3 log n2/ϵ)1/β}) when m > n2. This implies that the convergence rate is robust with large scale system(large user population), and is not affected by imperfect measurements of the server load. We also extend our analysis to open systems where users arrive and depart from a system with an initial load of m users. We allow for general time-dependent arrival processes (including heavy-tailed processes) and consider a uniform and a load-oblivious routing of the arrivals to the servers. A wide class of departure processes including load-dependent departures from the servers is also allowed. Our analysis demonstrates that it is possible to design fast, concurrent and distributed load balancing mechanisms in large multi-agent systems via randomized local search.

[1]  Adam Wierman,et al.  On the Impact of Heterogeneity and Back-End Scheduling in Load Balancing Designs , 2009, IEEE INFOCOM 2009.

[2]  Olivier Gaudoin,et al.  A SURVEY ON DISCRETE LIFETIME DISTRIBUTIONS , 2003 .

[3]  Yishay Mansour,et al.  Fast convergence of selfish rerouting , 2005, SODA '05.

[4]  D. Manjunath,et al.  Load balancing via random local search in closed and open systems , 2010, SIGMETRICS '10.

[5]  Ao Tang,et al.  Opportunistic Spectrum Access with Multiple Users: Learning under Competition , 2010, 2010 Proceedings IEEE INFOCOM.

[6]  Paul W. Goldberg,et al.  Distributed selfish load balancing , 2005, SODA '06.

[7]  Berthold Vöcking,et al.  Fast convergence to Wardrop equilibria by adaptive sampling methods , 2006, STOC '06.

[8]  Lachlan L. H. Andrew,et al.  Greening geographical load balancing , 2011, PERV.

[9]  Devavrat Shah,et al.  Dynamics in congestion games , 2010, SIGMETRICS '10.

[10]  T. Apostol Introduction to analytic number theory , 1976 .

[11]  Paul W. Goldberg,et al.  Bounds for the convergence rate of randomized local search in a multiplayer load-balancing game , 2004, PODC '04.

[12]  Yi Lu,et al.  Randomized load balancing with general service time distributions , 2010, SIGMETRICS '10.

[13]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[14]  Alexandre Proutière,et al.  Insensitive load balancing , 2004, SIGMETRICS '04/Performance '04.

[15]  Éva Tardos,et al.  Multiplicative updates outperform generic no-regret learning in congestion games: extended abstract , 2009, STOC '09.

[16]  Qing Zhao,et al.  Decentralized multi-armed bandit with multiple distributed players , 2010, 2010 Information Theory and Applications Workshop (ITA).

[17]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[18]  James R. Larus,et al.  Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services , 2011, Perform. Evaluation.

[19]  Ananthram Swami,et al.  Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[20]  Lachlan L. H. Andrew,et al.  Greening Geographical Load Balancing , 2015, IEEE/ACM Transactions on Networking.