Asymptotically Optimal Load Balancing in Large-scale Heterogeneous Systems with Multiple Dispatchers

We consider the load balancing problem in large-scale heterogeneous systems with multiple dispatchers. We introduce a general framework called Local-Estimation-Driven (LED). Under this framework, each dispatcher keeps local (possibly outdated) estimates of the queue lengths for all the servers, and the dispatching decision is made purely based on these local estimates. The local estimates are updated via infrequent communications between dispatchers and servers. We derive sufficient conditions for LED policies to achieve throughput optimality and delay optimality in heavy-traffic, respectively. These conditions directly imply delay optimality for many previous local-memory based policies in heavy traffic. Moreover, the results enable us to design new delay optimal policies for heterogeneous systems with multiple dispatchers. Finally, the heavy-traffic delay optimality of the LED framework also sheds light on a recent open question on how to design optimal load balancing schemes using delayed information.

[1]  Michael Mitzenmacher Analyzing distributed Join-Idle-Queue: A fluid limit approach , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[2]  Ramesh Govindan,et al.  Evolve or Die: High-Availability Design Principles Drawn from Googles Network Infrastructure , 2016, SIGCOMM.

[3]  R. Weber On the optimal assignment of customers to parallel servers , 1978, Journal of Applied Probability.

[4]  Ness B. Shroff,et al.  Designing Low-Complexity Heavy-Traffic Delay-Optimal Load Balancing Schemes: Theory to Algorithms , 2017, SIGMETRICS.

[5]  James R. Larus,et al.  Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services , 2011, Perform. Evaluation.

[6]  R. Srikant,et al.  Heavy traffic optimal resource allocation algorithms for cloud computing clusters , 2012, 2012 24th International Teletraffic Congress (ITC 24).

[7]  Patrick Shuff Building a Billion User Load Balancer , 2015 .

[8]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[9]  Adam Wierman,et al.  Asymptotically Optimal Load Balancing in Large-scale Heterogeneous Systems with Multiple Dispatchers , 2020, Perform. Evaluation.

[10]  Ness B. Shroff,et al.  Heavy-traffic Delay Optimality in Pull-based Load Balancing Systems: Necessary and Sufficient Conditions , 2019, SIGMETRICS.

[11]  Ariel Orda,et al.  LSQ: Load Balancing in Large-Scale Heterogeneous Systems With Multiple Dispatchers , 2020, IEEE/ACM Transactions on Networking.

[12]  R. Srikant,et al.  Asymptotically tight steady-state queue length bounds implied by drift conditions , 2011, Queueing Syst. Theory Appl..

[13]  Alexander L. Stolyar Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers , 2017, Queueing Syst. Theory Appl..

[14]  David Lipshutz,et al.  Open Problem—Load Balancing Using Delayed Information , 2019, Stochastic Systems.

[15]  Alexander L. Stolyar Pull-based load distribution in large-scale heterogeneous service systems , 2015, Queueing Syst. Theory Appl..