Mean Waiting Time in Large-Scale and Critically Loaded Power of d Load Balancing Systems

Mean field models are a popular tool used to analyse load balancing policies. In some exceptional cases the waiting time distribution of the mean field limit has an explicit form. In other cases it can be computed as the solution of a set of differential equations. In this paper we study the limit of the mean waiting time E[Wλ] as the arrival rate λ approaches 1 for a number of load balancing policies in a large-scale system of homogeneous servers which finish work at a constant rate equal to one and exponential job sizes with mean 1 (i.e. when the system gets close to instability). As E[Wλ] diverges to infinity, we scale with -log(1-λ) and present a method to compute the limit limλ-> 1- -E[Wλ]/l(1-λ). We show that this limit has a surprisingly simple form for the load balancing algorithms considered. More specifically, we present a general result that holds for any policy for which the associated differential equation satisfies a list of assumptions. For the well-known LL(d) policy which assigns an incoming job to a server with the least work left among d randomly selected servers these assumptions are trivially verified. For this policy we prove the limit is given by 1/d-1. We further show that the LL(d,K) policy, which assigns batches of K jobs to the K least loaded servers among d randomly selected servers, satisfies the assumptions and the limit is equal to K/d-K. For a policy which applies LL(di) with probability pi, we show that the limit is given by 1/ ∑i pi di - 1. We further indicate that our main result can also be used for load balancers with redundancy or memory. In addition, we propose an alternate scaling -l(pλ) instead of -l(1-λ), where pλ is adapted to the policy at hand, such that limλ-> 1- -E[Wλ]/l(1-λ)=limλ-> 1- -E[Wλ]/l(pλ), where the limit limλ-> 0+ -E[Wλ]/l(pλ) is well defined and non-zero (contrary to limλ-> 0+ -E[Wλ]/l(1-λ)). This allows to obtain relatively flat curves for -E[Wλ]/l(pλ) for λ ∈ [0,1] which indicates that the low and high load limits can be used as an approximation when λ is close to one or zero. Our results rely on the earlier proven ansatz which asserts that for certain load balancing policies the workload distribution of any finite set of queues becomes independent of one another as the number of servers tends to infinity.

[1]  Lei Ying,et al.  Steady-state analysis of load-balancing algorithms in the sub-Halfin-Whitt regime , 2020, J. Appl. Probab..

[2]  David Gamarnik,et al.  Join the Shortest Queue with Many Servers. The Heavy-Traffic Asymptotics , 2015, Math. Oper. Res..

[3]  Benny Van Houdt,et al.  On the Power-of-d-choices with Least Loaded Server Selection , 2018, Proc. ACM Meas. Anal. Comput. Syst..

[4]  Lei Ying,et al.  Steady‐state analysis of load balancing with Coxian‐2 distributed service times , 2020, Naval Research Logistics (NRL).

[5]  Benny Van Houdt,et al.  Performance Analysis of Workload Dependent Load Balancing Policies , 2019, Proc. ACM Meas. Anal. Comput. Syst..

[6]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[7]  Alan Scheller-Wolf,et al.  The Power of d Choices for Redundancy , 2016, SIGMETRICS.

[8]  R. Srikant,et al.  The power of slightly more than one sample in randomized load balancing , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[9]  Benny Van Houdt,et al.  Global attraction of ODE-based mean field models with hyperexponential job sizes , 2018, SIGMETRICS.

[10]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[11]  J. Boudec,et al.  A class of mean field interaction models for computer and communication systems , 2008, Perform. Evaluation.

[12]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[13]  M. Benaïm,et al.  A class of mean field interaction models for computer and communication systems , 2008, 2008 6th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks and Workshops.

[14]  Benny Van Houdt,et al.  Performance of Redundancy(d) with Identical/Independent Replicas , 2019, ACM Trans. Model. Perform. Evaluation Comput. Syst..

[15]  Equilibrium points of logarithmic potentials induced by positive charge distributions. I. Generalized de bruijn-springer relations , 2006, math/0601519.

[16]  Seva Shneer,et al.  Large-scale parallel server system with multi-component jobs , 2020, ArXiv.

[17]  Malwina Luczak,et al.  The supermarket model with arrival rate tending to one , 2012, 1201.5523.

[18]  David Blackwell,et al.  The range of certain vector integrals , 1951 .

[19]  Yiqiang Q. Zhao,et al.  Approximations for a Queueing Game Model with Join-the-Shortest-Queue Strategy , 2020, Journal of the Operations Research Society of China.

[20]  Yi Lu,et al.  Decay of Tails at Equilibrium for FIFO Join the Shortest Queue Networks , 2011, ArXiv.

[21]  T. Kurtz Approximation of Population Processes , 1987 .

[22]  Yi Lu,et al.  Randomized load balancing with general service time distributions , 2010, SIGMETRICS '10.

[23]  Parimal Parag,et al.  Load balancing policies with server-side cancellation of replicas , 2020, ArXiv.

[24]  Urtzi Ayesta,et al.  On the Stability of Redundancy Models , 2019, Oper. Res..

[25]  R. L. Dobrushin,et al.  Queueing system with selection of the shortest of two queues: an assymptotic approach , 1996 .