Value driven load balancing

To date, the study of dispatching or load balancing in server farms has primarily focused on the minimization of response time. Server farms are typically modeled by a front-end router that employs a dispatching policy to route jobs to one of several servers, with each server scheduling all the jobs in its queue via Processor-Sharing. However, the common assumption has been that all jobs are equally important or valuable, in that they are equally sensitive to delay. Our work departs from this assumption: we model each arrival as having a randomly distributed value parameter, independent of the arrival’s service requirement (job size). Given such value heterogeneity, the correct metric is no longer the minimization or response time, but rather, the minimization of value-weighted response time. In this context, we ask “what is a good dispatching policy to minimize the value-weighted response time metric?” We propose a number of new dispatching policies that are motivated by the goal of minimizing the value-weighted response time. Via a combination of exact analysis, asymptotic analysis, and simulation, we are able to deduce many unexpected results regarding dispatching.

[1]  Gustavo de Veciana,et al.  Size-based adaptive bandwidth allocation: optimizing the average QoS for elastic flows , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[2]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[3]  Esa Hyytiä,et al.  Round-robin routing policy: value functions and mean performance with job- and server-specific costs , 2013, VALUETOOLS.

[4]  Hagit Sarfati,et al.  Analysis of size interval task assignment policies , 2008, PERV.

[5]  Jorma T. Virtamo,et al.  M/M/1-PS queue and size-aware task assignment , 2011, Perform. Evaluation.

[6]  Gianfranco Ciardo,et al.  EQUILOAD: a load balancing policy for clustered web servers , 2001, Perform. Evaluation.

[7]  Mor Harchol-Balter,et al.  On Choosing a Task Assignment Policy for a Distributed Server System , 1998, J. Parallel Distributed Comput..

[8]  Ravi Mazumdar,et al.  Analysis of Load Balancing in Large Heterogeneous Processor Sharing Systems , 2013, ArXiv.

[9]  Alan Scheller-Wolf,et al.  Surprising results on task assignment in server farms with high-variability workloads , 2009, SIGMETRICS '09.

[10]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[11]  Bacel Maddah,et al.  Allocation of Service Time in a Multiserver System , 2006, Manag. Sci..

[12]  Ronald A. Howard,et al.  Dynamic Probabilistic Systems , 1971 .

[13]  Ivo J. B. F. Adan,et al.  Upper and lower bounds for the waiting time in the symmetric shortest queue system , 1994, Ann. Oper. Res..

[14]  James R. Larus,et al.  Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services , 2011, Perform. Evaluation.

[15]  R. Weber On the optimal assignment of customers to parallel servers , 1978, Journal of Applied Probability.

[16]  A. Karr Weak convergence of a sequence of Markov chains , 1975 .

[17]  Donald F. Towsley,et al.  Bounding the Mean Response Time of the Minimum Expected Delay Routing Policy: An Algorithmic Approach , 1995, IEEE Trans. Computers.

[18]  Esa Hyytiä,et al.  Lookahead actions in dispatching to parallel queues , 2013, Perform. Evaluation.

[19]  Esa Hyytiä,et al.  Minimizing slowdown in heterogeneous size-aware dispatching systems , 2012, SIGMETRICS '12.

[20]  R. Howard,et al.  Dynamic Probabilistic Systems, Volume I: Markov Models and Volume II: Semi- Markov and Decision Processes. , 1972 .

[21]  J. Kingman Two Similar Queues in Parallel , 1961 .

[22]  Adam Wierman,et al.  Asymptotic convergence of scheduling policies with respect to slowdown , 2002, Perform. Evaluation.

[23]  Dennis W. Fife Scheduling with Random Arrivals and Linear Loss Functions , 1965 .

[24]  Urtzi Ayesta,et al.  Load balancing in processor sharing systems , 2011, Telecommun. Syst..

[25]  M. Rao Probability theory with applications , 1984 .

[26]  Mor Harchol-Balter Task assignment with unknown duration , 2002, JACM.

[27]  K. R. Krishnan Joining the right queue: a state-dependent decision rule , 1990 .

[28]  Philip S. Yu,et al.  The state of the art in locally distributed Web-server systems , 2002, CSUR.

[29]  B. Sengupta,et al.  A conditional response time of the M/M/1 processor-sharing queue , 1985, AT&T Technical Journal.

[30]  Yi Lu,et al.  Randomized load balancing with general service time distributions , 2010, SIGMETRICS '10.

[31]  Tapani Lehtonen,et al.  On the optimality of the shortest line discipline , 1984 .

[32]  Anthony Ephremides,et al.  A simple dynamic routing problem , 1980 .

[33]  Carl M. Harris,et al.  Fundamentals of queueing theory , 1975 .

[34]  Mor Harchol-Balter Performance Modeling and Design of Computer Systems: The M/G/1 Queue and the Inspection Paradox , 2013 .

[35]  W. Whitt,et al.  Analysis of join-the-shortest-queue routing for web server farms , 2007, Perform. Evaluation.

[36]  G. J. Foschini,et al.  A Basic Dynamic Routing Problem and Diffusion , 1978, IEEE Trans. Commun..

[37]  Flavio Bonomi,et al.  On Job Assignment for a Parallel System of Processor Sharing Queues , 1990, IEEE Trans. Computers.

[38]  Vishal Misra,et al.  Mixed scheduling disciplines for network flows , 2003, PERV.