Load balancing for cluster systems under heavy-tailed and temporal dependent workloads

Abstract Large-scaled cluster systems have been employed in various areas by offering pools of fundamental resources. Efficient allocation of the shared resources in a cluster system is a critical but challenging issue, which has been extensively studied in the past few years. Despite the fact that existing load balancing policies, such as Random, Join Shortest Queue and size-based polices, are widely implemented in actual systems due to their simplicity and efficiency, the performance benefits of these policies diminish when workloads are highly variable and temporally correlated. In this paper, we propose a new load balancing policy, named AD u S, which attempts to partition jobs according to their present sizes and further rank the servers based on their loads. By dispatching jobs of similar sizes to the corresponding ranked servers, AD u S can adaptively balance user traffic and system load in a cluster and thus achieve significant performance benefits. Extensive trace-driven simulations using both synthetic and real traces show the effectiveness and robustness of AD u S under many different environments.

[1]  Yong Meng Teo,et al.  Comparison of Load Balancing Strategies on Cluster-based Web Servers , 2001, Simul..

[2]  Dennis Bushmitch,et al.  Thinning, striping and shuffling: traffic shaping and transport techniques for variable bit rate video , 2002, Global Telecommunications Conference, 2002. GLOBECOM '02. IEEE.

[3]  Tapani Lehtonen,et al.  On the optimality of the shortest line discipline , 1984 .

[4]  Vishal Misra,et al.  Optimal state-free, size-aware dispatching for heterogeneous M/G/-type systems , 2005, Perform. Evaluation.

[5]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[6]  Walter Willinger,et al.  Experimental queueing analysis with long-range dependent packet traffic , 1996, TNET.

[7]  Dirk Abendroth,et al.  Intelligent shaping: well shaped throughout the entire network? , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[8]  Prasant Mohapatra,et al.  Characterization of E-Commerce Traffic , 2003, Electron. Commer. Res..

[9]  Flavio Bonomi,et al.  On Job Assignment for a Parallel System of Processor Sharing Queues , 1990, IEEE Trans. Computers.

[10]  B. Conolly Structured Stochastic Matrices of M/G/1 Type and Their Applications , 1991 .

[11]  Hagit Sarfati,et al.  Analysis of SITA policies , 2010, Perform. Evaluation.

[12]  Qi Zhang,et al.  Performance impacts of autocorrelated flows in multi-tiered systems , 2007, Perform. Evaluation.

[13]  Rene L. Cruz,et al.  Service Burstiness and Dynamic Burstiness Measures: A Framework , 1992, J. High Speed Networks.

[14]  Amarnath Mukherjee,et al.  On resource management and QoS guarantees for long range dependent traffic , 1995, Proceedings of INFOCOM'95.

[15]  Peter J. Tonellato,et al.  Biomedical Cloud Computing With Amazon Web Services , 2011, PLoS Comput. Biol..

[16]  Ward Whitt,et al.  Deciding Which Queue to Join: Some Counterexamples , 1986, Oper. Res..

[17]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[18]  Qi Zhang,et al.  Performance-Guided Load (Un)balancing under Autocorrelated Flows , 2008, IEEE Transactions on Parallel and Distributed Systems.

[19]  Mor Harchol-Balter,et al.  On Choosing a Task Assignment Policy for a Distributed Server System , 1998, J. Parallel Distributed Comput..

[20]  Michael Muskulus,et al.  Analysis and modeling of job arrivals in a production grid , 2007, PERV.

[21]  Raymond R. Hill,et al.  Discrete-Event Simulation: A First Course , 2007, J. Simulation.

[22]  Bianca Schroeder,et al.  Understanding disk failure rates: What does an MTTF of 1,000,000 hours mean to you? , 2007, TOS.

[23]  Walter Willinger,et al.  Long-range dependence in variable-bit-rate video traffic , 1995, IEEE Trans. Commun..

[24]  Kenneth J. Christensen,et al.  Reduction of self-similarity by application-level traffic shaping , 1997, Proceedings of 22nd Annual Conference on Local Computer Networks.

[25]  Alma Riska,et al.  Long-Range Dependence at the Disk Drive Level , 2006, Third International Conference on the Quantitative Evaluation of Systems - (QEST'06).

[26]  Eric A. Brewer,et al.  Self-similarity in file systems , 1998, SIGMETRICS '98/PERFORMANCE '98.

[27]  Warren Smith,et al.  Scheduling with advanced reservations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[28]  Fei Xue,et al.  Self-similar traffic shaping at the edge router in optical packet-switched networks , 2002, 2002 IEEE International Conference on Communications. Conference Proceedings. ICC 2002 (Cat. No.02CH37333).

[29]  Wei Sun,et al.  Workload-aware load balancing for clustered Web servers , 2005, IEEE Transactions on Parallel and Distributed Systems.

[30]  Randolph D. Nelson,et al.  An Approximation for the Mean Response Time for Shortest Queue Routing with General Inerarrival and Service Times , 1993, Perform. Evaluation.

[31]  Alan Scheller-Wolf,et al.  Surprising results on task assignment in server farms with high-variability workloads , 2009, SIGMETRICS '09.

[32]  Evgenia Smirni,et al.  Burstiness in Multi-tier Applications: Symptoms, Causes, and New Models , 2008, Middleware.

[33]  Qi Zhang,et al.  Load Balancing for Performance Differentiation in Dual-Priority Clustered Servers , 2006, Third International Conference on the Quantitative Evaluation of Systems - (QEST'06).

[34]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.