Batch Sampling : Low Overhead Scheduling for Sub-Second Parallel Jobs

Large-scale data analytics frameworks are shifting towards shorter task durations and larger degrees of parallelism to provide low latency. However, scheduling highly parallel jobs that complete in hundreds of milliseconds poses a major scaling challenge for cluster schedulers, which will need to place millions of tasks per second on appropriate nodes while offering millisecondlevel latency and high availability. This paper presents a decentralized load balancing approach called batch sampling, based on a generalization of the power of two random choices, that performs within a few percent of an optimal centralized scheduler. We evaluate our approach through both analytical results and an implementation called Sparrow. Sparrow schedules tasks with less than 8ms of overhead and features a design that is inherently fault-tolerant and scalable.

[1]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[2]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[3]  Diksha Verma,et al.  Quincy: Fair Scheduling for Distributed Computing Clusters , 2014 .

[4]  Jeffrey Dean,et al.  Achieving Rapid Response Times in Large Online Services , 2012 .

[5]  Scott Shenker,et al.  Why Let Resources Idle? Aggressive Cloning of Jobs with Dolly , 2012, HotCloud.

[6]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .

[7]  M. Jette,et al.  Simple Linux Utility for Resource Management , 2009 .

[8]  Eli Upfal,et al.  A simple load balancing scheme for task allocation in parallel machines , 1991, SPAA '91.

[9]  Scott Shenker,et al.  Shark: fast data analysis using coarse-grained distributed memory , 2012, SIGMOD Conference.

[10]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[11]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[12]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[13]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[14]  Anthony P. Reeves,et al.  Strategies for Dynamic Load Balancing on Highly Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..

[15]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[16]  Miron Livny,et al.  An update on the scalability limits of the Condor batch system , 2011 .

[17]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[18]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[19]  Ramesh K. Sitaraman,et al.  The power of two random choices: a survey of tech-niques and results , 2001 .

[20]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..