HFSP: Size-based scheduling for Hadoop

Size-based scheduling with aging has, for long, been recognized as an effective approach to guarantee fairness and near-optimal system response times. We present HFSP, a scheduler introducing this technique to a real, multi-server, complex and widely used system such as Hadoop. Size-based scheduling requires a priori job size information, which is not available in Hadoop: HFSP builds such knowledge by estimating it on-line during job execution. Our experiments, which are based on realistic workloads generated via a standard benchmarking suite, pinpoint at a significant decrease in system response times with respect to the widely used Hadoop Fair scheduler, and show that HFSP is largely tolerant to job size estimation errors.

[1]  Anastasia Ailamaki,et al.  Same Queries, Different Data: Can We Predict Runtime Performance? , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[2]  Anirban Dasgupta,et al.  On scheduling in map-reduce and flow-shops , 2011, SPAA '11.

[3]  Xiaoqiao Meng,et al.  Delay tails in MapReduce scheduling , 2012, SIGMETRICS '12.

[4]  Linus Schrage,et al.  The Queue M/G/1 with the Shortest Remaining Processing Time Discipline , 1966, Oper. Res..

[5]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Magdalena Balazinska,et al.  SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[8]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[9]  Kun-Lung Wu,et al.  FLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads , 2010, Middleware.

[10]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[11]  Yanpei Chen,et al.  Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads , 2012, Proc. VLDB Endow..

[12]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[13]  Thomas Sandholm,et al.  MapReduce optimization using regulated dynamic prioritization , 2009, SIGMETRICS '09.

[14]  Kemafor Anyanwu,et al.  Scheduling Hadoop Jobs to Meet Deadlines , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[15]  Sergey Gorinsky,et al.  Fair Efficiency, or Low Average Delay without Starvation , 2007, 2007 16th International Conference on Computer Communications and Networks.

[16]  Archana Ganapathi,et al.  The Case for Evaluating MapReduce Performance Using Workload Suites , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.

[17]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[18]  Anastasia Ailamaki,et al.  Same Queries, Different Data: Can we Predict Query Performance? , 2012 .

[19]  R. Katz,et al.  Interactive Query Processing in Big Data Systems: A Cross Industry Study of MapReduce Workloads , 2012 .

[20]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[21]  Eric J. Friedman,et al.  Fairness and efficiency in web server protocols , 2003, SIGMETRICS '03.

[22]  M. Balazinska,et al.  An analysis of Hadoop usage in scientific workloads , 2013 .

[23]  Roy H. Campbell,et al.  ARIA: automatic resource inference and allocation for mapreduce environments , 2011, ICAC '11.

[24]  Murali S. Kodialam,et al.  Scheduling in mapreduce-like systems for fast completion time , 2011, 2011 Proceedings IEEE INFOCOM.

[25]  Chao Tian,et al.  A Dynamic MapReduce Scheduler for Heterogeneous Workloads , 2009, 2009 Eighth International Conference on Grid and Cooperative Computing.

[26]  John B. Nagle,et al.  On Packet Switches with Infinite Storage , 1987, IEEE Trans. Commun..

[27]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[28]  Xiaoqiao Meng,et al.  Performance analysis of Coupling Scheduler for MapReduce/Hadoop , 2012, 2012 Proceedings IEEE INFOCOM.

[29]  Srikanth Kandula,et al.  Reoptimizing Data Parallel Computing , 2012, NSDI.

[30]  Anujan Varma,et al.  Latency-rate servers: a general model for analysis of traffic scheduling algorithms , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[31]  Matteo Dell'Amico,et al.  A Simulator for Data-Intensive Job Scheduling , 2013, ArXiv.

[32]  Thomas Sandholm,et al.  Dynamic Proportional Share Scheduling in Hadoop , 2010, JSSPP.

[33]  Kyle Fox,et al.  Online scheduling on identical machines using SRPT , 2010, SODA '11.

[34]  Antonio Barbuzzi,et al.  Practical size-based scheduling for MapReduce workloads , 2013, 1302.2749.

[35]  Roy H. Campbell,et al.  Two Sides of a Coin: Optimizing the Schedule of MapReduce Jobs to Minimize Their Makespan and Improve Cluster Performance , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[36]  Mor Harchol-Balter,et al.  Size-based scheduling to improve web performance , 2003, TOCS.

[37]  Peter A. Dinda,et al.  Size-based scheduling policies with inaccurate scheduling information , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..