Joint optimization of overlapping phases in MapReduce

MapReduce is a scalable parallel computing framework for big data processing. It exhibits multiple processing phases, and thus an efficient job scheduling mechanism is crucial for ensuring efficient resource utilization. This work studies the scheduling challenge that results from the overlapping of the "map" and "shuffle" phases in MapReduce. We propose a new, general model for this scheduling problem. Further, we prove that scheduling to minimize average response time in this model is strongly NP-hard in the offline case and that no online algorithm can be constant-competitive in the online case. However, we provide two online algorithms that match the performance of the offline optimal when given a slightly faster service rate.

[1]  Han Hoogeveen,et al.  Minimizing Total Completion Time in a Two-Machine Flowshop: Analysis of Special Cases , 1999, Math. Oper. Res..

[2]  Matei Zaharia,et al.  Job Scheduling for Multi-User MapReduce Clusters , 2009 .

[3]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .

[4]  Leslie A. Hall Approximability of flow shop scheduling , 1998, Math. Program..

[5]  Weisong Shi,et al.  Workload characterization on a production Hadoop cluster: A case study on Taobao , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[6]  Roy H. Campbell,et al.  Two Sides of a Coin: Optimizing the Schedule of MapReduce Jobs to Minimize Their Makespan and Improve Cluster Performance , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[7]  Mor Harchol-Balter,et al.  Size-based scheduling to improve web performance , 2003, TOCS.

[8]  Adam Wierman,et al.  Classifying scheduling policies with respect to unfairness in an M/GI/1 , 2003, SIGMETRICS '03.

[9]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[10]  Thomas C. Bressoud,et al.  Cluster fault-tolerance: An experimental evaluation of checkpointing and MapReduce through simulation , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[11]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[12]  Linus Schrage,et al.  Letter to the Editor - A Proof of the Optimality of the Shortest Remaining Processing Time Discipline , 1968, Oper. Res..

[13]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[14]  Han Hoogeveen,et al.  Minimizing Total Completion Time in a Two-Machine Flowshop: Analysis of Special Cases , 1996, Math. Oper. Res..

[15]  Jeremy Singer,et al.  Comparing Fork / Join and MapReduce , 2012 .

[16]  Geoffrey C. Fox,et al.  Investigation of Data Locality in MapReduce , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[17]  Mohammad Hammoud,et al.  Locality-Aware Reduce Task Scheduling for MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[18]  Ling Liu,et al.  Purlieus: Locality-aware resource allocation for MapReduce in a cloud , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[19]  Ludmila Cherkasova Scheduling Strategy to improve Response Time for Web Applications , 1998, HPCN Europe.

[20]  Anirban Dasgupta,et al.  On scheduling in map-reduce and flow-shops , 2011, SPAA '11.

[21]  Xiaoqiao Meng,et al.  Delay tails in MapReduce scheduling , 2012, SIGMETRICS '12.

[22]  Benjamin Avi-Itzhak,et al.  A resource-allocation queueing fairness measure , 2004, SIGMETRICS '04/Performance '04.

[23]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[24]  Srikanth Kandula,et al.  PACMan: Coordinated Memory Caching for Parallel Jobs , 2012, NSDI.

[25]  Roy H. Campbell,et al.  ARIA: automatic resource inference and allocation for mapreduce environments , 2011, ICAC '11.

[26]  Carey L. Williamson,et al.  Revisiting unfairness in Web server scheduling , 2006, Comput. Networks.

[27]  Carey L. Williamson,et al.  Simulation evaluation of hybrid SRPT scheduling policies , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[28]  Nico M. van Dijk Queueing networks and product forms - a systems approach , 1993, Wiley-Interscience series in systems and optimization.

[29]  Adam Wierman,et al.  Fairness and classifications , 2007, PERV.

[30]  Jie Huang,et al.  HiTune: Dataflow-Based Performance Analysis for Big Data Cloud , 2011, USENIX Annual Technical Conference.

[31]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[32]  Adam Wierman,et al.  Classifying scheduling policies with respect to higher moments of conditional response time , 2005, SIGMETRICS '05.

[33]  Stefano Leonardi,et al.  Approximating total flow time on parallel machines , 1997, STOC '97.

[34]  Ravi Sethi,et al.  The Complexity of Flowshop and Jobshop Scheduling , 1976, Math. Oper. Res..

[35]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[36]  Ion Stoica,et al.  True elasticity in multi-tenant data-intensive compute clusters , 2012, SoCC '12.

[37]  Rajeev Gandhi,et al.  An Analysis of Traces from a Production MapReduce Cluster , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[38]  Donald F. Towsley,et al.  A unified modeling framework for distributed resource allocation of general fork and join processing networks , 2010, SIGMETRICS '10.

[39]  Kevin Wilkinson,et al.  Modeling the Performance of the Hadoop Online Prototype , 2011, 2011 23rd International Symposium on Computer Architecture and High Performance Computing.

[40]  Ronald W. Wolff,et al.  Stochastic Modeling and the Theory of Queues , 1989 .