Energy Efficient Scheduling of MapReduce Jobs

MapReduce has emerged as a prominent programming model for data-intensive computation. In this work, we study power-aware MapReduce scheduling in the speed scaling setting first introduced by Yao et al. [FOCS 1995]. We focus on the minimization of the total weighted completion time of a set of MapReduce jobs under a given budget of energy. Using a linear programming relaxation of our problem, we derive a polynomial time constant-factor approximation algorithm. We also propose a convex programming formulation that we combine with standard list scheduling policies, and we evaluate their performance using simulations.

[1]  Cynthia A. Phillips,et al.  Task Scheduling in Networks , 1997, SIAM J. Discret. Math..

[2]  Thomas A. Roemer,et al.  A note on the complexity of the concurrent open shop problem , 2006, J. Sched..

[3]  Murali S. Kodialam,et al.  Joint scheduling of processing and Shuffle phases in MapReduce systems , 2012, 2012 Proceedings IEEE INFOCOM.

[4]  Evripidis Bampis,et al.  Energy Aware Scheduling for Unrelated Parallel Machines , 2012, 2012 IEEE International Conference on Green Computing and Communications.

[5]  David B. Shmoys,et al.  Scheduling to minimize average completion time: off-line and on-line algorithms , 1996, SODA '96.

[6]  Martin Skutella,et al.  Scheduling Unrelated Machines by Randomized Rounding , 2002, SIAM J. Discret. Math..

[7]  Murali S. Kodialam,et al.  Scheduling in mapreduce-like systems for fast completion time , 2011, 2011 Proceedings IEEE INFOCOM.

[8]  Cynthia A. Phillips,et al.  Minimizing average completion time in the presence of release dates , 1998, Math. Program..

[9]  Lavanya Ramakrishnan,et al.  On the performance and energy efficiency of Hadoop deployment models , 2013, 2013 IEEE International Conference on Big Data.

[10]  Anirban Dasgupta,et al.  On scheduling in map-reduce and flow-shops , 2011, SPAA '11.

[11]  F. Frances Yao,et al.  A scheduling model for reduced CPU energy , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[12]  Nicole Megow,et al.  Dual Techniques for Scheduling on a Machine with Varying Speed , 2013, ICALP.

[13]  Ola Svensson,et al.  Minimizing the sum of weighted completion times in a concurrent open shop , 2010, Oper. Res. Lett..

[14]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[15]  Nan Yang,et al.  Energy Efficiency for MapReduce Workloads: An In-depth Study , 2012, ADC.

[16]  David B. Shmoys,et al.  Scheduling to Minimize Average Completion Time: Off-Line and On-Line Approximation Algorithms , 1997, Math. Oper. Res..

[17]  Kirk Pruhs,et al.  Speed scaling for weighted flow time , 2007, SODA '07.

[18]  Martin Skutella,et al.  List Scheduling in Order of α-Points on a Single Machine , 2006, Efficient Approximation and Online Algorithms.

[19]  Susanne Albers,et al.  Algorithms for Dynamic Speed Scaling , 2011, STACS.

[20]  Rong Ge,et al.  Improving MapReduce energy efficiency for computation intensive workloads , 2011, 2011 International Green Computing Conference and Workshops.

[21]  Kirk Pruhs,et al.  Speed Scaling of Tasks with Precedence Constraints , 2005, Theory of Computing Systems.

[22]  Ioannis Milis,et al.  Scheduling MapReduce Jobs on Unrelated Processors , 2013, EDBT/ICDT Workshops.

[23]  Jordi Torres,et al.  GreenHadoop: leveraging green energy in data-processing frameworks , 2012, EuroSys '12.