Configuring a MapReduce Framework for Dynamic and Efficient Energy Adaptation

MapReduce has become a popular framework for Big Data applications. While MapReduce has received much praise for its scalability and efficiency, it has not been thoroughly evaluated for power consumption. Our goal with this paper is to explore the possibility of scheduling in a power-efficient manner without the need for expensive power monitors on every node. We begin by considering that no cluster is truly homogeneous with respect to energy consumption. From there we develop a MapReduce framework that can evaluate the current status of each node and dynamically react to estimated power usage. Inso doing, we shift power consumption work toward more energy efficient nodes which are currently consuming less power. Our work shows that given an ideal framework configuration, certain nodes may consume only 62.3% of the dynamic power they consumed when the same framework was configured as it would be in a traditional MapReduce implementation.

[1]  Miron Livny,et al.  A framework for reliable and efficient data placement in distributed computing systems , 2005, J. Parallel Distributed Comput..

[2]  Yun Tian,et al.  Improving MapReduce performance through data placement in heterogeneous Hadoop clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[3]  Lavanya Ramakrishnan,et al.  MARIANE: MApReduce Implementation Adapted for HPC Environments , 2011, 2011 IEEE/ACM 12th International Conference on Grid Computing.

[4]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Madhusudhan Govindaraju,et al.  MARLA: MapReduce for Heterogeneous Clusters , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[7]  Rong Ge,et al.  Improving MapReduce energy efficiency for computation intensive workloads , 2011, 2011 International Green Computing Conference and Workshops.

[8]  Jignesh M. Patel,et al.  Energy management for MapReduce clusters , 2010, Proc. VLDB Endow..

[9]  Yanpei Chen,et al.  Towards Energy Efficient MapReduce , 2009 .

[10]  Xiao Qin,et al.  An Energy-Efficient Framework for Large-Scale Parallel Storage Systems , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[11]  Michael C. Schatz,et al.  CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..

[12]  Christoforos E. Kozyrakis,et al.  On the energy (in)efficiency of Hadoop clusters , 2010, OPSR.

[13]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[14]  Archana Ganapathi,et al.  Statistical Workloads for Energy Efficient MapReduce , 2010 .

[15]  Rini T. Kaushik,et al.  GreenHDFS: towards an energy-conserving, storage-efficient, hybrid Hadoop compute cluster , 2010 .