MapReduce framework energy adaptation via temperature awareness

MapReduce has become a popular framework for Big Data applications. While MapReduce has received much praise for its scalability and efficiency, it has not been thoroughly evaluated for power consumption. Our goal with this paper is to explore the possibility of scheduling in a power-efficient manner without the need for expensive power monitors on every node. We begin by considering that no cluster is truly homogeneous with respect to energy consumption. From there we develop a MapReduce framework that can evaluate the current status of each node and dynamically react to estimated power usage. In so doing, we shift work toward more energy efficient nodes which are currently consuming less power. Our work shows that given an ideal framework configuration, certain nodes may consume only 62.3 % of the dynamic power they consumed when the same framework was configured as it would be in a traditional MapReduce implementation.

[1]  Madhusudhan Govindaraju,et al.  MARLA: MapReduce for Heterogeneous Clusters , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[2]  Rong Ge,et al.  Improving MapReduce energy efficiency for computation intensive workloads , 2011, 2011 International Green Computing Conference and Workshops.

[3]  Shen Li,et al.  TAPA: Temperature aware power allocation in data center with Map-Reduce , 2011, 2011 International Green Computing Conference and Workshops.

[4]  Xiao Qin,et al.  An Energy-Efficient Framework for Large-Scale Parallel Storage Systems , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[5]  Miron Livny,et al.  A framework for reliable and efficient data placement in distributed computing systems , 2005, J. Parallel Distributed Comput..

[6]  Archana Ganapathi,et al.  Statistical Workloads for Energy Efficient MapReduce , 2010 .

[7]  Yun Tian,et al.  Improving MapReduce performance through data placement in heterogeneous Hadoop clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[8]  Rini T. Kaushik,et al.  GreenHDFS: towards an energy-conserving, storage-efficient, hybrid Hadoop compute cluster , 2010 .

[9]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[10]  Mugen Peng,et al.  Network Coding Scheme Based on LDPC Product Codes in Multiple-Access Relay System , 2011, 2011 IEEE International Conference on Communications Workshops (ICC).

[11]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[12]  Lavanya Ramakrishnan,et al.  MARIANE: MApReduce Implementation Adapted for HPC Environments , 2011, 2011 IEEE/ACM 12th International Conference on Grid Computing.

[13]  Jignesh M. Patel,et al.  Energy management for MapReduce clusters , 2010, Proc. VLDB Endow..

[14]  Yanpei Chen,et al.  Towards Energy Efficient MapReduce , 2009 .

[15]  Christoforos E. Kozyrakis,et al.  On the energy (in)efficiency of Hadoop clusters , 2010, OPSR.

[16]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[17]  Michael C. Schatz,et al.  CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..