Energy-aware processing of big data in homogeneous cluster

The size of data centers is becoming larger to deal with the exponential data growth, and the energy consumption challenges the services providers and the environment. Various data placement strategies were developed to reduce the energy consumption of processing big data on the level of storage system, but they were typically developed for specific applications and storage medium. This paper proposes an energy-aware algorithm EABD of processing big data in homogeneous cluster with general data storage. We show that a variation of this optimization can be reduced to set cover problem, and a heuristic algorithm is proposed to reduce the energy consumption by selecting proper nodes and assigning balanced workload to each selected node. This algorithm will not be influenced by the data placement strategies and storage medium. Simulation results show that our algorithm significantly reduces energy consumption in different situations.

[1]  Abhishek Chandra,et al.  Exploiting Spatio-Temporal Tradeoffs for Energy-Aware MapReduce in the Cloud , 2012, IEEE Transactions on Computers.

[2]  Lawrence T. Clark,et al.  Low power ARM® Cortex™-M0 CPU and SRAM using Deeply Depleted Channel (DDC) transistors with Vdd scaling and body bias , 2013, Proceedings of the IEEE 2013 Custom Integrated Circuits Conference.

[3]  Gang Quan,et al.  Transition-overhead-aware voltage scheduling for fixed-priority real-time systems , 2007, TODE.

[4]  Jeffrey F. Naughton,et al.  On energy management, load balancing and replication , 2010, SGMD.

[5]  Yonggang Wen,et al.  Data Center Energy Consumption Modeling: A Survey , 2016, IEEE Communications Surveys & Tutorials.

[6]  Tei-Wei Kuo,et al.  Energy-aware data placement strategy for SSD-assisted streaming video servers , 2014, 2014 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA).

[7]  Erez Zadok,et al.  Optimizing energy and performance for server-class file system workloads , 2010, TOS.

[8]  Klara Nahrstedt,et al.  Evaluation and Analysis of GreenHDFS: A Self-Adaptive, Energy-Conserving Variant of the Hadoop Distributed File System , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[9]  Shuhei Tanakamaru,et al.  Highly Reliable and Low Power SSD Using Asymmetric Coding and Stripe Bitline-Pattern Elimination Programming , 2012, IEEE Journal of Solid-State Circuits.

[10]  Hai Jin,et al.  Towards a green cluster through dynamic remapping of virtual machines , 2012, Future Gener. Comput. Syst..

[11]  Yuhui Deng,et al.  Skewly replicating hot data to construct a power-efficient storage cluster , 2015, J. Netw. Comput. Appl..

[12]  Madhusudhan Govindaraju,et al.  Configuring a MapReduce Framework for Dynamic and Efficient Energy Adaptation , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[13]  Thomas F. Wenisch,et al.  PowerNap: eliminating server idle power , 2009, ASPLOS.

[14]  Jinoh Kim,et al.  Energy proportionality for disk storage using replication , 2010, EDBT/ICDT '11.

[15]  Jinoh Kim,et al.  Energy-Aware Scheduling in Disk Storage Systems , 2011, 2011 31st International Conference on Distributed Computing Systems.

[16]  Christoforos E. Kozyrakis,et al.  On the energy (in)efficiency of Hadoop clusters , 2010, OPSR.

[17]  Thomas D. Burd,et al.  Design issues for Dynamic Voltage Scaling , 2000, ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514).

[18]  Liang Liu,et al.  Energy efficient scheduling of virtual machines in cloud with deadline constraint , 2015, Future Gener. Comput. Syst..

[19]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[20]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[21]  Vasudeva Varma,et al.  Dynamic energy efficient data placement and cluster reconfiguration algorithm for MapReduce framework , 2012, Future Gener. Comput. Syst..