Improving Energy Efficiency of Hadoop Clusters using Approximate Computing

There is an ongoing search for finding energy-efficient solutions in multi-core computing platforms. Approximate computing is one such solution leveraging the forgiving nature of applications to improve the energy efficiency at different layers of the computing platform ranging from applications to hardware. We are interested in understanding the benefits of approximate computing in the realm of Apache Hadoop and its applications. A few mechanisms for introducing approximation in programming models include sampling input data, skipping selective computations, relaxing synchronization, and user-defined quality-levels. We believe that it is straightforward to apply the aforementioned mechanisms to conserve energy in Hadoop clusters as well. The emerging trend of approximate computing motivates us to systematically investigate thermal profiling of approximate computing strategies in this research. In particular, we design a thermal-aware approximate computing framework called tHadoop2, which is an extension of tHadoop proposed by Chavan et al. We investigated the thermal behavior of a MapReduce application called Pi running on Hadoop clusters by varying two input parameters - number of maps and number of sampling points per map. Our profiling results show that Pi exhibits inherent resilience in terms of the number of precision digits present in its value.

[1]  Juan Li,et al.  An overview of energy efficiency techniques in cluster computing systems , 2013, Cluster Computing.

[2]  Li Shi,et al.  Emerging challenges and materials for thermal management of electronics , 2014 .

[3]  Ajit Chavan,et al.  Thermal-Aware File and Resource Allocation in Data Centers , 2017 .

[4]  Ayan Banerjee,et al.  Cooling-aware and thermal-aware workload placement for green HPC data centers , 2010, International Conference on Green Computing.

[5]  Thu D. Nguyen,et al.  ApproxHadoop: Bringing Approximations to MapReduce Frameworks , 2015, ASPLOS.

[6]  Sparsh Mittal A survey of techniques for designing and managing CPU register file , 2017, Concurr. Comput. Pract. Exp..

[7]  Samir Khuller,et al.  Algorithms for the Thermal Scheduling Problem , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[8]  Adrian Sampson,et al.  Hardware and Software for Approximate Computing , 2015 .

[9]  Ismail Akturk,et al.  On Quantification of Accuracy Loss in Approximate Computing , 2015 .

[10]  Albert Y. Zomaya,et al.  A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems , 2010, Adv. Comput..

[11]  Kaushik Roy,et al.  Approximate computing and the quest for computing efficiency , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Alan Edelman,et al.  Language and compiler support for auto-tuning variable-accuracy algorithms , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[13]  Logan Kugler Is "good enough" computing good enough? , 2015, Commun. ACM.