Big vs little core for energy-efficient Hadoop computing

The rapid growth in the data yields challenges to process data efficiently using current high-performance server architectures such as big Xeon cores. Furthermore, physical design constraints, such as power and density, have become the dominant limiting factor for scaling out servers. Heterogeneous architectures that combine big Xeon cores with little Atom cores have emerged as a promising solution to enhance energy-efficiency by allowing each application to run on an architecture that matches resource needs more closely than a one-size-fits-all architecture. Therefore, the question of whether to map the application to big Xeon or little Atom in heterogeneous server architecture becomes important. In this paper, we characterize Hadoop-based applications and their corresponding MapReduce tasks on big Xeon and little Atom-based server architectures to understand how the choice of big vs little cores is affected by various parameters at application, system and architecture levels and the interplay among these parameters. Furthermore, we have evaluated the operational and the capital cost to understand how performance, power and area constraints for big data analytics affects the choice of big vs little core server as a more cost and energy efficient architecture.

[1]  Hassan Ghasemzadeh,et al.  Big vs little core for energy-efficient Hadoop computing , 2019, J. Parallel Distributed Comput..

[2]  Tulika Mitra,et al.  Scalable custom instructions identification for instruction-set extensible processors , 2004, CASES '04.

[3]  Timothy G. Armstrong,et al.  LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.

[4]  Eric S. Chung,et al.  LINQits: big data on little clients , 2013, ISCA.

[5]  Houman Homayoun,et al.  Managing distributed UPS energy for effective power capping in data centers , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[6]  Ali Raza Butt,et al.  On the use of microservers in supporting hadoop applications , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[8]  Yuqing Zhu,et al.  BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[9]  Jordi Torres,et al.  GreenHadoop: leveraging green energy in data-processing frameworks , 2012, EuroSys '12.

[10]  Houman Homayoun,et al.  Big data on low power cores: Are low power embedded processors a good fit for the big data workloads? , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).

[11]  Babak Falsafi,et al.  Toward Dark Silicon in Servers , 2011, IEEE Micro.

[12]  Paul Chow,et al.  ZCluster: A Zynq-based Hadoop cluster , 2013, 2013 International Conference on Field-Programmable Technology (FPT).

[13]  Gang Lu,et al.  CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications , 2012, Frontiers of Computer Science.

[14]  Ali Raza Butt,et al.  [phi]Sched: A Heterogeneity-Aware Hadoop Workflow Scheduler , 2014, 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems.

[15]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Roy H. Campbell,et al.  ARIA: automatic resource inference and allocation for mapreduce environments , 2011, ICAC '11.

[17]  W. Stechele,et al.  Energy consumption of Graphic Processing Units with respect to automotive use-cases , 2010, 2010 International Conference on Energy Aware Computing.

[18]  Tao Li,et al.  Fast enumeration of maximal valid subgraphs for custom-instruction identification , 2009, CASES '09.

[19]  Anshul Kumar,et al.  Instruction Selection in ASIP Synthesis Using Functional Matching , 2010, 2010 23rd International Conference on VLSI Design.

[20]  Houman Homayoun,et al.  ElasticCore: Enabling dynamic heterogeneity with joint core and voltage/frequency scaling , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[21]  Chaitanya K. Baru,et al.  Setting the Direction for Big Data Benchmark Standards , 2012, TPCTC.

[22]  Houman Homayoun,et al.  Enabling dynamic heterogeneity through core-on-core stacking , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[23]  Babak Falsafi,et al.  Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[24]  Kushagra Vaid,et al.  Web search using mobile cores: quantifying and mitigating the price of efficiency , 2010, ISCA.

[25]  Houman Homayoun,et al.  Dynamically heterogeneous cores through 3D resource pooling , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[26]  Hayden Kwok-Hay So,et al.  Map-reduce processing of k-means algorithm with FPGA-accelerated computer cluster , 2014, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors.

[27]  Scott B. Baden,et al.  Redefining the Role of the CPU in the Era of CPU-GPU Integration , 2012, IEEE Micro.

[28]  Pradeep Dubey,et al.  Ternary Residual Networks , 2017, ArXiv.

[29]  Xiaowei Yang,et al.  CloudCmp: comparing public cloud providers , 2010, IMC '10.

[30]  Rajiv V. Joshi,et al.  Characterizing Hadoop applications on microservers for performance and energy efficiency optimizations , 2016, 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[31]  Kushal Datta,et al.  Energy efficient scheduling of MapReduce workloads on heterogeneous clusters , 2011, GCM '11.

[32]  Xiaona Li,et al.  BigDataBench: a Big Data Benchmark Suite from Web Search Engines , 2013, ArXiv.

[33]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[34]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[35]  Avesta Sasan,et al.  2015 Ieee International Conference on Big Data (big Data) System and Architecture Level Characterization of Big Data Applications on Big and Little Core Server Architectures , 2022 .

[36]  Avesta Sasan,et al.  Energy-efficient acceleration of big data analytics applications using FPGAs , 2015, 2015 IEEE International Conference on Big Data (Big Data).