Amdahl's Law in Big Data Analytics: Alive and Kicking in TPCx-BB (BigBench)

Big data, specifically data analytics, is responsible for driving many of consumers' most common online activities, including shopping, web searches, and interactions on social media. In this paper, we present the first (micro)architectural investigation of a new industry-standard, open source benchmark suite directed at big data analytics applications—TPCx-BB (BigBench). Where previous work has usually studied benchmarks which oversimplify big data analytics, our study of BigBench reveals that there is immense diversity among applications, owing to their varied data types, computational paradigms, and analyses. In our analysis, we also make an important discovery generally restricting processor performance in big data. Contrary to conventional wisdom that big data applications lend themselves naturally to parallelism, we discover that they lack sufficient thread-level parallelism (TLP) to fully utilize all cores. In other words, they are constrained by Amdahl's law. While TLP may be limited by various factors, ultimately we find that single-thread performance is as relevant in scale-out workloads as it is in more classical applications. To this end we present core packing: a software and hardware solution that could provide as much as 20% execution speedup for some big data analytics applications.

[1]  Weisong Shi,et al.  Workload characterization on a production Hadoop cluster: A case study on Taobao , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[2]  Shengsheng Huang,et al.  HiBench : A Representative and Comprehensive Hadoop Benchmark Suite , 2012 .

[3]  Zhen Jia,et al.  The Implications of Diverse Applications and Scalable Data Sets in Benchmarking Big Data Systems , 2012, WBDB.

[4]  Antony I. T. Rowstron,et al.  Scale-up vs scale-out for Hadoop: time to rethink? , 2013, SoCC.

[5]  John Paul Shen,et al.  Best of both latency and throughput , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[6]  Sherief Reda,et al.  Pack & Cap: Adaptive DVFS and thread packing under power caps , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Yuqing Zhu,et al.  BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[8]  Age K. Smilde,et al.  Principal component analysis , 2014 .

[9]  Babak Falsafi,et al.  Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[10]  Eric Anderson,et al.  Efficiency matters! , 2010, OPSR.

[11]  Babak Falsafi,et al.  Scale-out processors , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[12]  Chunjie Luo,et al.  Characterizing data analysis workloads in data centers , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[13]  Jian Li,et al.  Power-Performance Implications of Thread-level Parallelism on Chip Multiprocessors , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[14]  Todor Ivanov,et al.  Evaluating Hive and Spark SQL with BigBench , 2015, ArXiv.

[15]  Gu-Yeon Wei,et al.  Profiling a Warehouse-Scale Computer , 2016, IEEE Micro.

[16]  Thomas Willhalm,et al.  Memory system characterization of big data workloads , 2013, 2013 IEEE International Conference on Big Data.

[17]  John Paul Shen,et al.  Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[18]  Tao Li,et al.  Characterizing the efficiency of data deduplication for big data storage management , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[19]  Carlo Curino,et al.  Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data , 2014, TPCTC.

[20]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[21]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[22]  Avi Mendelson,et al.  Deep-dive analysis of the data analytics workload in CloudSuite , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[23]  Timothy G. Armstrong,et al.  LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.