BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite

As architecture, system, data management, and machine learning communities pay greater attention to innovative big data and data-driven artificial intelligence (in short, AI) algorithms, architecture, and systems, the pressure of benchmarking rises. However, complexity, diversity, frequently changed workloads, and rapid evolution of big data, especially AI systems raise great challenges in benchmarking. First, for the sake of conciseness, benchmarking scalability, portability cost, reproducibility, and better interpretation of performance data, we need understand what are the abstractions of frequently-appearing units of computation, which we call dwarfs, among big data and AI workloads. Second, for the sake of fairness, the benchmarks must include diversity of data and workloads. Third, for co-design of software and hardware, the benchmarks should be consistent across different communities. Other than creating a new benchmark or proxy for every possible workload, we propose using dwarf-based benchmarks--the combination of eight dwarfs--to represent diversity of big data and AI workloads. The current version--BigDataBench 4.0 provides 13 representative real-world data sets and 47 big data and AI benchmarks, including seven workload types: online service, offline analytics, graph analytics, AI, data warehouse, NoSQL, and streaming. BigDataBench 4.0 is publicly available from this http URL Also, for the first time, we comprehensively characterize the benchmarks of seven workload types in BigDataBench 4.0 in addition to traditional benchmarks like SPECCPU, PARSEC and HPCC in a hierarchical manner and drill down on five levels, using the Top-Down analysis from an architecture perspective.

[1]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[2]  Tilmann Rabl,et al.  The Vision of BigBench 2.0 , 2015, DanaC@SIGMOD.

[3]  Arun Kumar,et al.  A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics , 2018, Proc. VLDB Endow..

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Dan Meng,et al.  Transformer: A New Paradigm for Building Data-Parallel Programming Models , 2010, IEEE Micro.

[6]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Kristina Chodorow,et al.  MongoDB - The Definitive Guide: Powerful and Scalable Data Storage , 2019 .

[9]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[11]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[12]  Wanling Gao,et al.  Data motifs: a lens towards fully understanding big data and AI workloads , 2018, PACT.

[13]  Chunjie Luo,et al.  BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking , 2013, WBDB.

[14]  Daisuke Takahashi,et al.  The HPC Challenge (HPCC) benchmark suite , 2006, SC.

[15]  Chunjie Luo,et al.  pDeep: Predicting MS/MS Spectra of Peptides with Deep Learning. , 2017, Analytical chemistry.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[18]  Peter Mika,et al.  Flink: Semantic Web technology for the extraction and analysis of social networks , 2005, J. Web Semant..

[19]  Mauro Cettolo,et al.  WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[20]  Jim Gray,et al.  Benchmark Handbook: For Database and Transaction Processing Systems , 1992 .

[21]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[22]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[23]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24]  Quan Chen,et al.  DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[25]  Dong Han,et al.  Cambricon: An Instruction Set Architecture for Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[26]  Brian Fahs,et al.  Microarchitecture optimizations for exploiting memory-level parallelism , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[27]  Jianfeng Zhan,et al.  Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks , 2017, ICANN.

[28]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[29]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[30]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[31]  Mikko H. Lipasti,et al.  BenchNN: On the broad potential application scope of hardware neural network accelerators , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[32]  Lars George,et al.  HBase - The Definitive Guide: Random Access to Your Planet-Size Data , 2011 .

[33]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[34]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[35]  Zhen Jia,et al.  CVR: efficient vectorization of SpMV on x86 processors , 2018, CGO.

[36]  H. Peter Hofstee,et al.  Auto-tuning Spark big data workloads on POWER8: Prediction-based dynamic SMT threading , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).

[37]  Babak Falsafi,et al.  Clearing the Clouds: A Study of Emerging Workloads on Modern Hardware , 2011 .

[38]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[39]  Jianfeng Zhan,et al.  Understanding Big Data Analytics Workloads on Modern Processors , 2015, IEEE Transactions on Parallel and Distributed Systems.

[40]  Erik Wilde,et al.  A resource oriented architecture for the Web of Things , 2010, 2010 Internet of Things (IOT).

[41]  Yuqing Zhu,et al.  BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[42]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[43]  I. Jolliffe Principal Component Analysis and Factor Analysis , 1986 .

[44]  Gu-Yeon Wei,et al.  Profiling a Warehouse-Scale Computer , 2016, IEEE Micro.

[45]  Geoffrey C. Fox,et al.  Big Data, Simulations and HPC Convergence , 2015, WBDB.

[46]  Ahmad Yasin,et al.  A Top-Down method for performance analysis and counters architecture , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[47]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[48]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[49]  Gu-Yeon Wei,et al.  Fathom: reference workloads for modern deep learning methods , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[50]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[51]  Kunle Olukotun,et al.  DAWNBench : An End-to-End Deep Learning Benchmark and Competition , 2017 .

[52]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[53]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[54]  Rong Gu,et al.  Efficient large scale distributed matrix computation with spark , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[55]  Martin Grund,et al.  Impala: A Modern, Open-Source SQL Engine for Hadoop , 2015, CIDR.

[56]  Yanpei Chen,et al.  From TPC-C to Big Data Benchmarks: A Functional Workload Model , 2012, WBDB.

[57]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[58]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[59]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[60]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Michael Stonebraker,et al.  A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.

[62]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[63]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[64]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[65]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[66]  David R. Kaeli,et al.  DNNMark: A Deep Neural Network Benchmark Suite for GPUs , 2017, GPGPU@PPoPP.

[67]  Reynold Xin,et al.  GraphX: a resilient distributed graph system on Spark , 2013, GRADES.

[68]  Avi Mendelson,et al.  Deep-dive analysis of the data analytics workload in CloudSuite , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[69]  Timothy G. Armstrong,et al.  LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.