IoTAbench: an Internet of Things Analytics Benchmark

The commoditization of sensors and communication networks is enabling vast quantities of data to be generated by and collected from cyber-physical systems. This ``Internet-of-Things" (IoT) makes possible new business opportunities, from usage-based insurance to proactive equipment maintenance. While many technology vendors now offer ``Big Data" solutions, a challenge for potential customers is understanding quantitatively how these solutions will work for IoT use cases. This paper describes a benchmark toolkit called IoTAbench for IoT Big Data scenarios. This toolset facilitates repeatable testing that can be easily extended to multiple IoT use cases, including a user's specific needs, interests or dataset. We demonstrate the benchmark via a smart metering use case involving an eight-node cluster running the HP Vertica analytics platform. The use case involves generating, loading, repairing and analyzing synthetic meter readings. The intent of IoTAbench is to provide the means to perform ``apples-to-apples" comparisons between different sensor data and analytics platforms. We illustrate the capabilities of IoTAbench via a large experimental study, where we store 22.8 trillion smart meter readings totaling 727 TB of data in our eight-node cluster.

[1]  Campbell Fraser,et al.  Enhancements to SQL server column stores , 2013, SIGMOD '13.

[2]  D I Jones,et al.  An application of a Markov chain noise model to wind generator simulation , 1986 .

[3]  Jing Zhao,et al.  Benchmarking cloud-based data management systems , 2010, CloudDB '10.

[4]  Stamatis Karnouskos,et al.  Assessment of high-performance smart metering for the web service enabled smart grid era , 2011, ICPE '11.

[5]  Lavanya Ramakrishnan,et al.  Performance evaluation of a MongoDB and hadoop platform for scientific data analysis , 2013, Science Cloud '13.

[6]  A. Shamshad,et al.  First and second order Markov chain models for synthetic generation of wind speed time series , 2005 .

[7]  Norman May,et al.  Timeline index: a unified data structure for processing queries on temporal data in SAP HANA , 2013, SIGMOD '13.

[8]  David J. DeWitt,et al.  Can the Elephants Handle the NoSQL Onslaught? , 2012, Proc. VLDB Endow..

[9]  Lin Xiao,et al.  YCSB++: benchmarking and performance debugging advanced features in scalable table stores , 2011, SoCC.

[10]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[11]  Xu Han,et al.  An efficient index for massive IOT data in cloud environment , 2012, CIKM '12.

[12]  H. A. Dryar,et al.  The Effect of Weather on the System Load , 1944, Transactions of the American Institute of Electrical Engineers.

[13]  Joel H. Saltz,et al.  Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce , 2013, Proc. VLDB Endow..

[14]  Jiajie Xu,et al.  IOT-StatisticDB: A General Statistical Database Cluster Mechanism for Big Data Analysis in the Internet of Things , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[15]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[16]  Alexander Hall,et al.  Processing a Trillion Cells per Mouse Click , 2012, Proc. VLDB Endow..

[17]  Clark Gellings,et al.  Electric Load Curve Synthesis - A Computer Simulation of an Electric Utility Load Shape , 1981, IEEE Transactions on Power Apparatus and Systems.

[18]  Scott Shenker,et al.  Shark: SQL and rich analytics at scale , 2012, SIGMOD '13.

[19]  Ramakrishna Varadarajan,et al.  The Vertica Analytic Database: C-Store 7 Years Later , 2012, Proc. VLDB Endow..

[20]  Michael Stonebraker,et al.  A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.

[21]  Dhruba Borthakur Petabyte scale databases and storage systems at Facebook , 2013, SIGMOD '13.

[22]  Eric Lo,et al.  Parallel analytics as a service , 2013, SIGMOD '13.

[23]  Timothy G. Armstrong,et al.  LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.

[24]  Sam Lightstone,et al.  DB2 with BLU Acceleration: So Much More than Just a Column Store , 2013, Proc. VLDB Endow..