Overview of Time Series Storage and Processing in a Cloud Environment

In this paper we provide a short overview of Time Series Storage and Processing in a Cloud Environment. We focus on four main developments: Chukwa, OpenTSDB, TempoDB and Squwk. We compare them in terms of storage infrastructure, data acquisition, GUI and support for advanced analysis. In this comparison OpenTSDB emerges as the most interesting option for projects requiring support for advanced analysis. At the same time, TempoDB offers all the necessary basic functionality and can be a better option for many project which lack full IT support. We also describe other works in the field that have not yet developed in full frameworks. Some of them can significantly contribute to the existing platforms.

[1]  Alfredo Cuzzocrea,et al.  On Managing Very Large Sensor-Network Data Using Bigtable , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[4]  D. C. Barr The use of a data historian to extend plant life , 1994 .

[5]  Chaki Ng,et al.  Provenance-Aware Sensor Data Storage , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[6]  T. Dang,et al.  Improving industrial application's performances with an Historian , 2004, 2004 IEEE International Conference on Industrial Technology, 2004. IEEE ICIT '04..

[7]  Randy H. Katz,et al.  Chukwa: A System for Reliable Large-Scale Log Collection , 2010, LISA.

[8]  Deborah Estrin,et al.  Dimensions: why do we need a new data handling architecture for sensor networks? , 2003, CCRV.

[9]  Wen-Yuan Ku The Cloud-Based Sensor Data Warehouse , 2011 .

[10]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[11]  Chunming Rong,et al.  Performance Considerations of Data Acquisition in Hadoop System , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[12]  Toyotaro Suzumura,et al.  A Highly Efficient Consolidated Platform for Stream Computing and Hadoop , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.