Specialized Storage for Big Numeric Time Series

Numeric time series data has unique storage requirements and access patterns that can benefit from specialized support, given its importance in Big Data analyses. Popular frameworks and databases focus on addressing other needs, making them a suboptimal fit. This paper describes the support needed for numeric time series, suggests an architecture for efficient time series storage, and illustrates its potential for satisfying key requirements.

[1]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[2]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[3]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[4]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD 2000.

[5]  Jennifer Widom,et al.  An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations , 2002 .

[6]  Jennifer Widom,et al.  Query Processing, Resource Management, and Approximation ina Data Stream Management System , 2002 .

[7]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[8]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[9]  Dimitrios Gunopulos,et al.  Online amnesic approximation of streaming time series , 2004, Proceedings. 20th International Conference on Data Engineering.

[10]  Jimeng Sun,et al.  Streaming Pattern Discovery in Multiple Time-Series , 2005, VLDB.

[11]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[12]  Eamonn J. Keogh A decade of progress in indexing and mining large time series databases , 2006, VLDB.

[13]  Magdalena Balazinska,et al.  Moirae: History-Enhanced Monitoring , 2007, CIDR.

[14]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[15]  Eamonn J. Keogh,et al.  iSAX: indexing and mining terabyte sized time series , 2008, KDD.

[16]  Nuno Constantino Castro,et al.  Time Series Data Mining , 2009, Encyclopedia of Database Systems.

[17]  Eric Anderson,et al.  DataSeries: an efficient, flexible data format for structured serial data , 2009, OPSR.

[18]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[19]  Vanish Talwar,et al.  Monalytics: online monitoring and analytics for managing large scale data centers , 2010, ICAC '10.

[20]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .

[21]  Dimitrios Gunopulos,et al.  Mining Time Series Data , 2005, Data Mining and Knowledge Discovery Handbook.

[22]  Suman Nath,et al.  DataGarage: Warehousing Massive Performance Data on Commodity Servers , 2010, Proc. VLDB Endow..

[23]  Anne Wilson,et al.  TSDS: high-performance merge, subset, and filter software for time series-like data , 2010, Earth Sci. Informatics.

[24]  Henrik Loeser,et al.  "One Size Fits All": An Idea Whose Time Has Come and Gone? , 2011, BTW.

[25]  Dremel: interactive analysis of web-scale datasets , 2011, Commun. ACM.

[26]  Nesime Tatbul,et al.  Efficiently correlating complex events over live and archived data streams , 2011, DEBS '11.

[27]  Ramakrishna Varadarajan,et al.  The Vertica Analytic Database: C-Store 7 Years Later , 2012, Proc. VLDB Endow..

[28]  Carlos Agón,et al.  Time-series data mining , 2012, CSUR.

[29]  Jimeng Sun,et al.  gbase: an efficient analysis platform for large graphs , 2012, The VLDB Journal.

[30]  Luca Deri,et al.  tsdb: A Compressed Database for Time Series , 2012, TMA.