An analysis of time series representation methods: data mining applications perspective

Because of high dimensionality, proven data mining and pattern recognition methods are not suitable for processing time series data. As a result, several time series representations capable of achieving significant reduction in dimensionality without losing important features have been developed. Each representation has its own advantages and disadvantages. In this paper, based on the requirements of key data mining applications, such as clustering, classification and query by content, characteristics desired in an ideal time series representation are identified. Using the identified characteristics as metrics, widely known time series representation methods are evaluated to determine the extent to which the representations satisfy the requirements.

[1]  Carlos Agón,et al.  Time-series data mining , 2012, CSUR.

[2]  Wesley W. Chu,et al.  Segment-based approach for subsequence searches in sequence databases , 2001, Comput. Syst. Sci. Eng..

[3]  Lixiao Wu,et al.  An Approach of Time Series Piecewise Linear Representation Based on Local Maximum Minimum and Extremum , 2013 .

[4]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[5]  Sergios Theodoridis,et al.  Pattern Recognition, Fourth Edition , 2008 .

[6]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[7]  Divyakant Agrawal,et al.  A comparison of DFT and DWT based similarity search in time-series databases , 2000, CIKM '00.

[8]  Christos Faloutsos,et al.  Efficiently supporting ad hoc queries in large datasets of time sequences , 1997, SIGMOD '97.

[9]  Yi Jiang,et al.  A New Representation and Similarity Measure of Time Series on Data Mining , 2009, 2009 International Conference on Computational Intelligence and Software Engineering.

[10]  Henrik André-Jönsson,et al.  Using Signature Files for Querying Time-Series Data , 1997, PKDD.

[11]  Gang Ye,et al.  A New Method for Piecewise Linear Representation of Time Series Data , 2012 .

[12]  Eamonn J. Keogh,et al.  An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback , 1998, KDD.

[13]  Yunhao Liu,et al.  Indexable PLA for Efficient Similarity Search , 2007, VLDB.

[14]  Qiang Wang,et al.  A dimensionality reduction technique for efficient similarity analysis of time series databases , 2004, CIKM '04.

[15]  Duong Tuan Anh,et al.  An Improvement of PAA for Dimensionality Reduction in Large Time Series Databases , 2008, PRICAI.

[16]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[17]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.