Similarity-based Retrieval of Time-Series Data Using Multi-Scale Histograms

Queries for similarity-based time series data retrieval can be categorized into two types: (1) pattern existence queries, and (2) exact match queries. Pattern existence queries find the time series data with certain patterns while exact match queries search over the time series data by specifying exact values and detailed temporal information. In this paper, we propose multi-scale time series histograms that can be used to answer both types of queries, thus offering users more flexibility. The experimental results show that multi-scale histograms can effectively find the patterns in time series data as well as answer exact match queries, even when the data contain noise, local time shifting, local time scaling, amplitude shifting and amplitude scaling.

[1]  Alberto O. Mendelzon,et al.  Efficient Retrieval of Similar Time Sequences Using DFT , 1998, FODO.

[2]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[3]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[4]  Haixun Wang,et al.  Landmarks: a new model for similarity-based pattern querying in time series databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[5]  Giuseppe Psaila,et al.  Querying Shapes of Histories , 1995, VLDB.

[6]  Shu Lin,et al.  An Extendible Hash for Multi-Precision Similarity Querying of Image Databases , 2001, VLDB.

[7]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.

[8]  Raymond T. Ng,et al.  Multiscale Similarity Matching for Subimage Queries of Arbitrary Size , 1998, VDB.

[9]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[10]  Renée J. Miller,et al.  Similarity search over time-series data using wavelets , 2002, Proceedings 18th International Conference on Data Engineering.

[11]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[12]  Christos Faloutsos,et al.  Efficiently supporting ad hoc queries in large datasets of time sequences , 1997, SIGMOD '97.

[13]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[14]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[15]  Dimitrios Gunopulos,et al.  A Wavelet-Based Anytime Algorithm for K-Means Clustering of Time Series , 2003 .

[16]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[17]  Nasser Yazdani,et al.  Matching and indexing sequences of different lengths , 1997, CIKM '97.

[18]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[19]  Changzhou Wang,et al.  Supporting fast search in time series for movement patterns in multiple scales , 1998, CIKM '98.

[20]  Hagit Shatkay,et al.  Approximate queries and representations for large data sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[21]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[22]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[23]  Donald J. Berndt,et al.  Finding Patterns in Time Series: A Dynamic Programming Approach , 1996, Advances in Knowledge Discovery and Data Mining.

[24]  Changzhou Wang,et al.  Supporting content-based searches on time series via approximation , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[25]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[26]  James Lee Hafner,et al.  Efficient Color Histogram Indexing for Quadratic Form Distance Functions , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Margrit Betke,et al.  THE CAMERA MOUSE: PRELIMINARY INVESTIGATION OF AUTOMATED VISUAL TRACKING FOR COMPUTER ACCESS , 2000 .

[28]  Jessica Lin,et al.  Finding Motifs in Time Series , 2002, KDD 2002.

[29]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[30]  Dimitrios Gunopulos,et al.  Finding Similar Time Series , 1997, PKDD.

[31]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[32]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).