Optimization of subsequence matching under time warping in time-series databases

This paper discusses effective processing of subsequence matching under time warping in time-series databases. Time warping is a transformation that enables finding of sequences with similar patterns even when they are of different lengths. Through a preliminary experiment, we first point out that Naive-Scan, a basic method for processing of subsequence matching under time warping, has its performance bottleneck in the CPU processing step. For optimizing this step, in this paper, we propose a novel method that eliminates all possible redundant calculations. It is verified that this method is not only an optimal one for processing Naive-Scan, but also does not incur any false dismissals. Our experimental results showed that the proposed method can make great improvement in performance of subsequence matching under time warping. Especially, Naive-Scan, which has been known to show the worst performance, performs much better than LB-Scan as well as ST-Filter in all the cases by employing the proposed method for CPU processing. This result is interesting and valuable in that the performance inversion among Naive-Scan, LB-Scan, and ST-Filter has occurred by optimizing the CPU processing step, which is their common performance bottleneck.

[1]  Sang-Wook Kim,et al.  Index interpolation: an approach to subsequence matching supporting normalization transform in time-series databases , 2000, CIKM '00.

[2]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[3]  Sriram Padmanabhan,et al.  Prefix-querying: an approach for effective subsequence matching under time warping in sequence databases , 2001, CIKM '01.

[4]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[5]  Wesley W. Chu,et al.  Efficient searches for similar subsequences of different lengths in sequence databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[6]  Davood Rafiei,et al.  On similarity-based queries for time series data , 1997, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[7]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[8]  Man Hon Wong,et al.  Fast time-series searching with scaling and shifting , 1999, PODS '99.

[9]  Donald J. Berndt,et al.  Finding Patterns in Time Series: A Dynamic Programming Approach , 1996, Advances in Knowledge Discovery and Data Mining.

[10]  W. K. Loh Index Interpolation: A Subsequence Matching Algroithm Supporting Moving Average Transforms of Arbitrary Order in Time-Series Databases , 2001 .

[11]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[12]  Wesley W. Chu,et al.  Efficient processing of similarity search under time warping in sequence databases: an index-based approach , 2004, Inf. Syst..

[13]  Kim Sang-Wook,et al.  Subsequence Matching Under Time Warping in Time-Series Databases : Observation, Optimization, and Performance Results , 2004 .

[14]  Dimitrios Gunopulos,et al.  Finding Similar Time Series , 1997, PKDD.