A subsequence matching algorithm supporting moving average transform of arbitrary order in time-series databases using index interpolation

Proposes a subsequence matching algorithm that supports a moving-average transform of arbitrary order in time-series databases. The existing subsequence matching algorithm of C. Faloutsos et al. (1994) requires an index for each moving-average order, which causes serious storage and CPU time overheads. In this paper, we solve the problem using index interpolation. The proposed algorithm can use only a few indexes for pre-selected moving-average orders k, and it performs subsequence matching for an arbitrary order m (/spl les/k). We prove that the proposed algorithm causes no false dismissal. For selectivities less than 10/sup -2/, the degradation of the search performance compared with the fully-indexed case is no more than 17.2% when two out of 128 indexes are used. The algorithm works better with smaller selectivities.

[1]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[2]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[3]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[4]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[5]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.

[6]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[7]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[8]  Maurice G. Kendall,et al.  Time-Series. 2nd edn. , 1976 .

[9]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[10]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[11]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[12]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[13]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[14]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .