Efficient retrieval of similar time sequences under time warping

Fast similarity searching in large time sequence databases has typically used Euclidean distance as a dissimilarity metric. However, for several applications, including matching of voice, audio and medical signals (e.g., electrocardiograms), one is required to permit local accelerations and decelerations in the rate of sequences, leading to a popular, field tested dissimilarity metric called the "time warping" distance. From the indexing viewpoint, this metric presents two major challenges: (a) it does not lead to any natural indexable "features", and (b) comparing two sequences requires time quadratic in the sequence length. To address each problem, we propose to use: (a) a modification of the so called "FastMap", to map sequences into points, with little compromise of "recall" (typically zero); and (b) a fast linear test, to help us discard quickly many of the false alarms that FastMap will typically introduce. Using both ideas in cascade, our proposed method achieved up to an order of magnitude speed-up over sequential scanning on both real and synthetic datasets.

[1]  Miron Livny,et al.  Sequence query processing , 1994, SIGMOD '94.

[2]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[3]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[4]  Christos Faloutsos,et al.  A signature technique for similarity-based queries , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[5]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[6]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[7]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[8]  Alberto O. Mendelzon,et al.  Similarity-based queries , 1995, PODS '95.

[9]  C. Faloutsos Eecient Similarity Search in Sequence Databases , 1993 .

[10]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[11]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[12]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.

[13]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[14]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[15]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.