Fast and Exact Warping of Time Series Using Adaptive Segmental Approximations

Similarity search is a core module of many data analysis tasks, including search by example, classification, and clustering. For time series data, Dynamic Time Warping (DTW) has been proven a very effective similarity measure, since it minimizes the effects of shifting and distortion in time. However, the quadratic cost of DTW computation to the length of the matched sequences makes its direct application on databases of long time series very expensive. We propose a technique that decomposes the sequences into a number of segments and uses cheap approximations thereof to compute fast lower bounds for their warping distances. We present several, progressively tighter bounds, relying on the existence or not of warping constraints. Finally, we develop an index and a multi-step technique that uses the proposed bounds and performs two levels of filtering to efficiently process similarity queries. A thorough experimental study suggests that our method consistently outperforms state-of-the-art methods for DTW similarity search.

[1]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[2]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[3]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[4]  Ralph Arnote,et al.  Hong Kong (China) , 1996, OECD/G20 Base Erosion and Profit Shifting Project.

[5]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[6]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[7]  Dennis Shasha,et al.  Warping indexes with envelope transforms for query by humming , 2003, SIGMOD '03.

[8]  Pietro Perona,et al.  Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[10]  Dimitrios Gunopulos,et al.  Indexing multi-dimensional time-series with support for multiple distance measures , 2003, KDD '03.

[11]  Yang-Sae Moon,et al.  General match: a subsequence matching method in time-series databases based on generalized windows , 2002, SIGMOD '02.

[12]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[13]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[14]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[15]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[16]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[17]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[18]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[19]  Eamonn J. Keogh,et al.  Iterative Deepening Dynamic Time Warping for Time Series , 2002, SDM.

[20]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[21]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[22]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[23]  Eamonn J. Keogh,et al.  Scaling up Dynamic Time Warping to Massive Dataset , 1999, PKDD.

[24]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[25]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.

[26]  Eamonn J. Keogh,et al.  UCR Time Series Data Mining Archive , 1983 .