Anticipatory DTW for Efficient Similarity Search in Time Series Databases

Time series arise in many different applications in the form of sensor data, stocks data, videos, and other time-related information. Analysis of this data typically requires searching for similar time series in a database. Dynamic Time Warping (DTW) is a widely used high-quality distance measure for time series. As DTW is computationally expensive, efficient algorithms for fast computation are crucial. In this paper, we propose a novel filter-and-refine DTW algorithm called Anticipatory DTW. Existing algorithms aim at efficiently finding similar time series by filtering the database and computing the DTW in the refinement step. Unlike these algorithms, our approach exploits previously unused information from the filter step during the refinement, allowing for faster rejection of false candidates. We characterize a class of applicable filters for our approach, which comprises state-of-the-art lower bounds of the DTW. Our novel anticipatory pruning incurs hardly any over-head and no false dismissals. We demonstrate substantial efficiency improvements in thorough experiments on synthetic and real world time series databases and show that our technique is highly scalable to multivariate, long time series and wide DTW bands.

[1]  Christos Faloutsos,et al.  Searching Multimedia Databases by Content , 1996, Advances in Database Systems.

[2]  Christos Faloutsos,et al.  FTW: fast similarity search under the time warping distance , 2005, PODS.

[3]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[4]  Stan Salvador,et al.  FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space , 2004 .

[5]  Ira Assent,et al.  Robust Adaptable Video Copy Detection , 2009, SSTD.

[6]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[7]  Dennis Shasha,et al.  Warping indexes with envelope transforms for query by humming , 2003, SIGMOD '03.

[8]  R. Manmatha,et al.  Lower-Bounding of Dynamic Time Warping Distances for Multivariate Time Series , 2003 .

[9]  G. W. Hughes,et al.  Minimum Prediction Residual Principle Applied to Speech Recognition , 1975 .

[10]  Christos Faloutsos,et al.  Stream Monitoring under the Time Warping Distance , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[12]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.

[13]  Mi Zhou,et al.  Boundary-Based Lower-Bound Functions for Dynamic Time Warping and Their Indexing , 2007, ICDE.

[14]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[15]  Sheng-Fuu Lin,et al.  Time registration of two image sequences by dynamic time warping , 2004, IEEE International Conference on Networking, Sensing and Control, 2004.

[16]  Mi Zhou,et al.  Efficient Online Subsequence Searching in Data Streams under Dynamic Time Warping Distance , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Eamonn J. Keogh,et al.  LB_Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures , 2006, VLDB.

[18]  Eamonn J. Keogh,et al.  Iterative Deepening Dynamic Time Warping for Time Series , 2002, SDM.

[19]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[20]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[21]  Ira Assent,et al.  Approximation Techniques for Indexing the Earth Mover’s Distance in Multimedia Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[22]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[23]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[24]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[25]  Dimitrios Gunopulos,et al.  Indexing multi-dimensional time-series with support for multiple distance measures , 2003, KDD '03.

[26]  George M. Church,et al.  Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[27]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[28]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[29]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[30]  Clement T. Yu,et al.  Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping , 2003, IEEE Trans. Knowl. Data Eng..

[31]  Dimitrios Gunopulos,et al.  Approximate embedding-based subsequence matching of time series , 2008, SIGMOD Conference.