A tighter lower bound estimate for dynamic time warping

In this paper, we propose a new lower-bound estimate for speeding up dynamic time warping (DTW) on multivariate time sequences. It has several advantages as compared with the inner-product lower bound [1] recently proposed to eliminate a large number of DTW computations. First, we prove that it is tighter than the inner product lower bound while the computational complexity remains comparable. Second, the inner product lower bound is specifically designed for the inner product distance while the proposed lower bound is valid for any distance measure. Third, DTW search can be further speeded up since the distance matrix is calculated in advance at the lower bound estimation stage. Spoken term detection experiments on the TIMIT corpus show that the proposed lower bound estimate is able to reduce the computational requirements for DTW-KNN search by 54% as compared with the inner-product lower bound. in black ink.

[1]  James R. Glass,et al.  An inner-product lower-bound estimate for dynamic time warping , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[3]  Patrick Wambacq,et al.  Template-Based Continuous Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  R. Manmatha,et al.  Lower-Bounding of Dynamic Time Warping Distances for Multivariate Time Series , 2003 .

[5]  James R. Glass,et al.  Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[6]  Qiang Huang,et al.  Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis , 2012, INTERSPEECH.

[7]  Bin Ma,et al.  An acoustic segment modeling approach to query-by-example spoken term detection , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[9]  James R. Glass,et al.  Unsupervised Pattern Discovery in Speech , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  James R. Glass,et al.  A Piecewise Aggregate Approximation Lower-Bound Estimate for Posteriorgram-Based Dynamic Time Warping , 2011, INTERSPEECH.

[11]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.

[12]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[13]  Li Deng,et al.  Structure-based and template-based automatic speech recognition - comparing parametric and non-parametric approaches , 2007, INTERSPEECH.

[14]  Bin Ma,et al.  Acoustic TextTiling for story segmentation of spoken documents , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).