Impact of the Sakoe-Chiba Band on the DTW Time Series Distance Measure for kNN Classification

For classification of time series, the simple 1-nearest neighbor (1NN) classifier in combination with an elastic distance measure such as Dynamic Time Warping (DTW) distance is considered superior in terms of classification accuracy to many other more elaborate methods, including k-nearest neighbor (kNN) with neighborhood size k > 1. In this paper we revisit this apparently peculiar relationship and investigate the differences between 1NN and kNN classifiers in the context of time-series data and constrained DTW distance. By varying neighborhood size k, constraint width r, and evaluating 1NN and kNN with and without distance-based weighting in different schemes of cross-validation, we show that the first nearest neighbor indeed has special significance in labeled time-series data, but also that weighting can drastically improve the accuracy of kNN. This improvement is manifested by better accuracy of weighted kNN than 1NN for small values of k (3–4), better accuracy of weighted kNN than unweighted kNN in general, and reduced need to use large values of constraint r with weighted kNN.

[1]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[2]  Mirjana Ivanovic,et al.  The Influence of Global Constraints on Similarity Measures for Time-Series Databases , 2011, Knowl. Based Syst..

[3]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[4]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[5]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[6]  Andrew Luk,et al.  A Re-Examination of the Distance-Weighted k-Nearest Neighbor Classification Rule , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[8]  Alexandros Nanopoulos,et al.  Time-Series Classification in Many Intrinsic Dimensions , 2010, SDM.

[9]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[10]  Carlos Agón,et al.  Time-series data mining , 2012, CSUR.

[11]  Jianping Gou,et al.  A Novel Weighted Voting for K-Nearest Neighbor Rule , 2011, J. Comput..

[12]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[13]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[14]  David B. Lomet,et al.  Foundations of Data Organization and Algorithms , 1993, Lecture Notes in Computer Science.

[15]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[16]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[17]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[18]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[19]  Jianping Gou,et al.  A new distance-weighted k-nearest neighbor classifier , 2012 .

[20]  M. Ivanović,et al.  The Influence of Global Constraints on DTW and LCS Similarity Measures for Time-Series Databases , 2011 .

[21]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.

[22]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[23]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[24]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[25]  Darina Dicheva,et al.  Third International Conference on Software, Services & Semantic Technologies S3T 2011 , 2011 .

[26]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.