Efficient Classification of Long Time Series by 3-D Dynamic Time Warping

Throughout recent years, dynamic time warping (DTW) has remained as a robust similarity measure in time series classification (TSC). 1-nearest neighbor (1-NN) algorithm with DTW is the most widely used classification method on time series serving as a benchmark. With the increasing demand for TSC on low-resource devices and the widespread of wearable devices, the need for a efficient and accurate time series classifier has never been higher. Although 1-NN DTW attains accurate results, it highly falls back on efficiency due to its quadratic complexity in the length of time series. In this paper, we propose a new approximation method for reducing the length of the time series as the input of DTW. We call it control chart approximation (CCA), after a similar concept used in statistical quality control processing. CCA representation approximates raw time series by transforming them into a set of segments with aggregated values and durations forming a reduced 3-D vector. We also propose an adaptation of DTW in 3-D space as a distance measure for 1-NN classifier, and denote the method as 1-NN 3-D DTW. Our experiments on 85 datasets from UCR archive—including 28 long-length (>500 points) time series datasets—show up to two orders of magnitude performance gain in running time compared to the state-of-the-art 1-NN DTW implementation. Moreover, it shows similar or better accuracy on the long time series in the experiment.

[1]  G. W. Hughes,et al.  Minimum Prediction Residual Principle Applied to Speech Recognition , 1975 .

[2]  Gareth J. Janacek,et al.  Clustering time series from ARMA models with clipped data , 2004, KDD.

[3]  Jason Lines,et al.  Time series classification with ensembles of elastic distance measures , 2015, Data Mining and Knowledge Discovery.

[4]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[5]  Eamonn J. Keogh,et al.  The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[6]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[7]  Qiang Wang,et al.  A multiresolution symbolic representation of time series , 2005, 21st International Conference on Data Engineering (ICDE'05).

[8]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[9]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[10]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[11]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[12]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[13]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[14]  Henrik André-Jönsson,et al.  Using Signature Files for Querying Time-Series Data , 1997, PKDD.

[15]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[16]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[17]  Xiaoyu Song,et al.  Online Signature Verification Based on Stable Features Extracted Dynamically , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[18]  Pavlos Protopapas,et al.  Supporting exact indexing of arbitrarily rotated shapes and periodic time series under Euclidean and warping distance measures , 2008, The VLDB Journal.

[19]  Eamonn J. Keogh,et al.  CID: an efficient complexity-invariant distance for time series , 2013, Data Mining and Knowledge Discovery.

[20]  George C. Runger,et al.  A Bag-of-Features Framework to Classify Time Series , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Raymond T. Ng,et al.  Indexing spatio-temporal trajectories with Chebyshev polynomials , 2004, SIGMOD '04.

[22]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[23]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[24]  Eamonn J. Keogh,et al.  Scaling and time warping in time series querying , 2005, The VLDB Journal.

[25]  Philip S. Yu,et al.  Adaptive query processing for time-series data , 1999, KDD '99.

[26]  Ya-Ju Fan,et al.  On the Time Series $K$-Nearest Neighbor Classification of Abnormal Brain Activity , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[27]  Patrick Schäfer The BOSS is concerned with time series classification in the presence of noise , 2014, Data Mining and Knowledge Discovery.

[28]  Hossein Hamooni,et al.  Dual-Domain Hierarchical Classification of Phonetic Time Series , 2014, 2014 IEEE International Conference on Data Mining.

[29]  Kristin P. Bennett,et al.  Density-based indexing for approximate nearest-neighbor queries , 1999, KDD '99.

[30]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[31]  Aurobinda Routray,et al.  Effect of Sleep Deprivation on Functional Connectivity of EEG Channels , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[32]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[33]  Jason Lines,et al.  A shapelet transform for time series classification , 2012, KDD.

[34]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[35]  Patrick Schäfer,et al.  Scalable time series classification , 2016, Data Mining and Knowledge Discovery.

[36]  Fabian Mörchen,et al.  Optimizing time series discretization for knowledge discovery , 2005, KDD '05.

[37]  Marek Kulbacki,et al.  Unsupervised Learning Motion Models Using Dynamic Time Warping , 2002, Intelligent Information Systems.

[38]  Wang Yuanzhen,et al.  Early abandon to accelerate exact dynamic time warping , 2009, Int. Arab J. Inf. Technol..

[39]  Tele Tan,et al.  Classifying eye and head movement artifacts in EEG signals , 2011, 5th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2011).

[40]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[41]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[42]  Jason Lines,et al.  Classification of time series by shapelet transformation , 2013, Data Mining and Knowledge Discovery.

[43]  Kyoji Kawagoe,et al.  Extended SAX: Extension of Symbolic Aggregate Approximation for Financial Time Series Data Representation , 2006 .

[44]  Ujwala Baruah,et al.  Evaluation of Lower Bounding Methods of Dynamic Time Warping (DTW) , 2014 .

[45]  Yuan Li,et al.  Rotation-invariant similarity in time series using bag-of-patterns representation , 2012, Journal of Intelligent Information Systems.

[46]  Jason Lines,et al.  Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles , 2015, IEEE Transactions on Knowledge and Data Engineering.

[47]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[48]  Manuela Veloso,et al.  Learning from accelerometer data on a legged robot , 2004 .

[49]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[50]  Sim Heng Ong,et al.  Automated Identification of Chromosome Segments Involved in Translocations by Combining Spectral Karyotyping and Banding Analysis , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[51]  Christos Faloutsos,et al.  Efficiently supporting ad hoc queries in large datasets of time sequences , 1997, SIGMOD '97.

[52]  Jie Chen,et al.  Energy Efficiency Prediction Based on PCA-FRBF Model: A Case Study of Ethylene Industries , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[53]  Paul R. Cohen,et al.  Learned models for continuous planning , 1999, AISTATS.