Random Warping Series: A Random Features Method for Time-Series Embedding

Time series data analytics has been a problem of substantial interests for decades, and Dynamic Time Warping (DTW) has been the most widely adopted technique to measure dissimilarity between time series. A number of global-alignment kernels have since been proposed in the spirit of DTW to extend its use to kernel-based estimation method such as support vector machine. However, those kernels suffer from diagonal dominance of the Gram matrix and a quadratic complexity w.r.t. the sample size. In this work, we study a family of alignment-aware positive definite (p.d.) kernels, with its feature embedding given by a distribution of \emph{Random Warping Series (RWS)}. The proposed kernel does not suffer from the issue of diagonal dominance while naturally enjoys a \emph{Random Features} (RF) approximation, which reduces the computational complexity of existing DTW-based techniques from quadratic to linear in terms of both the number and the length of time-series. We also study the convergence of the RF approximation for the domain of time series of unbounded length. Our extensive experiments on 16 benchmark datasets demonstrate that RWS outperforms or matches state-of-the-art classification and clustering methods in both accuracy and computational time. Our code and data is available at { \url{this https URL}}.

[1]  Pierre Gançarski,et al.  A global averaging method for dynamic time warping, with applications to clustering , 2011, Pattern Recognit..

[2]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[3]  Eamonn J. Keogh,et al.  The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[4]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[5]  Brian Kingsbury,et al.  Efficient one-vs-one kernel ridge regression for speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Jie Chen,et al.  Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability , 2016, KDD.

[7]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[8]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[9]  Akira Hayashi,et al.  Embedding Time Series Data for Classification , 2005, MLDM.

[10]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[11]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[12]  Jinfeng Yi,et al.  Similarity Preserving Representation Learning for Time Series Analysis , 2017, ArXiv.

[13]  Claus Bahlmann,et al.  Online handwriting recognition with support vector machines - a kernel approach , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[14]  Shou-De Lin,et al.  Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space , 2014, NIPS.

[15]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[16]  Eamonn J. Keogh,et al.  CID: an efficient complexity-invariant distance for time series , 2013, Data Mining and Knowledge Discovery.

[17]  Eloy Romero,et al.  PRIMME_SVDS: A High-Performance Preconditioned SVD Solver for Accurate Large-Scale Computations , 2016, SIAM J. Sci. Comput..

[18]  Eamonn J. Keogh,et al.  Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets , 2013, SDM.

[19]  Sergey Malinchik,et al.  SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model , 2013, 2013 IEEE 13th International Conference on Data Mining.

[20]  Patrick Schäfer The BOSS is concerned with time series classification in the presence of noise , 2014, Data Mining and Knowledge Discovery.

[21]  Sylvie Gibet,et al.  On Recursive Edit Distance Kernels With Application to Time Series Classification , 2010, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Yu Yang,et al.  PIEFA: Personalized Incremental and Ensemble Face Alignment , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Marco Cuturi,et al.  Fast Global Alignment Kernels , 2011, ICML.

[24]  Pradeep Ravikumar,et al.  D2KE: From Distance to Kernel and Embedding , 2018, ArXiv.

[25]  Luis Gravano,et al.  k-Shape: Efficient and Accurate Clustering of Time Series , 2016, SGMD.

[26]  Shigeki Sagayama,et al.  Dynamic Time-Alignment Kernel in Support Vector Machine , 2001, NIPS.

[27]  Rohit J. Kate Using dynamic time warping distances as features for improved time series classification , 2016, Data Mining and Knowledge Discovery.

[28]  Andreas Stathopoulos,et al.  A Preconditioned Hybrid SVD Method for Accurately Computing Singular Triplets of Large Matrices , 2015, SIAM J. Sci. Comput..

[29]  George C. Runger,et al.  A Bag-of-Features Framework to Classify Time Series , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[31]  Thomas Philip Runarsson,et al.  Support vector machines and dynamic time warping for time series , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[32]  Lei Li,et al.  Time Series Clustering: Complex is Simpler! , 2011, ICML.

[33]  Tomoko Matsui,et al.  A Kernel for Time Series Based on Global Alignments , 2006, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[34]  Eamonn J. Keogh,et al.  Experimental comparison of representation methods and distance measures for time series data , 2010, Data Mining and Knowledge Discovery.

[35]  Tomasz Górecki,et al.  Non-isometric transforms in time series classification using DTW , 2014, Knowl. Based Syst..

[36]  George C. Runger,et al.  A time series forest for classification and feature extraction , 2013, Inf. Sci..

[37]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[38]  Jian Pei,et al.  A brief survey on sequence classification , 2010, SKDD.

[39]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..