A distance based time series classification framework

Abstract One of the challenging tasks in machine learning is the classification of time series. It is not very different from standard classification except that the time shifts across time series should be corrected by using a suitable alignment algorithm. In this study, we proposed a framework designed for distance based time series classification which enables users to easily apply different alignment and classification methods to different time series datasets. The framework can be extended to implement new alignment and classification algorithms. Using the framework, we implemented the k -Nearest Neighbor and Support Vector Machines classifiers as well as the alignment methods Dynamic Time Warping, Signal Alignment via Genetic Algorithm, Parametric Time Warping and Canonical Time Warping. We also evaluated the framework on UCR time series repository for which we can conclude that a suitable alignment method enhances the time series classification performance on nearly every dataset.

[1]  David B. Skillicorn,et al.  Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, MD, USA , 2005, SDM.

[2]  Eamonn J. Keogh,et al.  Logical-shapelets: an expressive primitive for time series classification , 2011, KDD.

[3]  KeoghEamonn,et al.  Querying and mining of time series data , 2008, VLDB 2008.

[4]  James C. Bezdek,et al.  Applications and Science of Neural Networks, Fuzzy Systems and Evolutionary Computation IV , 1998 .

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  Eamonn J. Keogh,et al.  LB_Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures , 2006, VLDB.

[7]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[8]  L. Ettre,et al.  Nomenclature for chromatography (IUPAC Recommendations 1993) , 1993 .

[9]  Paul D. Hale,et al.  Alignment of noisy signals , 2001, IEEE Trans. Instrum. Meas..

[10]  Christos Faloutsos,et al.  Fast Algorithms for Mining Co-evolving Time Series , 2011 .

[11]  Dan Ventura,et al.  LC-MS alignment in theory and practice: a comprehensive algorithmic review , 2013, Briefings Bioinform..

[12]  B. W. Wright,et al.  High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis. , 2003, Journal of chromatography. A.

[13]  Gérard G. Medioni,et al.  Dynamic Manifold Warping for view invariant action recognition , 2011, 2011 International Conference on Computer Vision.

[14]  J. Carstensen,et al.  Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping , 1998 .

[15]  Kevin G. Harding Two- and Three-Dimensional Vision Systems for Inspection, Control, and Metrology II , 2004 .

[16]  Dah-Jye Lee,et al.  Contour matching for a fish recognition and migration-monitoring system , 2004, SPIE Optics East.

[17]  Jimeng Sun,et al.  Online latent variable detection in sensor networks , 2005, 21st International Conference on Data Engineering (ICDE'05).

[18]  J. V. van Wijk,et al.  Cluster and calendar based visualization of time series data , 1999, Proceedings 1999 IEEE Symposium on Information Visualization (InfoVis'99).

[19]  Elias S. Manolakos,et al.  Robust normalization of DNA chromatograms by regression for improved base-calling , 2004, J. Frankl. Inst..

[20]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[21]  Jason Lines,et al.  Transformation Based Ensembles for Time Series Classification , 2012, SDM.

[22]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[23]  Pierre Geurts,et al.  Contributions to decision tree induction: bias/variance tradeoff and time series classification , 2002 .

[24]  L. H. Anauer,et al.  Speech Analysis and Synthesis by Linear Prediction of the Speech Wave , 2000 .

[25]  E. K. Kemsley,et al.  FTIR spectroscopy and multivariate analysis can distinguish the geographic origin of extra virgin olive oils. , 2003, Journal of agricultural and food chemistry.

[27]  Fernando De la Torre,et al.  Generalized time warping for multi-modal alignment of human motion , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  G. W. Hughes,et al.  Minimum Prediction Residual Principle Applied to Speech Recognition , 1975 .

[29]  Yannis Manolopoulos,et al.  Feature-based classification of time-series data , 2001 .

[30]  Lin Zhong,et al.  uWave: Accelerometer-based personalized gesture recognition and its applications , 2009, 2009 IEEE International Conference on Pervasive Computing and Communications.

[31]  Duc Truong Pham,et al.  Control chart pattern recognition using a new type of self-organizing neural network , 1998 .

[32]  Hermann Ney,et al.  Dynamic programming search for continuous speech recognition , 1999, IEEE Signal Process. Mag..

[33]  Bing-Yu Sun,et al.  A Study on the Dynamic Time Warping in Kernel Machines , 2007, 2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System.

[34]  Muzaffar Bashir,et al.  Reduced Dynamic Time Warping for Handwriting Recognition Based on Multidimensional Time Series of a Novel Pen Device , 2008 .

[35]  Radford M. Neal,et al.  Difference detection in LC-MS data for protein biomarker discovery , 2007, Bioinform..

[36]  Leslie S. Ettre,et al.  Nomenclature for Chromatography. , 1993 .

[37]  Daniel P. Siewiorek,et al.  Generalized feature extraction for structural pattern recognition in time-series data , 2001 .

[38]  Fernando De la Torre,et al.  Canonical Time Warping for Alignment of Human Behavior , 2009, NIPS.

[39]  T. K. Vintsyuk Speech discrimination by dynamic programming , 1968 .

[40]  Ercan Oztemel,et al.  Control chart pattern recognition using neural networks , 1992 .

[41]  Di Chen,et al.  Wavelet-Based Data Reduction Techniques for Process Fault Detection , 2006, Technometrics.

[42]  Manuela Veloso,et al.  Learning from accelerometer data on a legged robot , 2004 .

[43]  E. K. Kemsley,et al.  Detection of adulteration in cooked meat products by mid-infrared spectroscopy. , 2002, Journal of agricultural and food chemistry.

[44]  Sergey Malinchik,et al.  SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model , 2013, 2013 IEEE 13th International Conference on Data Mining.

[45]  Daniel Lemire,et al.  Time series classification by class-specific Mahalanobis distance measures , 2010, Adv. Data Anal. Classif..

[46]  Simon J. Perkins,et al.  Genetic Algorithms and Support Vector Machines for Time Series Classification , 2002, Optics + Photonics.

[47]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[48]  J. Theiler,et al.  Grammar-guided Feature Extraction for Time Series Classification , 2005 .

[49]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[50]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[51]  S. Mallat A wavelet tour of signal processing , 1998 .

[52]  Christos Faloutsos,et al.  Efficiently supporting ad hoc queries in large datasets of time sequences , 1997, SIGMOD '97.

[53]  Zhen Wang,et al.  uWave: Accelerometer-based Personalized Gesture Recognition and Its Applications , 2009, PerCom.

[54]  Oskar Söderkvist,et al.  Computer Vision Classification of Leaves from Swedish Trees , 2001 .

[55]  Yanchang Zhao R and Data Mining: Examples and Case Studies , 2012 .

[56]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[57]  Thomas Philip Runarsson,et al.  Support vector machines and dynamic time warping for time series , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[58]  Duc Truong Pham,et al.  Control chart pattern recognition using learning vector quantization networks , 1994 .

[59]  P. Eilers Parametric time warping. , 2004, Analytical chemistry.

[60]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[61]  Agma J. M. Traina,et al.  A new similarity measure for histograms applied to content-based retrieval of medical images , 2006, SAC '06.

[62]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[63]  J. Ramsay,et al.  Curve registration , 2018, Oxford Handbooks Online.

[64]  Jian Pei,et al.  A brief survey on sequence classification , 2010, SKDD.

[65]  Renée J. Miller,et al.  Similarity search over time-series data using wavelets , 2002, Proceedings 18th International Conference on Data Engineering.

[66]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[67]  Bernard Muschielok,et al.  The 4MOST instrument concept overview , 2014, Astronomical Telescopes and Instrumentation.

[68]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[69]  H WittenIan,et al.  The WEKA data mining software , 2009 .

[70]  Ronald R. Coifman,et al.  Local discriminant bases and their applications , 1995, Journal of Mathematical Imaging and Vision.

[71]  Caspar Zialor DNA sequencing with chain terminating inhibitors , 2014 .

[72]  Davide Roverso MULTIVARIATE TEMPORAL CLASSIFICATION BY WINDOWED WAVELET DECOMPOSITION AND RECURRENT NEURAL NETWORKS , 2000 .

[73]  Li Wei,et al.  Semi-supervised time series classification , 2006, KDD '06.

[74]  J. Ramsay Estimating smooth monotone functions , 1998 .

[75]  Sridhar Mahadevan,et al.  Manifold Warping: Manifold Alignment over Time , 2012, AAAI.

[76]  M Daszykowski,et al.  A comparison of three algorithms for chromatograms alignment. , 2006, Journal of chromatography. A.

[77]  B. Malek,et al.  Novel Shoulder-Surfing Resistant Haptic-based Graphical Password , 2006 .

[78]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[79]  B. Thompson Canonical Correlation Analysis , 1984 .

[80]  Jie Li,et al.  Kepler Science Operations Center architecture , 2010, Astronomical Telescopes + Instrumentation.

[81]  Susanta Kumar Gauri,et al.  Control chart pattern recognition using feature-based learning vector quantization , 2010 .

[82]  D A Price,et al.  Model of normal prepubertal growth. , 1996, Archives of disease in childhood.

[83]  Sule Gündüz Ögüdücü,et al.  SAGA: A novel signal alignment method based on genetic algorithm , 2013, Inf. Sci..

[84]  Romain Briandet,et al.  Discrimination of Arabica and Robusta in Instant Coffee by Fourier Transform Infrared Spectroscopy and Chemometrics , 1996 .

[85]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[86]  Michael H. F. Wilkinson,et al.  Automatic diatom identification using contour analysis by morphological curvature scale spaces , 2005, Machine Vision and Applications.

[87]  R. Manmatha,et al.  Word spotting for historical documents , 2006, International Journal of Document Analysis and Recognition (IJDAR).