TIME SERIES CLASSIFICATION BASED ON THE LONGEST COMMON SUBSEQUENCE SIMILARITY AND ENSEMBLE LEARNING

The dynamic time warping (DTW) algorithm provides a powerful way to measure the distance between two time series. However, the DTW algorithm may not be suitable for all time series of various types. This paper proposes a similarity measurement for two time series consisting of real numbers based on the concept of the longest common subsequence (LCS) problem with the data diversity. In addition, for reducing the error rates of time series classification, the behavior knowledge space (BKS) method is used to build ensemble classifiers by combining three classifiers, including DTW with warping window (DTWW), derivative dynamic time warping (DDTW) and LCS. The experimental results show that the LCS similarity measurement with diversity can get good accuracy comparable to the DTW algorithm. In addition the BKS method improves the error rate about 20% over the previously best-known DTWW method.

[1]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[2]  G. W. Hughes,et al.  Minimum Prediction Residual Principle Applied to Speech Recognition , 1975 .

[3]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[4]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[5]  W. Warde,et al.  Statistical Methods: An Introductory Text , 1995 .

[6]  L. Gupta,et al.  Nonlinear alignment and averaging for estimating the evoked potential , 1996, IEEE Transactions on Biomedical Engineering.

[7]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[8]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  Fabio Roli,et al.  The Behavior Knowledge Space Fusion Method: Analysis of Generalization Error and Strategies for Performance Improvement , 2003, Multiple Classifier Systems.

[10]  R. Manmatha,et al.  Word image matching using dynamic time warping , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[12]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[13]  Douglas W. Mitchell,et al.  88.27 More on spreads and non-arithmetic means , 2004, The Mathematical Gazette.

[14]  Michael H. F. Wilkinson,et al.  Automatic diatom identification using contour analysis by morphological curvature scale spaces , 2005, Machine Vision and Applications.

[15]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[16]  A. Kuzmanic,et al.  Hand shape classification using DTW and LCSS as similarity measures for vision-based gesture recognition system , 2007, EUROCON 2007 - The International Conference on "Computer as a Tool".

[17]  Hsing-Yen Ann,et al.  A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings , 2008, Inf. Process. Lett..

[18]  Chang-Biau Yang,et al.  Efficient Sparse Dynamic Programming for the Merged LCS Problem , 2008, BIOCOMP.

[19]  Lars Schmidt-Thieme,et al.  Time-Series Classification Based on Individualised Error Prediction , 2010, 2010 13th IEEE International Conference on Computational Science and Engineering.

[20]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[21]  Chang-Biau Yang,et al.  The Longest Common Subsequence Problem with Variable Gapped Constraints , 2011 .

[22]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[23]  Eamonn J. Keogh,et al.  DTW-D: time series semi-supervised learning from a single example , 2013, KDD.

[24]  Tomasz Górecki,et al.  Using derivatives in a longest common subsequence dissimilarity measure for time series classification , 2014, Pattern Recognit. Lett..

[25]  Chang-Biau Yang,et al.  Finding the gapped longest common subsequence by incremental suffix maximum queries , 2014, Inf. Comput..

[26]  Jason Lines,et al.  Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles , 2015, IEEE Transactions on Knowledge and Data Engineering.