A fast shapelet selection algorithm for time series classification

Abstract Time series classification has attracted significant interest over the past decade. One of the promising approaches is shapelet based algorithms, which are interpretable, more accurate and faster than most classifiers. However, the training time of shapelet based algorithms is high, even though it is computed off-line. To overcome this problem, in this paper, we propose a fast shapelet selection algorithm (FSS), which sharply reduces the time consumption of shapelet selection. In our algorithm, we first sample some time series from a training dataset with the help of a subclass splitting method. Then FSS identifies the local farthest deviation points (LFDPs) for the sampled time series and selects the subsequences between two nonadjacent LFDPs as shapelet candidates. Using these two steps, the number of shapelet candidates is greatly reduced, which leads to an obvious reduction in time complexity. Unlike other methods that accelerate the shapelet selection process at the expense of a reduction in accuracy, the experimental results demonstrate that FSS is thousands of times faster than the original shapelet transformation method, with no reduction in accuracy. Our results also demonstrate that our method is the fastest among shapelet based methods that have the leading level of accuracy.

[1]  Jason Lines,et al.  Time series classification with ensembles of elastic distance measures , 2015, Data Mining and Knowledge Discovery.

[2]  Yuan Li,et al.  Rotation-invariant similarity in time series using bag-of-patterns representation , 2012, Journal of Intelligent Information Systems.

[3]  Roberto Costa,et al.  Implementation and Empirical Assessment of a Web Application Cloud Deployment Tool , 2013, CloudCom 2013.

[4]  M. Arathi,et al.  An Efficient and Accurate Time Series Classification Using Shapelets , 2014 .

[5]  Rohit J. Kate Using dynamic time warping distances as features for improved time series classification , 2016, Data Mining and Knowledge Discovery.

[6]  Jason Lines,et al.  Classification of time series by shapelet transformation , 2013, Data Mining and Knowledge Discovery.

[7]  Lars Schmidt-Thieme,et al.  Learning DTW-Shapelets for Time-Series Classification , 2016, CODS.

[8]  George C. Runger,et al.  Time series representation and similarity based on local autopatterns , 2016, Data Mining and Knowledge Discovery.

[9]  Fuzhen Zhuang,et al.  Fast Time Series Classification Based on Infrequent Shapelets , 2012, 2012 11th International Conference on Machine Learning and Applications.

[10]  Maria Rifqi,et al.  Random-shapelet: An algorithm for fast shapelet discovery , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[11]  Xiaojie Yuan,et al.  Accelerating Time Series Shapelets Discovery with Key Points , 2016, APWeb.

[12]  Shijun Liu,et al.  A Shapelet Selection Algorithm for Time Series Classification: New Directions , 2017, IIKI.

[13]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[14]  Shijun Liu,et al.  A Fast Shapelet Discovery Algorithm Based on Important Data Points , 2017, Int. J. Web Serv. Res..

[15]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[16]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[17]  Philip S. Yu,et al.  Extracting Interpretable Features for Early Classification on Time Series , 2011, SDM.

[18]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[19]  Eamonn J. Keogh,et al.  CID: an efficient complexity-invariant distance for time series , 2013, Data Mining and Knowledge Discovery.

[20]  Eamonn J. Keogh,et al.  Time series shapelets: a novel technique that allows accurate, interpretable and fast classification , 2010, Data Mining and Knowledge Discovery.

[21]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[22]  Jason Lines,et al.  A shapelet transform for time series classification , 2012, KDD.

[23]  Tomasz Górecki,et al.  Non-isometric transforms in time series classification using DTW , 2014, Knowl. Based Syst..

[24]  Lars Schmidt-Thieme,et al.  Fast classification of univariate and multivariate time series through shapelet discovery , 2016, Knowledge and Information Systems.

[25]  Panagiotis Papapetrou,et al.  Generalized random shapelet forests , 2016, Data Mining and Knowledge Discovery.

[26]  Shijun Liu,et al.  A piecewise linear representation method based on importance data points for time series data , 2016, 2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[27]  Sergey Malinchik,et al.  SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model , 2013, 2013 IEEE 13th International Conference on Data Mining.

[28]  Lior Rokach,et al.  Fast and space-efficient shapelets-based time-series classification , 2015, Intell. Data Anal..

[29]  Gautam Das,et al.  The Move-Split-Merge Metric for Time Series , 2013, IEEE Transactions on Knowledge and Data Engineering.

[30]  Jason Lines,et al.  Transformation Based Ensembles for Time Series Classification , 2012, SDM.

[31]  Jason Lines,et al.  HIVE-COTE: The Hierarchical Vote Collective of Transformation-Based Ensembles for Time Series Classification , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[32]  Lars Schmidt-Thieme,et al.  Learning time-series shapelets , 2014, KDD.

[33]  Shijun Liu,et al.  A Self-Evolving Method of Data Model for Cloud-Based Machine Data Ingestion , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).

[34]  Pierre-François Marteau,et al.  Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Dan Roth,et al.  Efficient Pattern-Based Time Series Classification on GPU , 2012, 2012 IEEE 12th International Conference on Data Mining.

[36]  Jason Lines,et al.  Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles , 2015, IEEE Trans. Knowl. Data Eng..

[37]  George C. Runger,et al.  A time series forest for classification and feature extraction , 2013, Inf. Sci..

[38]  George C. Runger,et al.  A Bag-of-Features Framework to Classify Time Series , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Tomasz Górecki,et al.  Using derivatives in time series classification , 2012, Data Mining and Knowledge Discovery.

[40]  Eamonn J. Keogh,et al.  Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets , 2013, SDM.

[41]  Chotirat Ann Ratanamahatana,et al.  Fast and accurate template averaging for time series classification , 2016, 2016 8th International Conference on Knowledge and Smart Technology (KST).

[42]  Eamonn J. Keogh,et al.  Logical-shapelets: an expressive primitive for time series classification , 2011, KDD.

[43]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[44]  Patrick Schäfer The BOSS is concerned with time series classification in the presence of noise , 2014, Data Mining and Knowledge Discovery.