Evaluating Improvements to the Shapelet Transform

The Shapelet tree algorithm was proposed in 2009 as a novel way to find phase independent subsequences which could be used for time series classification. The shapelet discovery algorithm is O(nm), where n is the number of cases, and m is the length of the series. Several methods have sought to increase the speed of finding shapelets. The ShapeletTransform reduces the finding to a single pass, and FastShapelets smooths and reduces the series lengths through PAA and SAX. However neither of these techniques can enumerate all shapelets on the largest of the datasets present in the UCR repository. We first evaluate whether the FastShapelet algorithm is better as a transform, and secondly provide a contract classifier for the shapelet transform, by calculating the number of fundamental operations we can estimate the run time of the algorithm, and sample the data to fulfil this contract. We found that whilst the FastShapeletTransform does drastically reduce the operation count of finding shapelets it is not significantly better than FastShapelets, nor can it compete with the ShapeletTransform. The factory method for sampling the data is competitive with the ShapeletTransform and in some cases we see minor improvements despite being much faster.

[1]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[2]  James Large,et al.  The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms. Extended Version , 2016, ArXiv.

[3]  Jason Lines,et al.  Transformation Based Ensembles for Time Series Classification , 2012, SDM.

[4]  T. Shajina,et al.  Human Gait Recognition and Classification Using Time Series Shapelets , 2012, 2012 International Conference on Advances in Computing and Communications.

[5]  Patrick Schäfer,et al.  Scalable time series classification , 2016, Data Mining and Knowledge Discovery.

[6]  Norbert Link,et al.  Gesture recognition with inertial sensors and optimized DTW prototypes , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[7]  Li Wei,et al.  SAXually Explicit Images: Finding Unusual Shapes , 2006, Sixth International Conference on Data Mining (ICDM'06).

[8]  Eamonn J. Keogh,et al.  Time series shapelets: a novel technique that allows accurate, interpretable and fast classification , 2010, Data Mining and Knowledge Discovery.

[9]  KeoghEamonn,et al.  Time series shapelets , 2011 .

[10]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[11]  Anthony J. Bagnall,et al.  Binary Shapelet Transform for Multiclass Time Series Classification , 2015, Trans. Large Scale Data Knowl. Centered Syst..

[12]  References , 1971 .

[13]  Eamonn J. Keogh,et al.  Discovering the Intrinsic Cardinality and Dimensionality of Time Series Using MDL , 2011, 2011 IEEE 11th International Conference on Data Mining.

[14]  Lars Schmidt-Thieme,et al.  Learning time-series shapelets , 2014, KDD.

[15]  Jonathan F. F. Hills,et al.  Mining time-series data using discriminative subsequences , 2014 .

[16]  Jason Lines,et al.  Time series classification with ensembles of elastic distance measures , 2015, Data Mining and Knowledge Discovery.

[17]  Eamonn J. Keogh,et al.  Scalable Clustering of Time Series with U-Shapelets , 2015, SDM.

[18]  Philip S. Yu,et al.  Extracting Interpretable Features for Early Classification on Time Series , 2011, SDM.

[19]  Eamonn J. Keogh,et al.  Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets , 2013, SDM.

[20]  Jason Lines,et al.  A shapelet transform for time series classification , 2012, KDD.

[21]  Jason Lines,et al.  Time Series classification through transformation and ensembles , 2015 .

[22]  Eamonn J. Keogh,et al.  Clustering Time Series Using Unsupervised-Shapelets , 2012, 2012 IEEE 12th International Conference on Data Mining.

[23]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[24]  Eamonn J. Keogh,et al.  Accelerating the discovery of unsupervised-shapelets , 2015, Data Mining and Knowledge Discovery.

[25]  Jason Lines,et al.  Classification of time series by shapelet transformation , 2013, Data Mining and Knowledge Discovery.

[26]  Philip S. Yu,et al.  Early classification on time series , 2012, Knowledge and Information Systems.

[27]  M. P. Griffin,et al.  Toward the early diagnosis of neonatal sepsis and sepsis-like illness using novel heart rate analysis. , 2001, Pediatrics.