A shapelet transform for time series classification

The problem of time series classification (TSC), where we consider any real-valued ordered data a time series, presents a specific machine learning challenge as the ordering of variables is often crucial in finding the best discriminating features. One of the most promising recent approaches is to find shapelets within a data set. A shapelet is a time series subsequence that is identified as being representative of class membership. The original research in this field embedded the procedure of finding shapelets within a decision tree. We propose disconnecting the process of finding shapelets from the classification algorithm by proposing a shapelet transformation. We describe a means of extracting the k best shapelets from a data set in a single pass, and then use these shapelets to transform data by calculating the distances from a series to each shapelet. We demonstrate that transformation into this new data space can improve classification accuracy, whilst retaining the explanatory power provided by shapelets.

[1]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[2]  Juan José Rodríguez Diez,et al.  Support vector machines of interval-based features for time series classification , 2004, Knowl. Based Syst..

[3]  Ingo Mierswa,et al.  Understandable models Of music collections based on exhaustive feature generation with temporal statistics , 2006, KDD '06.

[4]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[5]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[6]  Sven Kreiborg,et al.  The BoneXpert Method for Automated Determination of Skeletal Maturity , 2009, IEEE Transactions on Medical Imaging.

[7]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[8]  Eamonn J. Keogh,et al.  Exact Discovery of Time Series Motifs , 2009, SDM.

[9]  Eamonn J. Keogh,et al.  Time series shapelets: a novel technique that allows accurate, interpretable and fast classification , 2010, Data Mining and Knowledge Discovery.

[10]  Norbert Link,et al.  Gesture recognition with inertial sensors and optimized DTW prototypes , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[11]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[12]  Philip S. Yu,et al.  Extracting Interpretable Features for Early Classification on Time Series , 2011, SDM.

[13]  Eamonn J. Keogh,et al.  Logical-shapelets: an expressive primitive for time series classification , 2011, KDD.

[14]  Barry-John Theobald,et al.  On the Extraction and Classification of Hand Outlines , 2011, IDEAL.

[15]  Krisztian Buza,et al.  Fusion Methods for Time-Series Classification , 2011 .

[16]  Jason Lines,et al.  Transformation Based Ensembles for Time Series Classification , 2012, SDM.

[17]  George C. Runger,et al.  A time series forest for classification and feature extraction , 2013, Inf. Sci..