Mining time-series data using discriminative subsequences

Time-series data is abundant, and must be analysed to extract usable knowledge. Local-shape-based methods offer improved performance for many problems, and a comprehensible method of understanding both data and models. For time-series classification, we transform the data into a local-shape space using a shapelet transform. A shapelet is a time-series subsequence that is discriminative of the class of the original series. We use a heterogeneous ensemble classifier on the transformed data. The accuracy of our method is significantly better than the time-series classification benchmark (1-nearest-neighbour with dynamic time-warping distance), and significantly better than the previous best shapelet-based classifiers. We use two methods to increase interpretability: First, we cluster the shapelets using a novel, parameterless clustering method based on Minimum Description Length, reducing dimensionality and removing duplicate shapelets. Second, we transform the shapelet data into binary data reflecting the presence or absence of particular shapelets, a representation that is straightforward to interpret and understand. We supplement the ensemble classifier with partial classifocation. We generate rule sets on the binary-shapelet data, improving performance on certain classes, and revealing the relationship between the shapelets and the class label. To aid interpretability, we use a novel algorithm, BruteSuppression, that can substantially reduce the size of a rule set without negatively affecting performance, leading to a more compact, comprehensible model. Finally, we propose three novel algorithms for unsupervised mining of approximately repeated patterns in time-series data, testing their performance in terms of speed and accuracy on synthetic data, and on a real-world electricity-consumption device-disambiguation problem. We show that individual devices can be found automatically and in an unsupervised manner using a local-shape-based approach.

[1]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[2]  Gunnar Rätsch,et al.  An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[3]  Hiep Xuan Huynh,et al.  Evaluating Interestingness Measures with Linear Correlation Graph , 2006, IEA/AIE.

[4]  Anthony J. Bagnall,et al.  BruteSuppression: a size reduction method for Apriori rule sets , 2013, Journal of Intelligent Information Systems.

[5]  Pankaj Kumar,et al.  REDUCTION OF NUMBER OF ASSOCIATION RULES WITH INTER ITEMSET DISTANCE IN TRANSACTION DATABASES , 2012 .

[6]  Divyakant Agrawal,et al.  A comparison of DFT and DWT based similarity search in time-series databases , 2000, CIKM '00.

[7]  Johannes Fürnkranz,et al.  ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[8]  Lei Liu,et al.  A fast pruning redundant rule method using Galois connection , 2011, Appl. Soft Comput..

[9]  Yue Xu,et al.  Reliable representations for association rules , 2011, Data Knowl. Eng..

[10]  Frans Coenen,et al.  Tree Structures for Mining Association Rules , 2004, Data Mining and Knowledge Discovery.

[11]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[14]  Li Wei,et al.  SAXually Explicit Images: Finding Unusual Shapes , 2006, Sixth International Conference on Data Mining (ICDM'06).

[15]  Tomasz Górecki,et al.  Using derivatives in time series classification , 2012, Data Mining and Knowledge Discovery.

[16]  M. P. Griffin,et al.  Toward the early diagnosis of neonatal sepsis and sepsis-like illness using novel heart rate analysis. , 2001, Pediatrics.

[17]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  Anthony J. Bagnall,et al.  Interestingness Measures for Fixed Consequent Rules , 2012, IDEAL.

[20]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[21]  Frank Höppner Discovery of Temporal Patterns. Learning Rules about the Qualitative Behaviour of Time Series , 2001, PKDD.

[22]  Friedrich Gebhardt,et al.  Choosing among competing generalizations , 1989 .

[23]  Gareth J. Janacek,et al.  A Likelihood Ratio Distance Measure for the Similarity Between the Fourier Transform of Time Series , 2005, PAKDD.

[24]  Vasant Dhar,et al.  Abstract-Driven Pattern Discovery in Databases , 1992, IEEE Trans. Knowl. Data Eng..

[25]  Norbert Link,et al.  Gesture recognition with inertial sensors and optimized DTW prototypes , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[26]  Jessica Lin,et al.  Finding Motifs in Time Series , 2002, KDD 2002.

[27]  Eugene Agichtein,et al.  Discovering common motifs in cursor movement data for improving web search , 2014, WSDM.

[28]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[29]  Gareth J. Janacek,et al.  Clustering time series from ARMA models with clipped data , 2004, KDD.

[30]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[31]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[32]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[33]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[34]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[35]  Jorge Caiado,et al.  A periodogram-based metric for time series classification , 2006, Comput. Stat. Data Anal..

[36]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[37]  Eamonn J. Keogh,et al.  Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data , 2011, 2011 IEEE 11th International Conference on Data Mining.

[38]  V. J. Rayward-Smith,et al.  Data mining rules using multi-objective evolutionary algorithms , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[39]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[40]  Norbert Link,et al.  Prototype Optimization for Temporarily and Spatially Distorted Time Series , 2010, AAAI Spring Symposium: It's All in the Timing.

[41]  Frank Vahid,et al.  Automated fall detection on privacy-enhanced video , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[42]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[43]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[44]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[45]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[46]  S. Venkatesh,et al.  Online Context Recognition in Multisensor Systems using Dynamic Time Warping , 2005, 2005 International Conference on Intelligent Sensors, Sensor Networks and Information Processing.

[47]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[48]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[49]  Patrick Meyer,et al.  On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid , 2008, Eur. J. Oper. Res..

[50]  Edith Cohen,et al.  Finding Interesting Associations without Support Pruning , 2001, IEEE Trans. Knowl. Data Eng..

[51]  Takahira Yamaguchi,et al.  Evaluation of Rule Interestingness Measures with a Clinical Dataset on Hepatitis , 2004, PKDD.

[52]  Deborah R. Carvalho,et al.  Evaluating the Correlation Between Objective Rule Interestingness Measures and Real Human Interest , 2005, PKDD.

[53]  Eamonn J. Keogh,et al.  Exact Discovery of Time Series Motifs , 2009, SDM.

[54]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[55]  Eamonn J. Keogh,et al.  Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets , 2013, SDM.

[56]  William Kruskal,et al.  A Nonparametric test for the Several Sample Problem , 1952 .

[57]  Lior Rokach,et al.  Fast Randomized Model Generation for Shapelet-Based Time Series Classification , 2012, ArXiv.

[58]  J. G. Ganascia,et al.  Deriving the learning bias from rule properties , 1991 .

[59]  Giorgio Valentini,et al.  Bias-Variance Analysis of Support Vector Machines for the Development of SVM-Based Ensemble Methods , 2004, J. Mach. Learn. Res..

[60]  Laura J. Grundy,et al.  A database of C. elegans behavioral phenotypes , 2013, Nature Methods.

[61]  Eamonn J. Keogh,et al.  A disk-aware algorithm for time series motif discovery , 2011, Data Mining and Knowledge Discovery.

[62]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[63]  José Ramón Villar,et al.  Data Analysis for Detecting a Temporary Breath Inability Episode , 2014, IDEAL.

[64]  Wenjia Wang,et al.  On diversity and accuracy of homogeneous and heterogeneous ensembles , 2007, Int. J. Hybrid Intell. Syst..

[65]  Didier Stricker,et al.  Exploring and extending the boundaries of physical activity recognition , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[66]  Pedro M. Domingos A Unified Bias-Variance Decomposition for Zero-One and Squared Loss , 2000, AAAI/IAAI.

[67]  I. Rubinfeld,et al.  ICU mortality prediction using time series motifs , 2012, 2012 Computing in Cardiology.

[68]  Peter A. Flach,et al.  Rule Evaluation Measures: A Unifying View , 1999, ILP.

[69]  Yasuhiko Morimoto,et al.  Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization , 1996, SIGMOD '96.

[70]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[71]  S. Campana,et al.  Stock Discrimination Using Otolith Shape Analysis , 1993 .

[72]  Alex Alves Freitas,et al.  On rule interestingness measures , 1999, Knowl. Based Syst..

[73]  Jason Lines,et al.  Alternative Quality Measures for Time Series Shapelets , 2012, IDEAL.

[74]  Jeremy Buhler,et al.  Finding Motifs Using Random Projections , 2002, J. Comput. Biol..

[75]  Paul R. Cohen,et al.  A Method for Clustering the Experiences of a Mobile Robot that Accords with Human Judgments , 2000, AAAI/IAAI.

[76]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[77]  Yves Grandvalet,et al.  Noise Injection: Theoretical Prospects , 1997, Neural Computation.

[78]  Michael H. Prager,et al.  Using otolith shape analysis to distinguish eastern Gulf of Mexico and Atlantic Ocean stocks of king mackerel , 2002 .

[79]  Tharam S. Dillon,et al.  Interestingness measures for association rules based on statistical validity , 2011, Knowl. Based Syst..

[80]  Eamonn J. Keogh,et al.  Mining Historical Documents for Near-Duplicate Figures , 2011, 2011 IEEE 11th International Conference on Data Mining.

[81]  Manuela Veloso,et al.  Learning from accelerometer data on a legged robot , 2004 .

[82]  Amy McGovern,et al.  Identifying predictive multi-dimensional time series motifs: an application to severe weather prediction , 2010, Data Mining and Knowledge Discovery.

[83]  Eamonn J. Keogh,et al.  Clustering Time Series Using Unsupervised-Shapelets , 2012, 2012 IEEE 12th International Conference on Data Mining.

[84]  Patrick Meyer,et al.  Association Rule Interestingness Measures: Experimental and Theoretical Studies , 2007, Quality Measures in Data Mining.

[85]  Barry-John Theobald,et al.  Automated Bone Age Assessment Using Feature Extraction , 2012, IDEAL.

[86]  Victor J. Rayward-Smith,et al.  The discovery of association rules from tabular databases comprising nominal and ordinal attributes , 2002, Intell. Data Anal..

[87]  John A. Major,et al.  Selecting among rules induced from a hurricane database , 1993, Journal of Intelligent Information Systems.

[88]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[89]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[90]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[91]  Gareth James,et al.  Variance and Bias for General Loss Functions , 2003, Machine Learning.

[92]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[93]  John M. Aronis,et al.  Scaling Up Inductive Learning with Massive Parallelism , 2005, Machine Learning.

[94]  Juan José Rodríguez Diez,et al.  Support vector machines of interval-based features for time series classification , 2005, Knowl. Based Syst..

[95]  Victor J. Rayward-Smith,et al.  Discovery of association rules in tabular data , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[96]  Mohamed F. Ghalwash,et al.  Extraction of Interpretable Multivariate Patterns for Early Diagnostics , 2013, 2013 IEEE 13th International Conference on Data Mining.

[97]  Eric C. Larson,et al.  Disaggregated End-Use Energy Sensing for the Smart Grid , 2011, IEEE Pervasive Computing.

[98]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[99]  Yasuhiko Morimoto,et al.  Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases , 1998, VLDB.

[100]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[101]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[102]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[103]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[104]  Zhen Wang,et al.  uWave: Accelerometer-based Personalized Gesture Recognition and Its Applications , 2009, PerCom.

[105]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[106]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[107]  Ulrich Eckhardt,et al.  Shape descriptors for non-rigid shapes with a single closed contour , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[108]  Giuseppe Di Fatta,et al.  Discriminative pattern mining in software fault detection , 2006, SOQUA '06.

[109]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[110]  Eamonn J. Keogh,et al.  Towards a minimum description length based stopping criterion for semi-supervised time series classification , 2013, 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI).

[111]  Alex A. Freitas,et al.  A critical review of rule surprisingness measures , 2003 .

[112]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[113]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[114]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.