A Recurrence Plot-Based Distance Measure

Given a set of time series, our goal is to identify prototypes that cover the maximum possible amount of occurring subsequences regardless of their order. This scenario appears in the context of the automotive industry, where the goal is to determine operational profiles that comprise frequently recurring driving behavior patterns. This problem can be solved by clustering, however, standard distance measures such as the dynamic time warping distance might not be suitable for this task, because they aim at capturing the cost of aligning two time series rather than rewarding pairwise recurring patterns. In this contribution, we propose a novel time series distance measure, based on the notion of recurrence plots, which enables us to determine the (dis)similarity of multivariate time series that contain segments of similar trajectories at arbitrary positions. We use recurrence quantification analysis to measure the structures observed in recurrence plots and to investigate dynamical properties, such as determinism, which reflect the pairwise (dis)similarity of time series. In experiments on real-life test drives from Volkswagen, we demonstrate that clustering multivariate time series using the proposed recurrence plot-based distance measure results in prototypical test drives that cover significantly more recurring patterns than using the same clustering algorithm with dynamic time warping distance.

[1]  Jessica Lin,et al.  Finding Motifs in Time Series , 2002, KDD 2002.

[2]  Norbert Marwan,et al.  How to Avoid Potential Pitfalls in Recurrence Plot Based Data Analysis , 2010, Int. J. Bifurc. Chaos.

[3]  Eamonn J. Keogh,et al.  Finding surprising patterns in a time series database in linear time and space , 2002, KDD.

[4]  Mahesh Kumar,et al.  Clustering seasonality patterns in the presence of errors , 2002, KDD.

[5]  Axel Wismüller,et al.  Cluster Analysis of Biomedical Image Time-Series , 2002, International Journal of Computer Vision.

[6]  Alessandro Giuliani,et al.  Simpler methods do it better: Success of Recurrence Quantification Analysis as a general purpose data analysis tool , 2009 .

[7]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[8]  Jason Lines,et al.  Classification of Household Devices by Electricity Usage Profiles , 2011, IDEAL.

[9]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[10]  N. Marwan Encounters with neighbours : current developments of concepts based on recurrence plots and their applications , 2003 .

[11]  Norbert Marwan,et al.  Recurrence plots 25 years later —Gaining confidence in dynamical transitions , 2013, 1306.0688.

[12]  Sahin Albayrak,et al.  An Order-invariant Time Series Distance Measure - Position on Recent Developments in Time Series Analysis , 2012, KDIR.

[13]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[14]  Eamonn J. Keogh,et al.  HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[15]  Jürgen Kurths,et al.  Recurrence plots for the analysis of complex systems , 2009 .

[16]  Frank Klawonn,et al.  Fuzzy Clustering of Short Time-Series and Unevenly Distributed Sampling Points , 2003, IDA.

[17]  Eamonn J. Keogh,et al.  Time Series Classification under More Realistic Assumptions , 2013, SDM.

[18]  Michael T. Turvey,et al.  Local Minima-Based Recurrence Plots for Continuous Dynamical Systems , 2011, Int. J. Bifurc. Chaos.

[19]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[20]  Norbert Marwan,et al.  A historical review of recurrence plots , 2008, 1709.09971.

[21]  Soo-Yong Kim,et al.  Divergence in perpendicular recurrence plot; quantification of dynamical divergence from short chaotic time series , 1999 .

[22]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[23]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Eamonn J. Keogh,et al.  Probabilistic discovery of time series motifs , 2003, KDD '03.

[25]  Eamonn J. Keogh,et al.  Clustering Time Series Using Unsupervised-Shapelets , 2012, 2012 IEEE 12th International Conference on Data Mining.

[26]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[27]  Sahin Albayrak,et al.  Pattern recognition and classification for multivariate time series , 2011, SensorKDD '11.

[28]  M. Karlaftis,et al.  Comparing traffic flow time-series under fine and adverse weather conditions using recurrence-based complexity measures , 2012 .

[29]  Sahin Albayrak,et al.  Pattern recognition in multivariate time series: dissertation proposal , 2011, PIKM '11.

[30]  Eamonn J. Keogh,et al.  Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets , 2013, SDM.