A Review and Evaluation of Elastic Distance Functions for Time Series Clustering

Time series clustering is the act of grouping time series data without recourse to a label. Algorithms that cluster time series can be classified into two groups: those that employ a time series specific distance measure; and those that derive features from time series. Both approaches usually rely on traditional clustering algorithms such as $k$-means. Our focus is on distance based time series that employ elastic distance measures, i.e. distances that perform some kind of realignment whilst measuring distance. We describe nine commonly used elastic distance measures and compare their performance with k-means and k-medoids clustering. Our findings are surprising. The most popular technique, dynamic time warping (DTW), performs worse than Euclidean distance with k-means, and even when tuned, is no better. Using k-medoids rather than k-means improved the clusterings for all nine distance measures. DTW is not significantly better than Euclidean distance with k-medoids. Generally, distance measures that employ editing in conjunction with warping perform better, and one distance measure, the move-split-merge (MSM) method, is the best performing measure of this study. We also compare to clustering with DTW using barycentre averaging (DBA). We find that DBA does improve DTW k-means, but that the standard DBA is still worse than using MSM. Our conclusion is to recommend MSM with k-medoids as the benchmark algorithm for clustering time series with elastic distance measures. We provide implementations in the aeon toolkit, results and guidance on reproducing results on the associated GitHub repository.

[1]  Germain Forestier,et al.  End-to-end deep representation learning for time series clustering: a comparative study , 2021, Data Mining and Knowledge Discovery.

[2]  Liang Zhao,et al.  Time series clustering in linear time complexity , 2021, Data Mining and Knowledge Discovery.

[3]  James Large,et al.  HIVE-COTE 2.0: a new meta ensemble for time series classification , 2021, Machine Learning.

[4]  Alejandro Pasos Ruiz,et al.  The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2020, Data Mining and Knowledge Discovery.

[5]  B. Lee,et al.  A Benchmark Study on Time Series Clustering , 2020, Machine Learning with Applications.

[6]  Franz J. Király,et al.  sktime: A Unified Interface for Machine Learning with Time Series , 2019, ArXiv.

[7]  Eamonn J. Keogh,et al.  The UCR time series archive , 2018, IEEE/CAA Journal of Automatica Sinica.

[8]  Germain Forestier,et al.  Optimizing dynamic time warping’s window width for time series data mining applications , 2018, Data Mining and Knowledge Discovery.

[9]  Chin-Teng Lin,et al.  A review of clustering techniques and developments , 2017, Neurocomputing.

[10]  David Schultz,et al.  Nonsmooth analysis and subgradient methods for averaging in dynamic time warping spaces , 2017, Pattern Recognit..

[11]  Eamonn J. Keogh,et al.  The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[12]  Eamonn J. Keogh,et al.  The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[13]  Elizabeth Ann Maharaj,et al.  Time-Series Clustering , 2015 .

[14]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[15]  Francesca Mangili,et al.  Should We Really Use Post-Hoc Tests Based on Mean-Ranks? , 2015, J. Mach. Learn. Res..

[16]  Jason Lines,et al.  Time series classification with ensembles of elastic distance measures , 2015, Data Mining and Knowledge Discovery.

[17]  Saeed Aghabozorgi,et al.  A Review of Subsequence Time Series Clustering , 2014, TheScientificWorldJournal.

[18]  Gautam Das,et al.  The Move-Split-Merge Metric for Time Series , 2013, IEEE Transactions on Knowledge and Data Engineering.

[19]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[20]  Pierre Gançarski,et al.  A global averaging method for dynamic time warping, with applications to clustering , 2011, Pattern Recognit..

[21]  P. Rousseeuw,et al.  Partitioning Around Medoids (Program PAM) , 2008 .

[22]  Pierre-François Marteau,et al.  Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  J. Demšar Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[24]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[25]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[26]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[27]  M. C. Ortiz,et al.  Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes , 2004 .

[28]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[29]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[30]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[31]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[32]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[33]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[34]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[35]  Eamonn J. Keogh,et al.  Everything you know about Dynamic Time Warping is Wrong , 2004 .

[36]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[37]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[38]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[39]  J. Mcqueen Some methods for classi cation and analysis of multivariate observations , 1967 .

[40]  Raymond E. Bonner,et al.  On Some Clustering Techniques , 1964, IBM J. Res. Dev..