Adaptive dissimilarity index for measuring time series proximity

The most widely used measures of time series proximity are the Euclidean distance and dynamic time warping. The latter can be derived from the distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves. The major limitation of these proximity measures is that they are based on the closeness of the values regardless of the similarity w.r.t. the growth behavior of the time series. To alleviate this drawback we propose a new dissimilarity index, based on an automatic adaptive tuning function, to include both proximity measures w.r.t. values and w.r.t. behavior. A comparative numerical analysis between the proposed index and the classical distance measures is performed on the basis of two datasets: a synthetic dataset and a dataset from a public health study.

[1]  Luis Angel García-Escudero,et al.  A Proposal for Robust Curve Clustering , 2005, J. Classif..

[2]  Vladimir Batagelj,et al.  Data Science and Classification , 2006, Studies in Classification, Data Analysis, and Knowledge Organization.

[3]  Robert H. Shumway,et al.  Discrimination and Clustering for Multivariate Time Series , 1998 .

[4]  Ruben H. Zamar,et al.  Comparing the shapes of regression functions , 2000 .

[5]  L. Wasserman,et al.  CATS , 2005 .

[6]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[7]  Christian Hennig,et al.  Design of Dissimilarity Measures: A New Dissimilarity Between Species Distribution Areas , 2006, Data Science and Classification.

[8]  David J. Hand,et al.  Advances in intelligent data analysis , 2000 .

[9]  J. Phair,et al.  The Multicenter AIDS Cohort Study: rationale, organization, and selected characteristics of the participants. , 1987, American journal of epidemiology.

[10]  Elizabeth Ann Maharaj,et al.  Cluster of Time Series , 2000, J. Classif..

[11]  Michael Godau,et al.  A Natural Metric for Curves - Computing the Distance for Polygonal Chains and Approximation Algorithms , 1991, STACS.

[12]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[13]  Helmut Alt,et al.  Measuring the resemblance of polygonal curves , 1992, SCG '92.

[14]  H. Mannila,et al.  Computing Discrete Fréchet Distance ∗ , 1994 .

[15]  Laura Firoiu,et al.  Clustering Time Series with Hidden Markov Models and Dynamic Time Warping , 1999 .

[16]  Jorge Caiado,et al.  A periodogram-based metric for time series classification , 2006, Comput. Stat. Data Anal..

[17]  Katharina Wittfeld,et al.  Distances of Time Series Components by Means of Symbolic Dynamics , 2004, Int. J. Bifurc. Chaos.

[18]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[19]  Ahlame Douzal Chouakria Compression Technique Preserving Correlations of a Multivariate Temporal Sequence , 2003, IDA.

[20]  Frank Klawonn,et al.  Fuzzy Clustering of Short Time-Series and Unevenly Distributed Sampling Points , 2003, IDA.

[21]  M. Fréchet Sur quelques points du calcul fonctionnel , 1906 .