Probabilistic Models For Joint Clustering And Time-Warping Of Multidimensional Curves

In this paper we present a family of models and learning algorithms that can simultaneously align and cluster sets of multidimensional curves measured on a discrete time grid. Our approach is based on a generative mixture model that allows both local nonlinear time warping and global linear shifts of the observed curves in both time and measurement spaces relative to the mean curves within the clusters. The resulting model can be viewed as a form of Bayesian network with a special temporal structure. The Expectation-Maximization (EM) algorithm is used to simultaneously recover both the curve models for each cluster, and the most likely alignments and cluster membership for each curve. We evaluate the methodology on two real-world data sets, and show that the Bayesian network models provide systematic improvements in predictive power over more conventional clustering approaches.

[1]  Todd R. Ogden,et al.  Wavelet Methods for Time Series Analysis , 2002 .

[2]  B. Frey,et al.  Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Catherine A. Sugar,et al.  Clustering for Sparsely Sampled Functional Data , 2003 .

[5]  W. DeSarbo,et al.  A maximum likelihood methodology for clusterwise linear regression , 1988 .

[6]  T. Gasser,et al.  Alignment of curves by dynamic time warping , 1997 .

[7]  T. Gasser,et al.  Statistical Tools to Analyze Data Representing a Sample of Curves , 1992 .

[8]  R. Blender,et al.  Identification of cyclone‐track regimes in the North Atlantic , 1997 .

[9]  Padhraic Smyth,et al.  Trajectory clustering with mixtures of regression models , 1999, KDD '99.

[10]  Padhraic Smyth,et al.  Curve Clustering with Random Effects Regression Mixtures , 2003, AISTATS.

[11]  Padhraic Smyth,et al.  Translation-invariant mixture models for curve clustering , 2003, KDD '03.

[12]  Eric Mjolsness,et al.  New Algorithms for 2D and 3D Point Matching: Pose Estimation and Correspondence , 1998, NIPS.

[13]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Wayne S. DeSarbo,et al.  Bayesian inference for finite mixtures of generalized linear models with random effects , 2000 .

[15]  Geoffrey J. McLachlan,et al.  FITTING FINITE MIXTURE MODELS IN A REGRESSION CONTEXT , 1992 .

[16]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[17]  Tommi S. Jaakkola,et al.  A new approach to analyzing gene expression time series data , 2002, RECOMB '02.