Probabilistic curve-aligned clustering and prediction with regression mixture models

Clustering and prediction of sets of curves is an important problem in many areas of science and engineering. Most clustering algorithms operate on fixed-dimensional feature vectors, and as a result, curve analysis is often forced into this unnatural paradigm. Perhaps more importantly, curves tend to be misaligned from each other in a continuous manner, either in space (across the measurements) or in time. However, the notion of time within a feature-vector is very rigid corresponding only to the discrete dimensional setup of the space itself. In contrast to this, we develop a probabilistic framework that allows for the joint clustering and continuous alignment of sets of curves in curve space. Our proposed methodology integrates new probabilistic alignment models with model-based curve clustering algorithms. The probabilistic approach allows for the derivation of consistent EM-type learning algorithms for the joint clustering-alignment problem. Both simulated and real-world datasets are used for detailed experimentation, with two extensive applications to the clustering of cyclone trajectories presented.

[1]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[2]  E. A. Sylvestre,et al.  Self Modeling Nonlinear Regression , 1972 .

[3]  R. Quandt A New Approach to Estimating Switching Regressions , 1972 .

[4]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[5]  David David Maximum likelihood estimates of the parameters of a mixture of two regression lines , 1974 .

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[8]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[9]  J. B. Ramsey,et al.  Estimating Mixtures of Normal Distributions and Switching Regressions , 1978 .

[10]  B. Hoskins,et al.  The Life Cycles of Some Nonlinear Baroclinic Waves , 1978 .

[11]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[12]  Lawrence R. Rabiner,et al.  Application of dynamic time warping to connected digit recognition , 1980 .

[13]  Norman R. Draper,et al.  Applied regression analysis (2. ed.) , 1981, Wiley series in probability and mathematical statistics.

[14]  B. Everitt,et al.  Finite Mixture Distributions , 1981 .

[15]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[16]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[17]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[18]  D. Kendall SHAPE MANIFOLDS, PROCRUSTEAN METRICS, AND COMPLEX PROJECTIVE SPACES , 1984 .

[19]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[20]  F. O’Sullivan A Statistical Perspective on Ill-posed Inverse Problems , 1986 .

[21]  C. D. Boor B(asic)-Spline Basics. , 1986 .

[22]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[23]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[24]  W. DeSarbo,et al.  A maximum likelihood methodology for clusterwise linear regression , 1988 .

[25]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[26]  F. O’Sullivan Fast Computation of Fully Automated Log-Density and Log-Hazard Estimators , 1988 .

[27]  T. Gasser,et al.  Convergence and consistency results for self-modeling nonlinear regression , 1988 .

[28]  M. C. Jones,et al.  Spline Smoothing and Nonparametric Regression. , 1989 .

[29]  T. Lwin,et al.  Probits of mixtures. , 1989, Biometrics.

[30]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[31]  P. Burman A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods , 1989 .

[32]  J. Friedman,et al.  FLEXIBLE PARSIMONIOUS SMOOTHING AND ADDITIVE MODELING , 1989 .

[33]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[34]  James Stephen Marron,et al.  Semiparametric Comparison of Regression Curves , 1990 .

[35]  M. Karim Generalized Linear Models With Random Effects , 1991 .

[36]  C. J. Stone,et al.  A study of logspline density estimation , 1991 .

[37]  R. Elsberry,et al.  Tropical Cyclone Track Characteristics as a Function of Large-Scale Circulation Anomalies , 1991 .

[38]  Wagner A. Kamakura,et al.  Estimating flexible distributions of ideal-points with external analysis of preferences , 1991 .

[39]  Andrew L. Rukhin,et al.  Tools for statistical inference , 1991 .

[40]  Ross J. Murray,et al.  A numerical scheme for tracking cyclone centres from digital data. Part I: development and operation of the scheme , 1991 .

[41]  C. Goodall Procrustes methods in the statistical analysis of shape , 1991 .

[42]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[43]  Geoffrey J. McLachlan,et al.  FITTING FINITE MIXTURE MODELS IN A REGRESSION CONTEXT , 1992 .

[44]  T. Gasser,et al.  Statistical Tools to Analyze Data Representing a Sample of Curves , 1992 .

[45]  C. J. Stone,et al.  Logspline Density Estimation for Censored Data , 1992 .

[46]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[47]  James S. Duncan,et al.  Boundary Finding with Parametrically Deformable Models , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  William H. Press,et al.  Numerical recipes in C (2nd ed.): the art of scientific computing , 1992 .

[49]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[50]  B. S. Everitt,et al.  Cluster analysis , 2014, Encyclopedia of Social Network Analysis and Mining.

[51]  Wayne S. DeSarbo,et al.  A Latent Class Binomial Logit Methodology for the Analysis of Paired Comparison Choice Data: An Application Reinvestigating the Determinants of Perceived Risk , 1993 .

[52]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[53]  R. Sausen,et al.  Objective Identification of Cyclones in GCM Simulations. , 1993 .

[54]  W. M. Gray,et al.  An Observational Analysis of Tropical Cyclone Recurvature , 1993 .

[55]  D. Cox Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[56]  Timothy F. Cootes,et al.  Use of active shape models for locating structures in medical images , 1994, Image Vis. Comput..

[57]  Eric Mjolsness,et al.  New Algorithms for 2D and 3D Point Matching: Pose Estimation and Correspondence , 1998, NIPS.

[58]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[59]  Jianqing Fan,et al.  Local polynomial modelling and its applications , 1994 .

[60]  Kevin I. Hodges,et al.  A General Method for Tracking Analysis and Its Application to Meteorological Data , 1994 .

[61]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994 .

[62]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[63]  Wray L. Buntine Operations for Learning with Graphical Models , 1994, J. Artif. Intell. Res..

[64]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[65]  T. Gasser,et al.  Searching for Structure in Curve Samples , 1995 .

[66]  Russell L. Elsberry,et al.  Large-Scale Circulation Variability over the Tropical Western North Pacific. Part II: Persistence and Transition Characteristics , 1995 .

[67]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[68]  J. Engel,et al.  Model Estimation in Nonlinear-regression Under Shape Invariance , 1995 .

[69]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[70]  Bernard W. Silverman,et al.  Incorporating parametric effects into functional principal components analysis , 1995 .

[71]  R. Elsberry,et al.  Large-Scale Circulation Variability over the Tropical Western North Pacific. Part I: Spatial Patterns and Tropical Cyclone Characteristics , 1995 .

[72]  Volker Tresp,et al.  Improved Gaussian Mixture Density Estimates Using Bayesian Penalty Terms and Network Averaging , 1995, NIPS.

[73]  Kevin I. Hodges,et al.  Feature Tracking on the Unit Sphere , 1995 .

[74]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[75]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[76]  Michael J. Black,et al.  Mixture Models for Image Representation , 1996 .

[77]  I. Johnstone,et al.  Density estimation by wavelet thresholding , 1996 .

[78]  M. Lander,et al.  Specific Tropical Cyclone Track Types and Unusual Tropical Cyclone Motions Associated with a Reverse-Oriented Monsoon Trough in the Western North Pacific , 1996 .

[79]  Anand Rangarajan,et al.  The Softassign Procrustes Matching Algorithm , 1997, IPMI.

[80]  Steven R. Waterhouse,et al.  Classification and Regression using Mixtures of Experts , 1997 .

[81]  R. Blender,et al.  Identification of cyclone‐track regimes in the North Atlantic , 1997 .

[82]  T. Gasser,et al.  Alignment of curves by dynamic time warping , 1997 .

[83]  Eamonn J. Keogh,et al.  An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback , 1998, KDD.

[84]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[85]  A Neumann,et al.  Statistical shape model based segmentation of medical images. , 1998, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society.

[86]  Kevin I. Hodges,et al.  Feature-Point Detection Using Distance Transforms: Application to Tracking Tropical Convective Complexes , 1998 .

[87]  K. Mardia,et al.  Statistical Shape Analysis , 1998 .

[88]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[89]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[90]  James J. Hack,et al.  The Hydrologic and Thermodynamic Characteristics of the NCAR CCM3 , 1998 .

[91]  Martin L. Puterman,et al.  Analysis of Patent Data—A Mixed-Poisson-Regression-Model Approach , 1998 .

[92]  R. Blender,et al.  North Atlantic cyclones in CO2-induced warm climate simulations: frequency, intensity, and tracks , 1998 .

[93]  Edwin R. Hancock,et al.  Graph Matching With a Dual-Step EM Algorithm , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[94]  Brendan J. Frey,et al.  Estimating mixture models of images and inferring spatial transformations using the EM algorithm , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[95]  Padhraic Smyth,et al.  Multiple Regimes in Northern Hemisphere Height Fields via MixtureModel Clustering* , 1999, Journal of the Atmospheric Sciences.

[96]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[97]  Anil K. Jain,et al.  Learning 2D shape models , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[98]  Padhraic Smyth,et al.  Trajectory clustering with mixtures of regression models , 1999, KDD '99.

[99]  Padhraic Smyth,et al.  Probabilistic Clustering using Hierarchical Models , 1999 .

[100]  Eamonn J. Keogh,et al.  Scaling up Dynamic Time Warping to Massive Dataset , 1999, PKDD.

[101]  Wayne S. DeSarbo,et al.  Bayesian inference for finite mixtures of generalized linear models with random effects , 2000 .

[102]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[103]  Kenneth Lange,et al.  Numerical analysis for statisticians , 1999 .

[104]  Atul J. Butte,et al.  Mutual information relevance networks , 2000 .

[105]  Craig B. Borkowf,et al.  Random Number Generation and Monte Carlo Methods , 2000, Technometrics.

[106]  Padhraic Smyth,et al.  Model selection for probabilistic clustering using cross-validated likelihood , 2000, Stat. Comput..

[107]  Brendan J. Frey,et al.  Transformed hidden Markov models: estimating mixture models of images and inferring spatial transformations in video sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[108]  Brendan J. Frey,et al.  Fast, Large-Scale Transformation-Invariant Clustering , 2001, NIPS.

[109]  George M. Church,et al.  Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[110]  Birgitte B. Rønn,et al.  Nonparametric maximum likelihood estimation for shifted curves , 2001 .

[111]  Dinggang Shen,et al.  An efficient fuzzy algorithm for aligning shapes under affine transformations , 2001, Pattern Recognit..

[112]  Gareth M. James,et al.  Functional linear discriminant analysis for irregularly sampled curves , 2001 .

[113]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[114]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[115]  Kert Viele,et al.  Modeling with Mixtures of Linear Regressions , 2002, Stat. Comput..

[116]  B. Hoskins,et al.  New perspectives on the Northern Hemisphere winter storm tracks , 2002 .

[117]  Roberto Marcondes Cesar Junior,et al.  Inference from Clustering with Application to Gene-Expression Microarrays , 2002, J. Comput. Biol..

[118]  Catherine A. Sugar,et al.  Clustering for Sparsely Sampled Functional Data , 2003 .

[119]  Kevin I. Hodges,et al.  Sensitivity of Feature-Based Analysis Methods of Storm Tracks to the Form of Background Field Removal , 2003 .

[120]  Padhraic Smyth,et al.  Curve Clustering with Random Effects Regression Mixtures , 2003, AISTATS.

[121]  Padhraic Smyth,et al.  Translation-invariant mixture models for curve clustering , 2003, KDD '03.

[122]  B. Frey,et al.  Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[123]  Padhraic Smyth,et al.  Probabilistic Models For Joint Clustering And Time-Warping Of Multidimensional Curves , 2003, UAI.

[124]  C. Robert,et al.  Estimating Mixtures of Regressions , 2003 .

[125]  Anand Rangarajan,et al.  Unsupervised learning of an Atlas from unlabeled point-sets , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[126]  Helmuth Späth,et al.  Algorithm 39 Clusterwise linear regression , 1979, Computing.