Clustering for Sparsely Sampled Functional Data

We develop a flexible model-based procedure for clustering functional data. The technique can be applied to all types of curve data but is particularly useful when individuals are observed at a sparse set of time points. In addition to producing final cluster assignments, the procedure generates predictions and confidence intervals for missing portions of curves. Our approach also provides many useful tools for evaluating the resulting models. Clustering can be assessed visually via low-dimensional representations of the curves, and the regions of greatest separation between clusters can be determined using a discriminant function. Finally, we extend the model to handle multiple functional and finite-dimensional covariates and show how it can be applied to standard finite-dimensional clustering problems involving missing data.

[1]  Fitting a Straight Line , 1946 .

[2]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[3]  Pasquale J. Di Pillo Further applications of bias to discriminant analysis , 1976 .

[4]  R. Kronmal,et al.  Discriminant functions when covariances are unequal and sample sizes are moderate , 1977 .

[5]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[6]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[7]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[8]  Maurice K. Wong,et al.  Algorithm AS136: A k-means clustering algorithm. , 1979 .

[9]  Adrian E. Raftery,et al.  Fitting straight lines to point patterns , 1984, Pattern Recognit..

[10]  D. Titterington Common structure of smoothing techniques in statistics , 1985 .

[11]  [A Statistical Perspective on Ill-Posed Inverse Problems]: Comment , 1986 .

[12]  F. O’Sullivan A Statistical Perspective on Ill-posed Inverse Problems , 1986 .

[13]  J. Friedman Regularized Discriminant Analysis , 1989 .

[14]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[15]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[16]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[17]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[18]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[19]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..

[20]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[21]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[22]  A. Raftery,et al.  Detecting features in spatial point processes with clutter via model-based clustering , 1998 .

[23]  B. Narasimhan,et al.  Bone mineral acquisition in healthy Asian, Hispanic, black, and Caucasian youth: a longitudinal study. , 1999, The Journal of clinical endocrinology and metabolism.

[24]  Catherine A. Sugar,et al.  Principal component models for sparse functional data , 1999 .

[25]  Colin O. Wu,et al.  Nonparametric Mixed Effects Models for Unequally Sampled Noisy Curves , 2001, Biometrics.

[26]  Gareth M. James,et al.  Functional linear discriminant analysis for irregularly sampled curves , 2001 .

[27]  Juan Romo,et al.  Depth-based classification for functional data , 2005, Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications.

[28]  atherine,et al.  Finding the number of clusters in a data set : An information theoretic approach C , 2003 .