Review of Clustering Methods for Functional Data

Functional data clustering is to identify heterogeneous morphological patterns in the continuous functions underlying the discrete measurements/observations. Application of functional data clustering has appeared in many publications across various fields of sciences, including but not limited to biology, (bio)chemistry, engineering, environmental science, medical science, psychology, social science, and so on. The phenomenal growth of the application of functional data clustering indicates the urgent need for a systematic approach to develop efficient clustering methods and scalable algorithmic implementations. On the other hand, there is abundant literature on the cluster analysis of time series, trajectory data, spatio-temporal data, and so on, which are all related to functional data. Therefore, an overarching structure of existing functional data clustering methods will enable the cross-pollination of ideas across various research fields. We here conduct a comprehensive review of original clustering methods for functional data. We propose a systematic taxonomy that explores the connections and differences among the existing functional data clustering methods and relates them to the conventional multivariate clustering methods. The structure of the taxonomy is built on three main attributes of a functional data clustering method and therefore is more reliable than existing categorizations. The review aims to bridge the gap between the functional data analysis community and the clustering community and to generate new principles for functional data clustering.

[1]  Hui Wu,et al.  Clustering Spatially Correlated Functional Data With Multiple Scalar Covariates. , 2022, IEEE transactions on neural networks and learning systems.

[2]  Heng Peng,et al.  Cluster analysis with regression of non‐Gaussian functional data on covariates , 2021, Canadian Journal of Statistics.

[3]  Mimi Zhang,et al.  DCF: An Efficient and Robust Density-Based Clustering Method , 2021, 2021 IEEE International Conference on Data Mining (ICDM).

[4]  B. Willis,et al.  Clustering functional data using forward search based on functional spatial ranks with medical applications , 2021, Statistical methods in medical research.

[5]  Darshan Bryner,et al.  Shape Analysis of Functional Data With Elastic Partial Matching , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Sarem Seitz,et al.  Mixtures of Gaussian Processes for regression under multiple prior distributions , 2021, ArXiv.

[7]  Simone Vantini,et al.  FunCC: A new bi-clustering algorithm for functional data with misalignment , 2021, Comput. Stat. Data Anal..

[8]  Bo Wang,et al.  Functional data clustering using principal curve methods , 2021, Communications in Statistics - Theory and Methods.

[9]  Alex Sharp,et al.  Functional data clustering by projection into latent generalized hyperbolic subspaces , 2021, Advances in Data Analysis and Classification.

[10]  Valentin Patilea,et al.  Clustering multivariate functional data using unsupervised binary trees , 2020, Comput. Stat. Data Anal..

[11]  Annamaria Bianchi,et al.  A study of longitudinal mobile health data through fuzzy clustering methods for functional data: The case of allergic rhinoconjunctivitis in childhood , 2020, PloS one.

[12]  Ruth King,et al.  Parameter clustering in Bayesian functional principal component analysis of neuroscientific data , 2020, Statistics in medicine.

[13]  Elvira Romano,et al.  Optimally weighted L2 distances for spatially dependent functional data , 2020 .

[14]  Sujing Wang,et al.  Design of Fast and Scalable Clustering Algorithm on Spark , 2020, ICCBDC.

[15]  Huazhen Lin,et al.  Cluster non‐Gaussian functional data , 2020, Biometrics.

[16]  Hernando Ombao,et al.  Clustering Brain Signals: a Robust Approach Using Functional Data Ranking , 2020, Journal of Classification.

[17]  Hee-Seok Oh,et al.  Pseudo-quantile functional data clustering , 2020, J. Multivar. Anal..

[18]  Haozhe Zhang,et al.  Modeling and Regionalization of China’s PM2.5 Using Spatial-Functional Mixture Models , 2020 .

[19]  C. Bouveyron,et al.  Co-clustering of multivariate functional data for the analysis of air pollution in the South of France , 2020, The Annals of Applied Statistics.

[20]  J. Jacques,et al.  Co-clustering for binary and functional data , 2020, Commun. Stat. Simul. Comput..

[21]  Hee‐Seok Oh,et al.  A generalization of functional clustering for discrete multivariate longitudinal data , 2020, Statistical methods in medical research.

[22]  Marc Fredette,et al.  On the importance of similarity characteristics of curve clustering and its applications , 2020, Pattern Recognit. Lett..

[23]  Dino Ienco,et al.  Deep Multivariate Time Series Embedding Clustering via Attentive-Gated Autoencoder , 2020, PAKDD.

[24]  Yi Guo,et al.  Robust Functional Manifold Clustering , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Thomas Laloë,et al.  Quantization based clustering: An iterative approach , 2020, Pattern Recognit. Lett..

[26]  Charles Bouveyron,et al.  Clustering multivariate functional data in group-specific functional subspaces , 2020, Computational Statistics.

[27]  Mimi Zhang,et al.  Weighted Clustering Ensemble: A Review , 2019, Pattern Recognit..

[28]  E. Lebarbier,et al.  A novel regularized approach for functional data clustering: an application to milking kinetics in dairy goats , 2019, Journal of the Royal Statistical Society: Series C (Applied Statistics).

[29]  Shehroz S. Khan,et al.  Spatiotemporal clustering: a review , 2019, Artificial Intelligence Review.

[30]  Hee-Seok Oh,et al.  Multiscale Clustering for Functional Data , 2019, J. Classif..

[31]  Philip S. Yu,et al.  Deep Learning for Spatio-Temporal Data Mining: A Survey , 2019, IEEE Transactions on Knowledge and Data Engineering.

[32]  Ronaldo Dias,et al.  Functional data clustering via hypothesis testing k-means , 2019, Comput. Stat..

[33]  Ronaldo Dias,et al.  Selection of the number of clusters in functional data analysis , 2019, Journal of Statistical Computation and Simulation.

[34]  Alan E. Gelfand,et al.  Multivariate functional data modeling with time-varying clustering , 2019, TEST.

[35]  Aurore Delaigle,et al.  Clustering functional data into groups by using projections , 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[36]  Lina Schelin,et al.  Multiresolution clustering of dependent functional data with application to climate reconstruction , 2019, Stat.

[37]  Jen-Hao Chen,et al.  A self-organizing clustering algorithm for functional data , 2018, Commun. Stat. Simul. Comput..

[38]  H. Dette,et al.  A similarity measure for second order properties of non-stationary functional time series with applications to clustering and testing , 2018, Bernoulli.

[39]  Jiye Liang,et al.  A new distance with derivative information for functional k-means clustering algorithm , 2018, Inf. Sci..

[40]  Faicel Chamroukhi,et al.  Model‐based clustering and classification of functional data , 2018, WIREs Data Mining Knowl. Discov..

[41]  Daniel Cremers,et al.  Clustering with Deep Learning: Taxonomy and New Methods , 2018, ArXiv.

[42]  Eric Fu,et al.  Model-based curve registration via stochastic approximation EM algorithm , 2017, Comput. Stat. Data Anal..

[43]  Pengcheng Zeng,et al.  Simultaneous Registration and Clustering for Multidimensional Functional Data , 2017, Journal of Computational and Graphical Statistics.

[44]  Anuj Karpatne,et al.  Spatio-Temporal Data Mining , 2017, ACM Comput. Surv..

[45]  Feras Saad,et al.  Temporally-Reweighted Chinese Restaurant Process Mixtures for Clustering, Imputing, and Forecasting Multivariate Time Series , 2017, AISTATS.

[46]  F. Ieva,et al.  A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data , 2017, Statistical Methods & Applications.

[47]  Heungsun Hwang,et al.  Dimension-Reduced Clustering of Functional Data via Subspace Separation , 2017, Journal of Classification.

[48]  Stephen P. Boyd,et al.  Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data , 2017, KDD.

[49]  C. Hennig,et al.  Dissimilarity for functional data clustering based on smoothing parameter commutation , 2017, Statistical Methods in Medical Research.

[50]  Chao Zhang,et al.  Trajectory clustering via deep representation learning , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[51]  Rosanna Verde,et al.  Spatial variability clustering for spatially dependent functional data , 2016, Statistics and Computing.

[52]  G. Molenberghs,et al.  Clustering multiply imputed multivariate high‐dimensional longitudinal profiles , 2017, Biometrical journal. Biometrische Zeitschrift.

[53]  Jeongyoun Ahn,et al.  Clustering multivariate functional data with phase variation , 2017, Biometrics.

[54]  Luis Angel García-Escudero,et al.  Robust clustering for functional data based on trimming and constraints , 2017, Advances in Data Analysis and Classification.

[55]  Andrew L. Janke,et al.  Whole‐volume clustering of time series data from zebrafish brain calcium images via mixture modeling , 2016, Stat. Anal. Data Min..

[56]  Anuj Srivastava,et al.  Functional and Shape Data Analysis , 2016 .

[57]  David B. Hitchcock,et al.  A Bayesian method for simultaneous registration and clustering of functional observations , 2016, Comput. Stat. Data Anal..

[58]  Julien Jacques,et al.  Model-based co-clustering for functional data , 2016, Neurocomputing.

[59]  Christopher Leckie,et al.  Online Clustering of Multivariate Time-series , 2016, SDM.

[60]  P. Girardi,et al.  Clustering Chlorophyll-a satellite data using quantiles , 2016 .

[61]  John A. D. Aston,et al.  Inference on Covariance Operators via Concentration Inequalities: k-sample Tests, Classification, and Clustering via Rademacher Complexities , 2016, Sankhya A.

[62]  Eric Moulines,et al.  Online EM for functional data , 2016, Comput. Stat. Data Anal..

[63]  S. Ghosal,et al.  Bayesian Clustering of Functional Data Using Local Features , 2016 .

[64]  E. Smith,et al.  Bivariate functional data clustering: grouping streams based on a varying coefficient model of the stream water and air temperature relationship , 2016 .

[65]  C. Bouveyron,et al.  The discriminative functional mixture model for a comparative analysis of bike sharing systems , 2016, 1601.07999.

[66]  C. Genovese,et al.  Nonparametric Clustering of Functional Data Using Pseudo-Densities , 2016, 1601.07872.

[67]  J. S. Marron,et al.  Functional Data Analysis of Amplitude and Phase Variation , 2015, 1512.03216.

[68]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[69]  Yan Liu,et al.  Functional Subspace Clustering with Application to Time Series , 2015, ICML.

[70]  Yu Zheng,et al.  Trajectory Data Mining , 2015, ACM Trans. Intell. Syst. Technol..

[71]  Ricardo García-Ródenas,et al.  K-means algorithms for functional data , 2015, Neurocomputing.

[72]  Valeria Vitelli,et al.  Sparse clustering of functional data , 2015, J. Multivar. Anal..

[73]  R A Haggarty,et al.  Spatially weighted functional clustering of river network data , 2014, Journal of the Royal Statistical Society. Series C, Applied statistics.

[74]  Yehua Li,et al.  Joint Modeling and Clustering Paired Generalized Longitudinal Trajectories With Application to Cocaine Abuse Treatment Data , 2014 .

[75]  Julien Jacques,et al.  Functional data clustering: a survey , 2013, Advances in Data Analysis and Classification.

[76]  Thaddeus Tarpey,et al.  Optimally weighted L2 distance for functional data , 2014, Biometrics.

[77]  A. Cuevas A partial overview of the theory of statistics with functional data , 2014 .

[78]  Faicel Chamroukhi,et al.  Piecewise Regression Mixture for Simultaneous Functional Data Clustering and Optimal Segmentation , 2013, J. Classif..

[79]  Julien Jacques,et al.  Model-based clustering for multivariate functional data , 2013, Comput. Stat. Data Anal..

[80]  Yoshikazu Terada,et al.  Functional factorial K-means analysis , 2013, Comput. Stat. Data Anal..

[81]  Julien Jacques,et al.  Funclust: A curves clustering method using functional random variables density approximation , 2013, Neurocomputing.

[82]  Simone Vantini,et al.  Bagging Voronoi classifiers for clustering spatial functional data , 2013, Int. J. Appl. Earth Obs. Geoinformation.

[83]  Martin Bauer,et al.  Overview of the Geometries of Shape Spaces and Diffeomorphism Groups , 2013, Journal of Mathematical Imaging and Vision.

[84]  Francesca Ieva,et al.  Multivariate functional clustering for the morphological analysis of electrocardiograph curves , 2013 .

[85]  Caroline F Finch,et al.  Applications of functional data analysis: A systematic review , 2013, BMC Medical Research Methodology.

[86]  Arnovst Kom'arek,et al.  Clustering for multivariate continuous and discrete longitudinal data , 2013, 1304.4448.

[87]  Jorge Mateu,et al.  Hierarchical clustering of spatially correlated functional data , 2012 .

[88]  Michio Yamamoto,et al.  Clustering of functional data in a low-dimensional subspace , 2012, Advances in Data Analysis and Classification.

[89]  Nicoleta Serban,et al.  Multilevel Functional Clustering Analysis , 2012, Biometrics.

[90]  Mia Hubert,et al.  Phase and Amplitude-Based Clustering for Functional Data , 2012, Comput. Stat. Data Anal..

[91]  Charles Bouveyron,et al.  Kernel discriminant analysis and clustering with parsimonious Gaussian process models , 2012, Statistics and Computing.

[92]  Daniela Cocchi,et al.  Clustering compositional data trajectories: the case of particulate matter in the lower troposphere , 2011 .

[93]  Charles Bouveyron,et al.  Model-based clustering of time series in group-specific functional subspaces , 2011, Adv. Data Anal. Classif..

[94]  Anuj Srivastava,et al.  Shape Analysis of Elastic Curves in Euclidean Spaces , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[95]  Alan E. Gelfand,et al.  The Dirichlet labeling process for clustering functional data , 2011 .

[96]  M Giacofci,et al.  Wavelet‐Based Clustering for Mixed‐Effects Functional Models in High Dimension , 2011, Biometrics.

[97]  David Kaziska Functional Analysis of Variance, Discriminant Analysis, and Clustering in a Manifold of Elastic Curves , 2011 .

[98]  J. Marron,et al.  Registration of Functional Data Using Fisher-Rao Metric , 2011, 1103.3817.

[99]  Sadanori Konishi,et al.  Functional Cluster Analysis via Orthonormalized Gaussian Basis Expansions and Its Application , 2010, J. Classif..

[100]  Anestis Antoniadis,et al.  Clustering Functional Data using Wavelets , 2010, Int. J. Wavelets Multiresolution Inf. Process..

[101]  Simone Vantini,et al.  K-mean Alignment for Curve Clustering , 2010, Comput. Stat. Data Anal..

[102]  Yves Lechevallier,et al.  Exploratory analysis of functional data via clustering and optimal segmentation , 2010, Neurocomputing.

[103]  P. Hall,et al.  Defining probability density for a distribution of random functions , 2010, 1002.4931.

[104]  XuanLong Nguyen,et al.  Inference of global clusters from locally distributed data , 2010, ArXiv.

[105]  Laura Ferreira,et al.  A Comparison of Hierarchical Methods for Clustering Functional Data , 2009, Commun. Stat. Simul. Comput..

[106]  A. Gelfand,et al.  Hybrid Dirichlet mixture models for functional data , 2009 .

[107]  Lin Zhong,et al.  uWave: Accelerometer-based personalized gesture recognition and its applications , 2009, 2009 IEEE International Conference on Pervasive Computing and Communications.

[108]  Xueli Liu,et al.  Simultaneous curve registration and clustering for functional data , 2009, Comput. Stat. Data Anal..

[109]  Pai-Ling Li,et al.  Correlation-Based Functional Clustering via Subspace Projection , 2008 .

[110]  B. Wang,et al.  Curve prediction and clustering with mixtures of Gaussian process functional regression models , 2008, Stat. Comput..

[111]  H. Müller,et al.  Modelling sparse generalized longitudinal observations with latent Gaussian processes , 2008 .

[112]  Wenxuan Zhong,et al.  Penalized Clustering of Large-Scale Functional Data With Multiple Covariates , 2008, 0801.2555.

[113]  Peter Hall,et al.  A Method for Projecting Functional Data Onto a Low-Dimensional Space , 2007 .

[114]  Jeng-Min Chiou,et al.  Functional clustering and identifying substructures of longitudinal data , 2007 .

[115]  Juan Antonio Cuesta-Albertos,et al.  Impartial trimmed k-means for functional data , 2007, Comput. Stat. Data Anal..

[116]  Sophie Dabo-Niang,et al.  On the using of modal curves for radar waveforms classification , 2007, Comput. Stat. Data Anal..

[117]  Hiroshi Yadohisa,et al.  Crisp and fuzzy k-means clustering algorithms for multivariate functional data , 2007, Comput. Stat..

[118]  T. Tarpey Linear Transformations and the k-Means Clustering Algorithm , 2007, American Statistician.

[119]  B. Mallick,et al.  Functional clustering by Bayesian wavelet methods , 2006 .

[120]  Luis Angel García-Escudero,et al.  A Proposal for Robust Curve Clustering , 2005, J. Classif..

[121]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[122]  C. Abraham,et al.  Unsupervised Curve Clustering using B‐Splines , 2003 .

[123]  Catherine A. Sugar,et al.  Clustering for Sparsely Sampled Functional Data , 2003 .

[124]  Thaddeus Tarpey,et al.  Clustering Functional Data , 2003, J. Classif..

[125]  D. M. Titterington,et al.  Bayesian regression and classification using mixtures of Gaussian processes , 2003 .

[126]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[127]  Francesca Ieva,et al.  Covariance-based Clustering in Multivariate and Functional Data Analysis , 2016, J. Mach. Learn. Res..

[128]  Geoffrey J. McLachlan,et al.  Mixtures of spatial spline regressions for clustering and classification , 2016, Comput. Stat. Data Anal..

[129]  Volker Tresp,et al.  Mixtures of Gaussian Processes , 2000, NIPS.