Classification methods for Hilbert data based on surrogate density

An unsupervised and a supervised classification approach for Hilbert random curves are studied. Both rest on the use of a surrogate of the probability density which is defined, in a distribution-free mixture context, from an asymptotic factorization of the small-ball probability. That surrogate density is estimated by a kernel approach from the principal components of the data. The focus is on the illustration of the classification algorithms and the computational implications, with particular attention to the tuning of the parameters involved. Some asymptotic results are sketched. Applications on simulated and real datasets show how the proposed methods work.

[1]  L. Devroye On the Almost Everywhere Convergence of Nonparametric Regression Function Estimates , 1981 .

[2]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[3]  Richard C. Dubes,et al.  Cluster Analysis and Related Issues , 1993, Handbook of Pattern Recognition and Computer Vision.

[4]  Manuel Febrero-Bande,et al.  Statistical Computing in Functional Data Analysis: The R Package fda.usc , 2012 .

[5]  A. Goia,et al.  A clustering method for Hilbert functional data based on the Small Ball Probability , 2015 .

[6]  Julien Jacques,et al.  Model-based clustering for multivariate functional data , 2013, Comput. Stat. Data Anal..

[7]  Valérie Ventura,et al.  To sort or not to sort: the impact of spike-sorting on neural decoding performance , 2014, Journal of neural engineering.

[8]  A. Goia,et al.  Functional clustering and linear regression for peak load forecasting , 2010 .

[9]  Jianfei Liu,et al.  A new point containment test algorithm based on preprocessing and determining triangles , 2010, Comput. Aided Des..

[10]  Hans-Hermann Bock,et al.  Special issue on “Model-based clustering and classification” , 2013, Adv. Data Anal. Classif..

[11]  Rebecca Nugent,et al.  Stability of density-based clustering , 2010, J. Mach. Learn. Res..

[12]  Gérard Govaert,et al.  Model-based cluster and discriminant analysis with the MIXMOD software , 2006, Comput. Stat. Data Anal..

[13]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[14]  B. Presnell,et al.  Nonparametric estimation of the mode of a distribution of random curves , 1998 .

[15]  Federico Rotolo,et al.  parfm: Parametric Frailty Models in R , 2012 .

[16]  Nonparametric estimation of a surrogate density function in infinite-dimensional spaces , 2012 .

[17]  Nicola Torelli,et al.  Clustering via nonparametric density estimation , 2007, Stat. Comput..

[18]  Frédéric Ferraty,et al.  Curves discrimination: a nonparametric functional approach , 2003, Comput. Stat. Data Anal..

[19]  Christopher R. Genovese,et al.  Asymptotic theory for density ridges , 2014, 1406.5663.

[20]  Aldo Goia,et al.  A functional linear model for time series prediction with exogenous variables , 2012 .

[21]  Hans-Hermann Bock,et al.  Special issue on “Model-based clustering and classification” (part 2) , 2014, Adv. Data Anal. Classif..

[22]  Aldo Goia,et al.  Some Insights About the Small Ball Probability Factorization for Hilbert Random Elements , 2015, 1501.04308.

[23]  M. Hazelton,et al.  Cross‐validation Bandwidth Matrices for Multivariate Kernel Density Estimation , 2005 .

[24]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[25]  Aurore Delaigle,et al.  Componentwise classification and clustering of functional data , 2012 .

[26]  Hyejin Shin An extension of Fisher's discriminant analysis for stochastic processes , 2008 .

[27]  Fabio Roli,et al.  Structural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshop, SSPR & SPR 2008, Orlando, USA, December 4-6, 2008. Proceedings ... Vision, Pattern Recognition, and Graphics) , 2008 .

[28]  Frédéric Ferraty,et al.  Nonparametric Functional Data Analysis: Theory and Practice (Springer Series in Statistics) , 2006 .

[29]  Catherine A. Sugar,et al.  Clustering for Sparsely Sampled Functional Data , 2003 .

[30]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[31]  P. Hall,et al.  Defining probability density for a distribution of random functions , 2010, 1002.4931.

[32]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[33]  Georgy Gimel farb Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings , 2012 .

[34]  R. D. Tuddenham,et al.  Physical growth of California boys and girls from birth to eighteen years. , 1954, Publications in child development. University of California, Berkeley.

[35]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[36]  T. Sager An Iterative Method for Estimating a Multivariate Mode and Isopleth , 1979 .

[37]  Daniel P. Siewiorek,et al.  Generalized feature extraction for structural pattern recognition in time-series data , 2001 .

[38]  Gareth M. James,et al.  Functional linear discriminant analysis for irregularly sampled curves , 2001 .

[39]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[40]  J. O. Ramsay,et al.  Functional Data Analysis (Springer Series in Statistics) , 1997 .

[41]  Sophie Dabo-Niang,et al.  On the using of modal curves for radar waveforms classification , 2007, Comput. Stat. Data Anal..

[42]  Gérard Biau,et al.  Simple estimation of the mode of a multivariate density , 2003 .