Sparse Functional Principal Component Analysis in High Dimensions.

Functional principal component analysis (FPCA) is a fundamental tool and has attracted increasing attention in recent decades, while existing methods are restricted to data with a small number of random functions. In this work, we focus on high-dimensional functional processes where the number of random functions $p$ is comparable to, or even much larger than the sample size $n$. Such data are ubiquitous in various fields such as neuroimaging analysis, and cannot be properly modeled by existing methods. We propose a new algorithm, called sparse FPCA, which is able to model principal eigenfunctions effectively under sensible sparsity regimes. While sparsity assumptions are standard in multivariate statistics, they have not been investigated in the complex context where not only is $p$ large, but also each variable itself is an intrinsically infinite-dimensional process. The sparsity structure motivates a thresholding rule that is easy to compute without smoothing operations by exploiting the relationship between univariate orthonormal basis expansions and multivariate Kahunen-Lo\`eve (K-L) representations. We investigate the theoretical properties of the resulting estimators under two sparsity regimes, and simulated and real data examples are provided to offer empirical support which also performs well in subsequent analysis such as classification.

[1]  Jane-Ling Wang,et al.  From sparse to dense functional data and beyond , 2016 .

[2]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[3]  Raymond K. W. Wong,et al.  Partially Linear Functional Additive Models for Multivariate Functional Data , 2018, Journal of the American Statistical Association.

[4]  Michael Elad,et al.  From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[5]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[6]  Jane-ling Wang,et al.  Functional linear regression analysis for longitudinal data , 2005, math/0603132.

[7]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[8]  H. Begleiter,et al.  Event related potentials during object recognition tasks , 1995, Brain Research Bulletin.

[9]  Catherine A. Sugar,et al.  Principal component models for sparse functional data , 1999 .

[10]  Jeng-Min Chiou,et al.  Multivariate functional principal component analysis: A normalization approach , 2014 .

[11]  Fang Yao,et al.  Functional Additive Models , 2008 .

[12]  S. Greven,et al.  Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains , 2015, 1509.02029.

[13]  Joel L. Horowitz,et al.  Methodology and convergence rates for functional linear regression , 2007, 0708.0466.

[14]  Marcela Svarc,et al.  Principal components for multivariate functional data , 2011 .

[15]  H. Muller,et al.  Generalized functional linear models , 2005, math/0505638.

[16]  H. Müller,et al.  Optimal Bayes classifiers for functional data and density ratios , 2016, 1605.03707.

[17]  K. Ritter,et al.  MULTIVARIATE INTEGRATION AND APPROXIMATION FOR RANDOM FIELDS SATISFYING SACKS-YLVISAKER CONDITIONS , 1995 .

[18]  Yang Feng,et al.  A road to classification in high dimensional space: the regularized optimal affine discriminant , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[19]  Lester Ingber,et al.  Statistical mechanics of neocortical interactions : Canonical momenta indicators of electroencephalography , 1995 .

[20]  Xinghao Qiao,et al.  Functional Graphical Models , 2018, Journal of the American Statistical Association.

[21]  M. Yuan,et al.  Optimal estimation of the mean function based on discretely sampled functional data: Phase transition , 2011, 1202.5134.

[22]  Colin O. Wu,et al.  Nonparametric Mixed Effects Models for Unequally Sampled Noisy Curves , 2001, Biometrics.

[23]  Fang Yao,et al.  Partially functional linear regression in high dimensions , 2016 .

[24]  P. Hall,et al.  On properties of functional principal components analysis , 2006 .

[25]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[26]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[27]  Vincent Q. Vu,et al.  MINIMAX SPARSE PRINCIPAL SUBSPACE ESTIMATION IN HIGH DIMENSIONS , 2012, 1211.0373.

[28]  Julien Jacques,et al.  Model-based clustering for multivariate functional data , 2013, Comput. Stat. Data Anal..

[29]  B. Silverman,et al.  Estimating the mean and covariance structure nonparametrically when the data are curves , 1991 .

[30]  Bernice Porjesz,et al.  Patterns of regional brain activity in alcohol-dependent subjects. , 2006, Alcoholism, clinical and experimental research.

[31]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[32]  Jianhua Z. Huang,et al.  Sparse principal component analysis via regularized low rank matrix approximation , 2008 .

[33]  F. Yao,et al.  From multiple Gaussian sequences to functional data and beyond: a Stein estimation approach , 2018 .