Smoothing and Mean-Covariance Estimation of Functional Data with a Bayesian Hierarchical Model.

Functional data, with basic observational units being functions (e.g., curves, surfaces) varying over a continuum, are frequently encountered in various applications. While many statistical tools have been developed for functional data analysis, the issue of smoothing all functional observations simultaneously is less studied. Existing methods often focus on smoothing each individual function separately, at the risk of removing important systematic patterns common across functions. We propose a nonparametric Bayesian approach to smooth all functional observations simultaneously and nonparametrically. In the proposed approach, we assume that the functional observations are independent Gaussian processes subject to a common level of measurement errors, enabling the borrowing of strength across all observations. Unlike most Gaussian process regression models that rely on pre-specified structures for the covariance kernel, we adopt a hierarchical framework by assuming a Gaussian process prior for the mean function and an Inverse-Wishart process prior for the covariance function. These prior assumptions induce an automatic mean-covariance estimation in the posterior inference in addition to the simultaneous smoothing of all observations. Such a hierarchical framework is flexible enough to incorporate functional data with different characteristics, including data measured on either common or uncommon grids, and data with either stationary or nonstationary covariance structures. Simulations and real data analysis demonstrate that, in comparison with alternative methods, the proposed Bayesian approach achieves better smoothing accuracy and comparable mean-covariance estimation results. Furthermore, it can successfully retain the systematic patterns in the functional observations that are usually neglected by the existing functional data analyses based on individual-curve smoothing.

[1]  Marina Vannucci,et al.  A Bayesian Hierarchical Model for Classification with Selection of Functional Predictors , 2010, Biometrics.

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Ci-Ren Jiang,et al.  COVARIATE ADJUSTED FUNCTIONAL PRINCIPAL COMPONENTS ANALYSIS FOR LONGITUDINAL DATA , 2010, 1003.0261.

[4]  A. Hottinger,et al.  Child Development: An International Method Study , 1960 .

[5]  Stéphane Girard,et al.  Functional nonparametric estimation of conditional extreme quantiles , 2010, J. Multivar. Anal..

[6]  Peter Hall,et al.  A Functional Data—Analytic Approach to Signal Discrimination , 2001, Technometrics.

[7]  J.Q.  Shi,et al.  Mixed‐effects Gaussian process functional regression models with application to dose–response curve prediction , 2012, Statistics in medicine.

[8]  J. Neumann Distribution of the Ratio of the Mean Square Successive Difference to the Variance , 1941 .

[9]  Hans-Georg Müller,et al.  Classification using functional data analysis for temporal gene expression data , 2006, Bioinform..

[10]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[11]  A. Dawid Some matrix-variate distribution theory: Notational considerations and a Bayesian application , 1981 .

[12]  Bo Wang,et al.  Generalized Gaussian Process Regression Model for Non-Gaussian Functional Data , 2014, 1401.8189.

[13]  Gareth M. James,et al.  Principal component models for sparse functional data , 1999 .

[14]  J. Ramsay,et al.  Some Tools for Functional Data Analysis , 1991 .

[15]  Jane-ling Wang,et al.  Functional linear regression analysis for longitudinal data , 2005, math/0603132.

[16]  Dennis D. Cox,et al.  A Functional Generalized Linear Model with Curve Selection in Cervical Pre-cancer Diagnosis Using Fluorescence Spectroscopy , 2009 .

[17]  S. Sain,et al.  Bayesian functional ANOVA modeling using Gaussian process prior distributions , 2010 .

[18]  J. Weston,et al.  Approximation Methods for Gaussian Process Regression , 2007 .

[19]  Aki Vehtari,et al.  MCMC Diagnostics for Matlab 6.x , 2003 .

[20]  G. Casella An Introduction to Empirical Bayes Data Analysis , 1985 .

[21]  Hongxiao Zhu,et al.  Robust, Adaptive Functional Regression in Functional Mixed Model Framework , 2011, Journal of the American Statistical Association.

[22]  George Casella,et al.  Improved Estimation of Dissimilarities by Presmoothing Functional Data , 2006 .

[23]  Wesley K. Thompson,et al.  A Bayesian regression model for multivariate functional data , 2009, Comput. Stat. Data Anal..

[24]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[25]  Hans-Georg Ller,et al.  Functional Modelling and Classification of Longitudinal Data. , 2005 .

[26]  Pierre Lane,et al.  Optical technologies and molecular imaging for cervical neoplasia: a program project update. , 2012, Gender medicine.

[27]  J. O. Ramsay,et al.  Functional Data Analysis (Springer Series in Statistics) , 1997 .

[28]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[29]  Helen Armstrong,et al.  Bayesian covariance matrix estimation using a mixture of decomposable graphical models , 2007, Stat. Comput..

[30]  D. Dunson,et al.  Efficient Gaussian process regression for large datasets. , 2011, Biometrika.

[31]  Michele Follen,et al.  Accuracy of optical spectroscopy for the detection of cervical intraepithelial neoplasia without colposcopic tissue information; a step toward automation for low resource settings. , 2012, Journal of biomedical optics.

[32]  Haotian Hang,et al.  Inconsistent Estimation and Asymptotically Equal Interpolations in Model-Based Geostatistics , 2004 .

[33]  Nicholas I. Fisher,et al.  On the Nonparametric Estimation of Covariance Functions , 1994 .

[34]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[36]  Ivan Jeliazkov,et al.  MCMC Estimation of Restricted Covariance Matrices , 2009 .

[37]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[38]  Alan E. Gelfand,et al.  Bayesian nonparametric modeling for functional analysis of variance , 2014 .

[39]  T. Auton Applied Functional Data Analysis: Methods and Case Studies , 2004 .

[40]  David B. Dunson,et al.  Bayesian Graphical Models for Multivariate Functional Data , 2014, J. Mach. Learn. Res..

[41]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[42]  Chan‐Fu Chen,et al.  Bayesian Inference for a Normal Dispersion Matrix and its Application to Stochastic Multiple Regression Analysis , 1979 .

[43]  P. Sarda,et al.  SPLINE ESTIMATORS FOR THE FUNCTIONAL LINEAR MODEL , 2003 .

[44]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[45]  B. Silverman,et al.  Smoothed functional principal components analysis by choice of norm , 1996 .

[46]  P. Guttorp,et al.  Nonparametric Estimation of Nonstationary Spatial Covariance Structure , 1992 .

[47]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[48]  B. Silverman,et al.  Estimating the mean and covariance structure nonparametrically when the data are curves , 1991 .

[49]  T. Choi,et al.  Gaussian Process Regression Analysis for Functional Data , 2011 .

[50]  R. Kass,et al.  Nonconjugate Bayesian Estimation of Covariance Matrices and its Use in Hierarchical Models , 1999 .