Locally Adaptive Bayes Nonparametric Regression via Nested Gaussian Processes

We propose a nested Gaussian process (nGP) as a locally adaptive prior for Bayesian nonparametric regression. Specified through a set of stochastic differential equations (SDEs), the nGP imposes a Gaussian process prior for the function’s mth-order derivative. The nesting comes in through including a local instantaneous mean function, which is drawn from another Gaussian process inducing adaptivity to locally varying smoothness. We discuss the support of the nGP prior in terms of the closure of a reproducing kernel Hilbert space, and consider theoretical properties of the posterior. The posterior mean under the nGP prior is shown to be equivalent to the minimizer of a nested penalized sum-of-squares involving penalties for both the global and local roughness of the function. Using highly efficient Markov chain Monte Carlo for posterior inference, the proposed method performs well in simulation studies compared to several alternatives, and is scalable to massive data, illustrated through a proteomics application.

[1]  James O. Ramsay,et al.  Penalized regression with model‐based penalties , 2000 .

[2]  T. Choi,et al.  Gaussian Process Regression Analysis for Functional Data , 2011 .

[3]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[4]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[5]  C. Crainiceanu,et al.  Fast Adaptive Penalized Splines , 2008 .

[6]  Christopher Holmes,et al.  Spatially adaptive smoothing splines , 2006 .

[7]  Junqing Wu,et al.  Nonparametric Regression With Basis Selection From Multiple Libraries , 2013, Technometrics.

[8]  Baver Okutmustur Reproducing kernel Hilbert spaces , 2005 .

[9]  Cees G. M. Snoek,et al.  Variable Selection , 2019, Model-Based Clustering and Classification for Data Science.

[10]  Jianqing Fan,et al.  Data‐Driven Bandwidth Selection in Local Polynomial Fitting: Variable Bandwidth and Spatial Adaptation , 1995 .

[11]  Ina Hoeschele,et al.  Nonparametric Bayesian Variable Selection With Applications to Multiple Quantitative Trait Loci Mapping With Epistasis and Gene–Environment Interaction , 2010, Genetics.

[12]  Jeffrey S. Morris,et al.  Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum , 2005, Bioinform..

[13]  R. Kohn,et al.  Nonparametric regression using Bayesian variable selection , 1996 .

[14]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[15]  Sally Wood,et al.  Bayesian mixture of splines for spatially adaptive nonparametric regression , 2002 .

[16]  Van Der Vaart,et al.  Rates of contraction of posterior distributions based on Gaussian process priors , 2008 .

[17]  R. Aebersold,et al.  Mass Spectrometry and Protein Analysis , 2006, Science.

[18]  G. Wahba Spline models for observational data , 1990 .

[19]  T. Faniran Numerical Solution of Stochastic Differential Equations , 2015 .

[20]  Felix Abramovich,et al.  Improved inference in nonparametric regression using Lk-smoothing splines , 1996 .

[21]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[22]  S. Ghosal,et al.  Posterior consistency of Gaussian process prior for nonparametric binary regression , 2006, math/0702686.

[23]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[24]  R. Kass,et al.  Bayesian curve-fitting with free-knot splines , 2001 .

[25]  I. Johnstone,et al.  Wavelet Shrinkage: Asymptopia? , 1995 .

[26]  Abel M. Rodrigues Matrix Algebra Useful for Statistics , 2007 .

[27]  David B. Dunson,et al.  Adaptive dimension reduction with a Gaussian process prior , 2011 .

[28]  David Ruppert,et al.  Theory & Methods: Spatially‐adaptive Penalties for Spline Fitting , 2000 .

[29]  Marina Vannucci,et al.  Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies. , 2011, Statistical science : a review journal of the Institute of Mathematical Statistics.

[30]  D. Ruppert,et al.  Spatially Adaptive Bayesian Penalized Splines With Heteroscedastic Errors , 2007 .

[31]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[32]  Jeffrey S. Morris,et al.  Improved peak detection and quantification of mass spectrometry data acquired from surface‐enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform , 2005, Proteomics.

[33]  Siem Jan Koopman,et al.  A simple and efficient simulation smoother for state space time series analysis , 2002 .

[34]  Jeffrey S. Morris,et al.  Understanding the characteristics of mass spectrometry data through the use of simulation , 2005, Cancer informatics.

[35]  G. Wahba,et al.  Hybrid Adaptive Splines , 1997 .

[36]  Debdeep Pati,et al.  ANISOTROPIC FUNCTION ESTIMATION USING MULTI-BANDWIDTH GAUSSIAN PROCESSES. , 2011, Annals of statistics.

[37]  Adrian F. M. Smith,et al.  Automatic Bayesian curve fitting , 1998 .

[38]  Fabian Scheipl,et al.  Locally adaptive Bayesian P-splines with a Normal-Exponential-Gamma prior , 2009, Comput. Stat. Data Anal..

[39]  Robert Kohn,et al.  Locally Adaptive Nonparametric Binary Regression , 2007, 0709.3545.

[40]  L. Rosasco,et al.  Reproducing kernel Hilbert spaces , 2019, High-Dimensional Statistics.

[41]  A. P. Dawid,et al.  Regression and Classification Using Gaussian Process Priors , 2009 .

[42]  L. Shepp Radon-Nikodym Derivatives of Gaussian Measures , 1966 .

[43]  Jeffrey S. Morris,et al.  Bayesian Analysis of Mass Spectrometry Proteomic Data Using Wavelet‐Based Functional Mixed Models , 2008, Biometrics.

[44]  M. Clyde,et al.  Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels , 2011, 1112.3149.

[45]  A. W. Vaart,et al.  Reproducing kernel Hilbert spaces of Gaussian priors , 2008, 0805.3252.

[46]  Peter X-K Song,et al.  Stochastic Functional Data Analysis: A Diffusion Model‐Based Approach , 2011, Biometrics.

[47]  Siem Jan Koopman,et al.  Time Series Analysis by State Space Methods , 2001 .

[48]  M. Schervish,et al.  On posterior consistency in nonparametric regression problems , 2007 .

[49]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[50]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[51]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[52]  Robert Tibshirani,et al.  Sample classification from protein mass spectrometry, by 'peak probability contrasts' , 2004, Bioinform..

[53]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[54]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[55]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[56]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[57]  J. Friedman,et al.  FLEXIBLE PARSIMONIOUS SMOOTHING AND ADDITIVE MODELING , 1989 .

[58]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[59]  Xiaotong Shen,et al.  Spatially Adaptive Regression Splines and Accurate Knot Selection Schemes , 2001 .