Laplace Approximation for Divisive Gaussian Processes for Nonstationary Regression

The standard Gaussian Process regression (GP) is usually formulated under stationary hypotheses: The noise power is considered constant throughout the input space and the covariance of the prior distribution is typically modeled as depending only on the difference between input samples. These assumptions can be too restrictive and unrealistic for many real-world problems. Although nonstationarity can be achieved using specific covariance functions, they require a prior knowledge of the kind of nonstationarity, not available for most applications. In this paper we propose to use the Laplace approximation to make inference in a divisive GP model to perform nonstationary regression, including heteroscedastic noise cases. The log-concavity of the likelihood ensures a unimodal posterior and makes that the Laplace approximation converges to a unique maximum. The characteristics of the likelihood also allow to obtain accurate posterior approximations when compared to the Expectation Propagation (EP) approximations and the asymptotically exact posterior provided by a Markov Chain Monte Carlo implementation with Elliptical Slice Sampling (ESS), but at a reduced computational load with respect to both, EP and ESS.

[1]  Alexander J. Smola,et al.  Heteroscedastic Gaussian process regression , 2005, ICML.

[2]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[3]  M. Yuan,et al.  Doubly penalized likelihood estimator in heteroscedastic regression , 2004 .

[4]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[5]  Wolfram Burgard,et al.  Most likely heteroscedastic Gaussian process regression , 2007, ICML '07.

[6]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[7]  Florian Steinke,et al.  Bayesian Inference and Optimal Design in the Sparse Linear Model , 2007, AISTATS.

[8]  D. Hinkley On the ratio of two correlated normal random variables , 1969 .

[9]  Carl E. Rasmussen,et al.  Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[10]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[11]  Paul W. Goldberg,et al.  Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.

[12]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[13]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[14]  Aníbal R. Figueiras-Vidal,et al.  Heteroscedastic Gaussian process regression using expectation propagation , 2011, 2011 IEEE International Workshop on Machine Learning for Signal Processing.

[15]  Kristian Kersting,et al.  Kernel Conditional Quantile Estimation via Reduction Revisited , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[16]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[17]  Ryan P. Adams,et al.  Gaussian process product models for nonparametric nonstationarity , 2008, ICML '08.

[18]  Aníbal R. Figueiras-Vidal,et al.  Divisive Gaussian Processes for Nonstationary Regression , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.