Improved evaluation of predictive probabilities in probit models with Gaussian process priors

Predictive models for binary data are fundamental in various fields, ranging from spatial statistics to machine learning. In such settings, the growing complexity of the phenomena to be analyzed has motivated a variety of flexible specifications that avoid strong parametric assumptions when defining the relationship between the observed predictors and the binary response data. A widely-implemented solution within this class expresses the probability parameter via a probit mapping of a Gaussian process indexed by the predictors. However, unlike for continuous settings with Gaussian responses, there is a lack of closed-form results for predictive distributions in binary models with Gaussian process priors. Markov chain Monte Carlo methods and approximate solutions provide common options to address this issue, but state-of-the-art strategies are either computationally intractable or lead to low-quality approximations in moderate-to-high dimensions. In this article, we aim to cover this gap by deriving closed-form expressions for the predictive probabilities in probit Gaussian processes that rely either on cumulative distribution functions of multivariate Gaussians or on functionals of multivariate truncated normals. To evaluate such quantities we develop novel scalable solutions based on tile-low-rank Monte Carlo methods for computing multivariate Gaussian probabilities and on accurate variational approximations of multivariate truncated normal densities. Closed-form expressions for the marginal likelihood and for the conditional distribution of the Gaussian process given the binary responses are also discussed. As illustrated in simulations and in a real-world environmental application, the proposed methods can scale to dimensions where state-of-the-art solutions are impractical.

[1]  Jian Cao,et al.  Exploiting low-rank covariance structures for computing high-dimensional normal and Student-t probabilities , 2020, Statistics and Computing.

[2]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[3]  Ari Pakman,et al.  Exact Hamiltonian Monte Carlo for Truncated Multivariate Gaussians , 2012, 1208.4118.

[4]  Victor De Oliveira,et al.  Bayesian Inference and Prediction of Gaussian Random Fields Based on Censored Data , 2005 .

[5]  Alessio Benavoli,et al.  Skew Gaussian processes for classification , 2020, Machine Learning.

[6]  Daniele Durante,et al.  Scalable and Accurate Variational Bayes for High-Dimensional Binary Regression Models , 2019 .

[7]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[8]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[9]  S. Ghosal,et al.  Nonparametric binary regression using a Gaussian process prior , 2007 .

[10]  Nicolas Chopin,et al.  Fast simulation of truncated Gaussian distributions , 2011, Stat. Comput..

[11]  Daniele Durante,et al.  Conjugate Bayes for probit regression via unified skew-normal distributions , 2018, Biometrika.

[12]  Daniele Durante,et al.  A Class of Conjugate Priors for Multinomial Probit Models which Includes the Multivariate Normal One , 2020, J. Mach. Learn. Res..

[13]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[14]  David E. Keyes,et al.  Hierarchical Decompositions for the Computation of High-Dimensional Multivariate Normal Probabilities , 2018 .

[15]  Mark Girolami,et al.  Variational Bayesian Multinomial Probit Regression with Gaussian Process Priors , 2006, Neural Computation.

[16]  William C. Horrace,et al.  Some results on the multivariate truncated normal distribution , 2005 .

[17]  James Ridgway,et al.  Leave Pima Indians alone: binary regression as a benchmark for Bayesian computation , 2015, 1506.08640.

[18]  Luai M. Al-Hadhrami,et al.  Potential of Establishment of Wind Farms in Western Province of Saudi Arabia , 2014 .

[19]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[20]  Jean-Michel Marin,et al.  Mean-field variational approximate Bayesian inference for latent variable models , 2007, Comput. Stat. Data Anal..

[21]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[22]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[23]  A. Genz Numerical Computation of Multivariate Normal Probabilities , 1992 .

[24]  Adelchi Azzalini,et al.  The Skew-Normal and Related Families , 2018 .

[25]  Andreas Brezger,et al.  Generalized structured additive regression based on Bayesian P-splines , 2006, Comput. Stat. Data Anal..

[26]  M. Genton,et al.  Current and Future Estimates of Wind Energy Potential Over Saudi Arabia , 2018, Journal of Geophysical Research: Atmospheres.

[27]  Marc G. Genton,et al.  Closing the gap between wind energy targets and implementation for emerging countries , 2020 .

[28]  Aaron Smith,et al.  MCMC for Imbalanced Categorical Data , 2016, Journal of the American Statistical Association.

[29]  Z. Botev The normal law under linear restrictions: simulation and estimation via minimax tilting , 2016, 1603.04166.

[30]  Jian Cao,et al.  Hierarchical-block conditioning approximations for high-dimensional multivariate normal probabilities , 2018, Stat. Comput..

[31]  G. Powers,et al.  A Description of the Advanced Research WRF Version 3 , 2008 .