PriorVAE: encoding spatial priors with variational autoencoders for small-area estimation

Gaussian processes (GPs), implemented through multivariate Gaussian distributions for a finite collection of data, are the most popular approach in small-area spatial statistical modelling. In this context, they are used to encode correlation structures over space and can generalize well in interpolation tasks. Despite their flexibility, off-the-shelf GPs present serious computational challenges which limit their scalability and practical usefulness in applied settings. Here, we propose a novel, deep generative modelling approach to tackle this challenge, termed PriorVAE: for a particular spatial setting, we approximate a class of GP priors through prior sampling and subsequent fitting of a variational autoencoder (VAE). Given a trained VAE, the resultant decoder allows spatial inference to become incredibly efficient due to the low dimensional, independently distributed latent Gaussian space representation of the VAE. Once trained, inference using the VAE decoder replaces the GP within a Bayesian sampling framework. This approach provides tractable and easy-to-implement means of approximately encoding spatial priors and facilitates efficient statistical inference. We demonstrate the utility of our VAE two-stage approach on Bayesian, small-area estimation tasks.

[1]  Samir Bhatt,et al.  \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi $$\end{document}πVAE: a stochastic process prior for Bayesian deep , 2020, Statistics and Computing.

[2]  G. Kalton,et al.  Population-Based HIV Impact Assessments Survey Methods, Response, and Quality in Zimbabwe, Malawi, and Zambia , 2021, Journal of acquired immune deficiency syndromes.

[3]  Rohan Arambepola,et al.  A simulation study of disaggregation regression for spatial disease mapping , 2020, Statistics in medicine.

[4]  Soumya Ghosh,et al.  Measuring the sensitivity of Gaussian processes to kernel choice , 2021, ArXiv.

[5]  S. Bhatt,et al.  A COVID-19 Model for Local Authorities of the United Kingdom , 2020, medRxiv.

[6]  S. Bhatt,et al.  Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe , 2020, Nature.

[7]  GP-VAE: Deep Probabilistic Time Series Imputation , 2019, AISTATS.

[8]  A Comprehensive Study of Autoencoders' Applications Related to Images , 2020, IT&I Workshops.

[9]  Neeraj Pradhan,et al.  Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro , 2019, ArXiv.

[10]  Stephen J Mooney,et al.  Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in stan. , 2019, Spatial and spatio-temporal epidemiology.

[11]  Emanuele Giorgi,et al.  A spatially discrete approximation to log‐Gaussian Cox processes for modelling aggregated disease count data , 2019, Statistics in medicine.

[12]  Osvaldo A. Martin,et al.  ArviZ a unified library for exploratory analysis of Bayesian models in Python , 2019, J. Open Source Softw..

[13]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[14]  Aki Vehtari,et al.  Yes, but Did It Work?: Evaluating Variational Inference , 2018, ICML.

[15]  Zoubin Ghahramani,et al.  Turing: A Language for Flexible Probabilistic Inference , 2018 .

[16]  Kerrie Mengersen,et al.  Spatial smoothing in Bayesian models: a comparison of weights matrix specifications and their impact on inference , 2017, International Journal of Health Geographics.

[17]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[18]  Luca Martino,et al.  Effective sample size for importance sampling based on discrepancy measures , 2016, Signal Process..

[19]  John Salvatier,et al.  Probabilistic programming in Python using PyMC3 , 2016, PeerJ Comput. Sci..

[20]  Richard E. Turner,et al.  Rényi Divergence Variational Inference , 2016, NIPS.

[21]  Andrea Riebler,et al.  An intuitive Bayesian spatial model for disease mapping that accounts for scaling , 2016, Statistical methods in medical research.

[22]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[23]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  Finn Lindgren,et al.  Bayesian computing with INLA: New features , 2012, Comput. Stat. Data Anal..

[26]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[27]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[28]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[29]  Sumio Watanabe,et al.  Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory , 2010, J. Mach. Learn. Res..

[30]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[32]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[33]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[34]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[35]  S. Brooker,et al.  Bayesian spatial analysis and disease mapping: tools to enhance planning and implementation of a schistosomiasis control programme in Tanzania , 2006, Tropical medicine & international health : TM & IH.

[36]  Isabel Molina,et al.  Small Area Estimation: Rao/Small Area Estimation , 2005 .

[37]  A. Gelfand,et al.  Proper multivariate conditional autoregressive models for spatial data analysis. , 2003, Biostatistics.

[38]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[39]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[40]  Una Maclean Atlas of Cancer in Scotland 1975-1980. Incidence and Epidemiological Perspective , 1985, IARC scientific publications.

[41]  M. James,et al.  The generalised inverse , 1978, The Mathematical Gazette.

[42]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .