Max-and-Smooth: A Two-Step Approach for Approximate Bayesian Inference in Latent Gaussian Models

With modern high-dimensional data, complex statistical models are necessary, requiring computationally feasible inference schemes. We introduce Max-and-Smooth, an approximate Bayesian inference scheme for a flexible class of latent Gaussian models (LGMs) where one or more of the likelihood parameters are modeled by latent additive Gaussian processes. Max-and-Smooth consists of two-steps. In the first step (Max), the likelihood function is approximated by a Gaussian density with mean and covariance equal to either (a) the maximum likelihood estimate and the inverse observed information, respectively, or (b) the mean and covariance of the normalized likelihood function. In the second step (Smooth), the latent parameters and hyperparameters are inferred and smoothed with the approximated likelihood function. The proposed method ensures that the uncertainty from the first step is correctly propagated to the second step. Since the approximated likelihood function is Gaussian, the approximate posterior density of the latent parameters of the LGM (conditional on the hyperparameters) is also Gaussian, thus facilitating efficient posterior inference in high dimensions. Furthermore, the approximate marginal posterior distribution of the hyperparameters is tractable, and as a result, the hyperparameters can be sampled independently of the latent parameters. In the case of a large number of independent data replicates, sparse precision matrices, and high-dimensional latent vectors, the speedup is substantial in comparison to an MCMC scheme that infers the posterior density from the exact likelihood function. The proposed inference scheme is demonstrated on one spatially referenced real dataset and on simulated data mimicking spatial, temporal, and spatio-temporal inference problems. Our results show that Max-and-Smooth is accurate and fast.

[1]  Birgir Hrafnkelsson,et al.  Bayesian prediction of monthly precipitation on a fine grid using covariates based on a regional meteorological model , 2016 .

[2]  Thiago G. Martins,et al.  Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors , 2014, 1403.4630.

[3]  Haavard Rue,et al.  Bayesian Computing with INLA: A Review , 2016, 1604.00860.

[4]  Christian P. Robert,et al.  Statistics for Spatio-Temporal Data , 2014 .

[5]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[6]  H. Rue,et al.  INLA goes extreme: Bayesian tail regression for the estimation of high spatio-temporal quantiles , 2018, Extremes.

[7]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .

[8]  Bob Glahn,et al.  MOS Uncertainty Estimates in an Ensemble Framework , 2009 .

[9]  D. Nychka,et al.  Bayesian Spatial Modeling of Extreme Precipitation Return Levels , 2007 .

[10]  Pall Jensson,et al.  Decision making in the cod industry based on recording and analysis of value chain data , 2010 .

[11]  Michael A. West,et al.  Time Series: Modeling, Computation, and Inference , 2010 .

[12]  Daniel Simpson,et al.  LGM Split Sampler: An Efficient MCMC Sampling Scheme for Latent Gaussian Models , 2020 .

[13]  Ryan P. Adams,et al.  Slice sampling covariance hyperparameters of latent Gaussian models , 2010, NIPS.

[14]  Raphael Huser,et al.  Point process-based modeling of multiple debris flow landslides using INLA: an application to the 2009 Messina disaster , 2017, Stochastic Environmental Research and Risk Assessment.

[15]  Maurizio Filippone,et al.  Pseudo-Marginal Bayesian Inference for Gaussian Processes , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  H. Rue,et al.  On Block Updating in Markov Random Field Models for Disease Mapping , 2002 .

[17]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[18]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[19]  F. Y. Edgeworth,et al.  The theory of statistics , 1996 .

[20]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[21]  R. L. Winkler Scoring Rules and the Evaluation of Probability Assessors , 1969 .

[22]  J. Thepaut,et al.  The ERA‐Interim reanalysis: configuration and performance of the data assimilation system , 2011 .

[23]  D. Nychka,et al.  A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets , 2015 .

[24]  Zhiyi Chi,et al.  Approximating likelihoods for large spatial data sets , 2004 .

[25]  Dorit Hammerling,et al.  A Case Study Competition Among Methods for Analyzing Large Spatial Data , 2017, Journal of Agricultural, Biological and Environmental Statistics.

[26]  Stefan Siegert,et al.  Forecast Recalibration and Multimodel Combination , 2019 .

[27]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[28]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[29]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[30]  Aman Verma,et al.  Spatial Determinants of Ebola Virus Disease Risk for the West African Epidemic , 2017, PLoS currents.

[31]  Yaming Yu,et al.  To Center or Not to Center: That Is Not the Question—An Ancillarity–Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency , 2011 .

[32]  Haavard Rue,et al.  Spatial modelling with R-INLA: A review , 2018, 1802.06350.

[33]  Michael F. Wehner,et al.  A probabilistic gridded product for daily precipitation extremes over the United States , 2018, Climate Dynamics.

[34]  P. Martin Mai,et al.  Geostatistical Modeling to Capture Seismic‐Shaking Patterns From Earthquake‐Induced Landslides , 2018, Journal of Geophysical Research: Earth Surface.

[35]  H. Glahn,et al.  The Use of Model Output Statistics (MOS) in Objective Weather Forecasting , 1972 .

[36]  M. Katzfuss,et al.  A General Framework for Vecchia Approximations of Gaussian Processes , 2017, 1708.06302.

[37]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[38]  Matthias Katzfuss,et al.  A Multi-Resolution Approximation for Massive Spatial Datasets , 2015, 1507.04789.

[39]  Alan E. Gelfand,et al.  Multilevel modeling using spatial processes: Application to the Singapore housing market , 2007, Comput. Stat. Data Anal..

[40]  Kenneth E. Kunkel,et al.  Investigating the association between late spring Gulf of Mexico sea surface temperatures and U.S. Gulf Coast precipitation extremes with focus on Hurricane Harvey , 2019, Environmetrics.