Large Multi-scale Spatial Modeling Using Tree Shrinkage Priors

We develop a multiscale spatial kernel convolution technique with higher order functions to capture fine scale local features and lower order terms to capture large scale features. To achieve parsimony, the coefficients in the multiscale kernel convolution model is assigned a new class of "Tree shrinkage prior" distributions. Tree shrinkage priors exert increasing shrinkage on the coefficients as resolution grows so as to adapt to the necessary degree of resolution at any sub-domain. Our proposed model has a number of significant features over the existing multi-scale spatial models for big data. In contrast to the existing multiscale approaches, the proposed approach auto-tunes the degree of resolution necessary to model a subregion in the domain, achieves scalability by suitable parallelization of local updating of parameters and is buttressed by theoretical support. Excellent empirical performances are illustrated using several simulation experiments and a geostatistical analysis of the sea surface temperature data from the pacific ocean.

[1]  Daniel W. Apley,et al.  Local Gaussian Process Approximation for Large Computer Experiments , 2013, 1303.0383.

[2]  N. Pillai Levy random measures: Posterior consistency and applications , 2008 .

[3]  David Higdon,et al.  A process-convolution approach to modelling temperatures in the North Atlantic Ocean , 1998, Environmental and Ecological Statistics.

[4]  Andrew O. Finley,et al.  Bayesian multi-resolution modeling for spatially replicated data sets with application to forest biomass data , 2007 .

[5]  Joseph Guinness Permutation Methods for Sharpening Gaussian Process Approximations , 2016 .

[6]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Michael L. Stein,et al.  Limitations on low rank approximations for covariance matrices of spatial data , 2014 .

[9]  Noel A Cressie,et al.  Long-Lead Prediction of Pacific SSTs via Bayesian Dynamic Modeling , 2000 .

[10]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[11]  Michael L. Stein,et al.  Spatial variation of total column ozone on a global scale , 2007, 0709.0394.

[12]  D. Nychka,et al.  A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets , 2015 .

[13]  Bradley P. Carlin,et al.  Hierarchical multiresolution approaches for dense point-level breast cancer treatment data , 2008, Comput. Stat. Data Anal..

[14]  Matthias Katzfuss,et al.  Bayesian nonstationary spatial modeling for very large datasets , 2012, 1204.2098.

[15]  Jaeyong Lee,et al.  GENERALIZED DOUBLE PARETO SHRINKAGE. , 2011, Statistica Sinica.

[16]  Bruno Sansó,et al.  Spatio‐temporal variability of ocean temperature in the Portugal Current System , 2006 .

[17]  M. Clyde,et al.  Prediction via Orthogonalized Model Mixing , 1996 .

[18]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[19]  Zhiyi Chi,et al.  Approximating likelihoods for large spatial data sets , 2004 .

[20]  David Ruppert,et al.  Tapered Covariance: Bayesian Estimation and Asymptotics , 2012 .

[21]  Holger Wendland,et al.  Scattered Data Approximation: Conditionally positive definite functions , 2004 .

[22]  James G. Scott,et al.  Local shrinkage rules, Lévy processes and regularized regression , 2010, 1010.3390.

[23]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[24]  Jo Eidsvik,et al.  Estimation and Prediction in Spatial Models With Block Composite Likelihoods , 2014 .

[25]  James G. Scott,et al.  Handling Sparsity via the Horseshoe , 2009, AISTATS.

[26]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[27]  Andrew O. Finley,et al.  Hierarchical Spatial Process Models for Multiple Traits in Large Genetic Trials , 2010, Journal of the American Statistical Association.

[28]  Douglas W. Nychka,et al.  Methods for Analyzing Large Spatial Data: A Review and Comparison , 2017 .

[29]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[30]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[31]  B. Sansó,et al.  A Spatio-Temporal Model for Mean, Anomaly, and Trend Fields of North Atlantic Sea Surface Temperature , 2009 .

[32]  M. Schervish,et al.  On posterior consistency in nonparametric regression problems , 2007 .

[33]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[34]  Cheng Li,et al.  A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging , 2017, 1712.09767.

[35]  A. Gelfand,et al.  Adaptive Gaussian predictive process models for large spatial datasets , 2011, Environmetrics.

[36]  P. Diggle,et al.  Bivariate Binomial Spatial Modeling of Loa loa Prevalence in Tropical Africa , 2008 .

[37]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[38]  Joseph Guinness,et al.  Permutation and Grouping Methods for Sharpening Gaussian Process Approximations , 2016, Technometrics.

[39]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[40]  C. Wikle,et al.  Polynomial nonlinear spatio‐temporal integro‐difference equation models , 2011 .

[41]  Chris Hans Bayesian lasso regression , 2009 .

[42]  Noel A Cressie,et al.  Statistics for Spatio-Temporal Data , 2011 .

[43]  J. Geweke,et al.  Getting It Right , 2004 .

[44]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[45]  Douglas W. Nychka,et al.  Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets , 2008 .

[46]  A. Gelfand,et al.  Handbook of spatial statistics , 2010 .

[47]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .

[48]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[49]  V. Mandrekar,et al.  Fixed-domain asymptotic properties of tapered maximum likelihood estimators , 2009, 0909.0359.

[50]  Chiwoo Park,et al.  Patchwork Kriging for Large-scale Gaussian Process Regression , 2017, J. Mach. Learn. Res..

[51]  Matthias Katzfuss,et al.  A Multi-Resolution Approximation for Massive Spatial Datasets , 2015, 1507.04789.

[52]  Ru Zhang,et al.  Local Gaussian Process Model for Large-Scale Dynamic Computer Experiments , 2016, Journal of Computational and Graphical Statistics.

[53]  D. Higdon Space and Space-Time Modeling using Process Convolutions , 2002 .

[54]  Sudipto Banerjee,et al.  Web Appendix: Meta-Kriging: Scalable Bayesian Modeling and Inference for Massive Spatial Datasets , 2018 .

[55]  Sudipto Banerjee,et al.  Hierarchical spatial modeling of additive and dominance genetic variance for large spatial trial datasets. , 2009, Biometrics.