Incorporating spatial structure into inclusion probabilities for Bayesian variable selection in generalized linear models with the spike-and-slab elastic net.

Abstract Spike-and-slab priors model predictors as arising from a mixture of distributions: those that should (slab) or should not (spike) remain in the model. The spike-and-slab lasso (SSL) is a mixture of double exponentials, extending the single lasso penalty by imposing different penalties on parameters based on their inclusion probabilities. The SSL was extended to Generalized Linear Models (GLM) for application in genetics/genomics, and can handle many highly correlated predictors of a scalar outcome, but does not incorporate these relationships into variable selection. When images/spatial data are used to model a scalar outcome, relevant parameters tend to cluster spatially, and model performance may benefit from incorporating spatial structure into variable selection. We propose to incorporate spatial information by assigning intrinsic autoregressive priors to the logit prior probabilities of inclusion, which results in more similar shrinkage penalties among spatially adjacent parameters. Using MCMC to fit Bayesian models can be computationally prohibitive for large-scale data, but we fit the model by adapting a computationally efficient coordinate-descent-based EM algorithm. A simulation study and an application to Alzheimer’s Disease imaging data show that incorporating spatial information can improve model fitness.

[1]  Hongzhe Li,et al.  In Response to Comment on "Network-constrained regularization and variable selection for analysis of genomic data" , 2008, Bioinform..

[2]  Xinyan Zhang,et al.  The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection , 2016, Genetics.

[3]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[4]  N. Zhang,et al.  Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces With Applications in Genomics , 2010 .

[5]  A. Dale,et al.  Cortical Surface-Based Analysis II: Inflation, Flattening, and a Surface-Based Coordinate System , 1999, NeuroImage.

[6]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[7]  Bradley P Carlin,et al.  Generalized Hierarchical Multivariate CAR Models for Areal Data , 2005, Biometrics.

[8]  Woncheol Jang,et al.  Incorporating spatial dependence into Bayesian multiple testing of statistical parametric maps in functional neuroimaging , 2014, NeuroImage.

[9]  E. George,et al.  The Spike-and-Slab LASSO , 2018 .

[10]  Qing Li,et al.  The Bayesian elastic net , 2010 .

[11]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[12]  Veronika Rockova,et al.  EMVS: The EM Approach to Bayesian Variable Selection , 2014 .

[13]  Anders M. Dale,et al.  An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest , 2006, NeuroImage.

[14]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[18]  J. Besag,et al.  On conditional and intrinsic autoregressions , 1995 .

[19]  Noel A Cressie,et al.  Statistics for Spatio-Temporal Data , 2011 .

[20]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[21]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[22]  Stephen J Mooney,et al.  Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in stan. , 2019, Spatial and spatio-temporal epidemiology.

[23]  L. Fahrmeir,et al.  Spatial Bayesian Variable Selection With Application to Functional Magnetic Resonance Imaging , 2007 .

[24]  Dani Gamerman,et al.  Bayesian spatiotemporal model of fMRI data , 2010, NeuroImage.

[25]  Wei Pan,et al.  Predictor Network in Penalized Regression with Application to Microarray Data” , 2009 .

[26]  Anders M. Dale,et al.  Cortical Surface-Based Analysis I. Segmentation and Surface Reconstruction , 1999, NeuroImage.