Bayesian Variable Selection Using Continuous Shrinkage Priors for Nonparametric Models and Non-Gaussian Data.

WEI, RAN. Bayesian Variable Selection Using Continuous Shrinkage Priors for Nonparametric Models and Non-Gaussian Data. (Under the direction of Subhashis Ghoshal and Brian Reich.) In this thesis, we study the properties and applications of Bayesian variable selection in regression models. We focus on Bayesian shrinkage priors that are global-local mixture priors to select an appropriate subset of covariates. First, a Bayesian non-parametric regression model is proposed with multivariate continuous shrinkage prior. This approach is to decompose the commonly-used linear regression model into the summation of nonlinear main effects and two-way interaction terms and apply the proposed computationallyadvantageous continuous shrinkage prior to identify important effects. We construct a multivariate Dirichlet-Laplace prior that aggressively shrinks many of the terms towards zero, thus mitigating the noise of including unimportant exposures and allowing us to isolate the effects of important exposures. Theoretical studies demonstrate the asymptotic prediction and variable selection consistency properties, while numerical simulations present model performance of prediction and variable selection under practical scenarios. The method is applied on neurobehavioral data from Agricultural Health Study that investigates the associations between pesticide use and neurobehavioral outcomes in farmers. The proposed method shows improved accuracy in predicting the joint effects on neurobehavioral responses, while restricting the number of covariates included in the model through variable selection technique. Next, we investigate the contraction properties of shrinkage priors in logistic regression model when the number of covariates is high. For the prior distribution that is heavy-tailed and concentrated around around zero with large probability, the logistic coefficient estimates are asymptotically concentrated around the true sparse vector in L2-error contraction rate. It is shown that the proposed contraction rate is comparable with point mass prior that is studied in Atchadé (2017). The simulation study under logistic regression model verifies the theoretical results by showing that shrinkage priors such as horseshoe prior and Dirichlet-Laplace prior perform quite similar as the point mass prior in estimation, variable selection and prediction, but yield much better results than Bayesian lasso and non-informative normal prior. Last, we propose spatial extension of Bayesian shrinkage prior in the application of forecasting calibration for fine particulate matter from wildland fire smoke. To improve the forecast, we consider expanding the space-time forecasts by computing spatial summaries of the forecast in surrounding areas and using the constructed covariates in a multivariate regression model. We apply an additive nonparametric regression model to delineate the associations between the fine particulate measurements at monitoring stations and multivariate spatial summaries. Our proposed model incorporates both spatial variation on the additive nonparametric effects and Bayesian variable selection across locations. The simulation study evaluates the additive regression model under different model assumptions (nonlinearity, spatial dependence in the mean functions and residuals) and demonstrates its advantages in both prediction and variable selection. The regression model is implemented to downscale the fine particulate matter forecasts of wildland fire smoke in Washington state in 2015 and find an improvement of 18% in mean squared error compared to the standard linear regression calibration. © Copyright 2017 by Ran Wei

[1]  Curtis B. Storlie,et al.  Variable Selection in Bayesian Smoothing Spline ANOVA Models: Application to Deterministic Computer Codes , 2009, Technometrics.

[2]  Subhashis Ghosal,et al.  Asymptotic normality of posterior distributions in high-dimensional linear models , 1999 .

[3]  Montserrat Fuentes,et al.  Model Evaluation and Spatial Interpolation by Bayesian Combination of Observations with Outputs from Numerical Models , 2005, Biometrics.

[4]  Howard D. Bondell,et al.  High Dimensional Linear Regression via the R2-D2 Shrinkage Prior , 2016, 1609.00046.

[5]  Brent A. Coull,et al.  Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures , 2013 .

[6]  Nicholas G. Polson,et al.  The Horseshoe+ Estimator of Ultra-Sparse Signals , 2015, 1502.00560.

[7]  James A. Coan,et al.  Spatial Bayesian variable selection and grouping for high-dimensional scalar-on-image regression , 2015, 1509.04069.

[8]  L. Fahrmeir,et al.  Spatial Bayesian Variable Selection With Application to Functional Magnetic Resonance Imaging , 2007 .

[9]  B. Reich,et al.  Scalar‐on‐image regression via the soft‐thresholded Gaussian process , 2016, Biometrika.

[10]  Jaeyong Lee,et al.  GENERALIZED DOUBLE PARETO SHRINKAGE. , 2011, Statistica Sinica.

[11]  Van Der Vaart,et al.  The Horseshoe Estimator: Posterior Concentration around Nearly Black Vectors , 2014, 1404.0202.

[12]  R. Kohn,et al.  Nonparametric regression using Bayesian variable selection , 1996 .

[13]  Stephen G. Walker,et al.  Empirical Bayes posterior concentration in sparse high-dimensional linear models , 2014, 1406.7718.

[14]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[15]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[16]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[17]  Yves F. Atchad'e On the contraction properties of some high-dimensional quasi-posterior distributions , 2015, 1508.07929.

[18]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[19]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[20]  A. U.S.,et al.  Posterior consistency in linear models under shrinkage priors , 2013 .

[21]  Montserrat Fuentes,et al.  Spatial variable selection methods for investigating acute health effects of fine particulate matter components , 2015, Biometrics.

[22]  Kenny Q. Ye,et al.  Variable Selection for Gaussian Process Models in Computer Experiments , 2006, Technometrics.

[23]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[24]  Aad van der Vaart,et al.  How many needles in the haystack? Adaptive inference and uncertainty quantification for the horseshoe , 2016 .

[25]  A. V. D. Vaart,et al.  BAYESIAN LINEAR REGRESSION WITH SPARSE PRIORS , 2014, 1403.0735.

[26]  Donald McKenzie,et al.  Mapping fuels at multiple scales: landscape application of the fuel characteristic classification system. , 2007 .

[27]  James G. Scott,et al.  Handling Sparsity via the Horseshoe , 2009, AISTATS.

[28]  C. F. Sirmans,et al.  Spatial Modeling With Spatially Varying Coefficient Processes , 2003 .

[29]  Howard H. Chang,et al.  A spectral method for spatial downscaling , 2014, Biometrics.

[30]  R. Draxler,et al.  NOAA’s HYSPLIT Atmospheric Transport and Dispersion Modeling System , 2015 .

[31]  Alan E Gelfand,et al.  A Spatio-Temporal Downscaler for Output From Numerical Models , 2010, Journal of agricultural, biological, and environmental statistics.

[32]  T. Maiti,et al.  Additive model building for spatial regression , 2017 .

[33]  A. V. D. Vaart,et al.  Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences , 2012, 1211.1197.

[34]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.

[35]  Daniel F. Schmidt,et al.  High-Dimensional Bayesian Regularised Regression with the BayesReg Package , 2016, 1611.06649.

[36]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[37]  R. Draxler An Overview of the HYSPLIT_4 Modelling System for Trajectories, Dispersion, and Deposition , 1998 .

[38]  Yun Yang,et al.  Minimax-optimal nonparametric regression in high dimensions , 2014, 1401.7278.

[39]  S. Ghosal,et al.  Adaptive Bayesian density regression for high-dimensional data , 2014, 1403.2695.

[40]  F. Gerr,et al.  Peripheral Nervous System Function and Organophosphate Pesticide Use among Licensed Pesticide Applicators in the Agricultural Health Study , 2012, Environmental health perspectives.

[41]  N. Pillai,et al.  Dirichlet–Laplace Priors for Optimal Shrinkage , 2014, Journal of the American Statistical Association.

[42]  M. Fuentes,et al.  Bayesian Variable Selection for Multivariate Spatially Varying Coefficient Regression , 2010, Biometrics.

[43]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[44]  Sangram Ganguly,et al.  DeepSD: Generating High Resolution Climate Change Projections through Single Image Super-Resolution , 2017, KDD.

[45]  F. Liang,et al.  Nearly optimal Bayesian shrinkage for high-dimensional regression , 2017, Science China Mathematics.

[46]  A. Gelfand,et al.  A bivariate space-time downscaler under space and time misalignment. , 2010, The annals of applied statistics.

[47]  Alan E Gelfand,et al.  Space‐Time Data fusion Under Error in Computer Model Output: An Application to Modeling Air Quality , 2012, Biometrics.

[48]  Thomas S. Shively,et al.  Model selection in spline nonparametric regression , 2002 .

[49]  Ciprian M Crainiceanu,et al.  Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection , 2014, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[50]  D. Sullivan,et al.  The BlueSky smoke modeling framework , 2008 .

[51]  L. Mark Berliner,et al.  Combining Information Across Spatial Scales , 2005, Technometrics.

[52]  Subhashis Ghosal,et al.  Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression , 2014, 1411.6716.

[53]  F. Gerr,et al.  High pesticide exposure events and central nervous system function among pesticide applicators in the Agricultural Health Study , 2012, International Archives of Occupational and Environmental Health.

[54]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[55]  N. Fann,et al.  Forecast-based interventions can reduce the health and economic burden of wildfires. , 2014, Environmental science & technology.

[56]  Subhashis Ghosal,et al.  Fast Bayesian model assessment for nonparametric additive regression , 2014, Comput. Stat. Data Anal..

[57]  E. Belitser,et al.  Needles and straw in a haystack: robust empirical Bayes confidence for possibly sparse sequences , 2015 .

[58]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[59]  C. Carvalho,et al.  Decoupling Shrinkage and Selection in Bayesian Linear Models: A Posterior Summary Perspective , 2014, 1408.0464.

[60]  Marina Vannucci,et al.  Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies. , 2011, Statistical science : a review journal of the Institute of Mathematical Statistics.

[61]  D. Pati,et al.  ADAPTIVE BAYESIAN ESTIMATION OF CONDITIONAL DENSITIES , 2014, Econometric Theory.

[62]  S. Ghosal Asymptotic Normality of Posterior Distributions for Exponential Families when the Number of Parameters Tends to Infinity , 2000 .

[63]  Brian S Caffo,et al.  Spatial Bayesian Variable Selection Models on Functional Magnetic Resonance Imaging Time-Series Data. , 2014, Bayesian analysis.