Bayesian sparse linear regression with unknown symmetric error

We study Bayesian procedures for sparse linear regression when the unknown error distribution is endowed with a non-parametric prior. Specifically, we put a symmetrized Dirichlet process mixture of Gaussian prior on the error density, where the mixing distributions are compactly supported. For the prior on regression coefficients, a mixture of point masses at zero and continuous distributions is considered. Under the assumption that the model is well specified, we study behavior of the posterior with diverging number of predictors. The compatibility and restricted eigenvalue conditions yield the minimax convergence rate of the regression coefficients in $\ell _1$- and $\ell _2$-norms, respectively. In addition, strong model selection consistency and a semi-parametric Bernstein–von Mises theorem are proven under slightly stronger conditions.

[1]  Harrison H. Zhou,et al.  A general framework for Bayes structured linear models , 2015, The Annals of Statistics.

[2]  Dana Yang Posterior asymptotic normality for an individual coordinate in high-dimensional linear regression , 2017, Electronic Journal of Statistics.

[3]  Yongdai Kim,et al.  The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors , 2016, Statistica Sinica.

[4]  Soumendu Sundar Mukherjee,et al.  Weak convergence and empirical processes , 2019 .

[5]  V. Rocková,et al.  Bayesian estimation of sparse signals with a continuous spike-and-slab prior , 2018 .

[6]  E. George,et al.  The Spike-and-Slab LASSO , 2018 .

[7]  Jianqing Fan,et al.  Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions , 2017, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[8]  Stephen G. Walker,et al.  Empirical Bayes posterior concentration in sparse high-dimensional linear models , 2014, 1406.7718.

[9]  Yongdai Kim,et al.  Consistent model selection criteria for quadratically supported risks , 2016 .

[10]  Matthew Stephens,et al.  False discovery rates: a new deal , 2016, bioRxiv.

[11]  Johannes Schmidt-Hieber,et al.  Conditions for Posterior Contraction in the Sparse Normal Means Problem , 2015, 1510.02232.

[12]  Minwoo Chae The semiparametric Bernstein-von Mises theorem for models with symmetric error , 2015, 1510.05247.

[13]  Martin J. Wainwright,et al.  On the Computational Complexity of High-Dimensional Bayesian Variable Selection , 2015, ArXiv.

[14]  Michael I. Jordan,et al.  Optimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators , 2015, 1503.03188.

[15]  A. V. D. Vaart,et al.  BAYESIAN LINEAR REGRESSION WITH SPARSE PRIORS , 2014, 1403.0735.

[16]  N. Pillai,et al.  Dirichlet–Laplace Priors for Optimal Shrinkage , 2014, Journal of the American Statistical Association.

[17]  J. Rousseau,et al.  A Bernstein–von Mises theorem for smooth functionals in semiparametric models , 2013, 1305.4482.

[18]  Thijs van Ommen,et al.  Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It , 2014, 1412.3730.

[19]  N. Narisetty,et al.  Bayesian variable selection with shrinking and diffusing priors , 2014, 1405.6545.

[20]  Martin J. Wainwright,et al.  Lower bounds on the performance of polynomial-time algorithms for sparse linear regression , 2014, COLT.

[21]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[22]  Stephen G. Walker,et al.  Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector , 2013, 1304.7366.

[23]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[24]  Suprateek Kundu,et al.  Bayes Variable Selection in Semiparametric Linear Models , 2011, Journal of the American Statistical Association.

[25]  R. Tibshirani,et al.  A Study of Error Variance Estimation in Lasso Regression , 2013, 1311.5274.

[26]  V. Spokoiny,et al.  Finite Sample Bernstein -- von Mises Theorem for Semiparametric Problems , 2013, 1310.7796.

[27]  M. Rudelson,et al.  Hanson-Wright inequality and sub-gaussian concentration , 2013 .

[28]  S. Ghosal,et al.  Adaptive Bayesian multivariate density estimation with Dirichlet mixtures , 2011, 1109.6406.

[29]  Shuheng Zhou,et al.  25th Annual Conference on Learning Theory Reconstruction from Anisotropic Random Measurements , 2022 .

[30]  A. V. D. Vaart,et al.  Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences , 2012, 1211.1197.

[31]  I. Castillo A semiparametric Bernstein–von Mises theorem for Gaussian process priors , 2012 .

[32]  James G. Scott,et al.  Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction , 2022 .

[33]  V. Spokoiny Parametric estimation. Finite sample theory , 2011, 1111.3029.

[34]  Van Der Vaart,et al.  The Bernstein-Von-Mises theorem under misspecification , 2012 .

[35]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[36]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[37]  T. Cai,et al.  Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices , 2011, 1102.2925.

[38]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[39]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[40]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[41]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[42]  D. V. van Dyk,et al.  Partially Collapsed Gibbs Samplers , 2008 .

[43]  S. Walker,et al.  On rates of convergence for posterior distributions in infinite-dimensional models , 2007, 0708.1892.

[44]  A. V. D. Vaart,et al.  Posterior convergence rates of Dirichlet mixtures at smooth densities , 2007, 0708.1885.

[45]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions for non-i.i.d. observations , 2007, 0708.0491.

[46]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[47]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[48]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[49]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[50]  I. Johnstone,et al.  Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences , 2004, math/0410088.

[51]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[52]  A. V. D. Vaart,et al.  Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities , 2001 .

[53]  E. George The Variable Selection Problem , 2000 .

[54]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[55]  S. Ghosal Asymptotic Normality of Posterior Distributions for Exponential Families when the Number of Parameters Tends to Infinity , 2000 .

[56]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[57]  A. V. D. Vaart,et al.  Asymptotic Statistics: Frontmatter , 1998 .

[58]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[59]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[60]  W. Wong,et al.  Probability inequalities for likelihood ratios and convergence rates of sieve MLEs , 1995 .

[61]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[62]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[63]  P. Bickel On Adaptive Estimation , 1982 .

[64]  F. T. Wright A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables Whose Distributions are not Necessarily Symmetric , 1973 .

[65]  F. T. Wright,et al.  A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables , 1971 .