Priors on the Variance in Sparse Bayesian Learning; the demi-Bayesian Lasso

We explore the use of proper priors for variance parameters of certain sparse Bayesian regression models. This leads to a connection between sparse Bayesian learning (SBL) models (Tipping, 2001) and the recently proposed Bayesian Lasso (Park and Casella, 2008). We outline simple modifications of existing algorithms to solve this new variant which essentially uses type-II maximum likelihood to fit the Bayesian Lasso model. We also propose an Elastic-net (Zou and Hastie, 2005) heuristic to help with modeling correlated inputs. Experimental results show the proposals to compare favorably to both the Lasso and traditional and more recent sparse Bayesian algorithms.

[1]  James O. Berger,et al.  Statistical Decision Theory and Bayesian Analysis, Second Edition , 1985 .

[2]  Matthew West,et al.  Bayesian factor regression models in the''large p , 2003 .

[3]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[4]  T. Stamey,et al.  Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. II. Radical prostatectomy treated patients. , 1989, The Journal of urology.

[5]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[6]  Dirk Husmeier Automatic Relevance Determination (ARD) , 1999 .

[7]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[8]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[9]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[10]  Michael E. Tipping,et al.  Fast Marginal Likelihood Maximisation for Sparse Bayesian Models , 2003 .

[11]  Michael E. Tipping Sparse Bayesian Learning and the Relevance Vector Machine , 2001, J. Mach. Learn. Res..

[12]  T. Fearn,et al.  The choice of variables in multivariate regression: a non-conjugate Bayesian decision theory approach , 1999 .

[13]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[14]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[15]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[16]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[17]  David P. Wipf,et al.  A New View of Automatic Relevance Determination , 2007, NIPS.