Generalized Beta Mixtures of Gaussians

In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better properties than traditional Cauchy and double exponential priors. We first propose a new class of normal scale mixtures through a novel generalized beta distribution that encompasses many interesting priors as special cases. This encompassing framework should prove useful in comparing competing priors, considering properties and revealing close connections. We then develop a class of variational Bayes approximations through the new hierarchy presented that will scale more efficiently to the types of truly massive data sets that are now encountered routinely.

[1]  C. Hoggart,et al.  Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies , 2008, PLoS genetics.

[2]  Artin Armagan,et al.  Variational Bridge Regression , 2009, AISTATS.

[3]  Michael B. Gordy A generalization of generalized beta distributions , 1998 .

[4]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  J. Berger A Robust Generalized Bayes Estimator and Confidence Region for a Multivariate Normal Mean , 1980 .

[6]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[7]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[8]  Christopher M. Bishop,et al.  Variational Relevance Vector Machines , 2000, UAI.

[9]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[10]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[11]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[12]  James G. Scott,et al.  Handling Sparsity via the Horseshoe , 2009, AISTATS.

[13]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[14]  J. Griffin,et al.  Bayesian adaptive lassos with non-convex penalization , 2007 .

[15]  James G. Scott,et al.  Alternative Global – Local Shrinkage Priors Using Hypergeometric – Beta Mixtures , 2009 .

[16]  L. Tierney,et al.  Accurate Approximations for Posterior Moments and Marginal Densities , 1986 .

[17]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[18]  M. J. Bayarri,et al.  Prior Assessments for Prediction in Queues , 1994 .

[19]  Jaeyong Lee,et al.  GENERALIZED DOUBLE PARETO SHRINKAGE. , 2011, Statistica Sinica.

[20]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[21]  W. Strawderman Proper Bayes Minimax Estimators of the Multivariate Normal Mean , 1971 .

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[24]  Chris Hans Bayesian lasso regression , 2009 .

[25]  I. Johnstone,et al.  Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences , 2004, math/0410088.