On the half-cauchy prior for a global scale parameter

This paper argues that the half-Cauchy distribution should replace the inverseGamma distribution as a default prior for a top-level scale parameter in Bayesian hierarchical models, at least for cases where a proper prior is necessary. Our arguments involve a blend of Bayesian and frequentist reasoning, and are intended to complement the original case made by Gelman (2006) in support of the folded-t family of priors. First, we generalize the half-Cauchy prior to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. We go on to prove a proposition that, together with the results for moments and marginals, allows us to characterize the frequentist risk of the Bayes estimators under all global-shrinkage priors in the class. These theoretical results, in turn, allow us to study the frequentist properties of the half-Cauchy prior versus a wide class of alternatives. The half-Cauchy occupies a sensible “middle ground” within this class: it performs very well near the origin, but does not lead to drastic compromises in other parts of the parameter space. This provides an alternative, classical justification for the repeated, routine use of this prior. We also consider situations where the underlying mean vector is sparse, where we argue that the usual conjugate choice of an inverse-gamma prior is particularly inappropriate, and can lead to highly distorted posterior inferences. Finally, we briefly summarize some open issues in the specification of default priors for scale terms in hierarchical models.

[1]  L. M. M.-T. Theory of Probability , 1929, Nature.

[2]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[3]  G. C. Tiao,et al.  BAYESIAN ANALYSIS OF RANDOM-EFFECT MODELS IN THE ANALYSIS OF VARIANCE. I. POSTERIOR DISTRIBUTION OF VARIANCE-COMPONENTS. , 1965 .

[4]  G. C. Tiao,et al.  Bayesian analysis of random-effect models in the analysis of variance II. Effect of autocorrelated erros , 1966 .

[5]  W. Strawderman Proper Bayes Minimax Estimators of the Multivariate Normal Mean , 1971 .

[6]  J. Berger A Robust Generalized Bayes Estimator and Confidence Region for a Multivariate Normal Mean , 1980 .

[7]  C. Stein Estimation of the Mean of a Multivariate Normal Distribution , 1981 .

[8]  Walter R. Gilks,et al.  BUGS - Bayesian inference Using Gibbs Sampling Version 0.50 , 1995 .

[9]  James O. Berger,et al.  A Catalog of Noninformative Priors , 1996 .

[10]  Michael B. Gordy A generalization of generalized beta distributions , 1998 .

[11]  M. Wells,et al.  On the construction of Bayes minimax estimators , 1998 .

[12]  Improving On The James-Stein Estimator , 1999 .

[13]  J. Griffin,et al.  Alternative prior distributions for variable selection with very many more variables than observations , 2005 .

[14]  P. Gustafson,et al.  Conservative prior distributions for variance parameters in hierarchical models , 2006 .

[15]  James G. Scott,et al.  An exploration of aspects of Bayesian multiple testing , 2006 .

[16]  Feng Liang,et al.  Improved minimax predictive densities under Kullback-Leibler loss , 2006 .

[17]  Edward I. George,et al.  gBF: A Fully Bayes Factor with a Generalized g-prior , 2008 .

[18]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[19]  Chris Hans Bayesian lasso regression , 2009 .

[20]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[21]  James G. Scott,et al.  Local shrinkage rules, Lévy processes and regularized regression , 2010, 1010.3390.

[22]  Yuzo Maruyama,et al.  Fully Bayes factors with a generalized g-prior , 2008, 0801.4410.

[23]  Carl Morris,et al.  Estimating Random Effects via Adjustment for Density Maximization , 2011 .

[24]  James G. Scott,et al.  Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction , 2022 .