Hierarchical Bayes, maximum a posteriori estimators, and minimax concave penalized likelihood estimation

Priors constructed from scale mixtures of normal distributions have long played an important role in decision theory and shrinkage estimation. This paper demonstrates equivalence between the maximum aposteriori estimator constructed under one such prior and Zhang’s minimax concave penalization estimator. This equivalence and related multivariate generalizations stem directly from an intriguing representation of the minimax concave penalty function as the Moreau envelope of a simple convex function. Maximum aposteriori estimation under the corresponding marginal prior distribution, a generalization of the quasi-Cauchy distribution proposed by Johnstone and Silverman, leads to thresholding estimators having excellent frequentist risk properties. AMS 2000 subject classifications: Primary 62C60, 62J07.

[1]  J. Griffin,et al.  Bayesian adaptive lassos with non-convex penalization , 2007 .

[2]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[3]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[4]  Richard Courant,et al.  Wiley Classics Library , 2011 .

[5]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[6]  R. Strawderman,et al.  On hierarchical prior specifications and penalized likelihood , 2012 .

[7]  Chris Hans Bayesian lasso regression , 2009 .

[8]  T. Hastie,et al.  SparseNet: Coordinate Descent With Nonconvex Penalties , 2011, Journal of the American Statistical Association.

[9]  Sanjo Zlobec,et al.  Estimating convexifiers in continuous optimization , 2003 .

[10]  Jianqing Fan,et al.  Regularization of Wavelet Approximations , 2001 .

[11]  Ming-Hui Chen,et al.  Propriety of posterior distribution for dichotomous quantal response models , 2000 .

[12]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[13]  Stein's positive part estimator and bayes estimator , 1979 .

[14]  Christian P. Robert,et al.  The Bayesian choice , 1994 .

[15]  James G. Scott,et al.  Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction , 2022 .

[16]  James O. Berger,et al.  Subjective Hierarchical Bayes Estimation of a Multivariate Normal Mean: On the Frequentist Interface , 1990 .

[17]  W. Strawderman Proper Bayes Minimax Estimators of the Multivariate Normal Mean , 1971 .

[18]  J. Berger,et al.  Choice of hierarchical priors: admissibility in estimation of normal means , 1996 .

[19]  I. Johnstone,et al.  Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences , 2004, math/0410088.

[20]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[21]  Hong-Ye Gao,et al.  Applied wavelet analysis with S-plus , 1996 .

[22]  Jaeyong Lee,et al.  GENERALIZED DOUBLE PARETO SHRINKAGE. , 2011, Statistica Sinica.

[23]  Elizabeth D. Schifano,et al.  Topics In Penalized Estimation , 2010 .

[24]  A. Bruce,et al.  WAVESHRINK WITH FIRM SHRINKAGE , 1997 .

[25]  James O. Berger,et al.  Posterior propriety and admissibility of hyperpriors in normal hierarchical models , 2005, math/0505605.

[26]  H. Zou,et al.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. , 2008, Annals of statistics.

[27]  M. Wells,et al.  On the construction of Bayes minimax estimators , 1998 .

[28]  L. Wasserman,et al.  The Selection of Prior Distributions by Formal Rules , 1996 .

[29]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[30]  D. Cox Regression Models and Life-Tables , 1972 .

[31]  Ker-Chau Li,et al.  From Stein's Unbiased Risk Estimates to the Method of Generalized Cross Validation , 1985 .

[32]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[33]  Joseph G. Ibrahim,et al.  Posterior propriety and computation for the Cox regression model with applications to missing covariates , 2006 .

[34]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[35]  R. Strawderman,et al.  Majorization-Minimization algorithms for nonsmoothly penalized objective functions , 2010 .

[36]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[37]  Árpád Baricz,et al.  Mills' ratio: Monotonicity patterns and functional inequalities , 2008 .

[38]  M. Sampford Some Inequalities on Mill's Ratio and Related Functions , 1953 .

[39]  Jian Huang,et al.  Penalized methods for bi-level variable selection. , 2009, Statistics and its interface.

[40]  Miguel A. Gómez-Villegas,et al.  Multivariate Exponential Power Distributions as Mixtures of Normal Distributions with Bayesian Applications , 2008 .

[41]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[42]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[43]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.