Calibration and empirical Bayes variable selection

For the problem of variable selection for the normal linear model, selection criteria such as AIC, C p , BIC and RIC have fixed dimensionality penalties. Such criteria are shown to correspond to selection of maximum posterior models under implicit hyperparameter choices for a particular hierarchical Bayes formulation. Based on this calibration, we propose empirical Bayes selection criteria that use hyperparameter estimates instead of fixed choices. For obtaining these estimates, both marginal and conditional maximum likelihood methods are considered. As opposed to traditional fixed penalty criteria, these empirical Bayes criteria have dimensionality penalties that depend on the data. Their performance is seen to approximate adaptively the performance of the best fixed-penalty criterion across a variety of orthogonal and nonorthogonal set-ups, including wavelet regression. Empirical Bayes shrinkage estimators of the selected coefficients are also proposed.

[1]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[2]  C. L. Mallows Some comments on C_p , 1973 .

[3]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[4]  N. E. Marquina Regressions by leaps and bounds and biased estimation techniques in yield modeling , 1979 .

[5]  D. Spiegelhalter,et al.  Bayes Factors and Choice Criteria for Linear Models , 1980 .

[6]  A. Zellner,et al.  Posterior odds ratios for selected regression hypotheses , 1980 .

[7]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[8]  L. Breiman The Little Bootstrap and other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error , 1992 .

[9]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[10]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[11]  Dean P. Foster,et al.  The risk inflation criterion for multiple regression , 1994 .

[12]  I. Johnstone,et al.  Wavelet Shrinkage: Asymptopia? , 1995 .

[13]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[14]  A. O'Hagan,et al.  Fractional Bayes factors for model comparison , 1995 .

[15]  Sylvia Richardson,et al.  Stochastic search variable selection , 1995 .

[16]  Nabendu Pal,et al.  Estimation Of A Multivariate Normal Mean Vector And Local Improvements , 1995 .

[17]  C. Mallows More comments on C p , 1995 .

[18]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[19]  Paul H. Garthwaite,et al.  Quantifying and using expert opinion for variable-selection problems in regression , 1996 .

[20]  M. Clyde,et al.  Prediction via Orthogonalized Model Mixing , 1996 .

[21]  R. Kohn,et al.  Nonparametric regression using Bayesian variable selection , 1996 .

[22]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[23]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[24]  D. Pauler The Schwarz criterion and related methods for normal linear models , 1998 .

[25]  B. Silverman,et al.  Wavelet thresholding via a Bayesian approach , 1998 .

[26]  M. Clyde,et al.  Multiple shrinkage and subset selection in wavelets , 1998 .

[27]  Edward I. George,et al.  Empirical Bayes Estimation in Wavelet Nonparametric Regression , 1999 .

[28]  Dean P. Foster,et al.  Local Asymptotic Coding and the Minimum Description Length , 1999, IEEE Trans. Inf. Theory.

[29]  M. Clyde,et al.  Flexible empirical Bayes estimation for wavelets , 2000 .

[30]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.