Steinized empirical bayes estimation for heteroscedastic data

Consider the problem of estimating normal means from independent observations with known variances, possibly different from each other. Suppose that a second-level normal model is specified on the unknown means, with the prior means depending on a vector of covariates and the prior variances constant. For this two-level normal model, existing empirical Bayes methods are constructed from the Bayes rule with the prior parameters selected either by maximum likelihood or moment equations or by minimizing Stein’s unbiased risk estimate. Such methods tend to deteriorate, sometimes substantially, when the second-level model is misspecified. We develop a Steinized empirical Bayes approach for improving the robustness to misspecification of the second-level model, while preserving the effectiveness in risk reduction when the second-level model is appropriate in capturing the unknown means. The proposed methods are constructed from a minimax Bayes estimator or, interpreted by its form, a Steinized Bayes estimator, which is not only globally minimax but also achieves close to the minimum Bayes risk over a scale class of normal priors including the specified prior. The prior parameters are then estimated by standard moment methods. We provide formal results showing that the proposed methods yield no greater asymptotic risks than existing methods using the same estimates of prior parameters, but without requiring the second-level model to be correct. We present both an application for predicting baseball batting averages and two simulation studies to demonstrate the practical advantage of the proposed methods.

[1]  R. Fay,et al.  Estimates of Income for Small Places: An Application of James-Stein Procedures to Census Data , 1979 .

[2]  Discussion of a paper by C.S. Ramage, ‘The subtropical cyclone’ , 1962 .

[3]  B. Efron,et al.  Data Analysis Using Stein's Estimator and its Generalizations , 1975 .

[4]  Danny Pfeffermann,et al.  New important developments in small area estimation , 2013, 1302.4907.

[5]  Gamma-minimax estimation of a multivariate normal mean , 1990 .

[6]  J. S. Rao,et al.  Best Predictive Small Area Estimation , 2011 .

[7]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[8]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[9]  T. Ferguson A Course in Large Sample Theory , 1996 .

[10]  Ker-Chau Li,et al.  From Stein's Unbiased Risk Estimates to the Method of Generalized Cross Validation , 1985 .

[11]  Martin Lysy,et al.  Shrinkage Estimation in Multilevel Normal Models , 2012, 1203.5610.

[12]  M. Bock Minimax Estimators of the Mean of a Multivariate Normal Distribution , 1975 .

[13]  N. G. N. Prasad,et al.  The estimation of mean-squared errors of small-area estimators , 1990 .

[14]  R. Koenker,et al.  Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules , 2014 .

[15]  Contracting towards subspaces when estimating the mean of a multivariate normal distribution , 1982 .

[16]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[17]  J. Neyman,et al.  INADMISSIBILITY OF THE USUAL ESTIMATOR FOR THE MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION , 2005 .

[18]  L. Brown In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies , 2008, 0803.3697.

[19]  J. Berger Admissible Minimax Estimation of a Multivariate Normal Mean with Arbitrary Quadratic Loss , 1976 .

[20]  Dipak K. Dey,et al.  Bayesian Decision Based Estimation and Predictive Inference , 2010 .

[21]  Yonina C. Eldar,et al.  Blind Minimax Estimation , 2007, IEEE Transactions on Information Theory.

[22]  G. Judge,et al.  A Semiparametric Basis for Combining Estimation Problems Under Quadratic Loss , 2004 .

[23]  C. Morris Parametric Empirical Bayes Inference: Theory and Applications , 1983 .

[24]  I. Johnstone,et al.  Minimax Risk over l p-Balls for l q-error , 1994 .

[25]  M. Ghosh,et al.  Small Area Shrinkage Estimation , 2012, 1203.5233.

[26]  C. Morris,et al.  Non-Optimality of Preliminary-Test Estimators for the Mean of a Multivariate Normal Distribution , 1972 .

[27]  B. Efron,et al.  Stein's Estimation Rule and Its Competitors- An Empirical Bayes Approach , 1973 .

[28]  Lawrence D. Brown,et al.  Estimation with Incompletely Specified Loss Functions (the Case of Several Location Parameters) , 1975 .

[29]  James O. Berger,et al.  Selecting a Minimax Estimator of a Multivariate Normal Mean , 1982 .

[30]  Wenhua Jiang,et al.  Empirical Bayes in-season prediction of baseball batting averages , 2010 .

[31]  Harrison H. Zhou,et al.  A data-driven block thresholding approach to wavelet estimation , 2009, 0903.5147.

[32]  Lawrence D. Brown,et al.  SURE Estimates for a Heteroscedastic Hierarchical Model , 2012, Journal of the American Statistical Association.

[33]  Z. Tan Improved minimax estimation of a multivariate normal mean under heteroscedasticity , 2015, 1505.07607.

[34]  C. Stein Inadmissibility of the Usual Estimator for the Mean of a Multivariate Normal Distribution , 1956 .

[35]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[36]  J. Rao,et al.  On measuring the variability of small area estimators under a basic area level model , 2005 .

[37]  J. Rao Small Area Estimation , 2003 .