Model Selection in Estimating Equations

Model selection is a necessary step in many practical regression analyses. But for methods based on estimating equations, such as the quasi-likelihood and generalized estimating equation (GEE) approaches, there seem to be few well-studied model selection techniques. In this article, we propose a new model selection criterion that minimizes the expected predictive bias (EPB) of estimating equations. A bootstrap smoothed cross-validation (BCV) estimate of EPB is presented and its performance is assessed via simulation for overdispersed generalized linear models. For illustration, the method is applied to a real data set taken from a study of the development of ewe embryos.

[1]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[2]  L. Breiman,et al.  Submodel selection and evaluation in regression. The X-random case , 1992 .

[3]  Williams Da,et al.  The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. , 1975 .

[4]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[5]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[6]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[7]  C. L. Mallows Some comments on C_p , 1973 .

[8]  C. Mallows More comments on C p , 1995 .

[9]  Alan J. Miller Subset Selection in Regression , 1992 .

[10]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[11]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[12]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[13]  Edna Schechtman,et al.  Efficient bootstrap simulation , 1986 .

[14]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[15]  J. Shao,et al.  The jackknife and bootstrap , 1996 .

[16]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[17]  Anthony C. Davison,et al.  Saddlepoint approximations in resampling methods , 1988 .

[18]  W. Pan Akaike's Information Criterion in Generalized Estimating Equations , 2001, Biometrics.

[19]  Martin Crowder,et al.  Beta-binomial Anova for Proportions , 1978 .

[20]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[21]  Clifford M. Hurvich,et al.  Model selection for extended quasi-likelihood models in small samples. , 1995, Biometrics.

[22]  B. Engel,et al.  Analysis of embryonic development with a model for under- or overdispersion relative to binomial variation. , 1993, Biometrics.

[23]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[24]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[25]  D. A. Williams,et al.  The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. , 1975, Biometrics.