Model-averaged confidence intervals for factorial experiments

We consider the coverage rate of model-averaged confidence intervals for the treatment means in a factorial experiment, when we use a normal linear model in the analysis. Model-averaging provides a useful compromise between using the full model (containing all main effects and interactions) and a "best model" obtained by some model-selection process. Use of the full model guarantees perfect coverage, whereas use of a best model is known to lead to narrow intervals with poor coverage. Model-averaging allows us to achieve good coverage using intervals that are also narrower than those from the full model. We compare four information criteria that might be used for model-averaging in this setting: AIC, AICc, and BIC. In this setting, if the full model is "truth", all the criteria will have perfect coverage rates asymptotically. We use simulation to assess the coverage rates and interval widths likely to be achieved by a confidence interval with a nominal coverage of 95%. Our results suggest that AIC performs best in terms of coverage rate; across a wide range of scenarios and replication levels, it consistently provides coverage rates within 1.5% points of the nominal level, while also leading to reductions in interval-width of up to 30%, compared to the full model. AICc performed worst overall, with a coverage rate that was up to 5.2% points too low. We recommend that model-averaging become standard practise when summarising the results of a factorial experiment in terms of the treatment means, and that AIC be used to perform the model-averaging.

[1]  M. Hazewinkel Encyclopaedia of mathematics , 1987 .

[2]  Yuhong Yang,et al.  Model combining in factorial data analysis , 2007 .

[3]  K. Burnham,et al.  Model selection: An integral part of inference , 1997 .

[4]  C. Chatfield Model uncertainty, data mining and statistical inference , 1995 .

[5]  Nils Lid Hjort,et al.  Model Selection and Model Averaging , 2001 .

[6]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[7]  David R. Anderson,et al.  Understanding AIC and BIC in Model Selection , 2004 .

[8]  William A Link,et al.  Model weights and the foundations of multimodel inference. , 2006, Ecology.

[9]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[10]  N. Hjort,et al.  Frequentist Model Average Estimators , 2003 .

[11]  D. Madigan,et al.  Bayesian Model Averaging in Proportional Hazard Models: Assessing the Risk of a Stroke , 1997 .

[12]  N. Hjort,et al.  Comprar Model Selection and Model Averaging | Gerda Claeskens | 9780521852258 | Cambridge University Press , 2008 .

[13]  R. Mead,et al.  The Design of Experiments: Statistical Principles for Practical Applications. , 1989 .

[14]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[15]  Clifford M. Hurvich,et al.  The impact of model selection on inference in linear regression , 1990 .

[16]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[17]  A. John Bailer,et al.  Comparing model averaging with other model selection strategies for benchmark dose estimation , 2009, Environmental and Ecological Statistics.

[18]  N. Sugiura Further analysts of the data by akaike' s information criterion and the finite corrections , 1978 .

[19]  David R. Anderson,et al.  Model selection bias and Freedman’s paradox , 2010 .