A comparison of inclusive and restrictive strategies in modern missing data procedures.

Two classes of modern missing data procedures, maximum likelihood (ML) and multiple imputation (MI), tend to yield similar results when implemented in comparable ways. In either approach, it is possible to include auxiliary variables solely for the purpose of improving the missing data procedure. A simulation was presented to assess the potential costs and benefits of a restrictive strategy, which makes minimal use of auxiliary variables, versus an inclusive strategy, which makes liberal use of such variables. The simulation showed that the inclusive strategy is to be greatly preferred. With an inclusive strategy not only is there a reduced chance of inadvertently omitting an important cause of missingness, there is also the possibility of noticeable gains in terms of increased efficiency and reduced bias, with only minor costs. As implemented in currently available software, the ML approach tends to encourage the use of a restrictive strategy, whereas the MI approach makes it relatively simple to use an inclusive strategy.

[1]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[2]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[3]  D. Rubin,et al.  Multiple Imputation for Nonresponse in Surveys , 1989 .

[4]  Xiao-Li Meng,et al.  Multiple-Imputation Inferences with Uncongenial Sources of Input , 1994 .

[5]  P W Lavori,et al.  A multiple imputation strategy for clinical trials with truncation of patient data. , 1995, Statistics in medicine.

[6]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[7]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[8]  R Little,et al.  Intent-to-treat analysis for longitudinal studies with drop-outs. , 1996, Biometrics.

[9]  R. Littell SAS System for Mixed Models , 1996 .

[10]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[11]  J L Schafer,et al.  Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective. , 1998, Multivariate behavioral research.

[12]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[13]  R. Hoyle Statistical Strategies for Small Sample Research , 1999 .

[14]  J. Schafer,et al.  On the performance of multiple imputation for multivariate data with small sample size , 1999 .

[15]  K. Yuan,et al.  5. Three Likelihood-Based Methods for Mean and Covariance Structure Analysis with Nonnormal Missing Data , 2000 .

[16]  P. Allison Multiple Imputation for Missing Data , 2000 .

[17]  Bengt Muthén,et al.  Second-generation structural equation modeling with a combination of categorical and continuous latent variables: New opportunities for latent class–latent growth modeling. , 2001 .

[18]  G. King,et al.  Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation , 2001, American Political Science Review.

[19]  Linda M. Collins,et al.  New methods for the analysis of change , 2001 .

[20]  John W. Graham,et al.  Planned missing-data designs in analysis of change. , 2001 .

[21]  J. Graham Adding Missing-Data-Relevant Variables to FIML-Based Structural Equation Models , 2003 .