Number of imputations needed to stabilize estimated treatment difference in longitudinal data analysis

Multiple imputation procedures replace each missing value with a set of plausible values based on the posterior predictive distribution of missing data given observed data. In many applications, as few as five imputations are adequate to achieve high efficiency relative to an infinite number of imputations. However, substantially more imputations are often needed to stabilize imputation-based inference at the analysis stage. Imputation-based inference at the analysis stage is considered stable if the conditional variability of the multiple imputation estimator, half-width of 95% confidence interval, test statistic, and estimated fraction of missing information given observed data is within specified thresholds for simulation error. For the estimation of treatment difference at study end for normally distributed responses in longitudinal trials, we calculate the multiple imputation quantities for an infinite number of imputations analytically and use simulations to assess the variability of the number of imputations needed at the analysis stage in repeated sampling.

[1]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[2]  Ulf Schepsmeier,et al.  Derivatives and Fisher information of bivariate copulas , 2014 .

[3]  Ofer Harel,et al.  Inferences on missing information under multiple imputation and two-stage multiple imputation , 2007 .

[4]  D. Rubin,et al.  Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse , 1986 .

[5]  R Little,et al.  Intent-to-treat analysis for longitudinal studies with drop-outs. , 1996, Biometrics.

[6]  D. Rubin,et al.  MULTIPLE IMPUTATIONS IN SAMPLE SURVEYS-A PHENOMENOLOGICAL BAYESIAN APPROACH TO NONRESPONSE , 2002 .

[7]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[8]  M. Segal,et al.  A parametric family of correlation structures for the analysis of longitudinal data. , 1992, Biometrics.

[9]  Terry L Katz Missing Data in Clinical Trials Forum , 2015 .

[10]  J. Graham,et al.  How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory , 2007, Prevention Science.

[11]  John B Carlin,et al.  American Journal of Epidemiology Practice of Epidemiology Strategies for Multiple Imputation in Longitudinal Studies , 2022 .

[12]  Paul T. von Hippel,et al.  TEACHER'S CORNER: How Many Imputations Are Needed? A Comment on Hershberger and Fisher (2003) , 2005 .

[13]  H. Stern,et al.  The use of multiple imputation for the analysis of missing data. , 2001, Psychological methods.

[14]  Xiao-Hua Zhou,et al.  Multiple imputation: review of theory, implementation and software , 2007, Statistics in medicine.

[15]  D. Rubin,et al.  Large-sample significance levels from multiply imputed data using moment-based statistics and an F reference distribution , 1991 .

[16]  Søren Feodor Nielsen,et al.  1. Statistical Analysis with Missing Data (2nd edn). Roderick J. Little and Donald B. Rubin, John Wiley & Sons, New York, 2002. No. of pages: xv+381. ISBN: 0‐471‐18386‐5 , 2004 .

[17]  Paul Zhang Multiple Imputation: Theory and Method , 2003 .

[18]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[19]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[20]  Todd E. Bodner,et al.  What Improves with Increased Missing Data Imputations? , 2008 .

[21]  Chung-Wei Shen,et al.  Model selection of generalized estimating equations with multiply imputed longitudinal data , 2013, Biometrical journal. Biometrische Zeitschrift.

[22]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[23]  Roger A. Sugden,et al.  Multiple Imputation for Nonresponse in Surveys , 1988 .

[24]  P. Allison Multiple Imputation for Missing Data , 2000 .