Methods to Account for Attrition in Longitudinal Data: Do They Work? A Simulation Study

Attrition threatens the internal validity of cohort studies. Epidemiologists use various imputation and weighting methods to limit bias due to attrition. However, the ability of these methods to correct for attrition bias has not been tested. We simulated a cohort of 300 subjects using 500 computer replications to determine whether regression imputation, individual weighting, or multiple imputation is useful to reduce attrition bias. We compared these results to a complete subject analysis. Our logistic regression model included a binary exposure and two confounders. We generated 10, 25, and 40% attrition through three missing data mechanisms: missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR), and used four covariance matrices to vary attrition. We compared true and estimated mean odds ratios (ORs), standard deviations (SDs), and coverage. With data MCAR and MAR for all attrition rates, the complete subject analysis produced results at least as valid as those from the imputation and weighting methods. With data MNAR, no method provided unbiased estimates of the OR at attrition rates of 25 or 40%. When observations are not MAR or MCAR, imputation and weighting methods may not effectively reduce attrition bias.

[1]  A. Heath,et al.  Assessing the Effects of Cooperation Bias and Attrition in Behavioral Genetic Research Using Data-Weighting , 1998, Behavior genetics.

[2]  S. Greenland,et al.  The importance of critically interpreting simulation studies. , 1997, Epidemiology.

[3]  D. Kiel,et al.  Elderly cohort study subjects unable to return for follow-up have lower bone mass than those who can return. , 2000, American journal of epidemiology.

[4]  Impact of correcting for nonresponse by weighting on estimates of alcohol consumption. , 2003, Journal of studies on alcohol.

[5]  Paula Diehr,et al.  Imputation of missing longitudinal data: a comparison of methods. , 2003, Journal of clinical epidemiology.

[6]  E. Hey,et al.  Bias due to incomplete follow up in a cohort study , 1999, British Journal of Ophthalmology.

[7]  Jos Twisk,et al.  Attrition in longitudinal studies. How to deal with missing data. , 2002, Journal of clinical epidemiology.

[8]  Jacques P. Brown,et al.  Stability of Normative Data for the SF-36 , 2004, Canadian journal of public health.

[9]  Roslyn A Stone,et al.  A method for imputing missing data in longitudinal studies. , 2004, Annals of epidemiology.

[10]  Carol M Musil,et al.  A Comparison of Imputation Techniques for Handling Missing Data , 2002, Western journal of nursing research.

[11]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[12]  A. Dobson,et al.  Multiple imputation for body mass index: lessons from the Australian Longitudinal Study on Women's Health , 2004, Statistics in medicine.

[13]  A. Sigurdson,et al.  An application of a weighting method to adjust for nonresponse in standardized incidence ratio analysis of cohort studies. , 2005, Annals of Epidemiology.

[14]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[15]  M P Becker,et al.  A multiple imputation strategy for incomplete longitudinal data , 2001, Statistics in medicine.

[16]  T. Church,et al.  An epidemiological study of the magnitude and consequences of work related violence: the Minnesota Nurses’ Study , 2004, Occupational and Environmental Medicine.

[17]  S Greenland,et al.  Response and follow-up bias in cohort studies. , 1977, American journal of epidemiology.

[18]  T. Sellers,et al.  Mortality and cancer rates in nonrespondents to a prospective study of older women: 5-year follow-up. , 1994, American journal of epidemiology.

[19]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[20]  Janet Dixon Elashoff,et al.  Two-sample Problems for a Dichotomous Variable with Missing Data , 1974 .

[21]  Mark Woodward,et al.  Imputations of missing values in practice: results from imputations of serum cholesterol in 28 cohort studies. , 2004, American journal of epidemiology.

[22]  S. Pocock,et al.  Impact of Missing Data Due to Selective Dropouts in Cohort Studies and Clinical Trials , 2002, Epidemiology.

[23]  Trivellore E Raghunathan,et al.  Use of multiple imputation to correct for nonresponse bias in a survey of urologic symptoms among African-American men. , 2002, American journal of epidemiology.

[24]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[25]  Pierre Côté,et al.  Loss to Follow-Up in Cohort Studies: How Much is Too Much? , 2003, European Journal of Epidemiology.

[26]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[27]  Jack P. C. Kleijnen,et al.  The role of statistical methodology in simulation , 1978, SIML.

[28]  S. Crawford,et al.  A comparison of anlaytic methods for non-random missingness of outcome data. , 1995, Journal of clinical epidemiology.

[29]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.