Responsiveness-informed multiple imputation and inverse probability-weighting in cohort studies with missing data that are non-monotone or not missing at random

Population-based cohort studies are invaluable to health research because of the breadth of data collection over time, and the representativeness of their samples. However, they are especially prone to missing data, which can compromise the validity of analyses when data are not missing at random. Having many waves of data collection presents opportunity for participants’ responsiveness to be observed over time, which may be informative about missing data mechanisms and thus useful as an auxiliary variable. Modern approaches to handling missing data such as multiple imputation and maximum likelihood can be difficult to implement with the large numbers of auxiliary variables and large amounts of non-monotone missing data that occur in cohort studies. Inverse probability-weighting can be easier to implement but conventional wisdom has stated that it cannot be applied to non-monotone missing data. This paper describes two methods of applying inverse probability-weighting to non-monotone missing data, and explores the potential value of including measures of responsiveness in either inverse probability-weighting or multiple imputation. Simulation studies are used to compare methods and demonstrate that responsiveness in longitudinal studies can be used to mitigate bias induced by missing data, even when data are not missing at random.

[1]  P. Allison Estimation of Linear Models with Incomplete Data , 1987 .

[2]  M. Jonson-Reid,et al.  Poverty and Child Maltreatment , 2014 .

[3]  Roger A. Sugden,et al.  Multiple Imputation for Nonresponse in Surveys , 1988 .

[4]  John B Carlin,et al.  Multiple imputation for missing data: fully conditional specification versus multivariate normal imputation. , 2010, American journal of epidemiology.

[5]  Andrew J Copas,et al.  Combining Multiple Imputation and Inverse-Probability Weighting , 2012, Biometrics.

[6]  M. Kenward,et al.  Every missingness not at random model has a missingness at random counterpart with equal fit , 2008 .

[7]  Joseph L Schafer,et al.  Robustness of a multivariate normal approximation for imputation of incomplete binary data , 2007, Statistics in medicine.

[8]  E. Hyppönen,et al.  Prenatal Exposures and Glucose Metabolism in Adulthood , 2007, Diabetes Care.

[9]  I. White,et al.  Inverse Probability Weighting with Missing Predictors of Treatment Assignment or Missingness , 2014 .

[10]  Bengt Muthén,et al.  On structural equation modeling with data that are not missing completely at random , 1987 .

[11]  J. Listing,et al.  TESTS IF DROPOUTS ARE MISSED AT RANDOM , 1998 .

[12]  Sarah A. Mustillo,et al.  Auxiliary Variables in Multiple Imputation When Data Are Missing Not at Random , 2015 .

[13]  Patrick Royston,et al.  Multiple imputation using chained equations: Issues and guidance for practice , 2011, Statistics in medicine.

[14]  I. Plewis,et al.  Modelling non‐response in the National Child Development Study , 2006 .

[15]  Investigating the missing data mechanism in quality of life outcomes: a comparison of approaches , 2009, Health and quality of life outcomes.

[16]  I. White,et al.  Review of inverse probability weighting for dealing with missing data , 2013, Statistical methods in medical research.

[17]  Russell V. Lenth,et al.  Statistical Analysis With Missing Data (2nd ed.) (Book) , 2004 .

[18]  M. Kenward,et al.  Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls , 2009, BMJ : British Medical Journal.

[19]  D. Rubin Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation , 2001, Health Services and Outcomes Research Methodology.

[20]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[21]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[22]  Manfred Jaeger On Testing the Missing at Random Assumption , 2006, ECML.

[23]  D. Fairclough Design and analysis of quality of life studies in clinical trials , 2002, Quality of Life Research.

[24]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[25]  J. Wooldridge Inverse probability weighted estimation for general missing data problems , 2004 .