Missing data methods in longitudinal studies: a review

Incomplete data are quite common in biomedical and other types of research, especially in longitudinal studies. During the last three decades, a vast amount of work has been done in the area. This has led, on the one hand, to a rich taxonomy of missing-data concepts, issues, and methods and, on the other hand, to a variety of data-analytic tools. Elements of taxonomy include: missing data patterns, mechanisms, and modeling frameworks; inferential paradigms; and sensitivity analysis frameworks. These are described in detail. A variety of concrete modeling devices is presented. To make matters concrete, two case studies are considered. The first one concerns quality of life among breast cancer patients, while the second one examines data from the Muscatine children’s obesity study.

[1]  Joseph G Ibrahim,et al.  Frailty models with missing covariates. , 2002, Biometrics.

[2]  S. Zeger,et al.  Joint analysis of longitudinal data comprising repeated measures and times to events , 2001 .

[3]  S. Lipsitz,et al.  Likelihood Methods for Incomplete Longitudinal Binary Responses with Incomplete Categorical Covariates , 1999, Biometrics.

[4]  Yudi Pawitan,et al.  Modeling Disease Marker Processes in AIDS , 1993 .

[5]  Joseph G. Ibrahim,et al.  Bayesian Methods for Missing Covariates in Cure Rate Models , 2002, Lifetime data analysis.

[6]  Joseph G. Ibrahim,et al.  BAYESIAN METHODS FOR JOINT MODELING OF LONGITUDINAL AND SURVIVAL DATA WITH APPLICATIONS TO CANCER VACCINE TRIALS , 2004 .

[7]  Joseph G. Ibrahim,et al.  Missing covariates in generalized linear models when the missing data mechanism is non‐ignorable , 1999 .

[8]  Roderick J. A. Little,et al.  A Class of Pattern-Mixture Models for Normal Incomplete Data , 1994 .

[9]  G Molenberghs,et al.  Bias in estimating association parameters for longitudinal binary responses with drop-outs. , 2001, Biometrics.

[10]  Stuart R. Lipsitz,et al.  Analysis of longitudinal data with non‐ignorable non‐monotone missing values , 2002 .

[11]  Raymond J. Carroll,et al.  Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process , 1988 .

[12]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[13]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[14]  N M Laird,et al.  Increasing efficiency from censored survival data by using random effects to model longitudinal covariates , 1998, Statistical methods in medical research.

[15]  Joseph G. Ibrahim,et al.  Bayesian variable selection for the Cox regression model with missing covariates , 2008, Lifetime data analysis.

[16]  N M Laird,et al.  Generalized linear mixture models for handling nonignorable dropouts in longitudinal studies. , 2000, Biostatistics.

[17]  J. Ibrahim,et al.  A Flexible B‐Spline Model for Multiple Longitudinal Biomarkers and Survival , 2005, Biometrics.

[18]  P. Diggle,et al.  Analysis of Longitudinal Data , 2003 .

[19]  G. Molenberghs,et al.  Longitudinal data analysis , 2008 .

[20]  B. Carlin,et al.  Bayesian Tobit Modeling of Longitudinal Ordinal Clinical Trial Compliance Data with Nonignorable Missingness , 1996 .

[21]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[22]  M. Kenward,et al.  Informative Drop‐Out in Longitudinal Data Analysis , 1994 .

[23]  Geert Molenberghs,et al.  Missing Data in Clinical Studies , 2007 .

[24]  J G Ibrahim,et al.  Maximum Likelihood Methods for Cure Rate Models with Missing Covariates , 2001, Biometrics.

[25]  G. Molenberghs,et al.  Linear Mixed Models for Longitudinal Data , 2001 .

[26]  S. Lipsitz,et al.  Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable , 2001 .

[27]  R. Wolfinger,et al.  Generalized linear mixed models a pseudo-likelihood approach , 1993 .

[28]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[29]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[30]  Joseph G. Ibrahim,et al.  Posterior propriety and computation for the Cox regression model with applications to missing covariates , 2006 .

[31]  Qingxia Chen,et al.  Sieve Maximum Likelihood Estimation for Regression Models With Covariates Missing at Random , 2007 .

[32]  J. Ibrahim Incomplete Data in Generalized Linear Models , 1990 .

[33]  R Henderson,et al.  Joint modelling of longitudinal measurements and event time data. , 2000, Biostatistics.

[34]  R. Little,et al.  Pattern-mixture models for multivariate incomplete data with covariates. , 1996, Biometrics.

[35]  Joseph G. Ibrahim,et al.  On propriety of the posterior distribution and existence of the maximum likelihood estimator for regression models with covariates missing at random , 2004 .

[36]  Geert Molenberghs,et al.  Using a Box–Cox transformation in the analysis of longitudinal data with incomplete responses , 2000 .

[37]  I. Meilijson A fast improvement to the EM algorithm on its own terms , 1989 .

[38]  S. Pocock,et al.  Coping with missing data in clinical trials: A model‐based approach applied to asthma trials , 2002, Statistics in medicine.

[39]  M. Kenward,et al.  The analysis of longitudinal ordinal data with nonrandom drop-out , 1997 .

[40]  D. Follmann,et al.  An approximate generalized linear model with random effects for informative missing data. , 1995, Biometrics.

[41]  Jeremy M. G. Taylor,et al.  A Stochastic Model for Analysis of Longitudinal AIDS Data , 1994 .

[42]  G. Molenberghs,et al.  A Latent‐Class Mixture Model for Incomplete Longitudinal Gaussian Data , 2008, Biometrics.

[43]  James M. Robins,et al.  Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse , 1998 .

[44]  J. Ware,et al.  Applied Longitudinal Analysis , 2004 .

[45]  LIKELIHOOD-BASED INFERENCE WITH NONIGNORABLE MISSING RESPONSES AND COVARIATES IN MODELS FOR DISCRETE LONGITUDINAL DATA , 2006 .

[46]  Hongtu Zhu,et al.  VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA. , 2010, Statistica Sinica.

[47]  Joseph G Ibrahim,et al.  Theory and Inference for Regression Models with Missing Responses and Covariates. , 2008, Journal of multivariate analysis.

[48]  S. Lipsitz,et al.  Missing-Data Methods for Generalized Linear Models , 2005 .

[49]  Geert Molenberghs,et al.  Shared parameter models under random effects misspecification , 2008 .

[50]  V. De Gruttola,et al.  Modelling progression of CD4-lymphocyte count and its relationship to survival time. , 1994, Biometrics.

[51]  Joseph G Ibrahim,et al.  Estimation in regression models for longitudinal binary data with outcome-dependent follow-up. , 2005, Biostatistics.

[52]  Joseph G. Ibrahim,et al.  A new joint model for longitudinal and survival data with a cure fraction , 2004 .

[53]  Joseph G. Ibrahim,et al.  Bayesian methods for generalized linear models with covariates missing at random , 2002 .

[54]  J. Ibrahim,et al.  Semiparametric Models for Missing Covariate and Response Data in Regression Models , 2006, Biometrics.

[55]  R. Cook Assessment of Local Influence , 1986 .

[56]  J. Ibrahim,et al.  Model Selection Criteria for Missing-Data Problems Using the EM Algorithm , 2008, Journal of the American Statistical Association.

[57]  M. Wulfsohn,et al.  Modeling the Relationship of Survival to Longitudinal Data Measured with Error. Applications to Survival and CD4 Counts in Patients with AIDS , 1995 .

[58]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[59]  Joseph G. Ibrahim,et al.  Propriety of the Posterior Distribution and Existence of the MLE for Regression Models With Covariates Missing at Random , 2004 .

[60]  Maximum likelihood estimation in random effects cure rate models with nonignorable missing covariates. , 2002, Biostatistics.

[61]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[62]  J. Ibrahim,et al.  A Bayesian semiparametric joint hierarchical model for longitudinal and survival data. , 2003, Biometrics.

[63]  Joseph G Ibrahim,et al.  Joint Models for Multivariate Longitudinal and Multivariate Survival Data , 2006, Biometrics.

[64]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[65]  Geert Molenberghs,et al.  Strategies to fit pattern-mixture models. , 2002, Biostatistics.

[66]  V. DeGruttola,et al.  Models for empirical Bayes estimators of longitudinal CD4 counts. , 1996, Statistics in medicine.

[67]  K. Bailey,et al.  Analysing changes in the presence of informative right censoring caused by death and withdrawal. , 1988, Statistics in medicine.

[68]  R. Prentice Surrogate endpoints in clinical trials: definition and operational criteria. , 1989, Statistics in medicine.

[69]  E Lesaffre,et al.  Local influence in linear mixed models. , 1998, Biometrics.

[70]  W. Gilks,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 1992 .

[71]  Joseph G. Ibrahim,et al.  Maximum likelihood inference for the Cox regression model with applications to missing covariates , 2009, J. Multivar. Anal..

[72]  Geert Molenberghs,et al.  Validation of Surrogate Endpoints in Multiple Randomized Clinical Trials with Discrete Outcomes , 2002 .

[73]  R. Jennrich,et al.  Unbalanced repeated-measures models with structured covariance matrices. , 1986, Biometrics.

[74]  J G Ibrahim,et al.  Monte Carlo EM for Missing Covariates in Parametric Regression Models , 1999, Biometrics.

[75]  M D Schluchter,et al.  Methods for the analysis of informatively censored longitudinal data. , 1992, Statistics in medicine.

[76]  J. Ibrahim,et al.  Likelihood-Based Methods for Missing Covariates in the Cox Proportional Hazards Model , 2001 .

[77]  Joseph G Ibrahim,et al.  Local Influence for Generalized Linear Models with Missing Covariates , 2009, Biometrics.

[78]  G. Molenberghs,et al.  Models for Discrete Longitudinal Data , 2005 .

[79]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[80]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[81]  Joseph G Ibrahim,et al.  Bayesian Approaches to Joint Cure‐Rate and Longitudinal Models with Applications to Cancer Vaccine Trials , 2003, Biometrics.

[82]  D. Thomas,et al.  Simultaneously modelling censored survival data and repeatedly measured covariates: a Gibbs sampling approach. , 1996, Statistics in medicine.

[83]  Stuart R. Lipsitz,et al.  Marginal models for the analysis of longitudinal measurements with nonignorable non-monotone missing data , 1998 .

[84]  Joseph G Ibrahim,et al.  Parameter Estimation in Longitudinal Studies with Outcome‐Dependent Follow‐Up , 2002, Biometrics.

[85]  Joseph G. Ibrahim,et al.  Non‐ignorable missing covariate data in survival analysis: a case‐study of an International Breast Cancer Study Group trial , 2004 .

[86]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[87]  Geert Molenberghs,et al.  Shared-parameter models and missingness at random , 2008 .

[88]  Yueh-Yun Chi,et al.  Bayesian approaches to joint longitudinal and survival models accommodating both zero and nonzero cure fractions , 2007 .

[89]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[90]  K. Bailey,et al.  Estimation and comparison of changes in the presence of informative right censoring: conditional linear model. , 1989, Biometrics.

[91]  N M Laird,et al.  Mixture models for the joint distribution of repeated measures and event times. , 1997, Statistics in medicine.

[92]  Joseph G. Ibrahim,et al.  A Weighted Estimating Equation for Missing Covariate Data with Properties Similar to Maximum Likelihood , 1999 .

[93]  D. Spiegelhalter,et al.  Bayesian Analysis of Realistically Complex Models , 1996 .

[94]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[95]  J. Ibrahim,et al.  Diagnostic Measures for Generalized Linear Models with Missing Covariates , 2009, Scandinavian journal of statistics, theory and applications.

[96]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[97]  Chris J. Skinner,et al.  The Muscatine children's obesity data reanalysed using pattern mixture models , 2008 .

[98]  Sik-Yum Lee,et al.  Local influence for incomplete data models , 2001 .

[99]  Robert F. Woolson,et al.  Analysis of categorical incomplete longitudinal data , 1984 .

[100]  M. Kenward,et al.  Informative dropout in longitudinal data analysis (with discussion) , 1994 .

[101]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[102]  Joseph G Ibrahim,et al.  Maximum Likelihood Methods for Nonignorable Missing Responses and Covariates in Random Effects Models , 2003, Biometrics.

[103]  Joseph G Ibrahim,et al.  Bayesian Analysis for Generalized Linear Models with Nonignorably Missing Covariates , 2005, Biometrics.

[104]  S. Lipsitz,et al.  Testing for bias in weighted estimating equations. , 2001, Biostatistics.

[105]  Christopher J. Nachtsheim,et al.  Diagnostics for mixed-model analysis of variance , 1987 .

[106]  R. Little Pattern-Mixture Models for Multivariate Incomplete Data , 1993 .