Analysis of longitudinal data from outcome‐dependent visit processes: Failure of proposed methods in realistic settings and potential improvements

The timing and frequency of the measurement of longitudinal outcomes in databases may be associated with the value of the outcome. Such visit processes are termed outcome dependent, and previous work showed that conducting standard analyses that ignore outcome-dependent visit times can produce highly biased estimates of the associations of covariates with outcomes. The literature contains several classes of approaches to analyze longitudinal data subject to outcome-dependent visit times, and all of these are based on simplifying assumptions about the visit process. Based on extensive discussions with subject matter investigators, we identified common characteristics of outcome-dependent visit processes that allowed us to evaluate the performance of existing methods in settings with more realistic visit processes than have been previously investigated. This paper uses the analysis of data from a study of kidney function, theory, and simulation studies to examine a range of settings that vary from those where all visits have a low degree of missingness and outcome dependence (which we call "regular" visits) to those where all visits have a high degree of missingness and outcome dependence (which we call "irregular" visits). Our results show that while all the approaches we studied can yield biased estimates of some covariate effects, other covariate effects can be estimated with little bias. In particular, mixed effects models fit by maximum likelihood yielded little bias in estimates of the effects of covariates not associated with the random effects and small bias in estimates of the effects of covariates associated with the random effects. Other approaches produced estimates with greater bias. Our results also show that the presence of some regular visits in the data set protects mixed model analyses from bias but not other methods.

[1]  Joseph G Ibrahim,et al.  Estimation in regression models for longitudinal binary data with outcome-dependent follow-up. , 2005, Biostatistics.

[2]  Joseph G Ibrahim,et al.  Parameter Estimation in Longitudinal Studies with Outcome‐Dependent Follow‐Up , 2002, Biometrics.

[3]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[4]  Yu Liang,et al.  Joint Modeling and Analysis of Longitudinal Data with Informative Observation Times , 2009, Biometrics.

[5]  E. Brown,et al.  Longitudinal data analysis for generalized linear models under participant-driven informative follow-up: an application in maternal health epidemiology. , 2010, American journal of epidemiology.

[6]  John M Neuhaus,et al.  Biased and unbiased estimation in longitudinal studies with informative visit processes , 2016, Biometrics.

[7]  Zhen Chen,et al.  A joint modeling approach to data with informative cluster size: Robustness to the cluster size model , 2011, Statistics in medicine.

[8]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[9]  Xingqiu Zhao,et al.  Semiparametric Regression Analysis of Longitudinal Data With Informative Observation Times , 2005 .

[10]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[11]  Benjamin French,et al.  Regression modeling of longitudinal data with outcome‐dependent observation times: extensions and comparative evaluation , 2014, Statistics in medicine.

[12]  Duchwan Ryu,et al.  Longitudinal Studies With Outcome-Dependent Follow-up , 2007, Journal of the American Statistical Association.

[13]  J. Kalbfleisch,et al.  A Comparison of Cluster-Specific and Population-Averaged Approaches for Analyzing Correlated Binary Data , 1991 .

[14]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data: Kalbfleisch/The Statistical , 2002 .

[15]  P. Diggle,et al.  Geostatistical inference under preferential sampling , 2010 .

[16]  Thomas Lumley,et al.  Longitudinal data analysis for generalized linear models with follow‐up dependent on outcome‐related variables , 2007 .

[17]  Lei Liu,et al.  Analysis of Longitudinal Data in the Presence of Informative Observational Times and a Dependent Terminal Event, with Application to Medical Cost Data , 2008, Biometrics.

[18]  Jie Zhou,et al.  A NEW INFERENCE APPROACH FOR JOINT MODELS OF LONGITUDINAL DATA WITH INFORMATIVE OBSERVATION AND CENSORING TIMES , 2013 .