Impact of discretization of the timeline for longitudinal causal inference methods

In longitudinal settings, causal inference methods usually rely on a discretization of the patient timeline that may not reflect the underlying data generation process. This article investigates the estimation of causal parameters under discretized data. It presents the implicit assumptions practitioners make but do not acknowledge when discretizing data to assess longitudinal causal parameters. We illustrate that differences in point estimates under different discretizations are due to the data coarsening resulting in both a modified definition of the parameter of interest and loss of information about time-dependent confounders. We further investigate several tools to advise analysts in selecting a timeline discretization for use with pooled longitudinal targeted maximum likelihood estimation for the estimation of the parameters of a marginal structural model. We use a simulation study to empirically evaluate bias at different discretizations and assess the use of the cross-validated variance as a measure of data support to select a discretization under a chosen data coarsening mechanism. We then apply our approach to a study on the relative effect of alternative asthma treatments during pregnancy on pregnancy duration. The results of the simulation study illustrate how coarsening changes the target parameter of interest as well as how it may create bias due to a lack of appropriate control for time-dependent confounders. We also observe evidence that the cross-validated variance acts well as a measure of support in the data, by being minimized at finer discretizations as the sample size increases.

[1]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[2]  O. Saarela,et al.  A flexible parametric approach for estimating continuous‐time inverse probability of treatment and censoring weights , 2016, Statistics in medicine.

[3]  M. Beauchesne,et al.  Development and validation of database indexes of asthma severity and control , 2007, Thorax.

[4]  Shu Yang,et al.  Modeling survival distribution as a function of time to treatment discontinuation: A dynamic treatment regime approach , 2018, Biometrics.

[5]  P. Andersen,et al.  Misspecified poisson regression models for large‐scale registry data: inference for ‘large n and small p’ , 2016, Statistics in medicine.

[6]  Samy Suissa,et al.  Primer: administrative health databases in observational studies of drug effects—advantages and disadvantages , 2007, Nature Clinical Practice Rheumatology.

[7]  Shu Yang,et al.  Semiparametric estimation of structural failure time models in continuous-time processes. , 2018, Biometrika.

[8]  S. Goodman,et al.  Causal inference in public health. , 2013, Annual review of public health.

[9]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[10]  Mark J van der Laan,et al.  Discussion of Identification, Estimation and Approximation of Risk under Interventions that Depend on the Natural Value of Treatment Using Observational Data, by Jessica Young, Miguel Hernán, and James Robins , 2014, Journal of causal inference.

[11]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[12]  M. J. van der Laan,et al.  Targeted Minimum Loss Based Estimation of Causal Effects of Multiple Time Point Interventions , 2012, The international journal of biostatistics.

[13]  W. Busse NAEPP Expert Panel Report: Managing Asthma During Pregnancy: Recommendations for Pharmacologic Treatment—2004 Update , 2005 .

[14]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[15]  M. Beauchesne,et al.  Impact of maternal use of asthma-controller therapy on perinatal outcomes , 2013, Thorax.

[16]  James M. Robins,et al.  Causal Inference from Complex Longitudinal Data , 1997 .

[17]  Romain Neugebauer,et al.  Targeted learning with daily EHR data , 2017, Statistics in medicine.

[18]  P. Gibson,et al.  Asthma exacerbations during pregnancy: incidence and association with adverse pregnancy outcomes , 2006, Thorax.

[19]  Mark J van der Laan,et al.  EFFECT OF BREASTFEEDING ON GASTROINTESTINAL INFECTION IN INFANTS: A TARGETED MAXIMUM LIKELIHOOD APPROACH FOR CLUSTERED LONGITUDINAL DATA. , 2014, The annals of applied statistics.

[20]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[21]  Joseph W Hogan,et al.  Causal comparative effectiveness analysis of dynamic continuous‐time treatment initiation rules with sparsely measured outcomes and death , 2019, Biometrics.

[22]  J. Robins,et al.  Estimating causal effects from epidemiological data , 2006, Journal of Epidemiology and Community Health.

[23]  Michael Rosenblum,et al.  Marginal Structural Models , 2011 .

[24]  Mark J. van der Laan,et al.  ltmle: An R Package Implementing Targeted Minimum Loss-Based Estimation for Longitudinal Data , 2017 .

[25]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[26]  J. Lok Statistical modeling of causal effects in continuous time , 2004, math/0410271.

[27]  J. Robins,et al.  Comment: Performance of Double-Robust Estimators When “Inverse Probability” Weights Are Highly Variable , 2007, 0804.2965.

[28]  Dylan S. Small,et al.  CAUSAL INFERENCE FOR CONTINUOUS-TIME PROCESSES WHEN COVARIATES ARE OBSERVED ONLY AT DISCRETE TIMES. , 2011, Annals of statistics.

[29]  M. J. van der Laan,et al.  Targeted Maximum Likelihood Estimation for Dynamic and Static Longitudinal Marginal Structural Working Models , 2014, Journal of causal inference.

[30]  J. Robins,et al.  Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. , 2000, Epidemiology.

[31]  M. J. van der Laan,et al.  The International Journal of Biostatistics Targeted Maximum Likelihood Learning , 2011 .

[32]  M. Beauchesne,et al.  Relationship Between Changes in Inhaled Corticosteroid Use and Markers of Uncontrolled Asthma During Pregnancy , 2012, Pharmacotherapy.

[33]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[34]  Kristin E. Porter,et al.  Diagnosing and responding to violations in the positivity assumption , 2012, Statistical methods in medical research.

[35]  Tyler J. VanderWeele,et al.  Concerning the consistency assumption in causal inference. , 2009, Epidemiology.

[36]  P. Duca,et al.  Use of administrative data in healthcare research , 2015, Internal and Emergency Medicine.

[37]  K. Røysland A martingale approach to continuous-time marginal structural models , 2009, 0901.2593.

[38]  M. Laan,et al.  Evaluation of adaptive treatment strategies in an observational study where time-varying covariates are not monitored systematically , 2018, 1806.11153.

[39]  Judith J Lok,et al.  MIMICKING COUNTERFACTUAL OUTCOMES TO ESTIMATE CAUSAL EFFECTS. , 2017, Annals of statistics.