论文信息 - Simple Examples of Estimating Causal Effects Using Targeted Maximum Likelihood Estimation

Simple Examples of Estimating Causal Effects Using Targeted Maximum Likelihood Estimation

We present a brief overview of targeted maximum likelihood for estimating the causal effect of a single time point treatment and of a two time point treatment. We focus on simple examples demonstrating how to apply the methodology developed in (van der Laan and Rubin, 2006; Moore and van der Laan, 2007; van der Laan, 2010a,b). We include R code for the single time point case. 1 Single Time Point Treatment We present a brief example, in the context of an observational study of HIV positive individuals on antiretroviral therapy. Assume we have a binary exposure A0, such as medication adherence being above 90% or not, and a binary outcome Y , such as virologic failure. Assume we have baseline variables L0 that should include all important confounders of the effect of A0 on Y . Say we want to estimate the causal effect of A0 on the mean of Y , as a risk difference; that is, we’d like to estimate the difference between the population mean of Y were everyone to have had exposure set as A0 = 0, and the population mean were everyone to have had exposure set as A0 = 1. (Below, we use both the terms “exposure” and “treatment” to refer to A0.) Below, for simplicity, we just show how to estimate the treatment specific (counterfactual) mean setting A0 = 1. Let p denote a joint probability density on the variables (L0, A0, Y ). (Throughout, we use the term “density” in the general sense; that is, it refers to a frequency function for discrete valued variables and refers to a density for continuous valued variables.) Here we will put no restrictions on p, except that we only consider p for which all the conditional distributions we give below are well-defined. Assume that for each subject i we get a vector of data (L 0 , A (i) 0 , Y ), where each such vector is an independent draw from the true (unknown) data generating distribution p∗ on (L0, A0, Y ). Assume we have n subjects. Under certain assumptions, the treatment specific mean of Y setting A0 = 1 equals the mean over the baseline variables L0 of p∗(Y = 1|A0 = 1, L0), which we denote by

Mark J. van der Laan | Michael Rosenblum

[1] M. J. van der Laan,et al. The International Journal of Biostatistics Targeted Maximum Likelihood Learning , 2011 .

[2] Roderick J. A. Little,et al. Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models: Comment , 1999 .

[3] Readings in Targeted Maximum Likelihood Estimation , 2009 .

[4] M. J. Laan. Statistical Inference for Variable Importance , 2006 .

[5] M. Laan,et al. Selecting Optimal Treatments Based on Predictive Factors , 2009 .

[6] Mark J. van der Laan,et al. Targeted Maximum Likelihood Based Causal Inference: Part I , 2010 .

[7] Michael Rosenblum,et al. Targeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model , 2010, The international journal of biostatistics.

[8] Michael Rosenblum,et al. The International Journal of Biostatistics Simple , Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables , 2011 .

[9] J. Robins,et al. Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[10] M J van der Laan,et al. Covariate adjustment in randomized trials with binary outcomes: Targeted maximum likelihood estimation , 2009, Statistics in medicine.