Simple Examples of Estimating Causal Effects Using Targeted Maximum Likelihood Estimation

We present a brief overview of targeted maximum likelihood for estimating the causal effect of a single time point treatment and of a two time point treatment. We focus on simple examples demonstrating how to apply the methodology developed in (van der Laan and Rubin, 2006; Moore and van der Laan, 2007; van der Laan, 2010a,b). We include R code for the single time point case. 1 Single Time Point Treatment We present a brief example, in the context of an observational study of HIV positive individuals on antiretroviral therapy. Assume we have a binary exposure A0, such as medication adherence being above 90% or not, and a binary outcome Y , such as virologic failure. Assume we have baseline variables L0 that should include all important confounders of the effect of A0 on Y . Say we want to estimate the causal effect of A0 on the mean of Y , as a risk difference; that is, we’d like to estimate the difference between the population mean of Y were everyone to have had exposure set as A0 = 0, and the population mean were everyone to have had exposure set as A0 = 1. (Below, we use both the terms “exposure” and “treatment” to refer to A0.) Below, for simplicity, we just show how to estimate the treatment specific (counterfactual) mean setting A0 = 1. Let p denote a joint probability density on the variables (L0, A0, Y ). (Throughout, we use the term “density” in the general sense; that is, it refers to a frequency function for discrete valued variables and refers to a density for continuous valued variables.) Here we will put no restrictions on p, except that we only consider p for which all the conditional distributions we give below are well-defined. Assume that for each subject i we get a vector of data (L 0 , A (i) 0 , Y ), where each such vector is an independent draw from the true (unknown) data generating distribution p∗ on (L0, A0, Y ). Assume we have n subjects. Under certain assumptions, the treatment specific mean of Y setting A0 = 1 equals the mean over the baseline variables L0 of p∗(Y = 1|A0 = 1, L0), which we denote by