Doubly Robust Estimation of Optimal Dynamic Treatment Regimes

We compare methods for estimating optimal dynamic decision rules from observational data, with particular focus on estimating the regret functions defined by Murphy (in J. R. Stat. Soc., Ser. B, Stat. Methodol. 65:331–355, 2003). We formulate a doubly robust version of the regret-regression approach of Almirall et al. (in Biometrics 66:131–139, 2010) and Henderson et al. (in Biometrics 66:1192–1201, 2010) and demonstrate that it is equivalent to a reduced form of Robins’ efficient g-estimation procedure (Robins, in Proceedings of the Second Symposium on Biostatistics. Springer, New York, pp. 189–326, 2004). Simulation studies suggest that while the regret-regression approach is most efficient when there is no model misspecification, in the presence of misspecification the efficient g-estimation procedure is more robust. The g-estimation method can be difficult to apply in complex circumstances, however. We illustrate the ideas and methods through an application on control of blood clotting time for patients on long term anticoagulation.

[1]  M. Kosorok,et al.  Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer , 2011, Biometrics.

[2]  J. Robins,et al.  Comparison of dynamic treatment regimes via inverse probability weighting. , 2006, Basic & clinical pharmacology & toxicology.

[3]  J. Robins,et al.  Estimating causal effects from epidemiological data , 2006, Journal of Epidemiology and Community Health.

[4]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[5]  Stephen R Cole,et al.  The consistency statement in causal inference: a definition or an assumption? , 2009, Epidemiology.

[6]  Donglin Zeng,et al.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.

[7]  Erica E M Moodie,et al.  Demystifying Optimal Dynamic Treatment Regimes , 2007, Biometrics.

[8]  Daniel Almirall,et al.  Structural Nested Mean Models for Assessing Time‐Varying Effect Moderation , 2010, Biometrics.

[9]  J. Robins,et al.  The International Journal of Biostatistics CAUSAL INFERENCE Dynamic Regime Marginal Structural Mean Models for Estimation of Optimal Dynamic Treatment Regimes , Part I : Main Content , 2011 .

[10]  Olli Saarela,et al.  Optimal Dynamic Regimes: Presenting a Case for Predictive Inference , 2010, The international journal of biostatistics.

[11]  Stephen R Cole,et al.  Constructing inverse probability weights for marginal structural models. , 2008, American journal of epidemiology.

[12]  A. Philip Dawid,et al.  Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview , 2010, ArXiv.

[13]  Robin Henderson,et al.  Estimation of optimal dynamic anticoagulation regimes from observational data: a regret-based approach. , 2006, Statistics in medicine.

[14]  Phil Ansell,et al.  Regret‐Regression for Optimal Dynamic Treatment Regimes , 2010, Biometrics.

[15]  Eric B. Laber,et al.  A Robust Method for Estimating Optimal Treatment Regimes , 2012, Biometrics.

[16]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[17]  Susan Murphy,et al.  Inference for non-regular parameters in optimal dynamic treatment regimes , 2010, Statistical methods in medical research.

[18]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[19]  Erica E M Moodie,et al.  Estimating Optimal Dynamic Regimes: Correcting Bias under the Null , 2009, Scandinavian journal of statistics, theory and applications.

[20]  M. Kramer,et al.  Estimating Response-Maximized Decision Rules With Applications to Breastfeeding , 2009 .

[21]  M. Hernán A definition of causal effect for epidemiological research , 2004, Journal of Epidemiology and Community Health.