The optimal dynamic treatment rule superlearner: considerations, performance, and application to criminal justice interventions

The optimal dynamic treatment rule (ODTR) framework offers an approach for understanding which kinds of patients respond best to specific treatments – in other words, treatment effect heterogeneity. Recently, there has been a proliferation of methods for estimating the ODTR. One such method is an extension of the SuperLearner algorithm – an ensemble method to optimally combine candidate algorithms extensively used in prediction problems – to ODTRs. Following the “causal roadmap,” we causally and statistically define the ODTR and provide an introduction to estimating it using the ODTR SuperLearner. Additionally, we highlight practical choices when implementing the algorithm, including choice of candidate algorithms, metalearners to combine the candidates, and risk functions to select the best combination of algorithms. Using simulations, we illustrate how estimating the ODTR using this SuperLearner approach can uncover treatment effect heterogeneity more effectively than traditional approaches based on fitting a parametric regression of the outcome on the treatment, covariates and treatment-covariate interactions. We investigate the implications of choices in implementing an ODTR SuperLearner at various sample sizes. Our results show the advantages of: (1) including a combination of both flexible machine learning algorithms and simple parametric estimators in the library of candidate algorithms; (2) using an ensemble metalearner to combine candidates rather than selecting only the best-performing candidate; (3) using the mean outcome under the rule as a risk function. Finally, we apply the ODTR SuperLearner to the “Interventions” study, an ongoing randomized controlled trial, to identify which justice-involved adults with mental illness benefit most from cognitive behavioral therapy (CBT) to reduce criminal

[1]  Bibhas Chakraborty,et al.  Q‐learning for estimating optimal dynamic treatment rules from observational data , 2012, The Canadian journal of statistics = Revue canadienne de statistique.

[2]  M. J. van der Laan,et al.  Super-Learning of an Optimal Dynamic Treatment Rule , 2016, The international journal of biostatistics.

[3]  Mark J. van der Laan,et al.  Super Learner In Prediction , 2010 .

[4]  M. J. van der Laan,et al.  AUC-Maximizing Ensembles through Metalearning , 2016, The international journal of biostatistics.

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[7]  B. Chakraborty,et al.  Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine , 2013 .

[8]  Yvonne Vergouwe,et al.  Estimates of absolute treatment benefit for individual patients required careful modeling of statistical interactions. , 2015, Journal of clinical epidemiology.

[9]  Mark J van der Laan,et al.  Optimal Individualized Treatments in Resource-Limited Settings , 2016, The international journal of biostatistics.

[10]  Kristin E. Porter,et al.  Diagnosing and responding to violations in the positivity assumption , 2012, Statistical methods in medical research.

[11]  Eric B. Laber,et al.  Estimation of optimal dynamic treatment regimes , 2014, Clinical trials.

[12]  Yang Ning,et al.  Efficient augmentation and relaxation learning for individualized treatment rules using observational data , 2019, J. Mach. Learn. Res..

[13]  M. J. van der Laan,et al.  Practice of Epidemiology Improving Propensity Score Estimators ’ Robustness to Model Misspecification Using Super Learner , 2015 .

[14]  I. König,et al.  What is precision medicine? , 2017, European Respiratory Journal.

[15]  Eric B. Laber,et al.  Interactive model building for Q-learning. , 2014, Biometrika.

[16]  J. Skeem,et al.  Correctional Policy for Offenders with Mental Illness: Creating a New Paradigm for Recidivism Reduction , 2011, Law and human behavior.

[17]  Susan A. Murphy,et al.  Introduction to SMART designs for the development of adaptive interventions: with application to weight loss research , 2014, Translational behavioral medicine.

[18]  M. J. van der Laan,et al.  Statistical Issues and Limitations in Personalized Medicine Research with Clinical Trials , 2012, The international journal of biostatistics.

[19]  J. Wittes,et al.  Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials. , 1991, JAMA.

[20]  Mark J van der Laan,et al.  The International Journal of Biostatistics A Doubly Robust Censoring Unbiased Transformation , 2011 .

[21]  M J van der Laan,et al.  Covariate adjustment in randomized trials with binary outcomes: Targeted maximum likelihood estimation , 2009, Statistics in medicine.

[22]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[23]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[24]  M. J. Laan,et al.  Targeted Minimum Loss Based Estimation of an Intervention Specific Mean Outcome , 2011 .

[25]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[26]  M. J. van der Laan,et al.  Causal Models and Learning from Data: Integrating Causal Modeling and Statistical Estimation , 2014, Epidemiology.

[27]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[28]  Donglin Zeng,et al.  New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes , 2015, Journal of the American Statistical Association.

[29]  M. J. van der Laan,et al.  Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study. , 2015, The Lancet. Respiratory medicine.

[30]  M. Lipsey,et al.  Effects of Cognitive Behavioral Programs for Criminal Offenders , 2007 .

[31]  Donglin Zeng,et al.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.

[32]  C. Assaid,et al.  The Theory of Response-Adaptive Randomization in Clinical Trials , 2007 .

[33]  Inbal Nahum-Shani,et al.  Q-learning: a data analysis method for constructing adaptive interventions. , 2012, Psychological methods.

[34]  M. Robins James,et al.  Estimation of the causal effects of time-varying exposures , 2008 .

[35]  M. Kosorok,et al.  Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine , 2015 .

[36]  Issa J Dahabreh,et al.  Using group data to treat individuals: understanding heterogeneous treatment effects in the age of precision medicine and patient-centred evidence. , 2016, International journal of epidemiology.

[37]  Mark J van der Laan,et al.  A practical illustration of the importance of realistic individualized treatment rules in causal inference. , 2007, Electronic journal of statistics.

[38]  S. Murphy,et al.  PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. , 2011, Annals of statistics.

[39]  M. J. Laan,et al.  Targeted Learning: Causal Inference for Observational and Experimental Data , 2011 .

[40]  Jeremy Robert Coyle,et al.  Computational Considerations for Targeted Learning , 2017 .

[41]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[42]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[43]  Min Zhang,et al.  Estimating optimal treatment regimes from a classification perspective , 2012, Stat.

[44]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[45]  S. Murphy,et al.  A "SMART" design for building individualized treatment sequences. , 2012, Annual review of clinical psychology.

[46]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[47]  Mark J van der Laan,et al.  Targeted Learning of the Mean Outcome under an Optimal Dynamic Treatment Rule , 2015, Journal of causal inference.

[48]  Ravi Varadhan,et al.  A framework for the analysis of heterogeneity of treatment effect in patient-centered outcomes research. , 2013, Journal of clinical epidemiology.

[49]  Ewout Steyerberg,et al.  Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects , 2018, British Medical Journal.

[50]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[51]  Erin LeDell,et al.  Super Learner Analysis of Electronic Adherence Data Improves Viral Prediction and May Provide Strategies for Selective HIV RNA Monitoring , 2015, Journal of acquired immune deficiency syndromes.

[52]  Erica E M Moodie,et al.  Demystifying Optimal Dynamic Treatment Regimes , 2007, Biometrics.

[53]  Anastasios A. Tsiatis,et al.  Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes , 2012, Statistical science : a review journal of the Institute of Mathematical Statistics.

[54]  I. Lipkovich,et al.  Tutorial in biostatistics: data‐driven subgroup identification and analysis in clinical trials , 2017, Statistics in medicine.

[55]  R. DeRubeis,et al.  Treatment Selection in Depression. , 2018, Annual review of clinical psychology.

[56]  M. J. van der Laan,et al.  STATISTICAL INFERENCE FOR THE MEAN OUTCOME UNDER A POSSIBLY NON-UNIQUE OPTIMAL TREATMENT STRATEGY. , 2016, Annals of statistics.

[57]  M. J. van der Laan,et al.  The International Journal of Biostatistics Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules , 2011 .

[58]  Michael R Kosorok,et al.  Residual Weighted Learning for Estimating Individualized Treatment Rules , 2015, Journal of the American Statistical Association.

[59]  J. Skeem,et al.  Offenders with mental illness have criminogenic needs, too: toward recidivism reduction. , 2014, Law and human behavior.