Practical guidance on modeling choices for the virtual twins method

Individuals can vary drastically in their response to the same treatment, and this heterogeneity has driven the push for more personalized medicine. Accurate and interpretable methods to identify subgroups that respond to the treatment differently from the population average are necessary to achieving this goal. The Virtual Twins (VT) method by Foster et al. [1] is a highly cited and implemented method for subgroup identification because of its intuitive framework. However, since its initial publication, many researchers still rely heavily on the authors’ initial modeling suggestions without examining newer and more powerful alternatives. This leaves much of the potential of the method untapped. We comprehensively evaluate the performance of VT with different combinations of methods in each of its component steps, under a collection of linear and nonlinear problem settings. Our simulations show that the method choice for step 1 of VT is highly influential in the overall accuracy of the method, and Superlearner is a promising choice. We illustrate our findings by using VT to identify subgroups with heterogeneous treatment effects in a randomized, double-blind nicotine reduction trial.

[1]  Ravi Varadhan,et al.  A framework for the analysis of heterogeneity of treatment effect in patient-centered outcomes research. , 2013, Journal of clinical epidemiology.

[2]  W. Loh,et al.  REGRESSION TREES WITH UNBIASED VARIABLE SELECTION AND INTERACTION DETECTION , 2002 .

[3]  S. Hall,et al.  Smoking Behavior and Exposure to Tobacco Toxicants during 6 Months of Smoking Progressively Reduced Nicotine Content Cigarettes , 2012, Cancer Epidemiology, Biomarkers & Prevention.

[4]  Susan Athey,et al.  Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.

[5]  Claudio Conversano,et al.  Combining an Additive and Tree-Based Regression Model Simultaneously: STIMA , 2010 .

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[8]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[9]  M. al’Absi,et al.  Reduced Nicotine Content Cigarettes and Nicotine Patch , 2013, Cancer Epidemiology, Biomarkers & Prevention.

[10]  Trevor Hastie,et al.  Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. , 2011, Journal of statistical software.

[11]  L. Cox,et al.  Evaluation of the brief questionnaire of smoking urges (QSU-brief) in laboratory and clinical settings. , 2001, Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco.

[12]  Jason D. Robinson,et al.  Effect of Immediate vs Gradual Reduction in Nicotine Content of Cigarettes on Biomarkers of Smoke Exposure: A Randomized Clinical Trial , 2018, JAMA.

[13]  A. Alberg,et al.  The 2014 Surgeon General's report: “The Health Consequences of Smoking–50 Years of Progress”: A paradigm shift in cancer care , 2014, Cancer.

[14]  Justin Grimmer,et al.  Estimating Heterogeneous Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods , 2017, Political Analysis.

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  I. Lipkovich,et al.  Tutorial in biostatistics: data‐driven subgroup identification and analysis in clinical trials , 2017, Statistics in medicine.

[17]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[18]  Marc Ratkovic,et al.  Estimating treatment effect heterogeneity in randomized program evaluation , 2013, 1305.5682.

[19]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[20]  J. M. Taylor,et al.  Subgroup identification from randomized clinical trial data , 2011, Statistics in medicine.

[21]  S. Hall,et al.  Effect of reducing the nicotine content of cigarettes on cigarette smoking behavior and tobacco smoke toxicant exposure: 2-year follow up. , 2015, Addiction.

[22]  D. Hatsukami,et al.  Reduced nicotine content cigarettes: effects on toxicant exposure, dependence and cessation. , 2010, Addiction.

[23]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[24]  C. Backinger,et al.  Nicotine reduction revisited: science and future directions , 2010, Tobacco Control.