Random forests of interaction trees for estimating individualized treatment effects in randomized trials

Assessing heterogeneous treatment effects is a growing interest in advancing precision medicine. Individualized treatment effects (ITEs) play a critical role in such an endeavor. Concerning experimental data collected from randomized trials, we put forward a method, termed random forests of interaction trees (RFIT), for estimating ITE on the basis of interaction trees. To this end, we propose a smooth sigmoid surrogate method, as an alternative to greedy search, to speed up tree construction. The RFIT outperforms the "separate regression" approach in estimating ITE. Furthermore, standard errors for the estimated ITE via RFIT are obtained with the infinitesimal jackknife method. We assess and illustrate the use of RFIT via both simulation and the analysis of data from an acupuncture headache trial.

[1]  I. Lipkovich,et al.  Subgroup identification based on differential effect search—A recursive partitioning method for establishing response to treatment in patient subpopulations , 2011, Statistics in medicine.

[2]  W. Loh,et al.  A regression tree approach to identifying subgroups with differential treatment effects , 2014, Statistics in medicine.

[3]  Min Zhang,et al.  Estimating optimal treatment regimes from a classification perspective , 2012, Stat.

[4]  D. Rubin Causal Inference Using Potential Outcomes , 2005 .

[5]  Eric B. Laber,et al.  Tree-based methods for individualized treatment regimes. , 2015, Biometrika.

[6]  P. Holland Statistics and Causal Inference , 1985 .

[7]  J. Friedman Multivariate adaptive regression splines , 1990 .

[8]  Trevor J. Hastie,et al.  Confidence intervals for random forests: the jackknife and the infinitesimal jackknife , 2013, J. Mach. Learn. Res..

[9]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[10]  A. Vickers,et al.  Acupuncture for chronic headache in primary care: large, pragmatic, randomised trial , 2004, BMJ : British Medical Journal.

[11]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[12]  I. van Mechelen,et al.  Qualitative interaction trees: a tool to identify qualitative treatment–subgroup interactions , 2014, Statistics in medicine.

[13]  Alberto Abadie Semiparametric Difference-in-Differences Estimators , 2005 .

[14]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[15]  Xin Yan,et al.  Facilitating score and causal inference trees for large observational studies , 2012, J. Mach. Learn. Res..

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  A. Vickers Whose data set is it anyway? Sharing raw data from randomized trials , 2006, Trials.

[18]  M. LeBlanc,et al.  Survival Trees by Goodness of Split , 1993 .

[19]  I. Lipkovich,et al.  Tutorial in biostatistics: data‐driven subgroup identification and analysis in clinical trials , 2017, Statistics in medicine.

[20]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[21]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[22]  K. Ballman,et al.  Biomarker: Predictive or Prognostic? , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[23]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[24]  Xiaogang Su,et al.  Subgroup Analysis via Recursive Partitioning , 2009 .

[25]  J. M. Taylor,et al.  Subgroup identification from randomized clinical trial data , 2011, Statistics in medicine.

[26]  B. Efron Estimation and Accuracy After Model Selection , 2014, Journal of the American Statistical Association.