Recursively Imputed Survival Trees

We propose recursively imputed survival tree (RIST) regression for right-censored data. This new nonparametric regression procedure uses a novel recursive imputation approach combined with extremely randomized trees that allows significantly better use of censored data than previous tree-based methods, yielding improved model fit and reduced prediction error. The proposed method can also be viewed as a type of Monte Carlo EM algorithm, which generates extra diversity in the tree-based fitting process. Simulation studies and data analyses demonstrate the superior performance of RIST compared with previous methods.

[1]  D. Harrington,et al.  Counting Processes and Survival Analysis , 1991 .

[2]  A Ciampi,et al.  RECPAM: a computer program for recursive partition and amalgamation for censored survival data and other situations frequently occurring in biostatistics. I. Methods and program features. , 1988, Computer methods and programs in biomedicine.

[3]  W. Sauerbrei,et al.  Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. , 1994, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[4]  Udaya B. Kogalur,et al.  High-Dimensional Variable Selection for Survival Data , 2010 .

[5]  Mark R. Segal,et al.  Regression Trees for Censored Data , 1988 .

[6]  M. Kosorok,et al.  Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer , 2011, Biometrics.

[7]  M. LeBlanc,et al.  Relative risk trees for censored survival data. , 1992, Biometrics.

[8]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[9]  M. LeBlanc,et al.  Survival Trees by Goodness of Split , 1993 .

[10]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[11]  Udaya B. Kogalur,et al.  Consistency of Random Survival Forests. , 2008, Statistics & probability letters.

[12]  E Graf,et al.  Assessment and comparison of prognostic classification schemes for survival data. , 1999, Statistics in medicine.

[13]  James M. Robins,et al.  Unified Methods for Censored Longitudinal Data and Causality , 2003 .

[14]  Michael R. Kosorok,et al.  The Versatility of Function-Indexed Weighted Log-Rank Statistics , 1999 .

[15]  L. Breiman SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES , 2000 .

[16]  Kun-Lin Hsieh Applying neural networks approach to achieve the parameter optimization for censored data , 2007 .

[17]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[18]  Lee-Ing Tong,et al.  Optimizing processes based on censored data obtained in repetitious experiments using grey prediction , 2005 .

[19]  Torsten Hothorn,et al.  Bagging survival trees , 2002, Statistics in medicine.

[20]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[21]  H. Ishwaran Variable importance in binary regression trees and forests , 2007, 0711.2434.

[22]  R. Olshen,et al.  Almost surely consistent nonparametric regression from recursive partitioning schemes , 1984 .

[23]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[24]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[25]  Adele Cutler,et al.  PERT – Perfect Random Tree Ensembles , 2001 .

[26]  P. Royston,et al.  Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials , 1999 .

[27]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[28]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[29]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[32]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[33]  M. Kosorok,et al.  Reinforcement learning design for cancer clinical trials , 2009, Statistics in medicine.

[34]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.