A Partitioning Deletion/Substitution/Addition Algorithm for Creating Survival Risk Groups

Accurately assessing a patient's risk of a given event is essential in making informed treatment decisions. One approach is to stratify patients into two or more distinct risk groups with respect to a specific outcome using both clinical and demographic variables. Outcomes may be categorical or continuous in nature; important examples in cancer studies might include level of toxicity or time to recurrence. Recursive partitioning methods are ideal for building such risk groups. Two such methods are Classification and Regression Trees (CART) and a more recent competitor known as the partitioning Deletion/Substitution/Addition (partDSA) algorithm, both of which also utilize loss functions (e.g., squared error for a continuous outcome) as the basis for building, selecting, and assessing predictors but differ in the manner by which regression trees are constructed. Recently, we have shown that partDSA often outperforms CART in so-called "full data" settings (e.g., uncensored outcomes). However, when confronted with censored outcome data, the loss functions used by both procedures must be modified. There have been several attempts to adapt CART for right-censored data. This article describes two such extensions for partDSA that make use of observed data loss functions constructed using inverse probability of censoring weights. Such loss functions are consistent estimates of their uncensored counterparts provided that the corresponding censoring model is correctly specified. The relative performance of these new methods is evaluated via simulation studies and illustrated through an analysis of clinical trial data on brain cancer patients. The implementation of partDSA for uncensored and right-censored outcomes is publicly available in the R package, partDSA.

[1]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[2]  M. LeBlanc,et al.  Survival Trees by Goodness of Split , 1993 .

[3]  E. Shaw,et al.  Prognostic factors for survival in adult patients with recurrent glioma enrolled onto the new approaches to brain tumor therapy CNS consortium phase I and II clinical trials. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[4]  James M. Robins,et al.  Unified Methods for Censored Longitudinal Data and Causality , 2003 .

[5]  Guofen Yan,et al.  Investigating the effects of ties on measures of concordance , 2008, Statistics in medicine.

[6]  Annette M. Molinaro,et al.  partDSA: deletion/substitution/addition algorithm for partitioning the covariate space in prediction , 2010, Bioinform..

[7]  W. Härdle Applied Nonparametric Regression , 1991 .

[8]  Mark R. Segal,et al.  Regression Trees for Censored Data , 1988 .

[9]  S. Dudoit,et al.  Unified Cross-Validation Methodology For Selection Among Estimators and a General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities and Examples , 2003 .

[10]  James M. Robins,et al.  Coarsening at Random: Characterizations, Conjectures, Counter-Examples , 1997 .

[11]  S. Dudoit,et al.  Tree-based multivariate regression and density estimation with right-censored data , 2004 .

[12]  Edward I. George,et al.  Managing Multiple Models , 2001, AISTATS.

[13]  Xin Guo,et al.  On the optimality of conditional expectation as a Bregman predictor , 2005, IEEE Trans. Inf. Theory.

[14]  James M. Robins,et al.  Estimation of the failure time distribution in the presence of informative censoring , 2002 .

[15]  J. V. Ryzin,et al.  Regression Analysis with Randomly Right-Censored Data , 1981 .

[16]  J. Robins,et al.  Recovery of Information and Adjustment for Dependent Censoring Using Surrogate Markers , 1992 .

[17]  Ingo Ruczinski,et al.  Logic Regression — Methods and Software , 2003 .

[18]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[19]  P. Royston,et al.  Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials , 1999 .

[20]  Martin J. van den Bent,et al.  Radiotherapy plus concomitant and adjuvant temozolomide for glioblastoma. , 2005, The New England journal of medicine.

[21]  M. LeBlanc,et al.  Relative risk trees for censored survival data. , 1992, Biometrics.

[22]  A. Ciampi,et al.  Stratification by stepwise regression, correspondence analysis and recursive partition: A comparison of three methods of analysis for survival data with covaria , 1986 .

[23]  R. Olshen,et al.  Tree-structured survival analysis. , 1985, Cancer treatment reports.

[24]  E Graf,et al.  Assessment and comparison of prognostic classification schemes for survival data. , 1999, Statistics in medicine.

[25]  M J Gleason,et al.  Outcomes and prognostic factors in recurrent glioma patients enrolled onto phase II clinical trials. , 1999, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[26]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[27]  P. Grambsch,et al.  Martingale-based residuals for survival models , 1990 .

[28]  Kathleen R. Lamborn,et al.  Joint NCCTG and NABTC prognostic factors analysis for high-grade recurrent glioma. , 2010, Neuro-oncology.

[29]  Robert L. Strawderman,et al.  Estimating the Mean of an Increasing Stochastic Process at a Censored Stopping Time , 2000 .

[30]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[31]  W. Sauerbrei,et al.  Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. , 1994, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[32]  Yan Yuan,et al.  Estimation of prediction error for survival models , 2009, Statistics in medicine.

[33]  A. Tsiatis Semiparametric Theory and Missing Data , 2006 .

[34]  J. Koziol,et al.  The Concordance Index C and the Mann–Whitney Parameter Pr(X>Y) with Randomly Censored Data , 2009, Biometrical journal. Biometrische Zeitschrift.