Trees and splines in survival analysis

During the past few years several nonparametric alternatives to the Cox proportional hazards model have appeared in the literature. These methods extend techniques that are well known from regression analysis to the analysis of censored survival data. In this paper we discuss methods based on (partition) trees and (polynomial) splines, analyse two datasets using both Survival Trees and HARE, and compare the strengths and weaknesses of the two methods. One of the strengths of HARE is that its model fitting procedure has an implicit check for proportionality of the underlying hazards model. It also provides an explicit model for the conditional hazards function, which makes it very convenient to obtain graphical summaries. On the other hand, the tree-based methods automatically partition a dataset into groups of cases that are similar in survival history. Results obtained by survival trees and HARE are often complementary. Trees and splines in survival analysis should provide the data analyst with two useful tools when analysing survival data.

[1]  F. O’Sullivan Fast Computation of Fully Automated Log-Density and Log-Hazard Estimators , 1988 .

[2]  D. Nelson,et al.  Recursive partitioning analysis of prognostic factors in three Radiation Therapy Oncology Group malignant glioma trials. , 1993, Journal of the National Cancer Institute.

[3]  Antonio Ciampi,et al.  Recursive Partition: A Versatile Method for Exploratory-Data Analysis in Biostatistics , 1987 .

[4]  A. Ciampi,et al.  Stratification by stepwise regression, correspondence analysis and recursive partition: A comparison of three methods of analysis for survival data with covaria , 1986 .

[5]  Young K. Truong,et al.  The L2 rate of convergence for hazard regression , 1995 .

[6]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[7]  R B Davis,et al.  Exponential survival trees. , 1989, Statistics in medicine.

[8]  J. Chambers,et al.  The New S Language , 1989 .

[9]  A. Nádas On Estimating the Distribution of a Random Vector When Only the Smallest Coordinate Is Observable , 1970 .

[10]  J. Anderson,et al.  Smooth Estimates for the Hazard Function , 1980 .

[11]  B. Efron Logistic Regression, Survival Analysis, and the Kaplan-Meier Curve , 1988 .

[12]  C. Radhakrishna Rao,et al.  Linear Statistical Inference and its Applications, Second Editon , 1973, Wiley Series in Probability and Statistics.

[13]  C. J. Stone,et al.  Hazard Regression , 2022 .

[14]  David P. Harrington,et al.  Supremum versions of the log-rank and generalized wilcoxon statistics , 1987 .

[15]  Michael D. Stein,et al.  An Exploratory Analysis of Survival with AIDS Using a Nonparametric Tree-Structured Approach , 1992, Epidemiology.

[16]  C. J. Stone,et al.  Logspline Density Estimation for Censored Data , 1992 .

[17]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[18]  B. Yandell Spline smoothing and nonparametric regression , 1989 .

[19]  R A Olshen,et al.  Prognostic significance of actual dose intensity in diffuse large-cell lymphoma: results of a tree-structured survival analysis. , 1990, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[20]  M. LeBlanc,et al.  Relative risk trees for censored survival data. , 1992, Biometrics.

[21]  R. Brand,et al.  Multivariate prediction of coronary heart disease during 8.5 year follow-up in the Western Collaborative Group Study. , 1976, The American journal of cardiology.

[22]  C. R. Rao,et al.  Linear Statistical Inference and its Applications , 1968 .

[23]  A. Senthilselvan Penalized Likelihood Estimation of Hazard and Intensity Functions , 1987 .

[24]  M. Pagano,et al.  Survival analysis. , 1996, Nutrition.

[25]  M R Segal,et al.  A comparison of estimated proportional hazards models and regression trees. , 1989, Statistics in medicine.

[26]  R. Olshen,et al.  Tree-structured survival analysis. , 1985, Cancer treatment reports.

[27]  J Halpern,et al.  27-year mortality in the Western Collaborative Group Study: construction of risk groups by recursive partitioning. , 1991, Journal of clinical epidemiology.

[28]  C. J. Stone,et al.  Polychotomous Regression , 1995 .

[29]  Rupert G. Miller,et al.  Survival Analysis , 2022, The SAGE Encyclopedia of Research Design.

[30]  Young K. Truong,et al.  LOGSPLINE ESTIMATION OF A POSSIBLY MIXED SPECTRAL DISTRIBUTION , 1995 .

[31]  M LeBlanc,et al.  A review of tree-based prognostic models. , 1995, Cancer treatment and research.

[32]  Antonio Ciampi,et al.  Extended hazard regression for censored survival data with covariates : a spline approximation for the baseline hazard function , 1987 .

[33]  Mark R. Segal,et al.  Regression Trees for Censored Data , 1988 .

[34]  J. Friedman Multivariate adaptive regression splines , 1990 .

[35]  C. Kooperberg,et al.  Hazard regression with interval-censored data. , 1997, Biometrics.

[36]  A. Whittemore,et al.  Survival estimation using splines. , 1986, Biometrics.

[37]  D. Harrington A class of rank test procedures for censored survival data , 1982 .

[38]  James O. Ramsay,et al.  Nonparametric density estimation for censored survival data: Regression‐spline approach , 1992 .

[39]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[40]  V. Mor,et al.  Factors affecting conversion rates to Medicaid among new admissions to nursing homes. , 1993, Health services research.

[41]  D R Ragland,et al.  Coronary heart disease mortality in the Western Collaborative Group Study. Follow-up experience of 22 years. , 1988, American journal of epidemiology.

[42]  F. O’Sullivan Nonparametric Estimation of Relative Risk Using Splines and Cross-Validation , 1988 .

[43]  R. Tibshirani,et al.  Varying‐Coefficient Models , 1993 .

[44]  Heping Zhang,et al.  Splitting Criteria in Survival Trees , 1995 .

[45]  Robert Gray,et al.  Flexible Methods for Analyzing Survival Data Using Splines, with Applications to Breast Cancer Prognosis , 1992 .

[46]  M. Friedman,et al.  A Predictive Study of Coronary Heart Disease: The Western Collaborative Group Study , 1964 .

[47]  D. Cox Regression Models and Life-Tables , 1972 .

[48]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[49]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .