Boosting method for nonlinear transformation models with censored survival data.

We propose a general class of nonlinear transformation models for analyzing censored survival data, of which the nonlinear proportional hazards and proportional odds models are special cases. A cubic smoothing spline-based component-wise boosting algorithm is derived to estimate covariate effects nonparametrically using the gradient of the marginal likelihood, that is computed using importance sampling. The proposed method can be applied to survival data with high-dimensional covariates, including the case when the sample size is smaller than the number of predictors. Empirical performance of the proposed method is evaluated via simulations and analysis of a microarray survival data.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Hongzhe Li,et al.  Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data , 2005, Bioinform..

[3]  Z. Ying,et al.  Analysis of transformation models with censored data , 1995 .

[4]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[5]  S. Bennett,et al.  Analysis of survival data by the proportional odds model. , 1983, Statistics in medicine.

[6]  Meland,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[7]  Anthony N. Pettitt,et al.  Inference for the Linear Model Using a Likelihood Based on Ranks , 1982 .

[8]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[9]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[10]  K F Lam,et al.  Marginal Likelihood Estimation for Proportional Odds Models with Right Censored Data , 2001, Lifetime data analysis.

[11]  Peter Buhlmann Boosting Methods: Why They Can Be Us eful for High-Dimensional Data , 2003 .

[12]  Zhiliang Ying,et al.  On the linear transformation model for censored data , 1998 .

[13]  G. Ridgeway The State of Boosting ∗ , 1999 .

[14]  梁翠蓮 Proportional odds model for survival data , 1999 .

[15]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[16]  Kjell A. Doksum,et al.  Estimation and Testing in a Two-Sample Generalized Odds-Rate Model , 1988 .

[17]  Gordon Johnston,et al.  Statistical Models and Methods for Lifetime Data , 2003, Technometrics.

[18]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[19]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[20]  Guoxin Zuo,et al.  A Baseline-free Procedure for Transformation Models Under Interval Censorship , 2005, Lifetime data analysis.

[21]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[22]  Tianxi Cai,et al.  Semiparametric regression analysis for clustered failure time data , 2000 .

[23]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[24]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[25]  D. Clayton,et al.  Multivariate generalizations of the proportional hazards model , 1985 .

[26]  Susan A. Murphy,et al.  Maximum Likelihood Estimation in the Proportional Odds Model , 1997 .

[27]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data. , 1983 .

[28]  R. Gill,et al.  Cox's regression model for counting processes: a large sample study : (preprint) , 1982 .

[29]  John D. Kalbfleisch,et al.  The Statistical Analysis of Failure Data , 1986, IEEE Transactions on Reliability.

[30]  A Tsodikov,et al.  Semiparametric models: a generalized self‐consistency approach , 2003, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[31]  Zhiliang Ying,et al.  Semiparametric analysis of transformation models with censored data , 2002 .

[32]  D. Cox Regression Models and Life-Tables , 1972 .

[33]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[34]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[35]  A Marginal Likelihood Approach to Estimation in Frailty Models , 1997 .

[36]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[37]  A. Tsodikov,et al.  Profile information matrix for nonlinear transformation models , 2007, Lifetime data analysis.