PENALIZED VARIABLE SELECTION PROCEDURE FOR COX MODELS WITH SEMIPARAMETRIC RELATIVE RISK.

We study the Cox models with semiparametric relative risk, which can be partially linear with one nonparametric component, or multiple additive or nonadditive nonparametric components. A penalized partial likelihood procedure is proposed to simultaneously estimate the parameters and select variables for both the parametric and the nonparametric parts. Two penalties are applied sequentially. The first penalty, governing the smoothness of the multivariate nonlinear covariate effect function, provides a smoothing spline ANOVA framework that is exploited to derive an empirical model selection tool for the nonparametric part. The second penalty, either the smoothly-clipped-absolute-deviation (SCAD) penalty or the adaptive LASSO penalty, achieves variable selection in the parametric part. We show that the resulting estimator of the parametric part possesses the oracle property, and that the estimator of the nonparametric part achieves the optimal rate of convergence. The proposed procedures are shown to work well in simulation experiments, and then applied to a real data example on sexually transmitted diseases.

[1]  Somratna Lertmaharith Degree of freedom , 2010 .

[2]  Donglin Zeng,et al.  Partially Linear Additive Hazards Regression With Varying Coefficients , 2008 .

[3]  H. Zou,et al.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. , 2008, Annals of statistics.

[4]  Brent A. Johnson,et al.  Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models , 2008, Journal of the American Statistical Association.

[5]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[6]  Brent A. Johnson Variable selection in semiparametric linear regression with censored data , 2008 .

[7]  Jianwen Cai,et al.  Partially Linear Hazard Regression for Multivariate Survival Data , 2007 .

[8]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[9]  Chenlei Leng,et al.  Model selection in nonparametric hazard regression , 2006 .

[10]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.

[11]  Jianhua Z. Huang,et al.  Polynomial Spline Estimation and Inference of Proportional Hazards Regression Models with Flexible Relative Risk Form , 2006, Biometrics.

[12]  Runze Li,et al.  Variable selection for multivariate failure time data. , 2005, Biometrika.

[13]  Jun Yan Survival Analysis: Techniques for Censored and Truncated Data , 2004 .

[14]  Chong Gu,et al.  Smoothing spline Gaussian regression: more scalable computation via efficient approximation , 2004 .

[15]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[16]  Laurence L. George,et al.  The Statistical Analysis of Failure Time Data , 2003, Technometrics.

[17]  Jianqing Fan,et al.  Variable Selection for Cox's proportional Hazards Model and Frailty Model , 2002 .

[18]  Chong Gu Smoothing Spline Anova Models , 2002 .

[19]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[20]  Jianhua Z. Huang,et al.  Functional ANOVA modeling for proportional hazards regression , 2000 .

[21]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[22]  Jian Huang Efficient estimation of the partly linear additive Cox model , 1999 .

[23]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[24]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[25]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[26]  W. Newey,et al.  The asymptotic variance of semiparametric estimators , 1994 .

[27]  Finbarr O'Sullivan,et al.  Nonparametric Estimation in the Cox Model , 1993 .

[28]  A. Karr,et al.  Nonparametric Survival Analysis with Time-Dependent Covariate Effects: A Penalized Partial Likelihood Approach , 1990 .

[29]  G. Wahba Spline Models for Observational Data , 1990 .

[30]  R. Prentice,et al.  Commentary on Andersen and Gill's "Cox's Regression Model for Counting Processes: A Large Sample Study" , 1982 .

[31]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[32]  Steven A. Orszag,et al.  CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .

[33]  H. Zou A note on path-based variable selection in the penalized proportional hazards model , 2008 .

[34]  Hao Helen Zhang,et al.  Component selection and smoothing in smoothing spline analysis of variance models -- COSSO , 2003 .

[35]  Irène Gijbels,et al.  Local likelihood and local partial likelihood in hazard regression , 1997 .

[36]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[37]  H. Weinberger Variational Methods for Eigenvalue Approximation , 1974 .