Variable selection in semiparametric hazard regression for multivariate survival data

This paper is concerned with how to select significant variables in the partially linear varying-coefficient hazard model for multivariate survival data. A new variable selection procedure is proposed to simultaneously estimate the parameters and select variables for the parametric parts. Compared to the profile pseudo-partial likelihood proposed by Cai et?al. (2008), the advantage of our method is to be practically feasible and easily implemented. We show that the estimators of both the parametric and nonparametric parts achieve the best convergence rates and establish their asymptotic normality. Moreover, we demonstrate that proposed procedures perform as well as an oracle procedure. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed procedures and a real dataset from the Colon Cancer Study is analyzed for illustration.

[1]  D. Cox Regression Models and Life-Tables , 1972 .

[2]  Emmanuel Mitry,et al.  Levamisole and fluorouracil for adjuvant therapy of resected colon carcinoma. , 1990, The New England journal of medicine.

[3]  Runze Li,et al.  Variable selection for multivariate failure time data. , 2005, Biometrika.

[4]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[5]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[6]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[7]  D. Pollard Asymptotics for Least Absolute Deviation Regression Estimators , 1991, Econometric Theory.

[8]  Jianwen Cai,et al.  Partially Linear Hazard Regression for Multivariate Survival Data , 2007 .

[9]  Jian Huang Efficient estimation of the partly linear additive Cox model , 1999 .

[10]  Jianqing Fan,et al.  Efficient Estimation and Inferences for Varying-Coefficient Models , 2000 .

[11]  Irène Gijbels,et al.  Local likelihood and local partial likelihood in hazard regression , 1997 .

[12]  Donglin Zeng,et al.  Partially Linear Additive Hazards Regression With Varying Coefficients , 2008 .

[13]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[14]  L. J. Wei,et al.  Regression analysis of multivariate incomplete failure time data by modeling marginal distributions , 1989 .

[15]  D. Clayton,et al.  Multivariate generalizations of the proportional hazards model , 1985 .

[16]  D. Lin,et al.  Cox regression analysis of multivariate failure time data: the marginal approach. , 1994, Statistics in medicine.

[17]  Ib M. Skovgaard,et al.  Efficient Estimation of Fixed and Time‐varying Covariate Effects in Multiplicative Intensity Models , 2002 .

[18]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[19]  Jianwen Cai,et al.  Partially linear hazard regression with varying coefficients for multivariate survival data , 2008 .

[20]  Jianqing Fan,et al.  Generalized Partially Linear Single-Index Models , 1997 .

[21]  R. Tibshirani,et al.  Local Likelihood Estimation , 1987 .

[22]  Runze Li,et al.  Variable Selection in Semiparametric Regression Modeling. , 2008, Annals of statistics.

[23]  K. Liang,et al.  Modelling Marginal Hazards in Multivariate Failure Time Data , 1993 .

[24]  Danyu Lin,et al.  Marginal Regression Models for Multivariate Failure Time Data , 1998 .

[25]  Jianqing Fan,et al.  Variable Selection for Cox's proportional Hazards Model and Frailty Model , 2002 .

[26]  Kani Chen,et al.  Global Partial Likelihood for Nonparametric Proportional Hazards Models , 2010, Journal of the American Statistical Association.