Variable selection for high‐dimensional partly linear additive Cox model with application to Alzheimer's disease

Variable selection has been discussed under many contexts and especially, a large literature has been established for the analysis of right-censored failure time data. In this article, we discuss an interval-censored failure time situation where there exist two sets of covariates with one being low-dimensional and having possible nonlinear effects and the other being high-dimensional. For the problem, we present a penalized estimation procedure for simultaneous variable selection and estimation, and in the method, Bernstein polynomials are used to approximate the involved nonlinear functions. Furthermore, for implementation, a coordinate-wise optimization algorithm, which can accommodate most commonly used penalty functions, is developed. A numerical study is performed for the evaluation of the proposed approach and suggests that it works well in practical situations. Finally the method is applied to an Alzheimer's disease study that motivated this investigation.

[1]  A. Oulhaj,et al.  Variable selection in a flexible parametric mixture cure model with interval‐censored data , 2015, Statistics in medicine.

[2]  Jian Huang,et al.  Group selection in the cox model with a diverging number of covariates , 2014 .

[3]  Minggen Lu,et al.  A partially linear proportional hazards model for current status data , 2018, Biometrics.

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  L. Tan,et al.  Genome-wide association study identifies two loci influencing plasma neurofilament light levels , 2018, BMC Medical Genomics.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Jianwen Cai,et al.  Tuning Parameter Selection in Cox Proportional Hazards Model with a Diverging Number of Parameters , 2018, Scandinavian journal of statistics, theory and applications.

[8]  Shuangge Ma,et al.  PENALIZED VARIABLE SELECTION PROCEDURE FOR COX MODELS WITH SEMIPARAMETRIC RELATIVE RISK. , 2010, Annals of statistics.

[9]  Tao Hu,et al.  A Sieve Semiparametric Maximum Likelihood Approach for Regression Analysis of Bivariate Interval-Censored Failure Time Data , 2017 .

[10]  D. Finkelstein,et al.  A proportional hazards model for interval-censored failure time data. , 1986, Biometrics.

[11]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[12]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[13]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data: Kalbfleisch/The Statistical , 2002 .

[14]  Jinchi Lv,et al.  High-Dimensional Sparse Additive Hazards Regression , 2012, 1212.6232.

[15]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[16]  Richard J Cook,et al.  Penalized regression for interval‐censored times of disease progression: Selection of HLA markers in psoriatic arthritis , 2015, Biometrics.

[17]  Jinchi Lv,et al.  A unified approach to model selection and sparse recovery using regularized least squares , 2009, 0905.3573.

[18]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[19]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[20]  Hui Zhao,et al.  Simultaneous Estimation and Variable Selection for Interval-Censored Data With Broken Adaptive Ridge Regression , 2019, Journal of the American Statistical Association.

[21]  Jian Huang Efficient estimation of the partly linear additive Cox model , 1999 .

[22]  Yang Feng,et al.  Model Selection for High-Dimensional Quadratic Regression via Regularization , 2014, 1501.00049.

[23]  Ji Zhu,et al.  Variable Selection With the Strong Heredity Constraint and Its Oracle Property , 2010 .

[24]  Jianguo Sun,et al.  The Statistical Analysis of Interval-censored Failure Time Data , 2006 .

[25]  Gang Li,et al.  Efficient Regularized Regression with L 0 Penalty for Variable Selection and Network Construction , 2016, Comput. Math. Methods Medicine.

[26]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[27]  Jianqing Fan,et al.  REGULARIZATION FOR COX'S PROPORTIONAL HAZARDS MODEL WITH NP-DIMENSIONALITY. , 2010, Annals of statistics.

[28]  Kan Li,et al.  Prediction of Conversion to Alzheimer's Disease with Longitudinal Measures and Time-To-Event Data. , 2017, Journal of Alzheimer's disease : JAD.

[29]  L. Tan,et al.  Genome-wide association study identified ATP6V1H locus influencing cerebrospinal fluid BACE activity , 2018, BMC Medical Genetics.

[30]  Qi Long,et al.  Risk Prediction for Prostate Cancer Recurrence Through Regularized Estimation with Simultaneous Adjustment for Nonlinear Clinical Effects. , 2011, The annals of applied statistics.

[31]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[32]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[33]  Yongzhao Shao,et al.  Application of concordance probability estimate to predict conversion from mild cognitive impairment to Alzheimer's disease , 2017, Biostatistics & epidemiology.

[34]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[35]  P. J. Verweij,et al.  Cross-validation in survival analysis. , 1993, Statistics in medicine.

[36]  Jianqing Fan,et al.  Variable Selection for Cox's proportional Hazards Model and Frailty Model , 2002 .

[37]  Xihong Lin,et al.  Variable selection and estimation with the seamless-L0 penalty models , 2012 .

[38]  Yang Feng,et al.  High-dimensional variable selection for Cox's proportional hazards model , 2010, 1002.3315.

[39]  Shuangge Ma,et al.  VARIABLE SELECTION IN PARTLY LINEAR REGRESSION MODEL WITH DIVERGING DIMENSIONS FOR RIGHT CENSORED DATA. , 2012, Statistica Sinica.

[40]  Jianwen Cai,et al.  Partially Linear Hazard Regression for Multivariate Survival Data , 2007 .