A surrogate ℓ0 sparse Cox's regression with applications to sparse high‐dimensional massive sample size time‐to‐event data

Sparse high-dimensional massive sample size (sHDMSS) time-to-event data present multiple challenges to quantitative researchers as most current sparse survival regression methods and software will grind to a halt and become practically inoperable. This paper develops a scalable ℓ0 -based sparse Cox regression tool for right-censored time-to-event data that easily takes advantage of existing high performance implementation of ℓ2 -penalized regression method for sHDMSS time-to-event data. Specifically, we extend the ℓ0 -based broken adaptive ridge (BAR) methodology to the Cox model, which involves repeatedly performing reweighted ℓ2 -penalized regression. We rigorously show that the resulting estimator for the Cox model is selection consistent, oracle for parameter estimation, and has a grouping property for highly correlated covariates. Furthermore, we implement our BAR method in an R package for sHDMSS time-to-event data by leveraging existing efficient algorithms for massive ℓ2 -penalized Cox regression. We evaluate the BAR Cox regression method by extensive simulations and illustrate its application on an sHDMSS time-to-event data from the National Trauma Data Bank with hundreds of thousands of observations and tens of thousands sparsely represented covariates.

[1]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[2]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[3]  Annie Qu,et al.  MODEL SELECTION FOR CORRELATED DATA WITH DIVERGING NUMBER OF PARAMETERS , 2013 .

[4]  M. R. Osborne Finite Algorithms in Optimization and Data Analysis , 1985 .

[5]  Gang Li,et al.  Efficient Regularized Regression with L 0 Penalty for Variable Selection and Network Construction , 2016, Comput. Math. Methods Medicine.

[6]  Jianwen Cai,et al.  Tuning Parameter Selection in Cox Proportional Hazards Model with a Diverging Number of Parameters , 2018, Scandinavian journal of statistics, theory and applications.

[7]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[8]  R. Gill,et al.  Cox's regression model for counting processes: a large sample study : (preprint) , 1982 .

[9]  Gang Li,et al.  Broken adaptive ridge regression and its asymptotic properties , 2018, J. Multivar. Anal..

[10]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[11]  Tong Zhang,et al.  Text Categorization Based on Regularized Linear Classification Methods , 2001, Information Retrieval.

[12]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[13]  Runze Li,et al.  Tuning parameter selectors for the smoothly clipped absolute deviation method. , 2007, Biometrika.

[14]  Shuangge Ma,et al.  Censored Rank Independence Screening for High-dimensional Survival Data. , 2014, Biometrika.

[15]  Xiaotong Shen,et al.  Journal of the American Statistical Association Likelihood-based Selection and Sharp Parameter Estimation Likelihood-based Selection and Sharp Parameter Estimation , 2022 .

[16]  Runze Li,et al.  Feature Screening in Ultrahigh Dimensional Cox's Model. , 2016, Statistica Sinica.

[17]  Thomas H. Scheike,et al.  Independent screening for single‐index hazard rate models with ultrahigh dimensional features , 2011, 1105.3361.

[18]  Stéphane Canu,et al.  Recovering Sparse Signals With a Certain Family of Nonconvex Penalties and DC Programming , 2009, IEEE Transactions on Signal Processing.

[19]  D. Cox Regression Models and Life-Tables , 1972 .

[20]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[21]  Runze Li,et al.  Variable selection for multivariate failure time data. , 2005, Biometrika.

[22]  Trevor Hastie,et al.  Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. , 2011, Journal of statistical software.

[23]  Donglin Zeng,et al.  Variable selection for case-cohort studies with failure time outcome , 2016, Biometrika.

[24]  H. Akaike A new look at the statistical model identification , 1974 .

[25]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[26]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[27]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[28]  Thomas H. Scheike,et al.  Coordinate Descent Methods for the Penalized Semiparametric Additive Hazards Model , 2012 .

[29]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[30]  Jianqing Fan,et al.  Variable Selection for Cox's proportional Hazards Model and Frailty Model , 2002 .

[31]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[32]  David Madigan,et al.  High-dimensional, massive sample-size Cox proportional hazards regression for survival analysis. , 2014, Biostatistics.

[33]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[34]  A. Raftery,et al.  Bayesian Information Criterion for Censored Survival Models , 2000, Biometrics.

[35]  I. Daubechies,et al.  Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[36]  P. Grambsch,et al.  Prognosis in primary biliary cirrhosis: Model for decision making , 1989, Hepatology.

[37]  Guang Cheng,et al.  Simultaneous Inference for High-Dimensional Linear Models , 2016, 1603.01295.

[38]  Ying Zhang,et al.  Sparse estimation of Cox proportional hazards models via approximated information criteria , 2016, Biometrics.

[39]  Zehua Chen,et al.  EXTENDED BIC FOR SMALL-n-LARGE-P SPARSE GLM , 2012 .

[40]  Patrick B. Ryan,et al.  Massive Parallelization of Serial Inference Algorithms for a Complex Generalized Linear Model , 2012, TOMC.

[41]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[42]  Dennis L. Sun,et al.  Exact post-selection inference, with application to the lasso , 2013, 1311.6238.

[43]  P. J. Verweij,et al.  Penalized likelihood in Cox regression. , 1994, Statistics in medicine.

[44]  Grégory Nuel,et al.  An Adaptive Ridge Procedure for L0 Regularization , 2015, PloS one.

[45]  David Madigan,et al.  Large‐scale parametric survival analysis , 2013, Statistics in medicine.

[46]  Hui Zhao,et al.  Variable selection for recurrent event data with broken adaptive ridge regression , 2018, The Canadian journal of statistics = Revue canadienne de statistique.

[47]  Gang Li,et al.  Simultaneous estimation and variable selection for incomplete event history studies , 2019, J. Multivar. Anal..

[48]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[49]  Raymond J. Carroll,et al.  Data integration with high dimensionality , 2016, Biometrika.

[50]  R. Tibshirani,et al.  A SIGNIFICANCE TEST FOR THE LASSO. , 2013, Annals of statistics.

[51]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[52]  M. Kosorok,et al.  Marginal asymptotics for the “large $p$, small $n$” paradigm: With applications to microarray data , 2005, math/0508219.

[53]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[54]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[55]  P. J. Verweij,et al.  Cross-validation in survival analysis. , 1993, Statistics in medicine.

[56]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[57]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[58]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[59]  H. Zou,et al.  A cocktail algorithm for solving the elastic net penalized Cox’s regression in high dimensions , 2013 .

[60]  Runze Li,et al.  Regularization Parameter Selections via Generalized Information Criterion , 2010, Journal of the American Statistical Association.

[61]  Yang Feng,et al.  High-dimensional variable selection for Cox's proportional hazards model , 2010, 1002.3315.