L1 Penalized Estimation in the Cox Proportional Hazards Model

This article presents a novel algorithm that efficiently computes L1 penalized (lasso) estimates of parameters in high‐dimensional models. The lasso has the property that it simultaneously performs variable selection and shrinkage, which makes it very useful for finding interpretable prediction rules in high‐dimensional data. The new algorithm is based on a combination of gradient ascent optimization with the Newton–Raphson algorithm. It is described for a general likelihood function and can be applied in generalized linear models and other models with an L1 penalty. The algorithm is demonstrated in the Cox proportional hazards model, predicting survival of breast cancer patients using gene expression data, and its performance is compared with competing approaches. An R package, penalized, that implements the method, is available on CRAN.

[1]  P. J. Verweij,et al.  Cross-validation in survival analysis. , 1993, Statistics in medicine.

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[4]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[5]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[6]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[7]  Statistical Analysis of Sediment Toxicity by Additive Monotone Regression Splines , 2002, Ecotoxicology.

[8]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[9]  Meland,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[10]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[11]  S. Sathiya Keerthi,et al.  A simple and efficient algorithm for gene selection using sparse logistic regression , 2003, Bioinform..

[12]  Yongdai Kim,et al.  Gradient LASSO for feature selection , 2004, ICML.

[13]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[14]  R. Tibshirani,et al.  On the “degrees of freedom” of the lasso , 2007, 0712.0881.

[15]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[16]  Jiang Gui,et al.  Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data , 2005, Bioinform..

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  Jelle J. Goeman,et al.  Testing association of a pathway with survival using gene expression data , 2005, Bioinform..

[19]  Mee Young Park,et al.  L 1-regularization path algorithm for generalized linear models , 2006 .

[20]  M. Segal Microarray gene expression data with linked survival phenotypes: diffuse large-B-cell lymphoma revisited. , 2006, Biostatistics.

[21]  L. V. van't Veer,et al.  Cross‐validated Cox regression on microarray gene expression data , 2006, Statistics in medicine.

[22]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[23]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[24]  Arnoldo Frigessi,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm305 Gene expression Predicting survival from microarray data—a comparative study , 2022 .

[25]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[26]  S. Sathiya Keerthi,et al.  A Fast Tracking Algorithm for Generalized LARS/LASSO , 2007, IEEE Transactions on Neural Networks.

[27]  S. Pandey,et al.  What Are Degrees of Freedom , 2008 .

[28]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[29]  Anne-Laure Boulesteix,et al.  Survival prediction using gene expression data: A review and comparison , 2009, Comput. Stat. Data Anal..