Statistical Applications in Genetics and Molecular Biology Survival Analysis with High-Dimensional Covariates : An Application in Microarray Studies

Use of microarray technology often leads to high-dimensional and low-sample size (HDLSS) data settings. A variety of approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptations of the elastic net approach are presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time (AFT) model. Assessment of the two methods is conducted through simulation studies and through analysis of microarray data obtained from a set of patients with diffuse large B-cell lymphoma where time to survival is of interest. The approaches are shown to match or exceed the predictive performance of a Cox-based and an AFT-based variable selection method. The methods are moreover shown to be much more computationally efficient than their respective Cox- and AFT-based counterparts.

[1]  Hongzhe Li,et al.  Kernel Cox Regression Models for Linking Gene Expression Profiles to Censored Survival Data , 2002, Pacific Symposium on Biocomputing.

[2]  Yudong D. He,et al.  A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients. , 2005, Cancer research.

[3]  M. Akritas Nearest Neighbor Estimation of a Bivariate Distribution Under Random Censoring , 1994 .

[4]  Winfried Stute,et al.  Consistent estimation under random censorship when covariables are present , 1993 .

[5]  Marina Vannucci,et al.  Bioinformatics Original Paper Bayesian Variable Selection for the Analysis of Microarray Data with Censored Outcomes , 2022 .

[6]  G. Wahba,et al.  A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines , 1970 .

[7]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[8]  Susmita Datta,et al.  Predicting Patient Survival from Microarray Data by Accelerated Failure Time Modeling Using Partial Least Squares and LASSO , 2007, Biometrics.

[9]  Jiang Gui,et al.  Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data , 2005, Bioinform..

[10]  Jiang Gui,et al.  Threshold Gradient Descent Method for Censored Data Regression with Applications in Pharmacogenomics , 2004, Pacific Symposium on Biocomputing.

[11]  L. Staudt,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[12]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[13]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[14]  Bin Nan,et al.  Doubly Penalized Buckley–James Method for Survival Data with High‐Dimensional Covariates , 2008, Biometrics.

[15]  T. Lumley,et al.  Time‐Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker , 2000, Biometrics.

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  M. Segal Microarray gene expression data with linked survival phenotypes: diffuse large-B-cell lymphoma revisited. , 2006, Biostatistics.

[18]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[19]  P. J. Verweij,et al.  Cross-validation in survival analysis. , 1993, Statistics in medicine.

[20]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[21]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[22]  D.,et al.  Regression Models and Life-Tables , 2022 .

[23]  Bogdan E. Popescu,et al.  Gradient Directed Regularization for Linear Regression and Classi…cation , 2004 .

[24]  Runze Li,et al.  Tuning parameter selectors for the smoothly clipped absolute deviation method. , 2007, Biometrika.

[25]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[26]  Somnath Datta,et al.  Estimating the mean life time using right censored data , 2005 .

[27]  D. Hunter,et al.  Variable Selection using MM Algorithms. , 2005, Annals of statistics.

[28]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[29]  Yvonne Zubovic,et al.  A large-scale monte carlo study of the buckley-james estimator with censored data , 1995 .

[30]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[31]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[32]  R. Tibshirani,et al.  Exploring the nature of covariate effects in the proportional hazards model. , 1990, Biometrics.

[33]  I. James,et al.  Linear regression with censored data , 1979 .

[34]  Samiran Ghosh,et al.  Adaptive Elastic Net : An Improvement of Elastic Net to achieve Oracle Properties , 2007 .

[35]  Lee-Jen Wei,et al.  The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. , 1992, Statistics in medicine.

[36]  Winfried Stute,et al.  Distributional Convergence under Random Censorship when Covariables are Present , 1996 .

[37]  David Harrington,et al.  Iterative Partial Least Squares with Right‐Censored Data Analysis: A Comparison to Other Dimension Reduction Techniques , 2005, Biometrics.

[38]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[39]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[40]  Hao Helen Zhang,et al.  ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS. , 2009, Annals of statistics.

[41]  Wenbin Lu,et al.  Variable selection for proportional odds model , 2006 .

[42]  D. Harrington,et al.  Penalized Partial Likelihood Regression for Right‐Censored Data with Bootstrap Selection of the Penalty Parameter , 2002, Biometrics.

[43]  Jian Huang,et al.  Regularized Estimation in the Accelerated Failure Time Model with High‐Dimensional Covariates , 2006, Biometrics.