Extended Bayesian information criterion in the Cox model with a high-dimensional feature space

Variable selection in the Cox proportional hazards model (the Cox model) has manifested its importance in many microarray genetic studies. However, theoretical results on the procedures of variable selection in the Cox model with a high-dimensional feature space are rare because of its complicated data structure. In this paper, we consider the extended Bayesian information criterion (EBIC) for variable selection in the Cox model and establish its selection consistency in the situation of high-dimensional feature space. The EBIC is adopted to select the best model from a model sequence generated from the SIS-ALasso procedure. Simulation studies and real data analysis are carried out to demonstrate the merits of the EBIC.

[1]  Jiang Gui,et al.  Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data , 2005, Bioinform..

[2]  R. Gill,et al.  Cox's regression model for counting processes: a large sample study : (preprint) , 1982 .

[3]  L. Liang,et al.  Mapping complex disease traits with global gene expression , 2009, Nature Reviews Genetics.

[4]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[5]  Runze Li,et al.  An overview on variable selection for survival data analysis , 2005 .

[6]  James Allen Fill Convergence Rates Related to the Strong Law of Large Numbers. , 1983 .

[7]  Jianqing Fan,et al.  Variable Selection for Cox's proportional Hazards Model and Frailty Model , 2002 .

[8]  Zehua Chen,et al.  EXTENDED BIC FOR SMALL-n-LARGE-P SPARSE GLM , 2012 .

[9]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[10]  S. Geer Exponential Inequalities for Martingales, with Application to Maximum Likelihood Estimation for Counting Processes , 1995 .

[11]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[12]  L. Staudt,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[13]  Hui Zou,et al.  A note on path-based variable selection in the penalized proportional hazards model , 2008 .

[14]  J. Ghosh,et al.  Modifying the Schwarz Bayesian Information Criterion to Locate Multiple Interacting Quantitative Trait Loci , 2004, Genetics.

[15]  D. Harrington,et al.  Counting Processes and Survival Analysis , 1991 .

[16]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[17]  Shuangge Ma,et al.  PENALIZED VARIABLE SELECTION PROCEDURE FOR COX MODELS WITH SEMIPARAMETRIC RELATIVE RISK. , 2010, Annals of statistics.

[18]  Yang Feng,et al.  High-dimensional variable selection for Cox's proportional hazards model , 2010, 1002.3315.

[19]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[20]  Zehua Chen,et al.  Extended BIC for linear regression models with diverging number of relevant features and high or ultra-high feature spaces , 2011 .

[21]  Karl W. Broman,et al.  A model selection approach for the identification of quantitative trait loci in experimental crosses , 2002 .

[22]  Zehua Chen,et al.  Selection Consistency of EBIC for GLIM with Non-canonical Links and Diverging Number of Parameters , 2011, 1112.2815.

[23]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[24]  Marina Vannucci,et al.  Bioinformatics Original Paper Bayesian Variable Selection for the Analysis of Microarray Data with Censored Outcomes , 2022 .

[25]  D. Siegmund Model selection in irregular problems: Applications to mapping quantitative trait loci , 2004 .