Identifying cancer biomarkers through a network regularized Cox model

A central problem in cancer genomics is to identify interpretable biomarkers for better disease prognosis. Many of the biomarkers identified through Cox Proportional Hazard (PH) models are biologically uninterpretable. We propose the use of graph Laplacian regularized Cox PH model to integrate biological networks into the feature selection problem in survival analysis. Simulation studies demonstrate that the performance of the proposed algorithm is superior to L1 and L1+L2 regularized Cox PH models. Utility of this algorithm is also validated by its ability to identify key known biomarkers such as p53 and myc in estrogen receptor positive breast cancer patients using genomic abberration data generated by the Cancer Genome Altas consortium. With the rapid expansion of our knowledge of biological networks, this approach will become increasingly useful for mining high-throughput genomic datasets.

[1]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[2]  Hongzhe Li,et al.  In Response to Comment on "Network-constrained regularization and variable selection for analysis of genomic data" , 2008, Bioinform..

[3]  Todd R. Golub,et al.  PAK1 is a breast cancer oncogene that coordinately activates MAPK and MET signaling , 2011, Oncogene.

[4]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[5]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[6]  Hongzhe Li,et al.  Kernel Cox Regression Models for Linking Gene Expression Profiles to Censored Survival Data , 2002, Pacific Symposium on Biocomputing.

[7]  N. Breslow Covariance analysis of censored survival data. , 1974, Biometrics.

[8]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[9]  Jiang Gui,et al.  Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data , 2005, Bioinform..

[10]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[11]  L. V. van't Veer,et al.  Cross‐validated Cox regression on microarray gene expression data , 2006, Statistics in medicine.

[12]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[13]  Aleix Prat Aparicio Comprehensive molecular portraits of human breast tumours , 2012 .

[14]  D.,et al.  Regression Models and Life-Tables , 2022 .