Semiparametric Bayesian kernel survival model for evaluating pathway effects

Massive amounts of high-dimensional data have been accumulated over the past two decades, which has cultured increasing interests in identifying gene pathways related to certain biological processes. In particular, since pathway-based analysis has the ability to detect subtle changes of differentially expressed genes that could be missed when using gene-based analysis, detecting the gene pathways that regulate certain diseases can provide new strategies for medical procedures and new targets for drug discovery. Limited work has been carried out, primarily in regression settings, to study the effects of pathways on survival outcomes. Motivated by a breast cancer gene-pathway data set, which exhibits the “small n, large p” characteristics, we propose a semiparametric Bayesian kernel survival model (s-BKSurv) to study the effects of both clinical covariates and gene expression levels within a pathway on survival time. We model the unknown high-dimensional functions of pathways via Gaussian kernel machine to consider the possibility that genes within the same pathway interact with each other. To address the multiple comparisons problem under a full Bayesian setting, we propose a similarity-dependent procedure based on Bayes factor to control the family-wise error rate. We demonstrate the outperformance of our approach under various simulation settings and pathways data.

[1]  J. Manson,et al.  Plasma homocysteine and cysteine and risk of breast cancer in women. , 2010, Cancer research.

[2]  M. Tadesse,et al.  Pathway and Network Approaches for Identification of Cancer Signature Markers from Omics Data , 2015, Journal of Cancer.

[3]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[4]  Xihong Lin,et al.  Kernel machine SNP‐set analysis for censored survival outcomes in genome‐wide association studies , 2011, Genetic epidemiology.

[5]  E. Saksela,et al.  Suppression of human natural killer cell activity by amino sugars. , 1989, Cellular immunology.

[6]  Joseph D. Szustakowski,et al.  Extending the pathway analysis framework with a test for transcriptional variance implicates novel pathway modulation during myogenic differentiation , 2007, Bioinform..

[7]  Ludmila V. Danilova,et al.  Frequent Inactivation of Cysteine Dioxygenase Type 1 Contributes to Survival of Breast Cancer Cells and Resistance to Anthracyclines , 2013, Clinical Cancer Research.

[8]  Gang Li,et al.  A simulation-based goodness-of-fit test for survival data , 2000 .

[9]  J. Baselga,et al.  Expression of the fructose transporter GLUT5 in human breast cancer. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Hongyu Zhao,et al.  Bayesian semiparametric regression models for evaluating pathway effects on continuous and binary clinical outcomes , 2012, Statistics in medicine.

[11]  Jiang Gui,et al.  A Robust Multifactor Dimensionality Reduction Method for Detecting Gene–Gene Interactions with Application to the Genetic Analysis of Bladder Cancer Susceptibility , 2011, Annals of human genetics.

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  F. Bertucci,et al.  Human breast cancer cells enhance self tolerance by promoting evasion from NK cell antitumor immunity. , 2011, The Journal of clinical investigation.

[14]  Samuel Arrabal,et al.  Prion Protein Prevents Human Breast Carcinoma Cell Line from Tumor Necrosis Factor α-Induced Cell Death , 2004, Cancer Research.

[15]  Hongyu Zhao,et al.  Pathway analysis using random forests classification and regression , 2006, Bioinform..

[16]  L. Boros,et al.  Fructose induces transketolase flux to promote pancreatic cancer growth. , 2010, Cancer research.

[17]  R. Wood,et al.  DNA polymerases and cancer , 2011, Nature Reviews Cancer.

[18]  J. Mercer Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[19]  Juna Lee,et al.  Cysteine Dioxygenase 1 Is a Tumor Suppressor Gene Silenced by Promoter Methylation in Multiple Human Cancers , 2012, PloS one.

[20]  Inyoung Kim,et al.  Random Effects Model for Multiple Pathway Analysis with Applications to Type II Diabetes Microarray Data , 2015, Statistics in biosciences.

[21]  Xihong Lin,et al.  Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection , 2009, Bioinform..

[22]  Peter H Watson,et al.  Reduced expression of the small leucine-rich proteoglycans, lumican, and decorin is associated with poor outcome in node-negative invasive breast cancer. , 2003, Clinical cancer research : an official journal of the American Association for Cancer Research.

[23]  Tianxi Cai,et al.  Kernel Machine Approach to Testing the Significance of Multiple Genetic Markers for Risk Prediction , 2011, Biometrics.

[24]  Eric C. Carlson,et al.  Keratocan, a Cornea-specific Keratan Sulfate Proteoglycan, Is Regulated by Lumican* , 2005, Journal of Biological Chemistry.

[25]  Wei Pan,et al.  Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms , 2007, Bioinform..

[26]  Seungyeoun Lee,et al.  Gene–gene interaction analysis for the survival phenotype based on the Cox model , 2012, Bioinform..

[27]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[28]  Jelle J. Goeman,et al.  Testing association of a pathway with survival using gene expression data , 2005, Bioinform..

[29]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[30]  D. Trump,et al.  Vitamin D signalling pathways in cancer: potential for anticancer therapeutics , 2007, Nature Reviews Cancer.

[31]  Hongyu Zhao,et al.  Pathway analysis using random forests with bivariate node-split for survival outcomes , 2010, Bioinform..

[32]  Francesco C Stingo,et al.  INCORPORATING BIOLOGICAL INFORMATION INTO LINEAR MODELS: A BAYESIAN APPROACH TO THE SELECTION OF PATHWAYS AND GENES. , 2011, The annals of applied statistics.

[33]  Hongyu Zhao,et al.  Statistical properties on semiparametric regression for evaluating pathway effects. , 2013, Journal of statistical planning and inference.

[34]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[35]  E. Feuer,et al.  SEER Cancer Statistics Review, 1975-2003 , 2006 .

[36]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[37]  Xing-Ming Zhao,et al.  Identifying dysregulated pathways in cancers from pathway interaction networks , 2012, BMC Bioinformatics.

[38]  B. Efron The Efficiency of Cox's Likelihood Function for Censored Data , 1977 .

[39]  Hongzhe Li,et al.  A Markov random field model for network-based analysis of genomic data , 2007, Bioinform..

[40]  Inyoung Kim,et al.  Bayesian Semiparametric Model for Pathway-Based Analysis with Zero-Inflated Clinical Outcomes , 2016 .

[41]  P. Jolliet,et al.  Plasma coenzyme Q10 concentrations in breast cancer: prognosis and therapeutic consequences. , 1998, International journal of clinical pharmacology and therapeutics.

[42]  Hee-Jung Choi,et al.  Estrogen induced β-1,4-galactosyltransferase 1 expression regulates proliferation of human breast cancer MCF-7 cells. , 2012, Biochemical and biophysical research communications.

[43]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[44]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[45]  Bárbara Sousa,et al.  Alterations in Vitamin D signalling and metabolic pathways in breast cancer progression: a study of VDR, CYP27B1 and CYP24A1 expression in benign and malignant breast lesions Vitamin D pathways unbalanced in breast lesions , 2010, BMC Cancer.

[46]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[47]  A. Thompson,et al.  DNA polymerase θ up-regulation is associated with poor survival in breast cancer, perturbs DNA replication, and promotes genetic instability , 2010, Proceedings of the National Academy of Sciences.