Bayesian Network Marker Selection via the Thresholded Graph Laplacian Gaussian Prior.

Selecting informative nodes over large-scale networks becomes increasingly important in many research areas. Most existing methods focus on the local network structure and incur heavy computational costs for the large-scale problem. In this work, we propose a novel prior model for Bayesian network marker selection in the generalized linear model (GLM) framework: the Thresholded Graph Laplacian Gaussian (TGLG) prior, which adopts the graph Laplacian matrix to characterize the conditional dependence between neighboring markers accounting for the global network structure. Under mild conditions, we show the proposed model enjoys the posterior consistency with a diverging number of edges and nodes in the network. We also develop a Metropolis-adjusted Langevin algorithm (MALA) for efficient posterior computation, which is scalable to large-scale networks. We illustrate the superiorities of the proposed method compared with existing alternatives via extensive simulation studies and an analysis of the breast cancer gene expression dataset in the Cancer Genome Atlas (TCGA).

[1]  C. Caldon,et al.  Estrogen Signaling and the DNA Damage Response in Hormone Dependent Breast Cancers , 2014, Front. Oncol..

[2]  Hongzhe Li,et al.  In Response to Comment on "Network-constrained regularization and variable selection for analysis of genomic data" , 2008, Bioinform..

[3]  Weijuan Song,et al.  Association between MDR1 C3435T polymorphism and colorectal cancer risk , 2017, Medicine.

[4]  James A. Coan,et al.  Spatial Bayesian variable selection and grouping for high-dimensional scalar-on-image regression , 2015, 1509.04069.

[5]  G. Semenza,et al.  Role of hypoxia-inducible factors in breast cancer metastasis. , 2013, Future oncology.

[6]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[7]  R. Schiff,et al.  Crosstalk between estrogen receptor and growth factor receptor pathways as a cause for endocrine therapy resistance in breast cancer. , 2005, Clinical Cancer Research.

[8]  Vinod Menon,et al.  Functional connectivity in the resting brain: A network analysis of the default mode hypothesis , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Xin Yu,et al.  The ubiquitin-like protein FAT 10 stabilizes eEF 1 A 1 expression to promote tumor proliferation in a complex manner Running title : FAT 10 stabilizes eEF 1 A 1 in a complex manner , 2016 .

[10]  Leif E. Peterson,et al.  Nuclear Receptor Corepressor 1 Expression and Output Declines with Prostate Cancer Progression , 2016, Clinical Cancer Research.

[11]  Mike West,et al.  Dynamics and sparsity in latent threshold factor models: A study in multivariate EEG signal processing , 2016, 1606.08292.

[12]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[13]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[14]  V. Johnson,et al.  Bayesian Model Selection in High-Dimensional Settings , 2012, Journal of the American Statistical Association.

[15]  Bo Xu,et al.  Elevated Aurora B expression contributes to chemoresistance and poor prognosis in breast cancer. , 2015, International journal of clinical and experimental pathology.

[16]  Ciprian M Crainiceanu,et al.  Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection , 2014, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[17]  K. Anderson,et al.  Functional significance of novel neurotrophin‐1/B cell‐stimulating factor‐3 (cardiotrophin‐like cytokine) for human myeloma cell growth and survival , 2003, British journal of haematology.

[18]  E. Fischer-Fodor,et al.  Gitr-Expressing Regulatory T-Cell Subsets are Increased in Tumor-Positive Lymph Nodes from Advanced Breast Cancer Patients as Compared to Tumor-Negative Lymph Nodes , 2012, International journal of immunopathology and pharmacology.

[19]  Hongzhe Li,et al.  VARIABLE SELECTION AND REGRESSION ANALYSIS FOR GRAPH-STRUCTURED COVARIATES WITH AN APPLICATION TO GENOMICS. , 2010, The annals of applied statistics.

[20]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[21]  Jiajia Chen,et al.  Network Biomarkers Constructed from Gene Expression and Protein-Protein Interaction Data for Accurate Prediction of Leukemia , 2017, Journal of Cancer.

[22]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[23]  James G. Scott,et al.  Local shrinkage rules, Lévy processes and regularized regression , 2010, 1010.3390.

[24]  Dianwen Zhu,et al.  CUNY Academic , 2016 .

[25]  Francesco C Stingo,et al.  INCORPORATING BIOLOGICAL INFORMATION INTO LINEAR MODELS: A BAYESIAN APPROACH TO THE SELECTION OF PATHWAYS AND GENES. , 2011, The annals of applied statistics.

[26]  M. West,et al.  Bayesian Analysis of Latent Threshold Dynamic Models , 2013 .

[27]  Fei Liu,et al.  Bayesian Regularization via Graph Laplacian , 2014 .

[28]  M. Weller,et al.  Identification of single nucleotide polymorphisms of the PI3K-AKT-mTOR pathway as a risk factor of central nervous system metastasis in metastatic breast cancer. , 2017, European journal of cancer.

[29]  Tian Zheng,et al.  Bayesian hierarchical graph-structured model for pathway analysis using gene expression data , 2013, Statistical applications in genetics and molecular biology.

[30]  Qi Long,et al.  Scalable Bayesian variable selection for structured high‐dimensional data , 2016, Biometrics.

[31]  F. Liang,et al.  A split‐and‐merge Bayesian variable selection approach for ultrahigh dimensional regression , 2015 .

[32]  Zhiyou Fang,et al.  Supervillin-mediated Suppression of p53 Protein Enhances Cell Survival* , 2013, The Journal of Biological Chemistry.

[33]  J. Vassallo,et al.  Aryl hydrocarbon receptor (AHR) is a potential tumour suppressor in pituitary adenomas , 2017, Endocrine-related cancer.

[34]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[35]  Wei Pan,et al.  Network‐Based Penalized Regression With Application to Genomic Data , 2013, Biometrics.

[36]  Kim-Anh Do,et al.  A Bayesian predictive model for imaging genetics with application to schizophrenia , 2016 .

[37]  Yuan Yin,et al.  RAGE may act as a tumour suppressor to regulate lung cancer development. , 2018, Gene.

[38]  N. Zhang,et al.  Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces With Applications in Genomics , 2010 .

[39]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[40]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[41]  Victor X Jin,et al.  Loss of Estrogen Receptor Signaling Triggers Epigenetic Silencing of Downstream Targets in Breast Cancer , 2004, Cancer Research.

[42]  N. Pillai,et al.  Dirichlet–Laplace Priors for Optimal Shrinkage , 2014, Journal of the American Statistical Association.

[43]  Yuan Qi,et al.  Joint network and node selection for pathway-based genomic data analysis , 2013, Bioinform..

[44]  Jian Kang,et al.  Thresholded Multiscale Gaussian Processes with Application to Bayesian Feature Selection for Massive Neuroimaging Data , 2015, 1504.06074.

[45]  B. Reich,et al.  Scalar‐on‐image regression via the soft‐thresholded Gaussian process , 2016, Biometrika.

[46]  J. Gustafsson,et al.  Estrogen receptor and aryl hydrocarbon receptor signaling pathways , 2006, Nuclear receptor signaling.

[47]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[48]  B. Mallick,et al.  Bayesian Variable Selection with Structure Learning: Applications in Integrative Genomics , 2015, 1508.02803.

[49]  J. Wolchok,et al.  Modulation of GITR for cancer immunotherapy. , 2012, Current opinion in immunology.

[50]  Susan Kovats,et al.  Estrogen receptors regulate innate immune cells and signaling pathways , 2015, Cellular immunology.

[51]  E. C. Ciruelos Gil Targeting the PI3K/AKT/mTOR pathway in estrogen receptor-positive breast cancer. , 2014, Cancer treatment reviews.

[52]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[53]  Huajun Zheng,et al.  IRS-2 rs1805097 polymorphism is associated with the decreased risk of colorectal cancer , 2017, Oncotarget.

[54]  Yang Ni,et al.  Bayesian Graphical Regression , 2018, Journal of the American Statistical Association.

[55]  S. Schuster Next-generation sequencing transforms today's biology , 2008, Nature Methods.

[56]  Wei Pan,et al.  Predictor Network in Penalized Regression with Application to Microarray Data” , 2009 .

[57]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[58]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[59]  Long Gao,et al.  Multi-Analyte Network Markers for Tumor Prognosis , 2012, PloS one.

[60]  Christine B Peterson,et al.  Joint Bayesian variable and graph selection for regression models with network‐structured predictors , 2016, Statistics in medicine.

[61]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[62]  Kunio Doi,et al.  Computer-aided diagnosis in medical imaging: Historical review, current status and future potential , 2007, Comput. Medical Imaging Graph..

[63]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[64]  J. Dunst,et al.  Impact of hypoxia inducible factors on estrogen receptor expression in breast cancer cells. , 2017, Archives of biochemistry and biophysics.

[65]  Haiyuan Yu,et al.  HINT: High-quality protein interactomes and their applications in understanding human disease , 2012, BMC Systems Biology.

[66]  Wei Pan,et al.  A Two-Step Penalized Regression Method with Networked Predictors , 2012, Statistics in biosciences.

[67]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.

[68]  D. Eidelberg,et al.  Brain network markers of abnormal cerebral glucose metabolism and blood flow in Parkinson’s disease , 2014, Neuroscience Bulletin.

[69]  A. Dobra Variable selection and dependency networks for genomewide data. , 2009, Biostatistics.

[70]  Wenxin Jiang Bayesian variable selection for high dimensional generalized linear models : Convergence rates of the fitted densities , 2007, 0710.3458.

[71]  X. Qi,et al.  SNCA, a novel biomarker for Group 4 medulloblastomas, can inhibit tumor invasion and induce apoptosis , 2018, Cancer science.

[72]  A. Stubelius,et al.  Immunomodulation by the estrogen metabolite 2-methoxyestradiol. , 2014, Clinical immunology.

[73]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[74]  J. Hopcroft,et al.  Algorithm 447: efficient algorithms for graph manipulation , 1973, CACM.

[75]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .