A Bayesian integrative approach for multi‐platform genomic data: A kidney cancer case study

Integration of genomic data from multiple platforms has the capability to increase precision, accuracy, and statistical power in the identification of prognostic biomarkers. A fundamental problem faced in many multi‐platform studies is unbalanced sample sizes due to the inability to obtain measurements from all the platforms for all the patients in the study. We have developed a novel Bayesian approach that integrates multi‐regression models to identify a small set of biomarkers that can accurately predict time‐to‐event outcomes. This method fully exploits the amount of available information across platforms and does not exclude any of the subjects from the analysis. Through simulations, we demonstrate the utility of our method and compare its performance to that of methods that do not borrow information across regression models. Motivated by The Cancer Genome Atlas kidney renal cell carcinoma dataset, our methodology provides novel insights missed by non‐integrative models.

[1]  Francesco C Stingo,et al.  miRNA–target gene regulatory networks: A Bayesian integrative approach to biomarker selection with application to kidney cancer , 2015, Biometrics.

[2]  R. Gregory,et al.  MicroRNA biogenesis pathways in cancer , 2015, Nature Reviews Cancer.

[3]  Christine B Peterson,et al.  Bayesian Inference of Multiple Gaussian Graphical Models , 2015, Journal of the American Statistical Association.

[4]  Taichi Isobe,et al.  miR-142 regulates the tumorigenicity of human breast cancer stem cells through the canonical WNT signaling pathway , 2014, eLife.

[5]  X. Xu,et al.  MicroRNA-133a functions as a tumor suppressor in gastric cancer. , 2014, Journal of biological regulators and homeostatic agents.

[6]  Nicholas Borcherding,et al.  Ubiquitin-conjugating enzyme Ubc13 controls breast cancer metastasis through a TAK1-p38 MAP kinase cascade , 2014, Proceedings of the National Academy of Sciences.

[7]  T. Yau,et al.  The Clinicopathological Significance of miR-133a in Colorectal Cancer , 2014, Disease markers.

[8]  Xiaohong Wang,et al.  miR-942 decreases TRAIL-induced apoptosis through ISG12a downregulation and is regulated by AKT , 2014, Oncotarget.

[9]  H. Kim,et al.  A guide to genome engineering with programmable nucleases , 2014, Nature Reviews Genetics.

[10]  T. Veenstra,et al.  Nm23-h1 binds to gelsolin and inactivates its actin-severing capacity to promote tumor cell motility and metastasis. , 2013, Cancer research.

[11]  Y. Niu,et al.  MiR-130b Is a Prognostic Marker and Inhibits Cell Proliferation and Invasion in Pancreatic Cancer through Targeting STAT3 , 2013, PloS one.

[12]  X. Wan,et al.  miR-130b is an EMT-related microRNA that targets DICER1 for aggression in endometrial cancer , 2013, Medical Oncology.

[13]  W. Han,et al.  The Oncogenic Role of microRNA-130a/301a/454 in Human Colorectal Cancer via Targeting Smad4 Expression , 2013, PloS one.

[14]  M. Mourtada-Maarabouni,et al.  Apoptosis suppression by candidate oncogene PLAC8 is reversed in other cell types. , 2012, Current cancer drug targets.

[15]  Zeny Z. Feng,et al.  Multiple-platform data integration method with application to combined analysis of microarray and proteomic data , 2012, BMC Bioinformatics.

[16]  Jason H. Moore,et al.  Chapter 11: Genome-Wide Association Studies , 2012, PLoS Comput. Biol..

[17]  Wenyi Wang,et al.  Integrating multi-platform genomic data using hierarchical Bayesian relevance vector machines , 2012, Proceedings 2012 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS).

[18]  Jeffrey S. Morris,et al.  iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data , 2012, Bioinform..

[19]  V. Johnson,et al.  Bayesian Model Selection in High-Dimensional Settings , 2012, Journal of the American Statistical Association.

[20]  Francesco C Stingo,et al.  BAYESIAN WAVELET-BASED CURVE CLASSIFICATION VIA DISCRIMINANT ANALYSIS WITH MARKOV RANDOM TREE PRIORS. , 2012, Statistica Sinica.

[21]  Jim E. Griffin,et al.  Cross-validation prior choice in Bayesian probit regression with many covariates , 2012, Stat. Comput..

[22]  Francesco C Stingo,et al.  INCORPORATING BIOLOGICAL INFORMATION INTO LINEAR MODELS: A BAYESIAN APPROACH TO THE SELECTION OF PATHWAYS AND GENES. , 2011, The annals of applied statistics.

[23]  Jörg Grigull,et al.  miRNA profiling for clear cell renal cell carcinoma: biomarker discovery and identification of potential controls and consequences of miRNA dysregulation. , 2011, The Journal of urology.

[24]  Denis Larocque,et al.  A review of survival trees , 2011 .

[25]  Trevor Hastie,et al.  Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. , 2011, Journal of statistical software.

[26]  Francesco C Stingo,et al.  A BAYESIAN GRAPHICAL MODELING APPROACH TO MICRORNA REGULATORY NETWORK INFERENCE. , 2011, The annals of applied statistics.

[27]  Yidong Chen,et al.  MicroRNA-185 suppresses tumor growth and progression by targeting the Six1 oncogene in human cancers , 2010, Oncogene.

[28]  N. Zhang,et al.  Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces With Applications in Genomics , 2010 .

[29]  C. Greenwood,et al.  Data Integration in Genetics and Genomics: Methods and Challenges , 2009, Human genomics and proteomics : HGP.

[30]  W. Gerald,et al.  Endogenous human microRNAs that suppress breast cancer metastasis , 2008, Nature.

[31]  Li-Xuan Qin,et al.  An Integrative Analysis of microRNA and mRNA Expression—A Case Study , 2008, Cancer informatics.

[32]  B. Moor,et al.  Integrating Microarray and Proteomics Data to Predict the Response of Cetuximab in Patients with Rectal Cancer , 2007, Pacific Symposium on Biocomputing.

[33]  Jeffrey S. Simonoff,et al.  An Investigation of Missing Data Methods for Classification Trees , 2006, J. Mach. Learn. Res..

[34]  M. Gerstein,et al.  Assessing the limits of genomic data integration for predicting protein networks. , 2005, Genome research.

[35]  P. Laird Early detection: The power and the promise of DNA methylation markers , 2003, Nature Reviews Cancer.

[36]  Ash A. Alizadeh,et al.  'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns , 2000, Genome Biology.

[37]  A. Gelfand,et al.  Bayesian Model Choice: Asymptotics and Exact Calculations , 1994 .

[38]  G. Hommel A stagewise rejective multiple test procedure based on a modified Bonferroni test , 1988 .

[39]  C. N. Morris,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[40]  M. Gerstein,et al.  Assessing the Limits of Genomic Data Integration for Predicting Protein-Protein Interactions , 2005 .

[41]  Adrian E. Raftery,et al.  Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance , 1995 .