A Mixture Copula Bayesian Network Model for Multimodal Genomic Data

Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when normality assumption is moderately or severely violated, making it unsuitable to deal with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. The parameters in mixture copula functions can be efficiently estimated by a routine Expectation-Maximization algorithm. A heuristic search algorithm based on Bayesian information criterion is developed to estimate the network structure, and prediction can be further improved by the best-scoring network out of multiple predictions from random initial values. Our method outperforms Gaussian Bayesian networks and regular copula Bayesian networks in terms of modeling flexibility and prediction accuracy, as demonstrated using a cell signaling dataset. We apply the proposed methods to the Cancer Genome Atlas data to study the genetic and epigenetic pathways that underlie serous ovarian cancer.

[1]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[2]  Pier Paolo Pandolfi,et al.  The PTEN–PI3K pathway: of feedbacks and cross-talks , 2008, Oncogene.

[3]  Ali Shojaie,et al.  Graph Estimation with Joint Additive Models. , 2013, Biometrika.

[4]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[5]  W. Wong,et al.  Learning Causal Bayesian Network Structures From Experimental Data , 2008 .

[6]  Kristin R Brogaard,et al.  A Locally Convoluted Cluster Model for Nucleosome Positioning Signals in Chemical Maps , 2014, Journal of the American Statistical Association.

[7]  Kristi Mai,et al.  Identification of Biomarkers for Predicting the Overall Survival of Ovarian Cancer Patients: a Sparse Group Lasso Approach , 2016 .

[8]  Michael Q. Zhang,et al.  Identification of Tumor Suppressors and Oncogenes from Genomic and Epigenetic Features in Ovarian Cancer , 2011, PloS one.

[9]  Erchin Serpedin,et al.  Reducing confounding and suppression effects in TCGA data: an integrated analysis of chemotherapy response in ovarian cancer , 2012, BMC Genomics.

[10]  E. Lam,et al.  The OPCML tumor suppressor functions as a cell surface repressor-adaptor, negatively regulating receptor tyrosine kinases in epithelial ovarian cancer. , 2012, Cancer discovery.

[11]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[12]  Yiling Lu,et al.  Knockdown of RAB25 promotes autophagy and inhibits cell growth in ovarian cancer cells. , 2012, Molecular medicine reports.

[13]  Y. Saga,et al.  Overexpression of PTEN in ovarian cancer cells suppresses i.p. dissemination and extends survival in mice , 2008, Molecular Cancer Therapeutics.

[14]  Yuan Ji,et al.  A Bayesian graphical model for integrative analysis of TCGA data , 2012, Proceedings 2012 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS).

[15]  Qing Zhou,et al.  Learning Sparse Causal Gaussian Networks With Experimental Intervention: Regularization and Coordinate Descent , 2013 .

[16]  C. Roberts,et al.  ARID1A mutations in cancer: another epigenetic tumor suppressor? , 2013, Cancer discovery.

[17]  Peter Bühlmann,et al.  Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm , 2007, J. Mach. Learn. Res..

[18]  Qingyang Zhang,et al.  Integrative network analysis of TCGA data for ovarian cancer , 2014, BMC Systems Biology.

[19]  A. Whittemore,et al.  Common variants in RB1 gene and risk of invasive ovarian cancer. , 2006, Cancer research.

[20]  S. Stamm,et al.  Involvement of PARP1 in the regulation of alternative splicing , 2016, Cell Discovery.

[21]  Gal Elidan,et al.  Copula Bayesian Networks , 2010, NIPS.

[22]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[23]  Edward J Oakeley,et al.  TEL/ETV6 Is a Signal Transducer and Activator of Transcription 3 (Stat3)-induced Repressor of Stat3 Activity* , 2004, Journal of Biological Chemistry.

[24]  Jie Yuan,et al.  Multiple regulation pathways and pivotal biological functions of STAT3 in cancer , 2015, Scientific Reports.

[25]  Fan Yang,et al.  AURKA and BRCA2 expression highly correlate with prognosis of endometrioid ovarian carcinoma , 2011, Modern Pathology.