Bayesian Data Fusion of Gene Expression and Histone Modification Profiles for Inference of Gene Regulatory Network

Accurately reconstructing gene regulatory networks (GRNs) from high-throughput gene expression data has been a major challenge in systems biology for decades. Many approaches have been proposed to solve this problem. However, there is still much room for the improvement of GRN inference. Integrating data from different sources is a promising strategy. Epigenetic modifications have a close relationship with gene regulation. Hence, epigenetic data such as histone modification profiles can provide useful information for uncovering regulatory interactions between genes. In this paper, we propose a method to integrate epigenetic data into the inference of GRNs. In particular, a dynamic Bayesian network (DBN) is employed to infer gene regulations from time-series gene expression data. Epigenetic data (histone modification profiles here) are integrated into the prior probability distribution of the Bayesian model. Our method has been validated on both synthetic and real datasets. Experimental results show that the integration of epigenetic data can significantly improve the performance of GRN inference. As more epigenetic datasets become available, our method would be useful for elucidating the gene regulatory mechanisms driving various cellular activities. The source code and testing datasets are available at https://github.com/Zheng-Lab/MetaGRN/tree/master/histonePrior.

[1]  J. Han,et al.  Inferring causal relationships among different histone modifications and gene expression. , 2008, Genome research.

[2]  A. Datta,et al.  From biological pathways to regulatory networks , 2010, 49th IEEE Conference on Decision and Control (CDC).

[3]  Kevin Y. Yip,et al.  A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets , 2011, Genome Biology.

[4]  Xiaojiang Xu,et al.  Application of machine learning methods to histone methylation ChIP-Seq data reveals H4R3me2 globally represses gene expression , 2010, BMC Bioinformatics.

[5]  Qing Nie,et al.  Incorporating Existing Network Information into Gene Network Inference , 2009, PloS one.

[6]  Evan O. Paull,et al.  Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[7]  Dong-Guk Shin,et al.  Reconstruction of Biological Networks by Incorporating Prior Knowledge into Bayesian Network Models , 2012, J. Comput. Biol..

[8]  Andrew J. Bulpitt,et al.  A Primer on Learning in Bayesian Networks for Computational Biology , 2007, PLoS Comput. Biol..

[9]  Jie Zheng,et al.  Integrating epigenetic prior in dynamic Bayesian network for gene regulatory network inference , 2013, 2013 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[10]  Nicola J. Rinaldi,et al.  Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle , 2001, Cell.

[11]  Richard Bonneau,et al.  Robust data-driven incorporation of prior knowledge into the inference of dynamic regulatory networks , 2013, Bioinform..

[12]  Min Zou,et al.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data , 2005, Bioinform..

[13]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[14]  Stefan Bornholdt,et al.  Boolean network models of cellular regulation: prospects and limitations , 2008, Journal of The Royal Society Interface.

[15]  Kai Tan,et al.  Finding combinatorial histone code by semi-supervised biclustering , 2012, BMC Genomics.

[16]  Kevin Murphy,et al.  Modelling Gene Expression Data using Dynamic Bayesian Networks , 2006 .

[17]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[18]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[19]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[20]  Sampsa Hautaniemi,et al.  Fast Gene Ontology based clustering for microarray experiments , 2008, BioData Mining.

[21]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[22]  Megan F. Cole,et al.  Genome-wide Map of Nucleosome Acetylation and Methylation in Yeast , 2005, Cell.

[23]  Julia A. Lasserre,et al.  Histone modification levels are predictive for gene expression , 2010, Proceedings of the National Academy of Sciences.

[24]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[25]  William Stafford Noble,et al.  The Forkhead transcription factor Hcm1 regulates chromosome segregation genes and fills the S-phase gap in the transcriptional circuitry of the cell cycle. , 2006, Genes & development.

[26]  Alioune Ngom,et al.  The Max-Min High-Order Dynamic Bayesian Network for Learning Gene Regulatory Networks with Time-Delayed Regulations , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Ruijie Zhang,et al.  Revealing epigenetic patterns in gene regulation through integrative analysis of epigenetic interaction network , 2012, Molecular Biology Reports.

[28]  João Ricardo Sato,et al.  Modeling gene expression regulatory networks with the sparse vector autoregressive model , 2007, BMC Systems Biology.

[29]  Sonja Althammer,et al.  Predictive Models of Gene Regulation from High-Throughput Epigenomics Data , 2012, Comparative and functional genomics.

[30]  M. Gerstein,et al.  Modeling the relative relationship of transcription factor binding and histone modifications to gene expression levels in mouse embryonic stem cells , 2011, Nucleic acids research.

[31]  Jagath C. Rajapakse,et al.  Integration of Epigenetic Data in Bayesian Network Modeling of Gene Regulatory Network , 2011, PRIB.

[32]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[33]  Jagath C. Rajapakse,et al.  Stability of building gene regulatory networks with sparse autoregressive models , 2011, BMC Bioinformatics.

[34]  Dirk Husmeier,et al.  Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks with Bayesian networks. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[35]  Jagath C. Rajapakse,et al.  Fusion of Gene Regulatory and Protein Interaction Networks Using Skip-Chain Models , 2008, PRIB.

[36]  Michael Q. Zhang,et al.  Combinatorial patterns of histone acetylations and methylations in the human genome , 2008, Nature Genetics.

[37]  Satoru Miyano,et al.  Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[38]  L. Aravind,et al.  Comprehensive analysis of combinatorial regulation using the transcriptional regulatory network of yeast. , 2006, Journal of molecular biology.

[39]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[40]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[41]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[42]  Dirk Husmeier,et al.  Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks , 2003, Bioinform..

[43]  Torbjörn E. M. Nordling,et al.  Network modeling of the transcriptional effects of copy number aberrations in glioblastoma , 2011, Molecular systems biology.

[44]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[45]  Erik L. L. Sonnhammer,et al.  Functional association networks as priors for gene regulatory network inference , 2014, Bioinform..

[46]  Dan Wu,et al.  Modeling Multiple Time Units Delayed Gene Regulatory Network Using Dynamic Bayesian Network , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[47]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[48]  Stefano Lonardi,et al.  Deciphering histone code of transcriptional regulation in malaria parasites by large-scale data mining , 2014, Comput. Biol. Chem..

[49]  Serdar Bozdag,et al.  A Canonical Correlation Analysis-Based Dynamic Bayesian Network Prior to Infer Gene Regulatory Networks from Multiple Types of Biological Data , 2015, J. Comput. Biol..

[50]  Michael Seifert,et al.  Comparative transcriptomics reveals similarities and differences between astrocytoma grades , 2015, BMC Cancer.

[51]  James B. Brown,et al.  Modeling gene expression using chromatin features in various cellular contexts , 2012, Genome Biology.

[52]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .