Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

Abstract Gene Regulatory Networks (GRNs) are known as the most adequate instrument to provide a clear insight and understanding of the cellular systems. One of the most successful techniques to reconstruct GRNs using gene expression data is Bayesian networks (BN) which have proven to be an ideal approach for heterogeneous data integration in the learning process. Nevertheless, the incorporation of prior knowledge has been achieved by using prior beliefs or by using networks as a starting point in the search process. In this work, the utilization of different kinds of structural restrictions within algorithms for learning BNs from gene expression data is considered. These restrictions will codify prior knowledge, in such a way that a BN should satisfy them. Therefore, one aim of this work is to make a detailed review on the use of prior knowledge and gene expression data to inferring GRNs from BNs, but the major purpose in this paper is to research whether the structural learning algorithms for BNs from expression data can achieve better outcomes exploiting this prior knowledge with the use of structural restrictions. In the experimental study, it is shown that this new way to incorporate prior knowledge leads us to achieve better reverse-engineered networks.

[1]  Halil Kilicoglu,et al.  Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference , 2014, PLoS Comput. Biol..

[2]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[3]  Luis M. de Campos,et al.  Bayesian network learning algorithms using structural restrictions , 2007, Int. J. Approx. Reason..

[4]  Eyad Almasri,et al.  Incorporating Literature Knowledge in Bayesian Network for Inferring Gene Networks with Gene Expression Data , 2008, ISBRA.

[5]  Haidong Wang,et al.  Discovering molecular pathways from protein interaction and gene expression data , 2003, ISMB.

[6]  Simon E. F. Spencer,et al.  Quantifying the multi-scale performance of network inference algorithms , 2014, Statistical applications in genetics and molecular biology.

[7]  Mingyi Wang,et al.  A hybrid Bayesian network learning method for constructing gene networks , 2007, Comput. Biol. Chem..

[8]  D. Gifford Blazing Pathways Through Genetic Mountains , 2001, Science.

[9]  Lyle H. Ungar,et al.  Using prior knowledge to improve genetic network reconstruction from microarray data , 2004, Silico Biol..

[10]  Satoru Miyano,et al.  Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection , 2003, ECCB.

[11]  Zalmiyah Zakaria,et al.  A review on the computational approaches for gene regulatory network construction , 2014, Comput. Biol. Medicine.

[12]  Joachim Selbig,et al.  pcaMethods - a bioconductor package providing PCA methods for incomplete data , 2007, Bioinform..

[13]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[14]  Tommi S. Jaakkola,et al.  Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models , 2001, Pacific Symposium on Biocomputing.

[15]  Li Liao,et al.  Phylogenetic tree information aids supervised learning for predicting protein-protein interaction based on distance matrices , 2007, BMC Bioinformatics.

[16]  Tommi S. Jaakkola,et al.  Bayesian Methods for Elucidating Genetic Regulatory Networks , 2002, IEEE Intell. Syst..

[17]  Tian Zheng,et al.  Bayesian hierarchical graph-structured model for pathway analysis using gene expression data , 2013, Statistical applications in genetics and molecular biology.

[18]  Richard Scheines,et al.  Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data , 2000 .

[19]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 2000, Nucleic Acids Res..

[20]  Sun Yong Kim,et al.  Combining Gene Expression Data with DNA Sequence Information for Estimating Gene Networks Using Bayesian Network Model , 2003 .

[21]  Luis M. de Campos,et al.  A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service , 2004, Artif. Intell. Medicine.

[22]  Rachel B. Brem,et al.  Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks , 2008, Nature Genetics.

[23]  Gene H. Golub,et al.  Missing value estimation for DNA microarray gene expression data: local least squares imputation , 2005, Bioinform..

[24]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[25]  Rainer Spang,et al.  Inferring cellular networks – a review , 2007, BMC Bioinformatics.

[26]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[27]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[28]  Shao Li,et al.  Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach , 2006, Bioinform..

[29]  Wai Lam,et al.  LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[30]  Wray L. Buntine A Guide to the Literature on Learning Probabilistic Networks from Data , 1996, IEEE Trans. Knowl. Data Eng..

[31]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[32]  Satoru Miyano,et al.  Using Protein-Protein Interactions for Refining Gene Networks Estimated from Microarray Data by Bayesian Networks , 2003, Pacific Symposium on Biocomputing.

[33]  Haiyan Huang,et al.  Review on statistical methods for gene network reconstruction using expression data. , 2014, Journal of theoretical biology.

[34]  John Quackenbush,et al.  Seeded Bayesian Networks: Constructing genetic networks from microarray data , 2008, BMC Systems Biology.

[35]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[36]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..

[37]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[38]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[39]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[40]  P. Spirtes,et al.  Causation, Prediction, and Search, 2nd Edition , 2001 .

[41]  Mark P. Styczynski,et al.  Overview of computational methods for the inference of gene regulatory networks , 2005, Comput. Chem. Eng..

[42]  Satoru Miyano,et al.  Estimating gene regulatory networks and protein-protein interactions of Saccharomyces cerevisiae from multiple genome-wide data , 2005, ECCB/JBI.

[43]  Luis M. de Campos,et al.  A new approach for learning belief networks using independence criteria , 2000, Int. J. Approx. Reason..

[44]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[45]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[46]  Cengizhan Ozturk,et al.  Bayesian network prior: network analysis of biological data using external knowledge , 2013, Bioinform..

[47]  Salma Jamoussi,et al.  Weighted ensemble learning of Bayesian network for gene regulatory networks , 2015, Neurocomputing.

[48]  Satoru Miyano,et al.  Bayesian Network and Nonparametric Heteroscedastic Regression for Nonlinear Modeling of Genetic Network , 2003, J. Bioinform. Comput. Biol..

[49]  Wei-Po Lee,et al.  Computational methods for discovering gene networks from expression data , 2009, Briefings Bioinform..

[50]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[51]  Martin A. Nowak,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004 .

[52]  Michael Banf,et al.  Computational inference of gene regulatory networks: Approaches, limitations and opportunities. , 2017, Biochimica et biophysica acta. Gene regulatory mechanisms.

[53]  Luis M. de Campos,et al.  Score-based methods for learning Markov boundaries by searching in constrained spaces , 2011, Data Mining and Knowledge Discovery.

[54]  Emma Steele,et al.  Literature-based priors for gene regulatory networks , 2009, Bioinform..

[55]  D. Husmeier,et al.  Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge , 2007, Statistical applications in genetics and molecular biology.

[56]  Sach Mukherjee,et al.  Network inference using informative priors , 2008, Proceedings of the National Academy of Sciences.

[57]  Luis M. de Campos,et al.  Searching for Bayesian Network Structures in the Space of Restricted Acyclic Partially Directed Graphs , 2011, J. Artif. Intell. Res..

[58]  M. Grzegorczyk,et al.  Statistical inference of regulatory networks for circadian regulation , 2014, Statistical applications in genetics and molecular biology.

[59]  Gustavo H. Esteves,et al.  A statistical method for measuring activation of gene regulatory networks , 2018, Statistical applications in genetics and molecular biology.

[60]  Alvis Brazma,et al.  Current approaches to gene regulatory network modelling , 2007, BMC Bioinformatics.

[61]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[62]  Reinhard Guthke,et al.  Data- and knowledge-based modeling of gene regulatory networks: an update , 2015, EXCLI journal.

[63]  Satoru Miyano,et al.  Combining Microarrays and Biological Knowledge for Estimating Gene Networks via Bayesian Networks , 2004, J. Bioinform. Comput. Biol..

[64]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[65]  Age K. Smilde,et al.  A Classification Model for the Leiden Proteomics Competition , 2008, Statistical applications in genetics and molecular biology.

[66]  Jose Miguel Puerta,et al.  Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood , 2010, Data Mining and Knowledge Discovery.