High-order dynamic Bayesian Network learning with hidden common causes for causal gene regulatory network

BackgroundInferring gene regulatory network (GRN) has been an important topic in Bioinformatics. Many computational methods infer the GRN from high-throughput expression data. Due to the presence of time delays in the regulatory relationships, High-Order Dynamic Bayesian Network (HO-DBN) is a good model of GRN. However, previous GRN inference methods assume causal sufficiency, i.e. no unobserved common cause. This assumption is convenient but unrealistic, because it is possible that relevant factors have not even been conceived of and therefore un-measured. Therefore an inference method that also handles hidden common cause(s) is highly desirable. Also, previous methods for discovering hidden common causes either do not handle multi-step time delays or restrict that the parents of hidden common causes are not observed genes.ResultsWe have developed a discrete HO-DBN learning algorithm that can infer also hidden common cause(s) from discrete time series expression data, with some assumptions on the conditional distribution, but is less restrictive than previous methods. We assume that each hidden variable has only observed variables as children and parents, with at least two children and possibly no parents. We also make the simplifying assumption that children of hidden variable(s) are not linked to each other. Moreover, our proposed algorithm can also utilize multiple short time series (not necessarily of the same length), as long time series are difficult to obtain.ConclusionsWe have performed extensive experiments using synthetic data on GRNs of size up to 100, with up to 10 hidden nodes. Experiment results show that our proposed algorithm can recover the causal GRNs adequately given the incomplete data. Using the limited real expression data and small subnetworks of the YEASTRACT network, we have also demonstrated the potential of our algorithm on real data, though more time series expression data is needed.

[1]  M. Omair Ahmad,et al.  Inference of Gene Regulatory Networks with Variable Time Delay from Time-Series Microarray Data , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Michele Ceccarelli,et al.  articleTimeDelay-ARACNE : Reverse engineering of gene networks from time-course data by an information theoretic approach , 2010 .

[3]  Frank Emmert-Streib,et al.  Inferring the conservative causal core of gene regulatory networks , 2010, BMC Systems Biology.

[4]  Isabel M. Tienda-Luna,et al.  Uncovering Gene Regulatory Networks from Time-Series Microarray Data with Variational Bayesian Structural Expectation Maximization , 2007, EURASIP J. Bioinform. Syst. Biol..

[5]  M. Ehrenberg,et al.  Rate of translation of natural mRNAs in an optimized in vitro system. , 1996, Archives of biochemistry and biophysics.

[6]  Jeff Hasty,et al.  Delay-induced degrade-and-fire oscillations in small genetic circuits. , 2009, Physical review letters.

[7]  Tao Jiang,et al.  OligoSpawn: a software tool for the design of overgo probes from large unigene datasets , 2006, BMC Bioinformatics.

[8]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[9]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[10]  K. Aihara,et al.  Stability of genetic regulatory networks with time delay , 2002 .

[11]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[12]  Harri Lähdesmäki,et al.  Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics , 2009, Bioinform..

[13]  H. McAdams,et al.  Circuit simulation of genetic networks. , 1995, Science.

[14]  Alvis Brazma,et al.  Current approaches to gene regulatory network modelling , 2007, BMC Bioinformatics.

[15]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[16]  Norbert Dojer,et al.  Learning Bayesian Networks Does Not Have to Be NP-Hard , 2006, MFCS.

[17]  André Elisseeff,et al.  Finding Latent Causes in Causal Networks: an Efficient Approach Based on Markov Blankets , 2008, NIPS.

[18]  Hans A. Kestler,et al.  Inferring Boolean network structure via correlation , 2011, Bioinform..

[19]  Vassily Hatzimanikatis,et al.  The Origins of Time-Delay in Template Biopolymerization Processes , 2010, PLoS Comput. Biol..

[20]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[21]  Ian A. Swinburne,et al.  Intron delays and transcriptional timing during development. , 2008, Developmental cell.

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[24]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[25]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[26]  Luis M. de Campos,et al.  A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests , 2006, J. Mach. Learn. Res..

[27]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[28]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[29]  Ali Jalali,et al.  Learning the Dependence Graph of Time Series with Latent Factors , 2011, ICML.

[30]  Nir Friedman,et al.  Learning the Dimensionality of Hidden Variables , 2001, UAI.

[31]  Xuan Vinh Nguyen,et al.  GlobalMIT: learning globally optimal dynamic bayesian network with the mutual information test criterion , 2011, Bioinform..

[32]  J. Stelling,et al.  A tunable synthetic mammalian oscillator , 2009, Nature.

[33]  Julian Lewis Autoinhibition with Transcriptional Delay A Simple Mechanism for the Zebrafish Somitogenesis Oscillator , 2003, Current Biology.

[34]  Richard Scheines,et al.  Learning the Structure of Linear Latent Variable Models , 2006, J. Mach. Learn. Res..

[35]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[36]  Zheng Li,et al.  Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic Bayesian network analysis , 2011, Bioinform..

[37]  P. Spirtes,et al.  Latent variables, causal models and overidentifying constraints , 1988 .

[38]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[39]  Nir Friedman,et al.  Discovering Hidden Variables: A Structure-Based Approach , 2000, NIPS.

[40]  M. Eichler GRAPHICAL MODELLING OF MULTIVARIATE TIME SERIES WITH LATENT VARIABLES , 2006 .

[41]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[42]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[43]  Frederick Eberhardt,et al.  Discovering Cyclic Causal Models with Latent Variables: A General SAT-Based Procedure , 2013, UAI.

[44]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[45]  Catarina Costa,et al.  The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae , 2013, Nucleic Acids Res..

[46]  Huang-Cheng Kuo,et al.  Finding Time-Delayed Gene Regulation Patterns from Microarray Data , 2009, 2009 Ninth International Conference on Hybrid Intelligent Systems.

[47]  Alioune Ngom,et al.  The max-min high-order dynamic Bayesian network learning for identifying gene regulatory networks from time-series microarray data , 2013, 2013 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[48]  Trevor I. Dix,et al.  Causal Modeling of Gene Regulatory Network , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[49]  Xavier Boyen,et al.  Discovering the Hidden Structure of Complex Dynamic Systems , 1999, UAI.

[50]  R. Scheines,et al.  Automatic discovery of latent variable models , 2005 .

[51]  Adel Javanmard,et al.  Learning Linear Bayesian Networks with Latent Variables , 2012, ICML.

[52]  Dan Wu,et al.  Modeling Multiple Time Units Delayed Gene Regulatory Network Using Dynamic Bayesian Network , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[53]  Ramón Díaz-Uriarte,et al.  IDconverter and IDClight: Conversion and annotation of gene and protein IDs , 2007, BMC Bioinformatics.

[54]  Kwong-Sak Leung,et al.  Inferring Time-Delayed Causal Gene Network Using Time-Series Expression Data , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.