An Evaluation of Methods for Inferring Boolean Networks from Time-Series Data

Regulatory networks play a central role in cellular behavior and decision making. Learning these regulatory networks is a major task in biology, and devising computational methods and mathematical models for this task is a major endeavor in bioinformatics. Boolean networks have been used extensively for modeling regulatory networks. In this model, the state of each gene can be either ‘on’ or ‘off’ and that next-state of a gene is updated, synchronously or asynchronously, according to a Boolean rule that is applied to the current-state of the entire system. Inferring a Boolean network from a set of experimental data entails two main steps: first, the experimental time-series data are discretized into Boolean trajectories, and then, a Boolean network is learned from these Boolean trajectories. In this paper, we consider three methods for data discretization, including a new one we propose, and three methods for learning Boolean networks, and study the performance of all possible nine combinations on four regulatory systems of varying dynamics complexities. We find that employing the right combination of methods for data discretization and network learning results in Boolean networks that capture the dynamics well and provide predictive power. Our findings are in contrast to a recent survey that placed Boolean networks on the low end of the “faithfulness to biological reality” and “ability to model dynamics” spectra. Further, contrary to the common argument in favor of Boolean networks, we find that a relatively large number of time points in the time-series data is required to learn good Boolean networks for certain data sets. Last but not least, while methods have been proposed for inferring Boolean networks, as discussed above, missing still are publicly available implementations thereof. Here, we make our implementation of the methods available publicly in open source at http://bioinfo.cs.rice.edu/.

[1]  Hans A. Kestler,et al.  Multiscale Binarization of Gene Expression Data for Reconstructing Boolean Networks , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Jeffrey W. Smith,et al.  Stochastic Gene Expression in a Single Cell , 2022 .

[3]  Carsten Peterson,et al.  Random Boolean network models and the yeast transcriptional network , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[5]  Jie Han,et al.  Stochastic Boolean networks: An efficient approach to modeling gene regulatory networks , 2012, BMC Systems Biology.

[6]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[7]  Edward R. Dougherty,et al.  Steady-State Analysis of Genetic Regulatory Networks Modelled by Probabilistic Boolean Networks , 2003, Comparative and functional genomics.

[8]  Maurice S. Bartlett,et al.  Stochastic Processes or the Statistics of Change , 1953 .

[9]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  O. Yli-Harja,et al.  Inference of Boolean Networks from Time Series Data with Realistic Characteristics , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[11]  Ilya Shmulevich,et al.  On Learning Gene Regulatory Networks Under the Boolean Network Model , 2003, Machine Learning.

[12]  J. Timmer,et al.  Division of labor by dual feedback regulators controls JAK2/STAT5 signaling over broad ligand range , 2011, Molecular systems biology.

[13]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[14]  Song Li,et al.  Boolean network simulations for life scientists , 2008, Source Code for Biology and Medicine.

[15]  Axel Kowald,et al.  Systems Biology - a Textbook , 2016 .

[16]  D. C. Clarke,et al.  Systems theory of Smad signalling. , 2006, Systems biology.

[17]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[18]  D. Gillespie A General Method for Numerically Simulating the Stochastic Time Evolution of Coupled Chemical Reactions , 1976 .

[19]  Q. Ouyang,et al.  The yeast cell-cycle network is robustly designed. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[21]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[22]  Melanie I. Stefan,et al.  BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models , 2010, BMC Systems Biology.

[23]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[24]  Paul J. Choi,et al.  Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells , 2010, Science.

[25]  Jean-Loup Faulon,et al.  Boolean dynamics of genetic regulatory networks inferred from microarray time series data , 2007, Bioinform..

[26]  Katherine C. Chen,et al.  Integrative analysis of cell cycle control in budding yeast. , 2004, Molecular biology of the cell.