ATEN: And/Or tree ensemble for inferring accurate Boolean network topology and dynamics

MOTIVATION Inferring gene regulatory networks from gene expression time series data is important for gaining insights into the complex processes of cell life. A popular approach is to infer Boolean networks. However, it is still a pressing open problem to infer accurate Boolean networks from experimental data that are typically short and noisy. RESULTS To address the problem, we propose a Boolean network inference algorithm which is able to infer accurate Boolean network topology and dynamics from short and noisy time series data. The main idea is that, for each target gene, we use an And/Or tree ensemble algorithm to select prime implicants of which each is a conjunction of a set of input genes. The selected prime implicants are important features for predicting the states of the target gene. Using these important features we then infer the Boolean function of the target gene. Finally, the Boolean functions of all target genes are combined as a Boolean network. Using the data generated from artificial and real-world gene regulatory networks, we show that our algorithm can infer more accurate Boolean network topology and dynamics from short and noisy time series data than other algorithms. Our algorithm enables us to gain better insights into complex regulatory mechanisms of cell life. AVAILABILITY Package ATEN is freely available at https://github.com/ningshi/ATEN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Ziv Bar-Joseph,et al.  Analyzing time series gene expression data , 2004, Bioinform..

[2]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[3]  Manuel Sanchez-Castillo,et al.  A Bayesian framework for the inference of gene regulatory networks from time and pseudo‐time series data , 2018, Bioinform..

[4]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[5]  Rainer Spang,et al.  Analyzing synergistic and non-synergistic interactions in signalling pathways using Boolean Nested Effect Models , 2015, Bioinform..

[6]  Abdul Salam Jarrah,et al.  An algebra-based method for inferring gene regulatory networks , 2014, BMC Systems Biology.

[7]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[8]  Erik L. L. Sonnhammer,et al.  Functional association networks as priors for gene regulatory network inference , 2014, Bioinform..

[9]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[10]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[11]  Hans A. Kestler,et al.  BoolNet - an R package for generation, reconstruction and analysis of Boolean networks , 2010, Bioinform..

[12]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[13]  Diego di Bernardo,et al.  Inference of gene regulatory networks and compound mode of action from time course gene expression profiles , 2006, Bioinform..

[14]  Francesco Iorio,et al.  Logic models to predict continuous outputs based on binary inputs with an application to personalized cancer therapy , 2016, Scientific Reports.

[15]  Yung-Keun Kwon,et al.  A novel mutual information-based Boolean network inference method from time-series gene expression data , 2017, PloS one.

[16]  Satoru Miyano,et al.  Identification of Genetic Networks from a Small Number of Gene Expression Patterns Under the Boolean Network Model , 1998, Pacific Symposium on Biocomputing.

[17]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[18]  M. LeBlanc,et al.  Logic Regression , 2003 .

[19]  Holger Schwender,et al.  Identification of SNP interactions using logic regression. , 2008, Biostatistics.

[20]  Hans A. Kestler,et al.  Inferring Boolean network structure via correlation , 2011, Bioinform..

[21]  Paola Zuccolotto,et al.  Variable Selection Using Random Forests , 2006 .

[22]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[23]  Luis Mateus Rocha,et al.  Control of complex networks requires both structure and dynamics , 2015, Scientific Reports.

[24]  Berthold Göttgens,et al.  BTR: training asynchronous Boolean models using single-cell expression data , 2016, BMC Bioinformatics.

[25]  Nripendra N. Biswas,et al.  Minimization of Boolean Functions , 1971, IEEE Transactions on Computers.

[26]  Edward R. Dougherty,et al.  From Boolean to probabilistic Boolean networks as models of genetic regulatory networks , 2002, Proc. IEEE.

[27]  Assieh Saadatpour,et al.  Boolean modeling of biological regulatory networks: a methodology tutorial. , 2013, Methods.

[28]  Rina Dechter,et al.  AND/OR Branch-and-Bound search for combinatorial optimization in graphical models , 2009, Artif. Intell..

[29]  Willard Van Orman Quine,et al.  A Way to Simplify Truth Functions , 1955 .

[30]  Zalmiyah Zakaria,et al.  A review on the computational approaches for gene regulatory network construction , 2014, Comput. Biol. Medicine.

[31]  R. Laubenbacher,et al.  A computational algebra approach to the reverse engineering of gene regulatory networks. , 2003, Journal of theoretical biology.

[32]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[33]  Ilya Shmulevich,et al.  On Learning Gene Regulatory Networks Under the Boolean Network Model , 2003, Machine Learning.

[34]  H. Othmer,et al.  The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. , 2003, Journal of theoretical biology.