Learning Boolean logic models of signaling networks with ASP

Boolean networks provide a simple yet powerful qualitative modeling approach in systems biology. However, manual identification of logic rules underlying the system being studied is in most cases out of reach. Therefore, automated inference of Boolean logical networks from experimental data is a fundamental question in this field. This paper addresses the problem consisting of learning from a prior knowledge network describing causal interactions and phosphorylation activities at a pseudo-steady state, Boolean logic models of immediate-early response in signaling transduction networks. The underlying optimization problem has been so far addressed through mathematical programming approaches and the use of dedicated genetic algorithms. In a recent work we have shown severe limitations of stochastic approaches in this domain and proposed to use Answer Set Programming (ASP), considering a simpler problem setting. Herein, we extend our previous work in order to consider more realistic biological conditions including numerical datasets, the presence of feedback-loops in the prior knowledge network and the necessity of multi-objective optimization. In order to cope with such extensions, we propose several discretization schemes and elaborate upon our previous ASP encoding. Towards real-world biological data, we evaluate the performance of our approach over in silico numerical datasets based on a real and large-scale prior knowledge network. The correctness of our encoding and discretization schemes are dealt with in Appendices A-B.

[1]  Assieh Saadatpour,et al.  Boolean modeling of biological regulatory networks: a methodology tutorial. , 2013, Methods.

[2]  Rui-Sheng Wang,et al.  Boolean modeling in systems biology: an overview of methodology and applications , 2012, Physical biology.

[3]  René Thomas Regulatory networks seen as asynchronous automata: A logical description , 1991 .

[4]  Beatriz Peñalver Bernabé,et al.  State–time spectrum of signal transduction logic models , 2012, Physical biology.

[5]  Steffen Klamt,et al.  A methodology for the structural and functional analysis of signaling and regulatory networks , 2006, BMC Bioinformatics.

[6]  B. Bollobás Combinatorics: Set Systems, Hypergraphs, Families of Vectors and Combinatorial Probability , 1986 .

[7]  Joshua D. Knowles,et al.  Multiobjective Optimization in Bioinformatics and Computational Biology , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[9]  Irene Papatheodorou,et al.  Using Answer Set Programming to Integrate RNA Expression with Signalling Pathway Information to Infer How Mutations Affect Ageing , 2012, PloS one.

[10]  Alex Alves Freitas,et al.  A critical review of multi-objective optimization in data mining: a position paper , 2004, SKDD.

[11]  Peter K. Sorger,et al.  Logic-Based Models for the Analysis of Cell Signaling Networks† , 2010, Biochemistry.

[12]  Katsumi Inoue,et al.  Concretizing the Process Hitting into Biological Regulatory Networks , 2012, CMSB.

[13]  Harvey J. Greenberg,et al.  Opportunities for Combinatorial Optimization in Computational Biology , 2004, INFORMS J. Comput..

[14]  R. Thomas,et al.  Boolean formalization of genetic control circuits. , 1973, Journal of theoretical biology.

[15]  Julio Saez-Rodriguez,et al.  Exhaustively characterizing feasible logic models of a signaling network using Answer Set Programming , 2013, Bioinform..

[16]  Julio Saez-Rodriguez,et al.  Crowdsourcing Network Inference: The DREAM Predictive Signaling Network Challenge , 2011, Science Signaling.

[17]  Nripendra N. Biswas,et al.  Minimization of Boolean Functions , 1971, IEEE Transactions on Computers.

[18]  Julio Saez-Rodriguez,et al.  CellNOptR: a flexible toolkit to train protein signaling networks to data using multiple logic formalisms , 2012, BMC Systems Biology.

[19]  Natalie Berestovsky,et al.  An Evaluation of Methods for Inferring Boolean Networks from Time-Series Data , 2013, PloS one.

[20]  Martin Gebser,et al.  Multi-threaded ASP solving with clasp , 2012, Theory and Practice of Logic Programming.

[21]  Laurent Trilling,et al.  Applications of a formal approach to decipher discrete genetic networks , 2010, BMC Bioinformatics.

[22]  D. Lauffenburger,et al.  Discrete logic modelling as a means to link protein signalling networks with functional analysis of mammalian signal transduction , 2009, Molecular systems biology.

[23]  Martin Gebser,et al.  On the Input Language of ASP Grounder Gringo , 2009, LPNMR.

[24]  Florian Dittmann,et al.  Automatic generation of causal networks linking growth factor stimuli to functional cell state changes , 2012, The FEBS journal.

[25]  Satoru Miyano,et al.  Identification of genetic networks by strategic gene disruptions and gene overexpressions under a boolean model , 2003, Theor. Comput. Sci..

[26]  Chitta Baral,et al.  Knowledge Representation, Reasoning and Declarative Problem Solving , 2003 .

[27]  François Fages,et al.  Machine Learning Biochemical Networks from Temporal Logic Properties , 2006, Trans. Comp. Sys. Biology.

[28]  Julio Saez-Rodriguez,et al.  Revisiting the Training of Logic Models of Protein Signaling Networks with ASP , 2012, CMSB.

[29]  Jasbir S. Arora,et al.  Survey of multi-objective optimization methods for engineering , 2004 .

[30]  Tatsuya Akutsu,et al.  Completing Networks Using Observed Data , 2009, ALT.

[31]  Chris Cornelis,et al.  Modeling Protein Interaction Networks with Answer Set Programming , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine.

[32]  Katsumi Inoue,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Logic Programming for Boolean Networks , 2022 .

[33]  Torsten Schaub,et al.  Minimal intervention strategies in logical signaling networks with ASP , 2013, Theory and Practice of Logic Programming.

[34]  Laurent Trilling,et al.  Automatic Inference of Regulatory and Dynamical Properties from Incomplete Gene Interaction and Expression Data , 2012, IPCAT.

[35]  A FreitasAlex A critical review of multi-objective optimization in data mining , 2004 .

[36]  Denis Thieffry,et al.  Graphic requirements for multistability and attractive cycles in a Boolean dynamical framework , 2008, Adv. Appl. Math..

[37]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[38]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[39]  Chitta Baral,et al.  A knowledge based approach for representing and reasoning about signaling networks , 2004, ISMB/ECCB.

[40]  Martin Gebser,et al.  Complex optimization in answer set programming , 2011, Theory and Practice of Logic Programming.

[41]  Julio Saez-Rodriguez,et al.  Identifying Drug Effects via Pathway Alterations using an Integer Linear Programming Optimization Formulation on Phosphoproteomic Data , 2009, PLoS Comput. Biol..

[42]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[43]  Giorgio Gallo,et al.  Directed Hypergraphs and Applications , 1993, Discret. Appl. Math..

[44]  V. Thorsson,et al.  Discovery of regulatory interactions through perturbation: inference and experimental design. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[45]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[46]  Martin Gebser,et al.  clasp : A Conflict-Driven Answer Set Solver , 2007, LPNMR.

[47]  Jaakko Astola,et al.  Inference of Genetic Regulatory Networks via Best-Fit Extensions , 2003 .

[48]  Adrien Richard,et al.  Static Analysis of Boolean Networks Based on Interaction Graphs: A Survey , 2012, SASB.

[49]  Martin Gebser,et al.  Extending the Metabolic Network of Ectocarpus Siliculosus Using Answer Set Programming , 2013, LPNMR.

[50]  Torsten Schaub,et al.  Metabolic Network Expansion with Answer Set Programming , 2009, ICLP.

[51]  Martin Gebser,et al.  Detecting Inconsistencies in Large Biological Networks with Answer Set Programming , 2008, ICLP.

[52]  J. Stelling,et al.  Robustness of Cellular Functions , 2004, Cell.

[53]  Chris Cornelis,et al.  Modelling gene and protein regulatory networks with Answer Set Programming , 2011, Int. J. Data Min. Bioinform..

[54]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[55]  Torsten Schaub,et al.  Automatic network reconstruction using ASP , 2011, Theory and Practice of Logic Programming.

[56]  Katsumi Inoue,et al.  Analyzing Pathways Using ASP-Based Approaches , 2010, ANB.

[57]  Martin Gebser,et al.  Repair and Prediction (under Inconsistency) in Large Biological Networks with Answer Set Programming , 2010, KR.

[58]  Martin Gebser,et al.  Answer Set Solving in Practice , 2012, Answer Set Solving in Practice.

[59]  Krzysztof R. Apt,et al.  Contributions to the Theory of Logic Programming , 1982, JACM.

[60]  Barteld Kooi,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI-11) , 2011, AAAI 2011.

[61]  Steffen Klamt,et al.  Hypergraphs and Cellular Networks , 2009, PLoS Comput. Biol..

[62]  Roded Sharan,et al.  Reconstructing Boolean Models of Signaling , 2012, RECOMB.

[63]  Oliver Ray,et al.  Logic-Based Steady-State Analysis and Revision of Metabolic Networks with Inhibition , 2010, 2010 International Conference on Complex, Intelligent and Software Intensive Systems.

[64]  Julio R. Banga,et al.  Optimization in computational systems biology , 2008, BMC Systems Biology.