Process-Driven Inference of Biological Network Structure: Feasibility, Minimality, and Multiplicity

A common problem in molecular biology is to use experimental data, such as microarray data, to infer knowledge about the structure of interactions between important molecules in subsystems of the cell. By approximating the state of each molecule as “on” or “off”, it becomes possible to simplify the problem, and exploit the tools of Boolean analysis for such inference. Amongst Boolean techniques, the process-driven approach has shown promise in being able to identify putative network structures, as well as stability and modularity properties. This paper examines the process-driven approach more formally, and makes four contributions about the computational complexity of the inference problem, under the “dominant inhibition” assumption of molecular interactions. The first is a proof that the feasibility problem (does there exist a network that explains the data?) can be solved in polynomial-time. Second, the minimality problem (what is the smallest network that explains the data?) is shown to be NP-hard, and therefore unlikely to result in a polynomial-time algorithm. Third, a simple polynomial-time heuristic is shown to produce near-minimal solutions, as demonstrated by simulation. Fourth, the theoretical framework explains how multiplicity (the number of network solutions to realize a given biological process), which can take exponential-time to compute, can instead be accurately estimated by a fast, polynomial-time heuristic.

[1]  Noga Alon,et al.  Algorithmic construction of sets for k-restrictions , 2006, TALG.

[2]  Andrew J. Bulpitt,et al.  From gene expression to gene regulatory networks in Arabidopsis thaliana , 2009, BMC Systems Biology.

[3]  C. C. Chang,et al.  On Closure Under Direct Product , 1958, J. Symb. Log..

[4]  E. Raineri,et al.  Evolvability and hierarchy in rewired bacterial gene networks , 2008, Nature.

[5]  Stefan Bornholdt,et al.  Less Is More in Modeling Large Genetic Networks , 2005, Science.

[6]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[7]  Terence P. Speed,et al.  Sparse combinatorial inference with an application in cancer biology , 2009, Bioinform..

[8]  Qi Ouyang,et al.  Design of a network with state stability. , 2006, Journal of theoretical biology.

[9]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[10]  S. Cook,et al.  Logical Foundations of Proof Complexity: INDEX , 2010 .

[11]  Michael T. Hallett,et al.  A Trade-Off between Sample Complexity and Computational Complexity in Learning Boolean Networks from Time-Series Data , 2010, IEEE ACM Trans. Comput. Biol. Bioinform..

[12]  Doi,et al.  Greedy Algorithms for Finding a Small Set of Primers Satisfying Cover and Length Resolution Conditions in PCR Experiments. , 1997, Genome informatics. Workshop on Genome Informatics.

[13]  Alfred Horn,et al.  On sentences which are true of direct unions of algebras , 1951, Journal of Symbolic Logic.

[14]  H. Othmer,et al.  The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. , 2003, Journal of theoretical biology.

[15]  Sean C. Warnick,et al.  Dynamical structure analysis of sparsity and minimality heuristics for reconstruction of biochemical networks , 2008, 2008 47th IEEE Conference on Decision and Control.

[16]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[17]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[18]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[19]  Gilles Brassard,et al.  Fundamentals of Algorithmics , 1995 .

[20]  Subhadip Raychaudhuri,et al.  A Minimal Model of Signaling Network Elucidates Cell-to-Cell Stochastic Variability in Apoptosis , 2010, PloS one.

[21]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[22]  R. Simha,et al.  Process-based network decomposition reveals backbone motif structure , 2010, Proceedings of the National Academy of Sciences.

[23]  Yigal D. Nochomovitz,et al.  Highly designable phenotypes and mutational buffers emerge from a systematic mapping between network topology and dynamic output. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Peter L. Hammer,et al.  On renamable Horn and generalized Horn functions , 2005, Annals of Mathematics and Artificial Intelligence.

[25]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[26]  Hans Hermes,et al.  Introduction to mathematical logic , 1973, Universitext.

[27]  Surya Ganguli,et al.  Function constrains network architecture and dynamics: a case study on the yeast cell cycle Boolean network. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  H. Keisler Reduced products and Horn classes , 1965 .

[29]  Stefan Bornholdt,et al.  Handbook of Graphs and Networks: From the Genome to the Internet , 2003 .

[30]  Satoru Miyano,et al.  Identification of genetic networks by strategic gene disruptions and gene overexpressions under a boolean model , 2003, Theor. Comput. Sci..

[31]  Matthias Dehmer,et al.  Structural information content of networks: Graph entropy based on local vertex functionals , 2008, Comput. Biol. Chem..

[32]  Jean H. Gallier,et al.  Linear-Time Algorithms for Testing the Satisfiability of Propositional Horn Formulae , 1984, J. Log. Program..

[33]  Treenut Saithong,et al.  Analysis and Practical Guideline of Constraint-Based Boolean Method in Genetic Network Inference , 2012, PloS one.

[34]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[35]  Peter van Beek,et al.  On the minimality and global consistency of row-convex constraint networks , 1995, JACM.

[36]  U. Alon,et al.  Spontaneous evolution of modularity and network motifs. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Ernesto Estrada,et al.  The Structure of Complex Networks: Theory and Applications , 2011 .

[38]  Miguel A. Fortuna,et al.  Do scale-free regulatory networks allow more expression than random ones? , 2007, Journal of theoretical biology.

[39]  Q. Ouyang,et al.  The yeast cell-cycle network is robustly designed. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Alejandro A. Schäffer,et al.  Approximation Algorithms for a Genetic Diagnostics Problem , 1998, J. Comput. Biol..

[41]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[42]  Stephan Borgert,et al.  On Entropy-Based Molecular Descriptors: Statistical Analysis of Real and Synthetic Chemical Structures , 2009, J. Chem. Inf. Model..

[43]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[44]  M. Dehmer,et al.  Entropy Bounds for Hierarchical Molecular Networks , 2008, PloS one.

[45]  Peter Philippsen,et al.  Regulation of exit from mitosis in multinucleate Ashbya gossypii cells relies on a minimal network of genes , 2011, Molecular biology of the cell.

[46]  Tatsuya Akutsu,et al.  Performance Analysis of a Greedy Algorithm for Inferring Boolean Functions , 2003, Discovery Science.

[47]  Frank Emmert-Streib,et al.  A Brief Introduction to Complex Networks and Their Analysis , 2011, Structural Analysis of Complex Networks.

[48]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[49]  S. Cook,et al.  Logical Foundations of Proof Complexity: INDEX , 2010 .

[50]  Ilya Shmulevich,et al.  Binary analysis and optimization-based normalization of gene expression data , 2002, Bioinform..

[51]  U. Alon Biological Networks: The Tinkerer as an Engineer , 2003, Science.

[52]  Matthias Dehmer,et al.  Information theoretic measures of UHG graphs with low computational complexity , 2007, Appl. Math. Comput..

[53]  Robert E. Tarjan,et al.  A Linear-Time Algorithm for Testing the Truth of Certain Quantified Boolean Formulas , 1979, Inf. Process. Lett..

[54]  V. Thorsson,et al.  Discovery of regulatory interactions through perturbation: inference and experimental design. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[55]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[56]  Ryan O'Donnell,et al.  Learning juntas , 2003, STOC '03.

[57]  Hiroshi Chuman,et al.  Simple computational models of type I/type II cells in Fas signaling-induced apoptosis. , 2008, Journal of theoretical biology.

[58]  James M. Hogan,et al.  Recruitment Learning Of Boolean Functions In Sparse Random Networks , 2001, Int. J. Neural Syst..

[59]  Theodore J. Perkins,et al.  Robust dynamics in minimal hybrid models of genetic networks , 2010, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[60]  Salim A. Chowdhury,et al.  IDENTIFICATION OF COORDINATELY DYSREGULATED SUBNETWORKS IN COMPLEX PHENOTYPES by SALIM , 2010 .

[61]  G. Altay,et al.  Structural influence of gene networks on their inference: analysis of C3NET. , 2011 .

[62]  L. Glass,et al.  The logical analysis of continuous, non-linear biochemical control networks. , 1973, Journal of theoretical biology.

[63]  Iain G. Johnston,et al.  The effect of scale-free topology on the robustness and evolvability of genetic regulatory networks. , 2010, Journal of theoretical biology.

[64]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[65]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[66]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[67]  Ryan Williams,et al.  Finding, minimizing, and counting weighted subgraphs , 2009, STOC '09.