Reverse Engineering Molecular Hypergraphs

Analysis of molecular interaction networks is pervasive in systems biology. This research relies almost entirely on graphs for modeling interactions. However, edges in graphs cannot represent multiway interactions among molecules, which occur very often within cells. Hypergraphs may be better representations for networks having such interactions, since hyperedges can naturally represent relationships among multiple molecules. Here, we propose using hypergraphs to capture the uncertainty inherent in reverse engineering gene-gene networks. Some subsets of nodes may induce highly varying subgraphs across an ensemble of networks inferred by a reverse engineering algorithm. We provide a novel formulation of hyperedges to capture this uncertainty in network topology. We propose a clustering-based approach to discover hyperedges. We show that our approach can recover hyperedges planted in synthetic data sets with high precision and recall, even for moderate amount of noise. We apply our techniques to a data set of pathways inferred from genetic interaction data in S. cerevisiae related to the unfolded protein response. Our approach discovers several hyperedges that capture the uncertain connectivity of genes in relevant protein complexes, suggesting that further experiments may be required to precisely discern their interaction patterns. We also show that these complexes are not discovered by an algorithm that computes frequent and dense subgraphs.

[1]  B. Alberts,et al.  The Endoplasmic Reticulum , 2002 .

[2]  Emad Ramadan,et al.  A hypergraph model for the yeast protein complex network , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[4]  D. Pe’er Bayesian Network Analysis of Signaling Networks: A Primer , 2005, Science's STKE.

[5]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[6]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[7]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.

[8]  Simon Kasif,et al.  Biological context networks: a mosaic view of the interactome , 2006, Molecular systems biology.

[9]  Zhenjun Hu,et al.  Towards zoomable multidimensional maps of the cell , 2007, Nature Biotechnology.

[10]  Rainer Spang,et al.  Inferring cellular networks – a review , 2007, BMC Bioinformatics.

[11]  Blanche Schwappach,et al.  The GET Complex Mediates Insertion of Tail-Anchored Proteins into the ER Membrane , 2008, Cell.

[12]  Jens Nielsen,et al.  Reconstruction and logical modeling of glucose repression signaling pathways in Saccharomyces cerevisiae , 2009, BMC Systems Biology.

[13]  Lenwood S. Heath,et al.  Semantics of Multimodal Network Models , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  Matthew A. Hibbs,et al.  Exploring the human genome with functional maps. , 2009, Genome research.

[15]  Jotun Hein,et al.  Rahnuma: hypergraph-based tool for metabolic pathway prediction and network comparison , 2009, Bioinform..

[16]  Mariano J. Alvarez,et al.  Genome-wide Identification of Post-translational Modulators of Transcription Factor Activity in Human B-Cells , 2009, Nature Biotechnology.

[17]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[18]  G. Hart,et al.  Glycosyltransferases and Glycan-processing Enzymes -- Essentials of Glycobiology , 2009 .

[19]  Steffen Klamt,et al.  Hypergraphs and Cellular Networks , 2009, PLoS Comput. Biol..

[20]  Frederick P. Roth,et al.  Next generation software for functional trend analysis , 2009, Bioinform..

[21]  S. Collins,et al.  Comprehensive Characterization of Genes Required for Protein Folding in the Endoplasmic Reticulum , 2009, Science.

[22]  Kei-Hoi Cheung,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[23]  Richard M. Karp,et al.  DEGAS: De Novo Discovery of Dysregulated Pathways in Human Diseases , 2010, PloS one.

[24]  Peter N. Robinson,et al.  GOing Bayesian: model-based gene set analysis of genome-scale data , 2010, Nucleic acids research.

[25]  D. Koller,et al.  Automated identification of pathways from quantitative genetic interaction data , 2010, Molecular systems biology.

[26]  Chris Hartman,et al.  ODES: an overlapping dense sub-graph algorithm , 2010, Bioinform..

[27]  Luay Nakhleh,et al.  Properties of metabolic graphs: biological organization or representation artifacts? , 2011, BMC Bioinformatics.

[28]  Johannes Hutzler,et al.  Functional and genomic analyses of blocked protein O‐mannosylation in baker's yeast , 2011, Molecular microbiology.

[29]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[30]  E. Zotenko,et al.  Inferring Physical Protein Contacts from Large-Scale Purification Data of Protein Complexes* , 2011, Molecular & Cellular Proteomics.

[31]  Haifeng Li,et al.  Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation , 2011, PLoS Comput. Biol..

[32]  Trey Ideker,et al.  Protein Networks as Logic Functions in Development and Cancer , 2011, PLoS Comput. Biol..

[33]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..