Organization of Physical Interactomes as Uncovered by Network Schemas

Large-scale protein-protein interaction networks provide new opportunities for understanding cellular organization and functioning. We introduce network schemas to elucidate shared mechanisms within interactomes. Network schemas specify descriptions of proteins and the topology of interactions among them. We develop algorithms for systematically uncovering recurring, over-represented schemas in physical interaction networks. We apply our methods to the S. cerevisiae interactome, focusing on schemas consisting of proteins described via sequence motifs and molecular function annotations and interacting with one another in one of four basic network topologies. We identify hundreds of recurring and over-represented network schemas of various complexity, and demonstrate via graph-theoretic representations how more complex schemas are organized in terms of their lower-order constituents. The uncovered schemas span a wide range of cellular activities, with many signaling and transport related higher-order schemas. We establish the functional importance of the schemas by showing that they correspond to functionally cohesive sets of proteins, are enriched in the frequency with which they have instances in the H. sapiens interactome, and are useful for predicting protein function. Our findings suggest that network schemas are a powerful paradigm for organizing, interrogating, and annotating cellular networks.

[1]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[2]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[3]  Sudarshan S. Chawathe,et al.  SEuS: Structure Extraction Using Summaries , 2002, Discovery Science.

[4]  A. Rzhetsky,et al.  Probabilistic prediction of unknown metabolic and signal-transduction networks. , 2001, Genetics.

[5]  M. Tyers,et al.  Combinatorial control in ubiquitin-dependent proteolysis: don't Skp the F-box hypothesis. , 1998, Trends in genetics : TIG.

[6]  Tom M. W. Nye,et al.  Statistical analysis of domains in interacting protein pairs , 2005, Bioinform..

[7]  R. Milo,et al.  Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[9]  N. Nakamura,et al.  A Novel Membrane Protein Capable of Binding the Na+/H+ Antiporter (Nha1p) Enhances the Salinity-resistant Cell Growth of Saccharomyces cerevisiae* , 2004, Journal of Biological Chemistry.

[10]  M. Gerstein,et al.  Getting connected: analysis and principles of biological networks. , 2007, Genes & development.

[11]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[12]  Roded Sharan,et al.  QNet: A Tool for Querying Protein Interaction Networks , 2007, RECOMB.

[13]  M. Gerstein,et al.  Global analysis of protein phosphorylation in yeast , 2005, Nature.

[14]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[15]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[16]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[17]  Martin Steffen,et al.  Automated modelling of signal transduction networks , 2002, BMC Bioinformatics.

[18]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[19]  Lawrence B. Holder,et al.  Graph-Based Data Mining , 2000, IEEE Intell. Syst..

[20]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[21]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[23]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[24]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[25]  M. Tyers,et al.  Osprey: a network visualization system , 2003, Genome Biology.

[26]  Jean-Luc Souciet,et al.  An evolutionary scenario for one of the largest yeast gene families. , 2006, Trends in genetics : TIG.

[27]  Jignesh M. Patel,et al.  SAGA: a subgraph matching tool for biological graphs , 2007, Bioinform..

[28]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  D. Gallwitz,et al.  Specific binding to a novel and essential Golgi membrane protein (Yip1p) functionally links the transport GTPases Ypt1p and Ypt31p , 1998, The EMBO journal.

[30]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[31]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[32]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[33]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2004, IEEE International Parallel and Distributed Processing Symposium.

[34]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[35]  K. Guimaraes,et al.  Predicting domain-domain interactions using a parsimony approach , 2006, Genome Biology.

[36]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[37]  Heinz Schwarz,et al.  Suppression of coatomer mutants by a new protein family with COPI and COPII binding motifs in Saccharomyces cerevisiae. , 2003, Molecular biology of the cell.

[38]  I. Paulsen,et al.  Major Facilitator Superfamily , 1998, Microbiology and Molecular Biology Reviews.

[39]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[40]  E. Banks,et al.  NetGrep: fast network schema searches in interactomes , 2008, Genome Biology.

[41]  R. Parker,et al.  Functions of Lsm proteins in mRNA degradation and splicing. , 2000, Current opinion in cell biology.

[42]  S. L. Wong,et al.  Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network , 2005, Journal of biology.

[43]  Roded Sharan,et al.  QPath: a method for querying pathways in a protein-protein interaction network , 2006, BMC Bioinformatics.

[44]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[45]  Philip M. Kim,et al.  The role of disorder in interaction networks: a structural analysis , 2008, Molecular systems biology.

[46]  Jérôme Wojcik,et al.  Protein-protein interaction map inference using interacting domain profile pairs , 2001, ISMB.

[47]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[48]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Philip M. Kim,et al.  Relating Three-Dimensional Structures to Protein Networks Provides Evolutionary Insights , 2006, Science.

[50]  D. Frishman,et al.  A domain interaction map based on phylogenetic profiling. , 2004, Journal of molecular biology.

[51]  Gianni Cesareni,et al.  WI‐PHI: A weighted yeast interactome enriched for direct physical interactions , 2007, Proteomics.

[52]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[53]  M. Gerstein,et al.  Genomic analysis of regulatory network dynamics reveals large topological changes , 2004, Nature.

[54]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[55]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[56]  Christopher J. Lee,et al.  Inferring protein domain interactions from databases of interacting proteins , 2005, Genome Biology.

[57]  M. Bertrand,et al.  An overview of the MAGE gene family with the identification of all human members of the family. , 2001, Cancer research.

[58]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[59]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[60]  S. Pfeffer,et al.  Yip3 catalyses the dissociation of endosomal Rab–GDI complexes , 2003, Nature.

[61]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[62]  Jessica H. Fong,et al.  Modeling the evolution of protein domain architectures using maximum parsimony. , 2007, Journal of molecular biology.

[63]  Dannie Durand,et al.  Graph Theoretical Insights into Evolution of Multidomain Proteins , 2005, RECOMB.

[64]  T. Pawson,et al.  Assembly of Cell Regulatory Systems Through Protein Interaction Domains , 2003, Science.

[65]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[66]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[67]  Dennis Shasha,et al.  NetMatch : a Cytoscape plugin for searching biological networks , 2006 .

[68]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[69]  Roded Sharan,et al.  Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks , 2006, J. Comput. Biol..

[70]  S. Teichmann,et al.  The evolution of domain arrangements in proteins and interaction networks , 2005, Cellular and Molecular Life Sciences CMLS.

[71]  Wojciech Szpankowski,et al.  Functional annotation of regulatory pathways , 2007, ISMB/ECCB.

[72]  Zohar Itzhaki,et al.  Evolutionary conservation of domain-domain interactions , 2006, Genome Biology.

[73]  T. Gibson,et al.  Systematic Discovery of New Recognition Peptides Mediating Protein Interaction Networks , 2005, PLoS biology.

[74]  Benno Schwikowski,et al.  GOlorize: a Cytoscape plug-in for network visualization with Gene Ontology-based layout and coloring , 2007, Bioinform..

[75]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.