Unraveling Protein Networks with Power Graph Analysis

Networks play a crucial role in computational biology, yet their analysis and representation is still an open problem. Power Graph Analysis is a lossless transformation of biological networks into a compact, less redundant representation, exploiting the abundance of cliques and bicliques as elementary topological motifs. We demonstrate with five examples the advantages of Power Graph Analysis. Investigating protein-protein interaction networks, we show how the catalytic subunits of the casein kinase II complex are distinguishable from the regulatory subunits, how interaction profiles and sequence phylogeny of SH3 domains correlate, and how false positive interactions among high-throughput interactions are spotted. Additionally, we demonstrate the generality of Power Graph Analysis by applying it to two other types of networks. We show how power graphs induce a clustering of both transcription factors and target genes in bipartite transcription networks, and how the erosion of a phosphatase domain in type 22 non-receptor tyrosine phosphatases is detected. We apply Power Graph Analysis to high-throughput protein interaction networks and show that up to 85% (56% on average) of the information is redundant. Experimental networks are more compressible than rewired ones of same degree distribution, indicating that experimental networks are rich in cliques and bicliques. Power Graphs are a novel representation of networks, which reduces network complexity by explicitly representing re-occurring network motifs. Power Graphs compress up to 85% of the edges in protein interaction networks and are applicable to all types of networks such as protein interactions, regulatory networks, or homology networks.

[1]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[2]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[3]  Jinyan Li,et al.  Bioinformatics Original Paper Discovering Motif Pairs at Interaction Sites from Protein Sequences on a Proteome-wide Scale , 2022 .

[4]  M. Kapoor,et al.  A DNA damage-induced p53 serine 392 kinase complex contains CK2, hSpt16, and SSRP1. , 2001, Molecular cell.

[5]  Birgit Pils,et al.  Evolution of the multifunctional protein tyrosine phosphatase family. , 2004, Molecular biology and evolution.

[6]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Vladimir Batagelj,et al.  Pajek - Analysis and Visualization of Large Networks , 2004, Graph Drawing Software.

[8]  K. Guimaraes,et al.  Predicting domain-domain interactions using a parsimony approach , 2006, Genome Biology.

[9]  Desmond J. Higham,et al.  A lock-and-key model for protein-protein interactions , 2006, Bioinform..

[10]  David James Sherman,et al.  ProViz: protein interaction visualization and exploration , 2005, Bioinform..

[11]  See-Kiong Ng,et al.  Integrative approach for computationally inferring protein domain interactions , 2003, SAC '03.

[12]  Christopher J. Lee,et al.  Inferring protein domain interactions from databases of interacting proteins , 2005, Genome Biology.

[13]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[14]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[15]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[16]  M. Moran,et al.  Large-scale mapping of human protein–protein interactions by mass spectrometry , 2007, Molecular systems biology.

[17]  Raya Khanin,et al.  How Scale-Free Are Biological Networks , 2006, J. Comput. Biol..

[18]  Tom M. W. Nye,et al.  Statistical analysis of domains in interacting protein pairs , 2005, Bioinform..

[19]  A. Beyer,et al.  Identification and characterization of protein subcomplexes in yeast , 2005, Proteomics.

[20]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[21]  K. Struhl,et al.  Yap, a novel family of eight bZIP proteins in Saccharomyces cerevisiae with distinct biological functions , 1997, Molecular and cellular biology.

[22]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[23]  Jian Wang,et al.  Protein interaction networks of Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster: Large‐scale organization and robustness , 2006, Proteomics.

[24]  Raja Jothi,et al.  Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. , 2006, Journal of molecular biology.

[25]  J. Wojcik,et al.  The protein–protein interaction map of Helicobacter pylori , 2001, Nature.

[26]  Robert D. Finn,et al.  New developments in the InterPro database , 2007, Nucleic Acids Res..

[27]  S. Kanaya,et al.  Large-scale identification of protein-protein interaction of Escherichia coli K-12. , 2006, Genome research.

[28]  Igor Jurisica,et al.  Modeling interactome: scale-free or geometric? , 2004, Bioinform..

[29]  Jos B. T. M. Roerdink,et al.  MOVE: A Multi-Level Ontology-Based Visualization and Exploration Framework for Genomic Networks , 2007, Silico Biol..

[30]  Claudio Donati,et al.  Protein Homology Network Families Reveal Step-Wise Diversification of Type III and Type IV Secretion Systems , 2006, PLoS Comput. Biol..

[31]  Wan Kyu Kim,et al.  Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. , 2002, Genome informatics. International Conference on Genome Informatics.

[32]  Jian Ye,et al.  BLAST: improvements for better sequence analysis , 2006, Nucleic Acids Res..

[33]  M. Mann,et al.  Analysis of proteins and proteomes by mass spectrometry. , 2001, Annual review of biochemistry.

[34]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Julien Gagneur,et al.  Modular decomposition of protein-protein interaction networks , 2004, Genome Biology.

[36]  T. Barrette,et al.  Probabilistic model of the human protein-protein interaction network , 2005, Nature Biotechnology.

[37]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[38]  T. M. Murali,et al.  XcisClique: analysis of regulatory bicliques , 2006, BMC Bioinformatics.

[39]  M. Tyers,et al.  Osprey: a network visualization system , 2003, Genome Biology.

[40]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[41]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[42]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[43]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[44]  Marc Vidal,et al.  Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis , 2005, Nature.

[45]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[46]  Kevin Struhl,et al.  The FACT Complex Travels with Elongating RNA Polymerase II and Is Important for the Fidelity of Transcriptional Initiation In Vivo , 2003, Molecular and Cellular Biology.

[47]  C. Cannings,et al.  On the structure of protein-protein interaction networks. , 2003, Biochemical Society transactions.

[48]  T. Ito,et al.  Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Brian E Snydsman,et al.  Assigning function to yeast proteins by integration of technologies. , 2003, Molecular cell.

[50]  S. Teichmann,et al.  Gene regulatory network growth by duplication , 2004, Nature Genetics.

[51]  T. Gallai Transitiv orientierbare Graphen , 1967 .

[52]  A. Emili,et al.  Interaction network containing conserved and essential protein complexes in Escherichia coli , 2005, Nature.

[53]  Béla Bollobás,et al.  Random Graphs , 1985 .

[54]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .

[55]  Xiaogang Wang,et al.  Clustering by common friends finds locally significant proteins mediating modules , 2007, Bioinform..

[56]  M. Vidal,et al.  Effect of sampling on topology predictions of protein-protein interaction networks , 2005, Nature Biotechnology.

[57]  L. Castagnoli,et al.  Protein Interaction Networks by Proteome Peptide Scanning , 2004, PLoS biology.

[58]  M. Osley,et al.  A yeast H2A-H2B promoter can be regulated by changes in histone gene copy number. , 1990, Genes & development.

[59]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[60]  Bernardo A Mangiola,et al.  A Drosophila protein-interaction map centered on cell-cycle regulators , 2004, Genome Biology.

[61]  A. Barabasi,et al.  A Protein–Protein Interaction Network for Human Inherited Ataxias and Disorders of Purkinje Cell Degeneration , 2006, Cell.

[62]  Sébastien Carrère,et al.  The ProDom database of protein domain families: more emphasis on 3D , 2004, Nucleic Acids Res..

[63]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[64]  Haruki Nakamura,et al.  Filtering high-throughput protein-protein interaction data using a combination of genomic features , 2005, BMC Bioinformatics.

[65]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[66]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[67]  Emad Ramadan,et al.  A hypergraph model for the yeast protein complex network , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[68]  R. Milo,et al.  Topological generalizations of network motifs. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[69]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[70]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[71]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[72]  M. Vignali,et al.  A protein interaction network of the malaria parasite Plasmodium falciparum , 2005, Nature.

[73]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[74]  Philip M. Kim,et al.  Relating Three-Dimensional Structures to Protein Networks Provides Evolutionary Insights , 2006, Science.

[75]  Trey Ideker,et al.  Integrated Assessment and Prediction of Transcription Factor Binding , 2006, PLoS Comput. Biol..

[76]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[77]  I Jurisica,et al.  Differentially androgen-modulated genes in ovarian epithelial cells from BRCA mutation carriers and control patients predict ovarian cancer survival and disease progression , 2007, Oncogene.

[78]  Zhenjun Hu,et al.  Towards zoomable multidimensional maps of the cell , 2007, Nature Biotechnology.

[79]  B. Séraphin,et al.  A generic protein purification method for protein complex characterization and proteome exploration , 1999, Nature Biotechnology.

[80]  Roded Sharan,et al.  A supervised approach for identifying discriminating genotype patterns and its application to breast cancer data , 2007, Bioinform..

[81]  Jeroen Raes,et al.  Duplication and divergence: the evolution of new genes and old ideas. , 2004, Annual review of genetics.

[82]  M Madan Babu,et al.  Predicting the Strongest Domain-Domain Contact in Interacting Protein Pairs , 2006, Statistical applications in genetics and molecular biology.

[83]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[84]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[85]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000, Softw. Pract. Exp..

[86]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[87]  Nianjun Liu,et al.  Inferring protein-protein interactions through high-throughput interaction data from diverse organisms , 2005, Bioinform..

[88]  Ramón Serrano,et al.  Yeast putative transcription factors involved in salt tolerance , 1998, FEBS letters.

[89]  M. Grunstein,et al.  Histone H2A subtypes associate interchangeably in vivo with histone H2B subtypes. , 1982, Proceedings of the National Academy of Sciences of the United States of America.