An evolutionary and structural characterization of mammalian protein complex organization

BackgroundWe have recently released a comprehensive, manually curated database of mammalian protein complexes called CORUM. Combining CORUM with other resources, we assembled a dataset of over 2700 mammalian complexes. The availability of a rich information resource allows us to search for organizational properties concerning these complexes.ResultsAs the complexity of a protein complex in terms of the number of unique subunits increases, we observed that the number of such complexes and the mean non-synonymous to synonymous substitution ratio of associated genes tend to decrease. Similarly, as the number of different complexes a given protein participates in increases, the number of such proteins and the substitution ratio of the associated gene also tends to decrease. These observations provide evidence relating natural selection and the organization of mammalian complexes. We also observed greater homogeneity in terms of predicted protein isoelectric points, secondary structure and substitution ratio in annotated versus randomly generated complexes. A large proportion of the protein content and interactions in the complexes could be predicted from known binary protein-protein and domain-domain interactions. In particular, we found that large proteins interact preferentially with much smaller proteins.ConclusionWe observed similar trends in yeast and other data. Our results support the existence of conserved relations associated with the mammalian protein complexes.

[1]  R. Guigó,et al.  Are splicing mutations the most frequent cause of hereditary disease? , 2005, FEBS letters.

[2]  Nikolay V Dokholyan,et al.  Natural selection against protein aggregation on self-interacting and essential proteins in yeast, fly, and worm. , 2008, Molecular biology and evolution.

[3]  Hunter B. Fraser,et al.  Using protein complexes to predict phenotypic effects of gene mutation , 2007, Genome Biology.

[4]  Liran Carmel,et al.  Widespread positive selection in synonymous sites of mammalian genes. , 2007, Molecular biology and evolution.

[5]  Eugene V Koonin,et al.  Comparable contributions of structural-functional constraints and expression level to the rate of protein sequence evolution , 2008, Biology Direct.

[6]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[7]  Joshua B Plotkin,et al.  Assessing the determinants of evolutionary rates in the presence of noise. , 2007, Molecular biology and evolution.

[8]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[9]  Alex Bateman,et al.  Reuse of structural domain–domain interactions in protein networks , 2007, BMC Bioinformatics.

[10]  Eugene V Koonin,et al.  Evolutionary systems biology: links between gene evolution and function. , 2006, Current opinion in biotechnology.

[11]  Joachim Nickel,et al.  Structure Analysis of Bone Morphogenetic Protein-2 Type I Receptor Complexes Reveals a Mechanism of Receptor Inactivation in Juvenile Polyposis Syndrome* , 2008, Journal of Biological Chemistry.

[12]  S. Yi,et al.  Understanding relationship between sequence and functional evolution in yeast proteins , 2007, Genetica.

[13]  C. Schein Controlling oligomerization of pharmaceutical proteins. , 1994, Pharmaceutica acta Helvetiae.

[14]  Maria Victoria Schneider,et al.  MINT: a Molecular INTeraction database. , 2002, FEBS letters.

[15]  Thomas Wilhelm,et al.  Physical and Functional Modularity of the Protein Network in Yeast* , 2003, Molecular & Cellular Proteomics.

[16]  Peer Bork,et al.  PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments , 2006, Nucleic Acids Res..

[17]  A. E. Hirsh,et al.  Adjusting for selection on synonymous sites in estimates of evolutionary distance. , 2005, Molecular biology and evolution.

[18]  Albert Sickmann,et al.  Multiple pathways for sorting mitochondrial precursor proteins , 2008, EMBO reports.

[19]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[20]  Piero Fariselli,et al.  eSLDB: eukaryotic subcellular localization database , 2006, Nucleic Acids Res..

[21]  Frederick P. Roth,et al.  Predicting co-complexed protein pairs using genomic and proteomic data integration , 2004, BMC Bioinformatics.

[22]  Ian M. Donaldson,et al.  The Biomolecular Interaction Network Database and related tools 2005 update , 2004, Nucleic Acids Res..

[23]  J. Xie,et al.  Parkinson's Disease Brain Mitochondrial Complex I Has Oxidatively Damaged Subunits and Is Functionally Impaired and Misassembled , 2006, The Journal of Neuroscience.

[24]  Dmitrij Frishman,et al.  Conservation of protein-protein interactions - lessons from ascomycota. , 2004, Trends in genetics : TIG.

[25]  David K. Smith,et al.  Accelerated Evolutionary Rate May Be Responsible for the Emergence of Lineage-Specific Genes in Ascomycota , 2006, Journal of Molecular Evolution.

[26]  B. Friguet,et al.  Mitochondrial protein quality control: Implications in ageing , 2008, Biotechnology journal.

[27]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[28]  Derek E. Wildman,et al.  OCPAT: an online codon-preserved alignment tool for evolutionary genomic analysis of protein coding sequences , 2007, Source Code for Biology and Medicine.

[29]  Robert D. Finn,et al.  iPfam: visualization of protein?Cprotein interactions in PDB at domain and amino acid resolutions , 2005, Bioinform..

[30]  Limsoon Wong,et al.  Using indirect protein interactions for the prediction of Gene Ontology functions , 2007, BMC Bioinformatics.

[31]  Robert B. Russell,et al.  3did: interacting protein domains of known three-dimensional structure , 2004, Nucleic Acids Res..

[32]  Claus O. Wilke,et al.  Mistranslation-Induced Protein Misfolding as a Dominant Constraint on Coding-Sequence Evolution , 2008, Cell.

[33]  N. Friedman,et al.  Natural history and evolutionary principles of gene duplication in fungi , 2007, Nature.

[34]  PagelPhilipp,et al.  The MIPS mammalian protein--protein interaction database , 2005 .

[35]  Shoshana J. Wodak,et al.  CYGD: the Comprehensive Yeast Genome Database , 2004, Nucleic Acids Res..

[36]  Nikolay V Dokholyan,et al.  The Coordinated Evolution of Yeast Proteins Is Constrained by Functional Modularity , 2022 .

[37]  Sarah A Teichmann,et al.  Evolution of protein complexes by duplication of homomeric interactions , 2007, Genome Biology.

[38]  C. Pál,et al.  An integrated view of protein evolution , 2006, Nature Reviews Genetics.

[39]  See-Kiong Ng,et al.  Discovering protein complexes in dense reliable neighborhoods of protein interaction networks. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[40]  Roded Sharan,et al.  Identification of conserved protein complexes based on a model of protein network evolution , 2007, Bioinform..

[41]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[42]  D. Niu,et al.  Selection for the miniaturization of highly expressed genes. , 2007, Biochemical and biophysical research communications.

[43]  Hunter B. Fraser,et al.  Modularity and evolutionary constraint on proteins , 2005, Nature Genetics.

[44]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[45]  Hanah Margalit,et al.  Characterization and prediction of protein–protein interactions within and between complexes , 2006, Proceedings of the National Academy of Sciences.

[46]  Stephen J. Elledge,et al.  Profiling Essential Genes in Human Mammary Cells by Multiplex RNAi Screening , 2008, Science.

[47]  Philip M. Kim,et al.  Relating Three-Dimensional Structures to Protein Networks Provides Evolutionary Insights , 2006, Science.

[48]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[49]  S. Teichmann,et al.  Assembly reflects evolution of protein complexes , 2008, Nature.

[50]  Nagiza F. Samatova,et al.  From pull-down data to protein interaction networks and complexes with biological relevance. , 2008, Bioinformatics.

[51]  Martijn A. Huynen,et al.  From Endosymbiont to Host-Controlled Organelle: The Hijacking of Mitochondrial Protein Synthesis and Metabolism , 2007, PLoS Comput. Biol..

[52]  Peter Tompa,et al.  Structural disorder promotes assembly of protein complexes , 2007, BMC Structural Biology.

[53]  P. Carroad,et al.  Estimation of diffusion coefficients of proteins , 1980 .

[54]  R. Schwartz,et al.  Whole proteome pI values correlate with subcellular localizations of proteins for organisms within the three domains of life. , 2001, Genome research.

[55]  Thomas Wilhelm,et al.  Dynamic simulation of protein complex formation on a genomic scale , 2005, Bioinform..

[56]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[57]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[58]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[59]  G. Wagner,et al.  The road to modularity , 2007, Nature Reviews Genetics.

[60]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[61]  Dmitrij Frishman,et al.  The MIPS mammalian protein?Cprotein interaction database , 2005, Bioinform..

[62]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[63]  Dmitrij Frishman,et al.  PROMPT: a protein mapping and comparison tool , 2006, BMC Bioinformatics.

[64]  J. McInerney,et al.  The causes of protein evolutionary rate variation. , 2006, Trends in ecology & evolution.

[65]  Andreas Prlic,et al.  Ensembl 2007 , 2006, Nucleic Acids Res..

[66]  William Stafford Noble,et al.  Predicting Co-Complexed Protein Pairs from Heterogeneous Data , 2008, PLoS Comput. Biol..

[67]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[68]  Araxi O. Urrutia,et al.  The signature of selection mediated by expression on human genes. , 2003, Genome research.

[69]  I. Jurisica,et al.  Unequal evolutionary conservation of human protein interactions in interologous networks , 2007, Genome Biology.

[70]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[71]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[72]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[73]  I. Tetko,et al.  MitoP2: An Integrative Tool for the Analysis of the Mitochondrial Proteome , 2008, Molecular biotechnology.

[74]  J. Echave,et al.  Quaternary structure constraints on evolutionary sequence divergence. , 2006, Molecular biology and evolution.

[75]  Caroline C. Friedel,et al.  Bootstrapping the Interactome: Unsupervised Identification of Protein Complexes in Yeast , 2008, RECOMB.

[76]  Eduardo P C Rocha,et al.  The quest for the universals of protein evolution. , 2006, Trends in genetics : TIG.

[77]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[78]  Dmitrij Frishman,et al.  Designability, aggregation propensity and duplication of disease-associated proteins. , 2005, Protein engineering, design & selection : PEDS.

[79]  A. E. Hirsh,et al.  Evolutionary Rate in the Protein Interaction Network , 2002, Science.

[80]  M. Vidal,et al.  Effect of sampling on topology predictions of protein-protein interaction networks , 2005, Nature Biotechnology.

[81]  Wen-Lian Hsu,et al.  Protein subcellular localization prediction based on compartment-specific features and structure conservation , 2007, BMC Bioinformatics.

[82]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[83]  Jason E Stajich,et al.  An Introduction to BioPerl. , 2007, Methods in molecular biology.

[84]  Barbara Imperiali,et al.  Protein Oligomerization: How and Why , 2005 .

[85]  T. Chatila,et al.  FOXP3 is a homo-oligomer and a component of a supramolecular regulatory complex disabled in the human XLAAD/IPEX autoimmune disease. , 2007, International immunology.

[86]  Burkhard Rost,et al.  Protein–Protein Interactions More Conserved within Species than across Species , 2006, PLoS Comput. Biol..

[87]  Sergei Maslov,et al.  Constraints imposed by non-functional protein–protein interactions on gene expression and proteome size , 2008, Molecular systems biology.

[88]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .

[89]  Jianzhi Zhang,et al.  Null mutations in human and mouse orthologs frequently result in different phenotypes , 2008, Proceedings of the National Academy of Sciences.

[90]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.