The Effect of Insertions and Deletions on Wirings in Protein-Protein Interaction Networks: A Large-Scale Study

Although insertions and deletions (indels) are a common type of sequence variation, their origin and their functional consequences have not yet been fully understood. It has been known that indels preferably occur in the loop regions of the affected proteins. Moreover, it has recently been demonstrated that indels are significantly more strongly correlated with functional changes than substitutions. In sum, there is substantial evidence that indels, not substitutions, are the predominant evolutionary factor when it comes to structural changes in proteins. As a consequence it comes natural to hypothesize that sizable indels can modify protein interaction interfaces, causing a gain or loss of protein-protein interactions, thereby significantly rewiring the interaction networks. In this paper, we have analyzed this relationship in a large-scale study. We have computed all paralogous protein pairs in Saccharomyces cerevisiae (Yeast) and Drosophila melanogaster (Fruit Fly), and sorted the respective alignments according to whether they contained indels of significant lengths as per a pair Hidden Markov Model (HMM)-based framework of a recent study. We subsequently computed well known centrality measures for proteins that participated in indel alignments (indel proteins) and those that did not. We found that indel proteins indeed showed greater variation in terms of these measures. This demonstrates that indels have a significant influence when it comes to rewiring of the interaction networks due to evolution, which confirms our hypothesis. In general, this study may yield relevant insights into the functional interplay of proteins and the evolutionary dynamics behind it.

[1]  Süleyman Cenk Sahinalp,et al.  Not All Scale-Free Networks Are Born Equal: The Role of the Seed Graph in PPI Network Evolution , 2006, Systems Biology and Computational Proteomics.

[2]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[3]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[4]  David J. Galas,et al.  A duplication growth model of gene expression networks , 2002, Bioinform..

[5]  Artem Cherkasov,et al.  Selective targeting of indel‐inferred differences in spatial structures of highly homologous proteins , 2005, Proteins.

[6]  Richard Lavery,et al.  Macromolecular recognition. , 2005, Current opinion in structural biology.

[7]  Steven A Benner,et al.  Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. , 2004, Journal of molecular biology.

[8]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[9]  Alexey S Kondrashov,et al.  Context of deletions and insertions in human coding sequences , 2004, Human mutation.

[10]  N. Reiner,et al.  Molecular cloning, biochemical and structural analysis of elongation factor-1 alpha from Leishmania donovani: comparison with the mammalian homologue. , 2003, Biochemical and biophysical research communications.

[11]  J. Felsenstein,et al.  Inching toward reality: An improved likelihood model of sequence evolution , 2004, Journal of Molecular Evolution.

[12]  N. Goldman,et al.  Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. , 1994, Molecular biology and evolution.

[13]  Luhua Lai,et al.  Structure-based method for analyzing protein–protein interfaces , 2004, Journal of molecular modeling.

[14]  D. Schomburg,et al.  Prediction of protein three-dimensional structures in insertion and deletion regions: a procedure for searching data bases of representative protein fragments using geometric scoring criteria. , 1995, Journal of molecular biology.

[15]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[16]  A. Emili,et al.  Interaction network containing conserved and essential protein complexes in Escherichia coli , 2005, Nature.

[17]  Artem Cherkasov,et al.  Relationship between insertion/deletion (indel) frequency of proteins and essentiality , 2007, BMC Bioinformatics.

[18]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[19]  D. Ingber,et al.  High-Betweenness Proteins in the Yeast Protein Interaction Network , 2005, Journal of biomedicine & biotechnology.

[20]  Xun Gu,et al.  The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment , 1995, Journal of Molecular Evolution.

[21]  B. L. Sibanda,et al.  Accommodating sequence changes in beta-hairpins in proteins. , 1993, Journal of molecular biology.

[22]  Artem Cherkasov,et al.  Indel‐based targeting of essential proteins in human pathogens that have close host orthologue(s): Discovery of selective inhibitors for Leishmania donovani elongation factor‐1α , 2007, Proteins.

[23]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[24]  A. Vespignani,et al.  Modeling of Protein Interaction Networks , 2001, Complexus.

[25]  Andrew D. Smith,et al.  SIMPROT: Using an empirically determined indel distribution in simulations of protein evolution , 2005, BMC Bioinformatics.

[26]  Alexandre Z. Caldeira,et al.  Uncertainty in homology inferences: assessing and improving genomic sequence alignment. , 2008, Genome research.

[27]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[28]  G. Gonnet,et al.  Empirical and structural models for insertions and deletions in the divergent evolution of proteins. , 1993, Journal of molecular biology.

[29]  P. Argos,et al.  Analysis of insertions/deletions in protein structures. , 1992, Journal of molecular biology.

[30]  Artem Cherkasov,et al.  Large‐scale survey for potentially targetable indels in bacterial and protozoan proteins , 2005, Proteins.

[31]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[32]  B. L. Sibanda,et al.  Accommodating sequence changes in β-hairpins in proteins , 1993 .

[33]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[34]  Artem Cherkasov,et al.  The Relation between Indel Length and Functional Divergence: A Formal Study , 2008, WABI.

[35]  E. Levanon,et al.  Preferential attachment in the protein network evolution. , 2003, Physical review letters.

[36]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[37]  P. Lio’,et al.  Molecular phylogenetics: state-of-the-art methods for looking into the past. , 2001, Trends in genetics : TIG.

[38]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[39]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[40]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[41]  B Qian,et al.  Distribution of indel lengths , 2001, Proteins.

[42]  J. Thorne,et al.  Models of protein sequence evolution and their applications. , 2000, Current opinion in genetics & development.

[43]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[44]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Mark Gerstein,et al.  The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics , 2007, PLoS Comput. Biol..

[46]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[47]  Martin Vingron,et al.  IntAct: an open source molecular interaction database , 2004, Nucleic Acids Res..

[48]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.