Comparable contributions of structural-functional constraints and expression level to the rate of protein sequence evolution

BackgroundProteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate.ResultsThis work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude.ConclusionFusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution.ReviewersThis article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section.

[1]  Jodie J. Yin,et al.  A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes , 2004, Genome Biology.

[2]  L. Orgel,et al.  Biochemical Evolution , 1971, Nature.

[3]  Patrice Koehl,et al.  The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..

[4]  Eric J. Deeds,et al.  A simple physical model for scaling in protein-protein interaction networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Christoph Adami,et al.  Evolutionary rate depends on number of protein-protein interactions independently of gene expression level: Response , 2004, BMC Evolutionary Biology.

[6]  Claus O. Wilke,et al.  Mistranslation-Induced Protein Misfolding as a Dominant Constraint on Coding-Sequence Evolution , 2008, Cell.

[7]  Laurence D. Hurst,et al.  Do essential genes evolve slowly? , 1999, Current Biology.

[8]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[9]  A. E. Hirsh,et al.  Functional genomic analysis of the rates of protein evolution. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[11]  C. Pál,et al.  Highly expressed genes in yeast evolve slowly. , 2001, Genetics.

[12]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[13]  C. Wilke,et al.  Why highly expressed proteins evolve slowly. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[14]  A. E. Hirsh,et al.  Evolutionary Rate in the Protein Interaction Network , 2002, Science.

[15]  M. Pagel,et al.  Evolutionary Genomics and Proteomics , 2007 .

[16]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[17]  Chao Qian,et al.  Population , 1940, State Rankings 2020: A Statistical View of America.

[18]  Eugene V Koonin,et al.  Evolutionary systems biology: links between gene evolution and function. , 2006, Current opinion in biotechnology.

[19]  C. Wilke,et al.  A single determinant dominates the rate of yeast protein evolution. , 2006, Molecular biology and evolution.

[20]  Hunter B. Fraser,et al.  Modularity and evolutionary constraint on proteins , 2005, Nature Genetics.

[21]  E. Koonin,et al.  Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. , 2002, Genome research.

[22]  Claus O Wilke,et al.  Population Genetics of Translational Robustness , 2005, Genetics.

[23]  D. Vitkup,et al.  Influence of metabolic network structure and function on enzyme evolution , 2006, Genome Biology.

[24]  A. Reddy Alternative splicing of pre-messenger RNAs in plants in the genomic era. , 2007, Annual review of plant biology.

[25]  L. Patthy,et al.  Modules, multidomain proteins and organismic complexity , 2005, The FEBS journal.

[26]  Eugene I Shakhnovich,et al.  Structural determinant of protein designability. , 2002, Physical review letters.

[27]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[28]  Dennis P Wall,et al.  A simple dependence between protein evolution rate and the number of protein-protein interactions , 2003, BMC Evolutionary Biology.

[29]  Martin Vingron,et al.  Increase of functional diversity by alternative splicing. , 2003, Trends in genetics : TIG.

[30]  C. Pál,et al.  An integrated view of protein evolution , 2006, Nature Reviews Genetics.

[31]  D. M. Krylov,et al.  Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. , 2003, Genome research.

[32]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[33]  Dennis P Wall,et al.  Converging on a general model of protein evolution. , 2005, Trends in biotechnology.

[34]  A. Goldberg,et al.  Protein degradation and protection against misfolded or damaged proteins , 2003, Nature.

[35]  P. Farabaugh,et al.  The frequency of translational misreading errors in E. coli is largely determined by tRNA competition. , 2006, RNA.

[36]  E. Koonin,et al.  Evolution of protein domain promiscuity in eukaryotes. , 2008, Genome research.

[37]  Tong Zhou,et al.  Contact Density Affects Protein Evolutionary Rate from Bacteria to Animals , 2008, Journal of Molecular Evolution.

[38]  Yuri I Wolf,et al.  Coping with the quantitative genomics 'elephant': the correlation between the gene dispensability and evolution rate. , 2006, Trends in genetics : TIG.

[39]  A. E. Hirsh,et al.  Evolutionary rate depends on number of protein-protein interactions independently of gene expression level , 2004, BMC Evolutionary Biology.

[40]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[41]  Frances H Arnold,et al.  Structural determinants of the rate of protein evolution in yeast. , 2006, Molecular biology and evolution.

[42]  E. Koonin,et al.  Conservation and coevolution in the scale-free human gene coexpression network. , 2004, Molecular biology and evolution.

[43]  Bernardo Lemos,et al.  Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. , 2005, Molecular biology and evolution.

[44]  J. Felsenstein Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. , 1996, Methods in enzymology.

[45]  N. Grishin,et al.  From complete genomes to measures of substitution rate variability within and between proteins. , 2000, Genome research.

[46]  Liran Carmel,et al.  Unifying measures of gene function and evolution , 2006, Proceedings of the Royal Society B: Biological Sciences.

[47]  A. E. Hirsh,et al.  Protein dispensability and rate of evolution , 2001, Nature.