Universal distribution of protein evolution rates as a consequence of protein folding physics

The hypothesis that folding robustness is the primary determinant of the evolution rate of proteins is explored using a coarse-grained off-lattice model. The simplicity of the model allows rapid computation of the folding probability of a sequence to any folded conformation. For each robust folder, the network of sequences that share its native structure is identified. The fitness of a sequence is postulated to be a simple function of the number of misfolded molecules that have to be produced to reach a characteristic protein abundance. After fixation probabilities of mutants are computed under a simple population dynamics model, a Markov chain on the fold network is constructed, and the fold-averaged evolution rate is computed. The distribution of the logarithm of the evolution rates across distinct networks exhibits a peak with a long tail on the low rate side and resembles the universal empirical distribution of the evolutionary rates more closely than either distribution resembles the log-normal distribution. The results suggest that the universal distribution of the evolutionary rates of protein-coding genes is a direct consequence of the basic physics of protein folding.

[1]  L. Orgel,et al.  Biochemical Evolution , 1971, Nature.

[2]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[3]  Harold L. Friedman,et al.  Brownian dynamics: Its application to ionic solutions , 1977 .

[4]  M. Kimura,et al.  The neutral theory of molecular evolution. , 1983, Scientific American.

[5]  H. Berendsen,et al.  A consistent empirical potential for water–protein interactions , 1984 .

[6]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[7]  D. Thirumalai,et al.  Metastability of the folded states of globular proteins. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[8]  J. Gillespie The causes of molecular evolution , 1991 .

[9]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[10]  D. Yee,et al.  Principles of protein folding — A perspective from simple exact models , 1995, Protein science : a publication of the Protein Society.

[11]  E I Shakhnovich,et al.  Evolution-like selection of fast-folding model proteins. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[12]  J. Felsenstein Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. , 1996, Methods in enzymology.

[13]  M Karplus,et al.  "New view" of protein folding reconciled with the old through multiple unfolding simulations. , 1997, Science.

[14]  R A Goldstein,et al.  Evolution of model proteins on a foldability landscape , 1997, Proteins.

[15]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[16]  Seiji Saito,et al.  Evolution of the folding ability of proteins through functional selection. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[17]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[18]  E. Bornberg-Bauer,et al.  How are model protein structures distributed in sequence space? , 1997, Biophysical journal.

[19]  Cecilia Clementi,et al.  FOLDING, DESIGN, AND DETERMINATION OF INTERACTION POTENTIALS USING OFF-LATTICE DYNAMICS OF MODEL HETEROPOLYMERS , 1998 .

[20]  R A Goldstein,et al.  On the thermodynamic hypothesis of protein folding. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[21]  J. Onuchic,et al.  Proposed mechanism for stability of proteins to evolutionary mutations. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  E. Bornberg-Bauer,et al.  Modeling evolutionary landscapes: mutational stability, topology, and superfunnels in sequence space. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[23]  D. Thirumalai,et al.  Deciphering the timescales and mechanisms of protein folding using minimal off-lattice models. , 1999, Current opinion in structural biology.

[24]  M Vendruscolo,et al.  Folding Lennard‐Jones proteins by a contact potential , 1999, Proteins.

[25]  Ned S. Wingreen,et al.  Designability, thermodynamic stability, and dynamics in protein folding: A lattice model study , 1998, cond-mat/9806197.

[26]  D. Baker,et al.  Matching theory and experiment in protein folding. , 1999, Current opinion in structural biology.

[27]  Hiroshi Noguchi,et al.  Folding path in a semiflexible homopolymer chain: A Brownian dynamics simulation , 2000 .

[28]  Marek Cieplak,et al.  Molecular dynamics of folding of secondary structures in Go-type models of proteins , 1999, cond-mat/9911488.

[29]  N. Grishin,et al.  From complete genomes to measures of substitution rate variability within and between proteins. , 2000, Genome research.

[30]  C. Pál,et al.  Highly expressed genes in yeast evolve slowly. , 2001, Genetics.

[31]  Erich Bornberg-Bauer,et al.  Randomness, Structural Uniqueness, Modularity and Neutral Evolution in Sequence Space of Model Proteins , 2002 .

[32]  Chen Zeng,et al.  Emergence of highly designable protein‐backbone conformations in an off‐lattice model , 2001, Proteins.

[33]  Erich Bornberg-Bauer,et al.  Recombinatoric exploration of novel folded structures: A heteropolymer-based model of protein evolutionary landscapes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Hao Li,et al.  Designability and thermal stability of protein structures , 2003, cond-mat/0303600.

[35]  D. M. Krylov,et al.  Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. , 2003, Genome research.

[36]  B. Bagchi,et al.  Study of the dynamics of protein folding through minimalistic models , 2003 .

[37]  M. Levitt,et al.  Simulating protein evolution in sequence and structure space. , 2004, Current opinion in structural biology.

[38]  J. Onuchic,et al.  Theory of Protein Folding This Review Comes from a Themed Issue on Folding and Binding Edited Basic Concepts Perfect Funnel Landscapes and Common Features of Folding Mechanisms , 2022 .

[39]  Christoph Adami,et al.  Stability and the evolvability of function in a model protein. , 2004, Biophysical journal.

[40]  K. Holsinger The neutral theory of molecular evolution , 2004 .

[41]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[42]  Cecilia Clementi,et al.  Balancing energy and entropy: a minimalist model for the characterization of protein folding landscapes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[43]  A. E. Hirsh,et al.  The application of statistical physics to evolutionary biology. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  C. Wilke,et al.  Why highly expressed proteins evolve slowly. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Cecilia Clementi,et al.  Characterization of the folding landscape of monomeric lactose repressor: quantitative comparison of theory and experiment. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Emile Zuckerkandl,et al.  Evolutionary processes and evolutionary noise at the molecular level , 1976, Journal of Molecular Evolution.

[47]  Claus O Wilke,et al.  Population Genetics of Translational Robustness , 2005, Genetics.

[48]  Liran Carmel,et al.  Unifying measures of gene function and evolution , 2006, Proceedings of the Royal Society B: Biological Sciences.

[49]  J. McInerney,et al.  The causes of protein evolutionary rate variation. , 2006, Trends in ecology & evolution.

[50]  Eugene V Koonin,et al.  Evolutionary systems biology: links between gene evolution and function. , 2006, Current opinion in biotechnology.

[51]  Cecilia Clementi,et al.  Minimalist protein model as a diagnostic tool for misfolding and aggregation. , 2006, Journal of molecular biology.

[52]  Eugene I. Shakhnovich Protein Folding Thermodynamics and Dynamics: Where Physics, Chemistry, and Biology Meet , 2006 .

[53]  C. Wilke,et al.  A single determinant dominates the rate of yeast protein evolution. , 2006, Molecular biology and evolution.

[54]  C. Pál,et al.  An integrated view of protein evolution , 2006, Nature Reviews Genetics.

[55]  Eugene I. Shakhnovich,et al.  Protein stability imposes limits on organism complexity and speed of molecular evolution , 2007, Proceedings of the National Academy of Sciences.

[56]  Wolfhard Janke,et al.  Identification of characteristic protein folding channels in a coarse-grained hydrophobic-polar peptide model. , 2007, The Journal of chemical physics.

[57]  Eugene V Koonin,et al.  Comparable contributions of structural-functional constraints and expression level to the rate of protein sequence evolution , 2008, Biology Direct.

[58]  Claus O. Wilke,et al.  Mistranslation-Induced Protein Misfolding as a Dominant Constraint on Coding-Sequence Evolution , 2008, Cell.

[59]  Cecilia Clementi,et al.  Coarse-grained models of protein folding: toy models or predictive tools? , 2008, Current opinion in structural biology.

[60]  Eugene V Koonin,et al.  The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages , 2009, Proceedings of the National Academy of Sciences.

[61]  R. Aebersold,et al.  Comparative Functional Analysis of the Caenorhabditis elegans and Drosophila melanogaster Proteomes , 2009, PLoS biology.

[62]  C. Wilke,et al.  The evolutionary consequences of erroneous protein synthesis , 2009, Nature Reviews Genetics.