Nonadaptive Amino Acid Convergence Rates Decrease over Time

Convergence is a central concept in evolutionary studies because it provides strong evidence for adaptation. It also provides information about the nature of the fitness landscape and the repeatability of evolution, and can mislead phylogenetic inference. To understand the role of adaptive convergence, we need to understand the patterns of nonadaptive convergence. Here, we consider the relationship between nonadaptive convergence and divergence in mitochondrial and model proteins. Surprisingly, nonadaptive convergence is much more common than expected in closely related organisms, falling off as organisms diverge. The extent of the convergent drop-off in mitochondrial proteins is well predicted by epistatic or coevolutionary effects in our “evolutionary Stokes shift” models and poorly predicted by conventional evolutionary models. Convergence probabilities decrease dramatically if the ancestral amino acids of branches being compared have diverged, but also drop slowly over evolutionary time even if the ancestral amino acids have not substituted. Convergence probabilities drop-off rapidly for quickly evolving sites, but much more slowly for slowly evolving sites. Furthermore, once sites have diverged their convergence probabilities are extremely low and indistinguishable from convergence levels at randomized sites. These results indicate that we cannot assume that excessive convergence early on is necessarily adaptive. This new understanding should help us to better discriminate adaptive from nonadaptive convergence and develop more relevant evolutionary models with improved validity for phylogenetic inference.

[1]  Z. Yang,et al.  Models of amino acid substitution and applications to mitochondrial protein evolution. , 1998, Molecular biology and evolution.

[2]  R. Goldstein,et al.  Strong evidence for protein epistasis, weak evidence against it , 2014, Proceedings of the National Academy of Sciences.

[3]  S. Whelan,et al.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. , 2001, Molecular biology and evolution.

[4]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[5]  Allan C. Wilson,et al.  Adaptive evolution in the stomach lysozymes of foregut fermenters , 1987, Nature.

[6]  David T. Jones,et al.  Protein evolution with dependence among codons due to tertiary structure. , 2003, Molecular biology and evolution.

[7]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[8]  J. Bloom,et al.  Mutational effects on stability are largely conserved during protein evolution , 2013, Proceedings of the National Academy of Sciences.

[9]  Richard A. Goldstein,et al.  Analyzing Rate Heterogeneity During Protein Evolution , 2000, Pacific Symposium on Biocomputing.

[10]  S. Jeffery Evolution of Protein Molecules , 1979 .

[11]  Motoo Kimura,et al.  Some Problems of Stochastic Processes in Genetics , 1957 .

[12]  Richard A. Goldstein,et al.  Identifying Changes in Selective Constraints: Host Shifts in Influenza , 2009, PLoS Comput. Biol..

[13]  E. Johansson,et al.  Three-dimensional structure of a mammalian purple acid phosphatase at 2.2 A resolution with a mu-(hydr)oxo bridged di-iron center. , 1999, Journal of molecular biology.

[14]  Analysis of among-site variation in substitution patterns , 2004, Biological Procedures Online.

[15]  Jaa Nylander,et al.  MrModeltest 2.2. Program Distributed by the Author , 2004 .

[16]  Todd A. Castoe,et al.  Phylogenetics, likelihood, evolution and complexity , 2012, Bioinform..

[17]  A. Halpern,et al.  Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. , 1998, Molecular biology and evolution.

[18]  Ari Löytynoja,et al.  An algorithm for progressive multiple alignment of sequences with insertions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[20]  David D. Pollock,et al.  SP Transcription Factor Paralogs and DNA-Binding Sites Coevolve and Adaptively Converge in Mammals and Birds , 2012, Genome biology and evolution.

[21]  Todd A. Castoe,et al.  Evidence for an ancient adaptive episode of convergent molecular evolution , 2009, Proceedings of the National Academy of Sciences.

[22]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[23]  Olivier Gascuel,et al.  Empirical profile mixture models for phylogenetic reconstruction , 2008, Bioinform..

[24]  R. Goldstein,et al.  Amino acid coevolution induces an evolutionary Stokes shift , 2012, Proceedings of the National Academy of Sciences.

[25]  Paul D. Williams,et al.  Assessing the Accuracy of Ancestral Protein Reconstruction Methods , 2006, PLoS Comput. Biol..

[26]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[27]  Hervé Philippe,et al.  Statistical potentials for improved structurally constrained evolutionary models. , 2010, Molecular biology and evolution.

[28]  D. Pollock,et al.  Detecting gradients of asymmetry in site-specific substitutions in mitochondrial genomes. , 2004, DNA and cell biology.

[29]  Yang Liu,et al.  Convergent sequence evolution between echolocating bats and dolphins , 2010, Current Biology.

[30]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[31]  Lena Osterhagen,et al.  Molecular Evolution A Statistical Approach , 2016 .

[32]  Richard A. Goldstein,et al.  Estimating the Distribution of Selection Coefficients from Phylogenetic Data Using Sitewise Mutation-Selection Models , 2012, Genetics.

[33]  Todd A. Castoe,et al.  Adaptive Evolution and Functional Redesign of Core Metabolic Proteins in Snakes , 2008, PloS one.

[34]  M. Kimura,et al.  On the probability of fixation of mutant genes in a population. , 1962, Genetics.

[35]  W R Taylor,et al.  Coevolving protein residues: maximum likelihood identification and relationship to structure. , 1999, Journal of molecular biology.

[36]  R A Goldstein,et al.  Models of natural mutations including site heterogeneity , 1998, Proteins.

[37]  M. Kimura,et al.  An introduction to population genetics theory , 1971 .

[38]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[39]  H. Munro,et al.  Mammalian protein metabolism , 1964 .

[40]  R. Goldstein,et al.  The evolution and evolutionary consequences of marginal thermostability in proteins , 2011, Proteins.

[41]  Richard A. Goldstein,et al.  Changing Selective Pressure during Antigenic Changes in Human Influenza H3 , 2008, PLoS pathogens.

[42]  S. Carroll,et al.  Frequent and widespread parallel evolution of protein sequences. , 2008, Molecular biology and evolution.

[43]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[44]  P. Provero,et al.  Genome-wide signatures of convergent evolution in echolocating mammals , 2013, Nature.

[45]  R. Murphy,et al.  Parallel Evolution of Auditory Genes for Echolocation in Bats and Toothed Whales , 2012, PLoS genetics.