Evolutionary switches between two serine codon sets are driven by selection

Significance When a rare evolutionary event is observed, such as substitution of two adjacent nucleotides, the question emerges whether such rare changes are caused by mutational bias or by selection. Here we address this question through genome-wide analysis of double substitutions that lead to switch of the codon sets for the amino acid serine, the only one that is encoded by two disjoint sets of codons. We show that selection is the primary factor behind these changes. These findings suggest that short-term evolution of proteins is subject to stronger purifying selection than previously thought and has significant implications for methods of phylogenetic analysis. Serine is the only amino acid that is encoded by two disjoint codon sets so that a tandem substitution of two nucleotides is required to switch between the two sets. Previously published evidence suggests that, for the most evolutionarily conserved serines, the codon set switch occurs by simultaneous substitution of two nucleotides. Here we report a genome-wide reconstruction of the evolution of serine codons in triplets of closely related species from diverse prokaryotes and eukaryotes. The results indicate that the great majority of codon set switches proceed by two consecutive nucleotide substitutions, via a threonine or cysteine intermediate, and are driven by selection. These findings imply a strong pressure of purifying selection in protein evolution, which in the case of serine codon set switches occurs via an initial deleterious substitution quickly followed by a second, compensatory substitution. The result is frequent reversal of amino acid replacements and, at short evolutionary distances, pervasive homoplasy.

[1]  T. Kunkel,et al.  Low-fidelity DNA synthesis by the L 979 F mutator derivative of Saccharomyces cerevisiae DNA polymerase zeta , 2018 .

[2]  Kin Chan,et al.  Clusters of Multiple Mutations: Incidence and Molecular Mechanisms. , 2015, Annual review of genetics.

[3]  D. Cooper,et al.  Complex Multiple‐Nucleotide Substitution Mutations Causing Human Inherited Disease Reveal Novel Insights into the Action of Translesion Synthesis DNA Polymerases , 2015, Human mutation.

[4]  Jianzhi Zhang,et al.  Are Convergent and Parallel Amino Acid Substitutions in Protein Evolution More Prevalent Than Neutral Expectations? , 2015, Molecular biology and evolution.

[5]  E. Koonin,et al.  Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes , 2014, BMC Biology.

[6]  B. Rannala,et al.  Molecular phylogenetics: principles and practice , 2012, Nature Reviews Genetics.

[7]  P. Mieczkowski,et al.  Damage-induced localized hypermutability , 2011, Cell cycle.

[8]  D. Wake,et al.  Homoplasy: From Detecting Pattern to Determining Process and Mechanism of Evolution , 2011, Science.

[9]  Robert W Murphy,et al.  Recent trends in molecular phylogenetic analysis: where to next? , 2011, The Journal of heredity.

[10]  W. G. Hill,et al.  The population genetics of mutations: good, bad and indifferent , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[11]  T. Kunkel,et al.  Low-fidelity DNA synthesis by the L979F mutator derivative of Saccharomyces cerevisiae DNA polymerase ζ , 2009, Nucleic acids research.

[12]  B. Charlesworth Effective population size and patterns of molecular evolution and variation , 2009, Nature Reviews Genetics.

[13]  Inna Dubchak,et al.  ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes , 2008, Nucleic Acids Res..

[14]  S. Sommer,et al.  Epidemiology of Doublet/Multiplet Mutations in Lung Cancers: Evidence that a Subset Arises by Chronocoordinate Events , 2008, PloS one.

[15]  S. Carroll,et al.  Frequent and widespread parallel evolution of protein sequences. , 2008, Molecular biology and evolution.

[16]  Liran Carmel,et al.  Homoplasy in genome-wide analysis of rare amino acid replacements: the molecular-evolutionary basis for Vavilov's law of homologous series , 2008, Biology Direct.

[17]  M. Brudno,et al.  Extensive parallelism in protein evolution , 2007, Biology Direct.

[18]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[19]  M. Lynch The frailty of adaptive hypotheses for the origins of organismal complexity , 2007, Proceedings of the National Academy of Sciences.

[20]  J. Drake Too Many Mutants with Multiple Mutations , 2007, Critical reviews in biochemistry and molecular biology.

[21]  Richard E. Lenski,et al.  Parallel Changes in Global Protein Profiles During Long-Term Experimental Evolution in Escherichia coli , 2006, Genetics.

[22]  Igor B. Rogozin,et al.  Dollo parsimony and the reconstruction of genome evolution , 2006 .

[23]  I. Rogozin,et al.  Roles of DNA polymerases in replication, repair, and recombination in eukaryotes. , 2006, International review of cytology.

[24]  Frédéric Delsuc,et al.  Heterotachy and long-branch attraction in phylogenetics , 2005, BMC Evolutionary Biology.

[25]  J. Drake,et al.  Clusters of mutations from transient hypermutability. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[26]  H. Philippe,et al.  Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. , 2005, Molecular biology and evolution.

[27]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[28]  E. Koonin,et al.  A universal trend of amino acid gain and loss in protein evolution , 2005, Nature.

[29]  Wen-Hsiung Li,et al.  Ubiquitin genes as a paradigm of concerted evolution of tandem repeats , 2005, Journal of Molecular Evolution.

[30]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[31]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[32]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[33]  I. Rogozin,et al.  Theoretical analysis of mutation hotspots and their DNA sequence context specificity. , 2003, Mutation research.

[34]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[35]  Alexey S Kondrashov,et al.  Patterns in spontaneous mutation revealed by human-baboon sequence comparison. , 2002, Trends in genetics : TIG.

[36]  K Bebenek,et al.  Error rate and specificity of human and murine DNA polymerase eta. , 2001, Journal of molecular biology.

[37]  G. Pesole,et al.  Long-branch attraction phenomenon and the impact of among-site rate variation on rodent phylogeny. , 2000, Gene.

[38]  B. Harfe,et al.  DNA polymerase zeta introduces multiple mutations when bypassing spontaneous DNA damage in Saccharomyces cerevisiae. , 2000, Molecular cell.

[39]  S. Sommer,et al.  Evidence that proximal multiple mutations in Big Blue transgenic mice are dependent events. , 2000, Mutation research.

[40]  M. Nei,et al.  Molecular Evolution and Phylogenetics , 2000 .

[41]  P. Sharp,et al.  Evidence for a high frequency of simultaneous double-nucleotide substitutions. , 2000, Science.

[42]  N. Bianchi,et al.  Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. , 1993, Molecular biology and evolution.

[43]  D. Labie,et al.  Molecular Evolution , 1991, Nature.

[44]  E. Koonin,et al.  Tale of two serines , 1989, Nature.

[45]  C. Grandori Regulation of kinase activity , 1989, Nature.

[46]  D. Irwin Evolution of an active-site codon in serine proteases , 1988, Nature.

[47]  Sydney Brenner,et al.  The molecular evolution of genes and proteins: a tale of two serines , 1988, Nature.

[48]  M. Seidman,et al.  Multiple point mutations in a shuttle vector propagated in human cells: evidence for an error-prone DNA polymerase activity. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[49]  R. Grantham Amino Acid Difference Formula to Help Explain Protein Evolution , 1974, Science.