Synonymous codon usage and selection on proteins

Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed `volatility', to estimate selection pressures on protein sequences from their synonymous codon usage (Plotkin and Dushoff 2003, Plotkin et al 2004a). Here we provide a theoretical foundation for this approach. We derive the expected frequencies of synonymous codons as a function of the strength of selection, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. Our results indicate that, over a broad range of parameters, synonymous codon usage can reliably distinguish between negative selection, positive selection, and neutrality. While the power of volatility to detect negative selection depends on the population size, there is no such dependence for the detection of positive selection. Furthermore, we show that phenomena such as transient hyper-mutators in microbes can improve the power of volatility to detect negative selection, even when the typical observed neutral site heterozygosity is low.

[1]  M. Bulmer The selection-mutation-drift theory of synonymous codon usage. , 1991, Genetics.

[2]  Paul Higgs,et al.  Error thresholds and stationary mutant distributions in multi-locus diploid genetics models , 1994 .

[3]  B. Charlesworth,et al.  The effect of deleterious mutations on neutral molecular variation. , 1993, Genetics.

[4]  L. Pauling,et al.  Molecules as documents of evolutionary history. , 1965, Journal of theoretical biology.

[5]  J. Wakeley,et al.  The excess of transitions among nucleotide substitutions: new methods of estimating transition bias underscore its significance. , 1996, Trends in ecology & evolution.

[6]  T. Nagylaki Introduction to Theoretical Population Genetics , 1992 .

[7]  S. Wright Evolution in mendelian populations , 1931 .

[8]  F. Tajima The amount of DNA polymorphism maintained in a finite population when the neutral mutation rate varies among sites. , 1996, Genetics.

[9]  D. Hartl,et al.  Selection intensity for codon bias. , 1994, Genetics.

[10]  D. Hartl,et al.  Directional selection and the site-frequency spectrum. , 2001, Genetics.

[11]  W. L. Payne,et al.  High Mutation Frequencies Among Escherichia coli and Salmonella Pathogens , 1996, Science.

[12]  G. Churchill,et al.  Properties of statistical tests of neutrality for DNA polymorphism data. , 1995, Genetics.

[13]  Mandy J. Haldane,et al.  A Mathematical Theory of Natural and Artificial Selection, Part V: Selection and Mutation , 1927, Mathematical Proceedings of the Cambridge Philosophical Society.

[14]  A. Oliver,et al.  High frequency of hypermutable Pseudomonas aeruginosa in cystic fibrosis lung infection. , 2000, Science.

[15]  M. Kreitman,et al.  Methods to detect selection in populations with applications to the human. , 2000, Annual review of genomics and human genetics.

[16]  T. Ferenci,et al.  Enrichment and elimination of mutY mutators in Escherichia coli populations. , 2002, Genetics.

[17]  J. L. King,et al.  Non-Darwinian evolution. , 1969, Science.

[18]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[19]  Wen-Hsiung Li Unbiased estimation of the rates of synonymous and nonsynonymous substitution , 2006, Journal of Molecular Evolution.

[20]  Colin J. Thompson,et al.  On Eigen's theory of the self-organization of matter and the evolution of biological macromolecules , 1974 .

[21]  W. Marzluff,et al.  Selection on silent sites in the rodent H3 histone gene family. , 1994, Genetics.

[22]  K. H. Wolfe,et al.  Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae , 2000, Yeast.

[23]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[24]  Gerald J. Wyckoff,et al.  A universal evolutionary index for amino acid changes. , 2004, Molecular biology and evolution.

[25]  C. Pál,et al.  Highly expressed genes in yeast evolve slowly. , 2001, Genetics.

[26]  Jianzhi Zhang,et al.  On the Evolution of Codon Volatility , 2005, Genetics.

[27]  François Taddei,et al.  Evolutionary Implications of the Frequent Horizontal Transfer of Mismatch Repair Genes , 2000, Cell.

[28]  Ziheng Yang Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A , 2000, Journal of Molecular Evolution.

[29]  F. Taddei,et al.  The rise and fall of mutator bacteria. , 2001, Current opinion in microbiology.

[30]  Joshua B. Plotkin,et al.  Detecting selection using a single genome sequence of M. tuberculosis and P. falciparum , 2004, Nature.

[31]  S. Salzberg,et al.  Whole-Genome Comparison of Mycobacterium tuberculosis Clinical and Laboratory Strains , 2002, Journal of bacteriology.

[32]  Maria Anisimova,et al.  The accuracy and power of likelihood ratio tests to detect positive selection at amino acid sites , 2001 .

[33]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[34]  J. Gillespie IS THE POPULATION SIZE OF A SPECIES RELEVANT TO ITS EVOLUTION? , 2001, Evolution; international journal of organic evolution.

[35]  R H Borts,et al.  Direct estimate of the mutation rate and the distribution of fitness effects in the yeast Saccharomyces cerevisiae. , 2001, Genetics.

[36]  M. Adams,et al.  Inferring Nonneutral Evolution from Human-Chimp-Mouse Orthologous Gene Trios , 2003, Science.

[37]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[38]  A K Konopka Theory of degenerate coding and informational parameters of protein coding genes. , 1985, Biochimie.

[39]  S. Miyazawa,et al.  Two types of amino acid substitutions in protein evolution , 1979, Journal of Molecular Evolution.

[40]  M. Kreitman,et al.  Adaptive protein evolution at the Adh locus in Drosophila , 1991, Nature.

[41]  C. Kurland,et al.  Codon usage determines translation rate in Escherichia coli. , 1989, Journal of molecular biology.

[42]  D. Hartl,et al.  Population genetics of polymorphism and divergence. , 1992, Genetics.

[43]  O. Berg,et al.  Selection intensity for codon bias and the effective population size of Escherichia coli. , 1996, Genetics.

[44]  H. Akashi,et al.  Gene expression and molecular evolution. , 2001, Current opinion in genetics & development.

[45]  Valeria Souza,et al.  Stress-Induced Mutagenesis in Bacteria , 2003, Science.

[46]  J. Crow,et al.  THE NUMBER OF ALLELES THAT CAN BE MAINTAINED IN A FINITE POPULATION. , 1964, Genetics.

[47]  W. Ewens Mathematical Population Genetics , 1980 .

[48]  M. Kimura,et al.  An introduction to population genetics theory , 1971 .

[49]  Jonathan Dushoff,et al.  Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[50]  A. E. Hirsh,et al.  Adjusting for selection on synonymous sites in estimates of evolutionary distance. , 2005, Molecular biology and evolution.