An improved likelihood ratio test for detecting site-specific functional divergence among clades of protein-coding genes.

Maximum likelihood codon substitution models have proven useful for studying when and how protein function evolves, but they have recently been criticized on a number of fronts. The strengths and weaknesses of such methods must therefore be identified and improved upon. Here, using simulations, we show that the Clade model C versus M1a test for functional divergence among clades is prone to false positives under simple evolutionary conditions. We then propose a new null model (M2a_rel) that better accounts for among-site variation in selective constraint. We show that the revised test has an improved false-positive rate and good power. Applying this test to previously analyzed data sets of primate ribonucleases and mammalian rhodopsins reveals that some conclusions may have been misled by the original method. The improved test should prove useful for identifying patterns of divergence in selective constraint among paralogous gene families and among orthologs from ecologically divergent species.

[1]  G. Cannarozzi,et al.  Codon Evolution: Mechanisms and Models , 2012 .

[2]  J. M. Morrow,et al.  The future of codon models in studies of molecular function: ancestral reconstruction and clade models of functional divergence , 2012 .

[3]  Ziheng Yang,et al.  Statistical properties of the branch-site test of positive selection. , 2011, Molecular biology and evolution.

[4]  Masatoshi Nei,et al.  The neutral theory of molecular evolution in the genomic era. , 2010, Annual review of genomics and human genetics.

[5]  E. Teeling,et al.  Rhodopsin Molecular Evolution in Mammals Inhabiting Low Light Environments , 2009, PloS one.

[6]  E. Teeling Hear, hear: the convergent evolution of echolocation in bats? , 2009, Trends in ecology & evolution.

[7]  Maria Anisimova,et al.  Investigating protein-coding sequence evolution with probabilistic codon substitution models. , 2009, Molecular biology and evolution.

[8]  A. Dean,et al.  Mechanistic approaches to the study of evolution: the functional synthesis , 2007, Nature Reviews Genetics.

[9]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[10]  K. Dyer,et al.  The RNase a superfamily: Generation of diversity and innate host defense , 2006, Molecular Diversity.

[11]  W. Wong,et al.  Bayes empirical bayes inference of amino acid sites under positive selection. , 2005, Molecular biology and evolution.

[12]  Ziheng Yang,et al.  A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution , 2004, Journal of Molecular Evolution.

[13]  Roald Forsberg,et al.  A codon-based model of host-specific selection in parasites, with an application to the influenza A virus. , 2003, Molecular biology and evolution.

[14]  S. Whelan,et al.  Statistical tests of gamma-distributed rate heterogeneity in models of sequence evolution in phylogenetics. , 2000, Molecular biology and evolution.

[15]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.