Detecting Amino Acid Sites Under Positive Selection and Purifying Selection

An excess of nonsynonymous over synonymous substitution at individual amino acid sites is an important indicator that positive selection has affected the evolution of a protein between the extant sequences under study and their most recent common ancestor. Several methods exist to detect the presence, and sometimes location, of positively selected sites in alignments of protein-coding sequences. This article describes the “sitewise likelihood-ratio” (SLR) method for detecting nonneutral evolution, a statistical test that can identify sites that are unusually conserved as well as those that are unusually variable. We show that the SLR method can be more powerful than currently published methods for detecting the location of positive selection, especially in difficult cases where the strength of selection is low. The increase in power is achieved while relaxing assumptions about how the strength of selection varies over sites and without elevated rates of false-positive results that have been reported with some other methods. We also show that the SLR method performs well even under circumstances where the results from some previous methods can be misleading.

[1]  Qi Li,et al.  On Hotelling's Approach to Hypothesis Testing When a Nuisance Parameter Is Present Only under the Alternative , 2007 .

[2]  Ziheng Yang,et al.  Maximum-likelihood models for combined analyses of multiple sequence data , 1996, Journal of Molecular Evolution.

[3]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[4]  Nick Goldman,et al.  Accuracy and Power of Statistical Methods for Detecting Adaptive Evolution in Protein Coding Sequences and for Identifying Positively Selected Sites , 2004, Genetics.

[5]  Yoshiyuki Suzuki,et al.  New Methods for Detecting Positive Selection at Single Amino Acid Sites , 2004, Journal of Molecular Evolution.

[6]  J. Huelsenbeck,et al.  Bayesian Estimation of Positively Selected Sites , 2004, Journal of Molecular Evolution.

[7]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[8]  Nick Goldman,et al.  Statistical tests of models of DNA substitution , 1993, Journal of Molecular Evolution.

[9]  Simon Whelan,et al.  Pandit: a database of protein and associated nucleotide domains with inferred trees , 2003, Bioinform..

[10]  R. Nielsen,et al.  Pervasive adaptive evolution in mammalian fertilization proteins. , 2003, Molecular biology and evolution.

[11]  Masatoshi Nei,et al.  Simulation study of the reliability and robustness of the statistical methods for detecting positive selection at single amino acid sites. , 2002, Molecular biology and evolution.

[12]  Joseph P Bielawski,et al.  Accuracy and power of bayes prediction of amino acid sites under positive selection. , 2002, Molecular biology and evolution.

[13]  M M Miyamoto,et al.  A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Z. Yang,et al.  Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. , 2001, Molecular biology and evolution.

[15]  Ziheng Yang,et al.  Statistical methods for detecting molecular adaptation , 2000, Trends in Ecology & Evolution.

[16]  Ziheng Yang Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A , 2000, Journal of Molecular Evolution.

[17]  Z. Yang,et al.  Maximum-likelihood analysis of molecular adaptation in abalone sperm lysin reveals variable selective pressures among lineages and sites. , 2000, Molecular biology and evolution.

[18]  S. Whelan,et al.  Statistical tests of gamma-distributed rate heterogeneity in models of sequence evolution in phylogenetics. , 2000, Molecular biology and evolution.

[19]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[20]  T Gojobori,et al.  A method for detecting positive selection at single amino acid sites. , 1999, Molecular biology and evolution.

[21]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[22]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[23]  Peter R. Nelson,et al.  Multiple Comparisons: Theory and Methods , 1997 .

[24]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[25]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[26]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[27]  M. Kreitman,et al.  Adaptive protein evolution at the Adh locus in Drosophila , 1991, Nature.

[28]  K. Liang,et al.  Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions , 1987 .

[29]  R. Davies Hypothesis Testing when a Nuisance Parameter is Present Only Under the Alternatives , 1987 .

[30]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[31]  C. Luo,et al.  A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. , 1985, Molecular biology and evolution.

[32]  R. Davies Hypothesis testing when a nuisance parameter is present only under the alternative , 1977 .

[33]  J. Alexander Theory and methods , 1926 .