Modeling the site-specific variation of selection patterns along lineages.

The unambiguous footprint of positive Darwinian selection in protein-coding DNA sequences is revealed by an excess of nonsynonymous substitutions over synonymous substitutions compared with the neutral expectation. Methods for analyzing the patterns of nonsynonymous and synonymous substitutions usually rely on stochastic models in which the selection regime may vary across the sequence but remains constant across lineages for any amino acid position. Despite some work that has relaxed the constraint that selection patterns remain constant over time, no model provides a strong statistical framework to deal with switches between selection processes at individual sites during the course of evolution. This paper describes an approach that allows the site-specific selection process to vary along lineages of a phylogenetic tree. The parameters of the switching model of codon substitution are estimated by using maximum likelihood. The analysis of eight HIV-1 env homologous sequence data sets shows that this model provides a significantly better fit to the data than one that does not take into account switches between selection patterns in the phylogeny at individual sites. We also provide strong evidence that the strength and the frequency of occurrence of selection might not be estimated accurately when the site-specific variation of selection regimes is ignored.

[1]  D. Haydon,et al.  Evidence for positive selection in foot-and-mouth disease virus capsid genes from field isolates. , 2001, Genetics.

[2]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[3]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[4]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[5]  J. Margolick,et al.  Consistent Viral Evolutionary Changes Associated with the Progression of Human Immunodeficiency Virus Type 1 Infection , 1999, Journal of Virology.

[6]  Allen G. Rodrigo,et al.  Immune-Mediated Positive Selection Drives Human Immunodeficiency Virus Type 1 Molecular Variation and Predicts Disease Duration , 2002, Journal of Virology.

[7]  P. Sharp,et al.  In search of molecular darwinism , 1997, Nature.

[8]  Allen G. Rodrigo,et al.  Computational and Evolutionary Analysis of HIV Molecular Sequences , 2001, Springer US.

[9]  K. Liang,et al.  Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions , 1987 .

[10]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[11]  T. Ota,et al.  Positive selection is a general phenomenon in the evolution of abalone sperm lysin. , 1995, Molecular biology and evolution.

[12]  M. Nei,et al.  Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection , 1988, Nature.

[13]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[14]  T. Ohta Pattern of nucleotide substitutions in growth hormone-prolactin gene family: a paradigm for evolution by gene duplication. , 1993, Genetics.

[15]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[16]  W. Messier,et al.  Episodic adaptive evolution of primate lysozymes , 1997, Nature.

[17]  C. Luo,et al.  A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. , 1985, Molecular biology and evolution.

[18]  J. Huelsenbeck,et al.  Bayesian Estimation of Positively Selected Sites , 2004, Journal of Molecular Evolution.

[19]  Ziheng Yang,et al.  Positive Darwinian selection drives the evolution of several female reproductive proteins in mammals , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  K. Crandall,et al.  Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection. , 1999, Molecular biology and evolution.

[21]  M. Steel,et al.  Modeling the covarion hypothesis of nucleotide substitution. , 1998, Mathematical biosciences.

[22]  Roald Forsberg,et al.  A codon-based model of host-specific selection in parasites, with an application to the influenza A virus. , 2003, Molecular biology and evolution.

[23]  R. Nielsen,et al.  Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. , 2002, Molecular biology and evolution.

[24]  Z. Yang,et al.  Positive and negative selection in the DAZ gene family. , 2001, Molecular biology and evolution.

[25]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[26]  Richard A. Goldstein,et al.  Probabilistic reconstruction of ancestral protein sequences , 1996, Journal of Molecular Evolution.

[27]  Z. Yang,et al.  Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. , 2000, Molecular biology and evolution.

[28]  S. Yokoyama,et al.  ADAPTIVE EVOLUTION OF PHOTORECEPTORS AND VISUAL PIGMENTS IN VERTEBRATES , 1996 .

[29]  F. Ayala,et al.  Convergent neofunctionalization by positive Darwinian selection after ancient recurrent duplications of the xanthine dehydrogenase gene , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[30]  M. Adams,et al.  Inferring Nonneutral Evolution from Human-Chimp-Mouse Orthologous Gene Trios , 2003, Science.

[31]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.