A maximum likelihood method for detecting directional evolution in protein sequences and its application to influenza A virus.

We develop a model-based phylogenetic maximum likelihood test for evidence of preferential substitution toward a given residue at individual positions of a protein alignment--directional evolution of protein sequences (DEPS). DEPS can identify both the target residue and sites evolving toward it, help detect selective sweeps and frequency-dependent selection--scenarios that confound most existing tests for selection, and achieve good power and accuracy on simulated data. We applied DEPS to alignments representing different genomic regions of influenza A virus (IAV), sampled from avian hosts (H5N1 serotype) and human hosts (H3N2 serotype), and identified multiple directionally evolving sites in 5/8 genomic segments of H5N1 and H3N2 IAV. We propose a simple descriptive classification of directionally evolving sites into 5 groups based on the temporal distribution of residue frequencies and document known functional correlates, such as immune escape or host adaptation.

[1]  Colin A. Russell,et al.  The Global Circulation of Seasonal Influenza A (H3N2) Viruses , 2008, Science.

[2]  Cécile Viboud,et al.  Multiple Reassortment Events in the Evolutionary History of H1N1 Influenza A Virus Since 1918 , 2008, PLoS pathogens.

[3]  T. Tatusova,et al.  The Influenza Virus Resource at the National Center for Biotechnology Information , 2007, Journal of Virology.

[4]  R. Webster,et al.  Epitope Mapping of the Hemagglutinin Molecule of a Highly Pathogenic H5N1 Influenza Virus by Using Monoclonal Antibodies , 2007, Journal of Virology.

[5]  Cecile Viboud,et al.  Phylogenetic Analysis Reveals the Global Migration of Seasonal Influenza A Viruses , 2007, PLoS pathogens.

[6]  Ian Holmes,et al.  An empirical codon model for protein sequence evolution. , 2007, Molecular biology and evolution.

[7]  David C. Nickle,et al.  HIV-Specific Probabilistic Models of Protein Evolution , 2007, PloS one.

[8]  Morten Nielsen,et al.  CTL epitopes for influenza A including the H5N1 bird flu; genome-, pathogen-, and HLA-wide screening. , 2007, Vaccine.

[9]  Arthur Chun-Chieh Shih,et al.  Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution , 2007, Proceedings of the National Academy of Sciences.

[10]  M. Zvelebil,et al.  A model of directional selection applied to the evolution of drug resistance in HIV-1. , 2007, Molecular biology and evolution.

[11]  E. Holmes,et al.  The evolution of epidemic influenza , 2007, Nature Reviews Genetics.

[12]  Peter F Stadler,et al.  Modeling amino acid substitution patterns in orthologous and paralogous genes. , 2007, Molecular phylogenetics and evolution.

[13]  T. Pupko,et al.  A combined empirical and mechanistic codon model. , 2006, Molecular biology and evolution.

[14]  Tony O’Hagan Bayes factors , 2006 .

[15]  R. Webster,et al.  H5N1 influenza--continuing evolution and spread. , 2006, The New England journal of medicine.

[16]  Yoshiyuki Suzuki,et al.  Natural selection on the influenza virus genome. , 2006, Molecular biology and evolution.

[17]  David Posada,et al.  Automated phylogenetic detection of recombination using a genetic algorithm. , 2006, Molecular biology and evolution.

[18]  Pardis C Sabeti,et al.  Positive Natural Selection in the Human Lineage , 2006, Science.

[19]  Ian A. Wilson,et al.  Structure and Receptor Specificity of the Hemagglutinin from an H5N1 Influenza Virus , 2006, Science.

[20]  Masatoshi Nei,et al.  Selectionism and neutralism in molecular evolution. , 2005, Molecular biology and evolution.

[21]  Arlin Stoltzfus,et al.  The Exchangeability of Amino Acids in Proteins , 2005, Genetics.

[22]  Stuart C. Ray,et al.  Divergent and convergent evolution after a common-source outbreak of hepatitis C virus , 2005, The Journal of experimental medicine.

[23]  Sergei L. Kosakovsky Pond,et al.  Not so different after all: a comparison of methods for detecting amino acid sites under selection. , 2005, Molecular biology and evolution.

[24]  Sergei L. Kosakovsky Pond,et al.  A genetic algorithm approach to detecting lineage-specific variation in selection pressure. , 2005, Molecular biology and evolution.

[25]  Sergei L. Kosakovsky Pond,et al.  HyPhy: hypothesis testing using phylogenies , 2005, Bioinform..

[26]  Simon D W Frost,et al.  A simple hierarchical approach to modeling distributions of substitution rates. , 2005, Molecular biology and evolution.

[27]  Takashi Miyata,et al.  Molecular evolution of mRNA: A method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application , 1980, Journal of Molecular Evolution.

[28]  P. Huang,et al.  Evolutionary characterization of recent human H3N2 influenza A isolates from Japan and China: novel changes in the receptor binding domain , 2005, Archives of Virology.

[29]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[30]  Stéphane Guindon,et al.  Modeling the site-specific variation of selection patterns along lineages. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[31]  A. Lapedes,et al.  Mapping the Antigenic and Genetic Evolution of Influenza Virus , 2004, Science.

[32]  Masatoshi Nei,et al.  False-positive selection identified by ML-based methods: examples from the Sig1 gene of the diatom Thalassiosira weissflogii and the tax gene of a human T-cell lymphotropic virus. , 2004, Molecular biology and evolution.

[33]  D. Haussler,et al.  Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. , 2003, Molecular biology and evolution.

[34]  R. Nielsen,et al.  Pervasive adaptive evolution in mammalian fertilization proteins. , 2003, Molecular biology and evolution.

[35]  Joseph P Bielawski,et al.  Accuracy and power of bayes prediction of amino acid sites under positive selection. , 2002, Molecular biology and evolution.

[36]  R. Nielsen,et al.  Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. , 2002, Molecular biology and evolution.

[37]  K. McCracken,et al.  Estimating the influence of selection on the variable amino acid sites of the cytochrome B protein functional domains. , 2001, Molecular biology and evolution.

[38]  S. Whelan,et al.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. , 2001, Molecular biology and evolution.

[39]  Ziheng Yang Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A , 2000, Journal of Molecular Evolution.

[40]  R. Shamir,et al.  A fast algorithm for joint reconstruction of ancestral amino acid sequences. , 2000, Molecular biology and evolution.

[41]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[42]  W. Fitch,et al.  Predicting the evolution of human influenza A. , 1999, Science.

[43]  Xuhua Xia,et al.  What Amino Acid Properties Affect Protein Evolution? , 1998, Journal of Molecular Evolution.

[44]  Z. Yang,et al.  Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. , 1998, Molecular biology and evolution.

[45]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[46]  D. Maddison,et al.  NEXUS: an extensible file format for systematic information. , 1997, Systematic biology.

[47]  L. Stanfel,et al.  A new approach to clustering the amino acids. , 1996, Journal of theoretical biology.

[48]  S. Muse,et al.  Estimating synonymous and nonsynonymous substitution rates. , 1996, Molecular biology and evolution.

[49]  M. Nei,et al.  A new method of inference of ancestral nucleotide and amino acid sequences. , 1995, Genetics.

[50]  S. Muse Evolutionary analyses of DNA sequences subject to constraints of secondary structure. , 1995, Genetics.

[51]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[52]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[53]  M. Nei,et al.  Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. , 1993, Molecular biology and evolution.

[54]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[55]  H. Klenk,et al.  Carbohydrate masking of an antigenic epitope of influenza virus haemagglutinin independent of oligosaccharide size. , 1992, Glycobiology.

[56]  M. Kreitman,et al.  Adaptive protein evolution at the Adh locus in Drosophila , 1991, Nature.

[57]  G. Semenza,et al.  Hydrophobic binding of the ectodomain of influenza hemagglutinin to membranes occurs through the "fusion peptide". , 1989, The Journal of biological chemistry.

[58]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[59]  K. Liang,et al.  Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions , 1987 .

[60]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[61]  J. N. Varghese,et al.  Structure of the catalytic and antigenic sites in influenza virus neuraminidase , 1983, Nature.

[62]  I. Wilson,et al.  Structure of the haemagglutinin membrane glycoprotein of influenza virus at 3 Å resolution , 1981, Nature.

[63]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[64]  C. Loan,et al.  Nineteen Dubious Ways to Compute the Exponential of a Matrix , 1978 .

[65]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[66]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[67]  D. C. Hurst,et al.  Large Sample Simultaneous Confidence Intervals for Multinomial Proportions , 1964 .

[68]  S. Lindstrom,et al.  Evolutionary characterization of recent human H 3 N 2 influenza A isolates from Japan and China : novel changes in the receptor binding domain Brief , 2022 .