Directionality in the evolution of influenza A haemagglutinin

The evolution of haemagglutinin (HA), an important influenza virus antigen, has been the subject of intensive research for more than two decades. Many characteristics of HA's sequence evolution are captured by standard Markov chain substitution models. Such models assign equal fitness to all accessible amino acids at a site. We show, however, that such models strongly underestimate the number of homoplastic amino acid substitutions during the course of HA's evolution, i.e. substitutions that repeatedly give rise to the same amino acid at a site. We develop statistics to detect individual homoplastic events and find that they preferentially occur at positively selected epitopic sites. Our results suggest that the evolution of the influenza A HA, including evolution by positive selection, is strongly affected by the long-term site-specific preferences for individual amino acids.

[1]  N. Cox,et al.  Genetic analysis of human H2N2 and early H3N2 influenza viruses, 1957-1972: evidence for genetic divergence and multiple reassortment events. , 2004, Virology.

[2]  W. Fitch,et al.  Long term trends in the evolution of H(3) HA1 human influenza type A. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[3]  R. Webster,et al.  Influence of host cell-mediated variation on the international surveillance of influenza A (H3N2) viruses. , 1993, Virology.

[4]  Jonathan Dushoff,et al.  Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Laurence D. Hurst,et al.  A quantitative measure of error minimization in the genetic code , 1991, Journal of Molecular Evolution.

[6]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[7]  Sudhir Kumar,et al.  Detection of convergent and parallel evolution at the amino acid sequence level. , 1997, Molecular biology and evolution.

[8]  R. Nielsen,et al.  Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. , 2002, Molecular biology and evolution.

[9]  David T. Jones,et al.  Protein evolution with dependence among codons due to tertiary structure. , 2003, Molecular biology and evolution.

[10]  María Silvina Fornasari,et al.  Site-specific amino acid replacement matrices from structurally constrained protein evolution simulations. , 2002, Molecular biology and evolution.

[11]  K. Holsinger The neutral theory of molecular evolution , 2004 .

[12]  Sergei L. Kosakovsky Pond,et al.  A maximum likelihood method for detecting directional evolution in protein sequences and its application to influenza A virus. , 2008, Molecular biology and evolution.

[13]  W. Fitch,et al.  Predicting the evolution of human influenza A. , 1999, Science.

[14]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[15]  W. Fitch,et al.  Positive selection on the H3 hemagglutinin gene of human influenza virus A. , 1999, Molecular biology and evolution.

[16]  Z. Yang,et al.  Models of amino acid substitution and applications to mitochondrial protein evolution. , 1998, Molecular biology and evolution.

[17]  D. Bryant,et al.  Site interdependence attributed to tertiary structure in amino acid sequence evolution. , 2005, Gene.

[18]  I. Wilson,et al.  Structural basis of immune recognition of influenza virus hemagglutinin. , 1990, Annual review of immunology.

[19]  E. Holmes,et al.  The evolution of epidemic influenza , 2007, Nature Reviews Genetics.

[20]  Cecile Viboud,et al.  Long intervals of stasis punctuated by bursts of positive selection in the seasonal evolution of influenza A virus , 2006, Biology Direct.

[21]  R. Nielsen,et al.  Detecting Site-Specific Physicochemical Selective Pressures: Applications to the Class I HLA of the Human Major Histocompatibility Complex and the SRK of the Plant Sporophytic Self-Incompatibility System , 2005, Journal of Molecular Evolution.

[22]  N. Cox,et al.  Comparison of 10 influenza A (H1N1 and H3N2) haemagglutinin sequences obtained directly from clinical specimens to those of MDCK cell- and egg-grown viruses. , 1993, The Journal of general virology.

[23]  Matthew W. Dimmic,et al.  Modeling evolution at the protein level using an adjustable amino acid fitness model. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[24]  Yoshiyuki Suzuki,et al.  Natural selection on the influenza virus genome. , 2006, Molecular biology and evolution.

[25]  Itay Mayrose,et al.  Towards realistic codon models: among site variability and dependency of synonymous and non-synonymous rates , 2007, ISMB/ECCB.

[26]  R A Goldstein,et al.  Using physical-chemistry-based substitution models in phylogenetic analyses of HIV-1 subtypes. , 1999, Molecular biology and evolution.

[27]  Ziheng Yang Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A , 2000, Journal of Molecular Evolution.

[28]  W. Fitch,et al.  Effects of passage history and sampling bias on phylogenetic reconstruction of human influenza A evolution. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[29]  M. Brudno,et al.  Extensive parallelism in protein evolution , 2007, Biology Direct.

[30]  Ziheng Yang,et al.  Statistical methods for detecting molecular adaptation , 2000, Trends in Ecology & Evolution.

[31]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[32]  Sean B. Carroll,et al.  Genetics and the making of Homo sapiens , 2003, Nature.

[33]  M. Zvelebil,et al.  A model of directional selection applied to the evolution of drug resistance in HIV-1. , 2007, Molecular biology and evolution.

[34]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[35]  M. Kimura Preponderance of synonymous changes as evidence for the neutral theory of molecular evolution , 1977, Nature.

[36]  A. Halpern,et al.  Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. , 1998, Molecular biology and evolution.

[37]  M. Kimura The Neutral Theory of Molecular Evolution: Introduction , 1983 .

[38]  Ziheng Yang,et al.  Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage. , 2008, Molecular biology and evolution.

[39]  Wendy S. W. Wong,et al.  Identification of physicochemical selective pressure on protein encoding nucleotide sequences , 2006, BMC Bioinformatics.

[40]  R A Goldstein,et al.  Models of natural mutations including site heterogeneity , 1998, Proteins.

[41]  Stéphane Guindon,et al.  Modeling the site-specific variation of selection patterns along lineages. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Arthur Chun-Chieh Shih,et al.  Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution , 2007, Proceedings of the National Academy of Sciences.

[43]  Z. Yang,et al.  Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. , 2000, Molecular biology and evolution.

[44]  W. Fitch,et al.  Positive Darwinian evolution in human influenza A viruses. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[45]  David C. Jones,et al.  Combining protein evolution and secondary structure. , 1996, Molecular biology and evolution.

[46]  Ziheng Yang,et al.  Estimating the distribution of selection coefficients from phylogenetic data with applications to mitochondrial and viral DNA. , 2003, Molecular biology and evolution.

[47]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[48]  Roald Forsberg,et al.  A codon-based model of host-specific selection in parasites, with an application to the influenza A virus. , 2003, Molecular biology and evolution.

[49]  Takashi Miyata,et al.  Molecular evolution of mRNA: A method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application , 1980, Journal of Molecular Evolution.

[50]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[51]  Jonathan Dushoff,et al.  Hemagglutinin sequence clusters and the antigenic evolution of influenza A virus , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[52]  T Gojobori,et al.  Statistical analysis of nucleotide sequences of the hemagglutinin gene of human influenza A viruses. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[53]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[54]  Bryan T Grenfell,et al.  Whole-Genome Analysis of Human Influenza A Virus Reveals Multiple Persistent Lineages and Reassortment among Recent H3N2 Viruses , 2005, PLoS biology.

[55]  K. Nixon,et al.  The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis , 1999, Cladistics : the international journal of the Willi Hennig Society.

[56]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[57]  A. Lesk,et al.  How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. , 1980, Journal of molecular biology.

[58]  P. Higgs,et al.  The Response of Amino Acid Frequencies to Directional Mutation Pressure in Mitochondrial Genome Sequences Is Related to the Physical Properties of the Amino Acids and to the Structure of the Genetic Code , 2005, Journal of Molecular Evolution.

[59]  A. Hughes,et al.  Sequence convergence in the peptide-binding region of primate and rodent MHC class Ib molecules. , 1997, Molecular biology and evolution.

[60]  Christopher J. Lee,et al.  Positive Selection Detection in 40,000 HumanImmunodeficiency Virus (HIV) Type 1 Sequences Automatically IdentifiesDrug Resistance and Positive Fitness Mutations in HIV Proteaseand ReverseTranscriptase , 2004, Journal of Virology.