Diversity and Evolution Trend of Protein Types of Human Influenza A (H1N1) Virus HA Segment

The Influenza A virus is prone to mutation and the ongoing research on its evolution is of great significance to its prevention and control. The WHO did not update the recommended vaccine after A/California/07/2009 was recommended as the vaccine strain, but the virus has been mutating. This paper proposes an Integrated-Clustering-Analysis (ICA) model to study the distribution and evolution of the influenza A (H1N1) virus. We discover the following interesting facts. Every year there is one major type of virus sequences, the number of which is the overwhelming majority of all sequences. Viral sequences after 2009 undergo cumulative changes as they deviate from the viral vaccine strain over time. According to the drift rate, the evolution process can be divided into three stages. The first stage is a high-speed mutation period from 2009 to 2011. In the second stage, from 2012 to 2014, the mutation speed drops continuously and keeps at a low level. The third stage, from 2015 to 2017, the mutation speed starts with a year jump and follows by two years trough. It seems that the evolution of the influenza A virus has a three years cycle, so we cautiously guess that the drift rate in 2018 would jump up again. The ICA model proposed in this paper can intuitively observe the process of virus type change.

[1]  Gabriele Neumann,et al.  Emergence and pandemic potential of swine-origin H1N1 influenza virus , 2009, Nature.

[2]  Zhao-Hui Qi,et al.  Evolution trends of the 2009 pandemic influenza A (H1N1) viruses in different continents from March 2009 to April 2012 , 2014, Biologia.

[3]  Qilin Xiang,et al.  A new graphical coding of DNA sequence and its similarity calculation , 2013 .

[4]  J. Bao,et al.  A wavelet-based feature vector model for DNA clustering. , 2015, Genetics and molecular research : GMR.

[5]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[6]  Pedro Carpena,et al.  Clustering of DNA words and biological function: a proof of principle. , 2012, Journal of theoretical biology.

[7]  Aiping Wu,et al.  Antigenic Patterns and Evolution of the Human Influenza A (H1N1) Virus , 2015, Scientific Reports.

[8]  Bao Jp,et al.  A wavelet-based feature vector model for DNA clustering. , 2015 .

[9]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[10]  Jinhua Liu,et al.  Novel reassortant influenza viruses between pandemic (H1N1) 2009 and other influenza viruses pose a risk to public health. , 2015, Microbial pathogenesis.

[11]  Ping-an He,et al.  A novel descriptor of protein sequences and its application. , 2014, Journal of theoretical biology.

[12]  Zachary Miller,et al.  Characteristic sites in the internal proteins of avian and human influenza viruses. , 2010 .

[13]  Jun Feng,et al.  A protein mapping method based on physicochemical properties and dimension reduction , 2015, Comput. Biol. Medicine.

[14]  Gavin J. D. Smith,et al.  Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic , 2009, Nature.

[15]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.

[16]  Tiee-Jian Wu,et al.  Statistical Measures of DNA Sequence Dissimilarity under Markov Chain Models of Base Composition , 2001, Biometrics.

[17]  Anjana Munshi,et al.  Comparative analysis of hemagglutinin of 2009 H1N1 influenza A pandemic indicates its evolution to 1918 H1N1 pandemic. , 2012, Gene.

[18]  R. Rabadán,et al.  Geographic dependence, surveillance, and origins of the 2009 influenza A (H1N1) virus. , 2009, The New England journal of medicine.

[19]  Wei Hu Computational Study of Interdependence Between Hemagglutinin and Neuraminidase of Pandemic 2009 H1N1 , 2015, IEEE Transactions on NanoBioscience.

[20]  Guoqing Lu,et al.  Pandemic (H1N1) 2009 virus revisited: an evolutionary retrospective. , 2011, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[21]  [Weekly Epidemiological Record; week of November 30 to December 6, 1952]. , 1952, Le Scalpel.

[22]  Jonas S. Almeida,et al.  Alignment-free sequence comparison-a review , 2003, Bioinform..

[23]  Richard H Scheuermann,et al.  Toward a method for tracking virus evolutionary trajectory applied to the pandemic H1N1 2009 influenza virus. , 2014, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[24]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[25]  Paola Cristina Resende,et al.  Phylogenetic analyses of influenza A (H1N1)pdm09 hemagglutinin gene during and after the pandemic event in Brazil. , 2015, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[26]  Vincent Ferretti,et al.  Integrating alignment-based and alignment-free sequence similarity measures for biological sequence classification , 2014, Bioinform..

[27]  Chenglong Yu,et al.  Protein sequence comparison based on K-string dictionary. , 2013, Gene.