Efficient Influenza A Virus Origin Detection

This research describes a novel, alignment-free method of genomic sequence comparisons based on absent nucleotide words and expression levels. Testing this method on Influenza A virus isolates, three classifications are presented which successfully identify; 1) the geographic origins of domestic bird H5N1 isolates through China and Southeast Asia during 2006, 2) the country of human H5N1 isolates crossing over from domestic bird hosts and, 3) the historical flu season from which human H3N2 isolates originated. Because comparison methods used do not rely on alignment, they are computationally efficient and well suited for large numbers of sequences in compehensive flu transmission network delineation.

[1]  F. Fasina,et al.  Molecular characterization and epidemiology of the highly pathogenic avian influenza H5N1 in Nigeria , 2008, Epidemiology and Infection.

[2]  C. Viboud,et al.  Explorer The genomic and epidemiological dynamics of human influenza A virus , 2016 .

[3]  Yue Lu,et al.  A Polynomial Time Solvable Formulation of Multiple Sequence Alignment , 2005, RECOMB.

[4]  Gavin J. D. Smith,et al.  Identification of the Progenitors of Indonesian and Vietnamese Avian Influenza A (H5N1) Viruses from Southern China , 2008, Journal of Virology.

[5]  Chin‐Yun Lee,et al.  Influenza pandemics: past, present and future. , 2006, Journal of the Formosan Medical Association = Taiwan yi zhi.

[6]  A. Gibbs,et al.  Molecular virology: Was the 1918 pandemic caused by a bird flu? Was the 1918 flu avian in origin? (Reply) , 2006, Nature.

[7]  István Miklós,et al.  Bayesian coestimation of phylogeny and sequence alignment , 2005, BMC Bioinformatics.

[8]  Yi Luo,et al.  How independent are the appearances of n-mers in different genomes? , 2004, Bioinform..

[9]  Sophie Schbath,et al.  Exceptional Motifs in Different Markov Chain Models for a Statistical Analysis of DNA Sequences , 1995, J. Comput. Biol..

[10]  Terence P. Speed,et al.  Over- and Underrepresentation of Short DNA Words in Herpesvirus Genomes , 1996, J. Comput. Biol..

[11]  Sergei L. Kosakovsky Pond,et al.  Evolutionary and Transmission Dynamics of Reassortant H5N1 Influenza Virus in Indonesia , 2008, PLoS pathogens.

[12]  L. Campitelli,et al.  Characterization of Low-Pathogenic H5 Subtype Influenza Viruses from Eurasia: Implications for the Origin of Highly Pathogenic H5N1 Viruses , 2007, Journal of Virology.

[13]  S. Karlin,et al.  Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Timothy L. Andersen,et al.  Absent Sequences: Nullomers and Primes , 2006, Pacific Symposium on Biocomputing.

[15]  R. Amann,et al.  Application of tetranucleotide frequencies for the assignment of genomic fragments. , 2004, Environmental microbiology.

[16]  S. Karlin,et al.  Dinucleotide relative abundance extremes: a genomic signature. , 1995, Trends in genetics : TIG.

[17]  Gabriele Neumann,et al.  Human infection with highly pathogenic H5N1 influenza virus , 2008, The Lancet.

[18]  Integrating genealogy and epidemiology: the ancestral infection and selection graph as a model for reconstructing host virus histories. , 2005, Theoretical population biology.

[19]  Edward C Holmes,et al.  Evolutionary history and phylogeography of human viruses. , 2008, Annual review of microbiology.

[20]  Jonas S. Almeida,et al.  Alignment-free sequence comparison-a review , 2003, Bioinform..

[21]  Chang-won Lee,et al.  Avian influenza virus. , 2009, Comparative immunology, microbiology and infectious diseases.

[22]  S. Salzberg,et al.  Genome Analysis Linking Recent European and African Influenza (H5N1) Viruses , 2007, Emerging infectious diseases.

[23]  Yi Guan,et al.  Characterization of H5N1 Influenza Viruses That Continue To Circulate in Geese in Southeastern China , 2002, Journal of Virology.

[24]  S. Kanaya,et al.  A novel bioinformatic strategy for unveiling hidden genome signatures of eukaryotes: self-organizing map of oligonucleotide frequency. , 2002, Genome informatics. International Conference on Genome Informatics.

[25]  Cecile Viboud,et al.  Molecular Epidemiology of A/H3N2 and A/H1N1 Influenza Virus during a Single Epidemic Season in the United States , 2008, PLoS pathogens.

[26]  I. Rigoutsos,et al.  Accurate phylogenetic classification of variable-length DNA fragments , 2007, Nature Methods.

[27]  Colin A. Russell,et al.  The Global Circulation of Seasonal Influenza A (H3N2) Viruses , 2008, Science.

[28]  S. Karlin,et al.  Over- and under-representation of short oligonucleotides in DNA sequences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[29]  David E. Swayne,et al.  Continued Circulation in China of Highly Pathogenic Avian Influenza Viruses Encoding the Hemagglutinin Gene Associated with the 1997 H5N1 Outbreak in Poultry and Humans , 2000, Journal of Virology.

[30]  A. Fomsgaard,et al.  First introduction of highly pathogenic H5N1 avian influenza A viruses in wild and domestic birds in Denmark, Northern Europe , 2007, Virology Journal.

[31]  S. Cherian,et al.  Characterization of the complete genome of influenza A (H5N1) virus isolated during the 2006 outbreak in poultry in India , 2008, Virus Genes.

[32]  M. Blaser,et al.  Evolutionary implications of microbial genome tetranucleotide frequency biases. , 2003, Genome research.

[33]  J. Taubenberger The origin and virulence of the 1918 "Spanish" influenza virus. , 2006, Proceedings of the American Philosophical Society.

[34]  David E. Swayne,et al.  Isolation and Characterization of Avian Influenza Viruses, Including Highly Pathogenic H5N1, from Poultry in Live Bird Markets in Hanoi, Vietnam, in 2001 , 2005, Journal of Virology.

[35]  R. Merkl,et al.  Statistical evaluation and biological interpretation of non-random abundance in the E. coli K-12 genome of tetra- and pentanucleotide sequences related to VSP DNA mismatch repair. , 1992, Nucleic acids research.