Predicting transmission of avian influenza A viruses from avian to human by using informative physicochemical properties

Some strains of avian influenza A virus (AIV) can directly transmit from their natural hosts to humans. These avian-to-human transmissions have continuously been reported to cause human deaths worldwide since 1997. Predicting whether AIV strains can transmit from avian to human is valuable for early warning of AIV strains with human pandemic potential. In this study, we constructed a computational model to predict avian-to-human transmission of AIV based on physicochemical properties. Initially, ninety signature positions in the inner protein sequences were extracted with the entropy method. These positions were then encoded with 531 physicochemical features. Subsequently, the optimal subset of these physicochemical features was mined with several feature selection methods. Finally, a support vector machine (SVM) model named A2H was established to integrate the selected optimal features. The experimental results of cross-validation and an independent test show that A2H has the capability of predicting transmission of AIV from avian to human.

[1]  Zheng Kou,et al.  Prediction of interspecies transmission for avian influenza A virus based on a back-propagation neural network , 2010, Math. Comput. Model..

[2]  Gabriele Neumann,et al.  Emergence and pandemic potential of swine-origin H1N1 influenza virus , 2009, Nature.

[3]  Yukiko Muramoto,et al.  Pathogenicity of highly pathogenic avian H5N1 influenza A viruses isolated from humans between 2003 and 2008 in northern Vietnam , 2010, The Journal of general virology.

[4]  Jin Hyun Kim,et al.  Biological and Structural Characterization of a Host-Adapting Amino Acid in Influenza Virus , 2010, PLoS pathogens.

[5]  T. Tatusova,et al.  The Influenza Virus Resource at the National Center for Biotechnology Information , 2007, Journal of Virology.

[6]  Kwok-Hung Chan,et al.  Infection of immunocompromised patients by avian H9N2 influenza A virus. , 2011, The Journal of infection.

[7]  Marion Koopmans,et al.  Avian influenza A virus (H7N7) associated with human conjunctivitis and a fatal case of acute respiratory distress syndrome. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  British Columbia,et al.  Human Illness from Avian Influenza H7N3 , 2004 .

[9]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[10]  David B. Finkelstein,et al.  Persistent Host Markers in Pandemic and H5N1 Influenza Viruses , 2007, Journal of Virology.

[11]  Guohua Deng,et al.  A Single-Amino-Acid Substitution in the NS1 Protein Changes the Pathogenicity of H5N1 Avian Influenza Viruses in Mice , 2007, Journal of Virology.

[12]  Christopher N. Larsen,et al.  BioHealthBase: informatics support in the elucidation of influenza virus host–pathogen interactions and virulence , 2007, Nucleic Acids Res..

[13]  Urbano Nunes,et al.  Novel Maximum-Margin Training Algorithms for Supervised Neural Networks , 2010, IEEE Transactions on Neural Networks.

[14]  Jaap Heringa,et al.  An analysis of protein domain linkers: their classification and role in protein folding. , 2002, Protein engineering.

[15]  Shinn-Ying Ho,et al.  Computational identification of ubiquitylation sites from protein sequences , 2008, BMC Bioinformatics.

[16]  J. Doudna,et al.  An inhibitory activity in human cells restricts the function of an avian-like influenza virus polymerase. , 2008, Cell host & microbe.

[17]  Xiaoyong Zou,et al.  Prediction of protein secondary structure content by using the concept of Chou's pseudo amino acid composition and support vector machine. , 2009, Protein and peptide letters.

[18]  Zejun Li,et al.  Identification of Amino Acids in HA and PB2 Critical for the Transmission of H5N1 Avian Influenza Viruses in a Mammalian Host , 2009, PLoS pathogens.

[19]  Yi Guan,et al.  Full Factorial Analysis of Mammalian and Avian Influenza Polymerase Subunits Suggests a Role of an Efficient Polymerase for Virus Adaptation , 2009, PloS one.

[20]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[21]  C. Naeve,et al.  Mutations in the hemagglutinin receptor-binding site can change the biological properties of an influenza virus , 1984, Journal of virology.

[22]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[23]  Shaomin Yan,et al.  Mutation trend of hemagglutinin of influenza A virus: a review from a computational mutation viewpoint , 2006, Acta Pharmacologica Sinica.

[24]  Yi Pan,et al.  Understandable learning machine system design for Transmembrane or Embedded Membrane segments prediction , 2011, Int. J. Data Min. Bioinform..

[25]  H. Klenk,et al.  Molecular mechanisms of interspecies transmission and pathogenicity of influenza viruses: Lessons from the 2009 pandemic , 2011, BioEssays : news and reviews in molecular, cellular and developmental biology.

[26]  Doina Caragea,et al.  Prediction of alternatively spliced exons using Support Vector Machines , 2010, Int. J. Data Min. Bioinform..

[27]  Jagath C. Rajapakse,et al.  Prediction of Protein Secondary Structure with two-stage multi-class SVMs , 2007, Int. J. Data Min. Bioinform..

[28]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[29]  Kuo-Chen Chou,et al.  A novel sequence-based method for phosphorylation site prediction with feature selection and analysis. , 2012, Protein and peptide letters.

[30]  K. Chou Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology , 2009 .

[31]  B. Murphy,et al.  A single amino acid in the PB2 gene of influenza A virus is a determinant of host range , 1993, Journal of virology.

[32]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[33]  Pang-Chui Shaw,et al.  Structure of the influenza virus A H5N1 nucleoprotein: implications for RNA binding, oligomerization, and vaccine design , 2008, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[34]  L. N. Kanal,et al.  Handbook of Statistics, Vol. 2. Classification, Pattern Recognition and Reduction of Dimensionality. , 1985 .

[35]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Zheng Kou,et al.  Molecular patterns of avian influenza A viruses , 2008 .

[37]  D. Nayak,et al.  Influenza virus polymerase basic protein 1 interacts with influenza virus polymerase basic protein 2 at multiple sites , 1996, Journal of virology.

[38]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[39]  Guo-Zheng Li,et al.  An asymmetric classifier based on partial least squares , 2010, Pattern Recognit..

[40]  Ron A M Fouchier,et al.  The molecular basis of the pathogenicity of the Dutch highly pathogenic human influenza A H7N7 viruses. , 2007, The Journal of infectious diseases.

[41]  N. Cox,et al.  Characterization of an avian influenza A (H5N1) virus isolated from a child with a fatal respiratory illness. , 1998, Science.

[42]  Martin Hirst,et al.  Human Illness from Avian Influenza H7N3, British Columbia , 2004, Emerging infectious diseases.

[43]  Guang-Wu Chen,et al.  Genomic Signatures of Human versus Avian Influenza A Viruses , 2006, Emerging infectious diseases.

[44]  G T Montelione,et al.  An amino-terminal polypeptide fragment of the influenza virus NS1 protein possesses specific RNA-binding activity and largely helical backbone structure. , 1995, RNA.

[45]  S. Cusack,et al.  Host Determinant Residue Lysine 627 Lies on the Surface of a Discrete, Folded Domain of Influenza Virus Polymerase PB2 Subunit , 2008, PLoS pathogens.

[46]  David J. Stevens,et al.  Haemagglutinin mutations responsible for the binding of H5N1 influenza A viruses to human-type receptors , 2006, Nature.

[47]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.