Tempel: time-series mutation prediction of influenza A viruses via attention-based recurrent neural networks

MOTIVATION Influenza viruses are persistently threatening public health, causing annual epidemics, and sporadic pandemics. The evolution of influenza viruses remains to be the main obstacle in the effectiveness of antiviral treatments due to rapid mutations. The goal of this work is to predict whether mutations are likely to occur in the next flu season using historical glycoprotein hemagglutinin (HA) sequence data. One of the major challenges is to model the temporality and dimensionality of sequential influenza strains and to interpret the prediction results. RESULTS In this paper, we propose an efficient and robust time-series mutation prediction model Tempel for the mutation prediction of influenza A viruses. We first construct the sequential training samples with splittings and embeddings. By employing recurrent neural networks (RNNs) with attention mechanisms, Tempel is capable of considering the historical residue information. Attention mechanisms are being increasingly used to improve the performance of mutation prediction by selectively focusing on the parts of the residues. A framework is established based on Tempel that enables us to predict the mutations at any specific residue site. Experimental results on three influenza datasets show that Tempel can significantly enhance the predictive performance compared with widely used approaches and provide novel insights into the dynamics of viral mutation and evolution. SUPPLEMENTARY INFORMATION The datasets, source code and supplementary documents are available at: https://drive.google.com/open?id=15WULR5__6k47iRotRPl3H7ghi3RpeNXH.

[1]  Chih-Jen Wei,et al.  Immunization by Avian H5 Influenza Hemagglutinin Mutants with Altered Receptor Binding Specificity , 2007, Science.

[2]  Aiping Wu,et al.  Networks of genomic co-occurrence capture characteristics of human influenza A (H3N2) evolution. , 2007, Genome research.

[3]  Trevor Bedford,et al.  Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses , 2015, Proceedings of the National Academy of Sciences.

[4]  A. Lauring,et al.  Mutation and Epistasis in Influenza Virus Evolution , 2018, Viruses.

[5]  N. Cox,et al.  Global epidemiology of influenza: past and present. , 2000, Annual review of medicine.

[6]  Wei Zhang,et al.  An Airborne Transmissible Avian Influenza H5 Hemagglutinin Seen at the Atomic Level , 2013, Science.

[7]  K. Wei,et al.  Global evolutionary history and spatio-temporal dynamics of dengue virus type 2 , 2017, Scientific Reports.

[8]  David F. Burke,et al.  A Recommended Numbering Scheme for Influenza A HA Subtypes , 2014, PloS one.

[9]  Arthur Chun-Chieh Shih,et al.  Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution , 2007, Proceedings of the National Academy of Sciences.

[10]  A. Lapedes,et al.  Mapping the Antigenic and Genetic Evolution of Influenza Virus , 2004, Science.

[11]  R. Cummings,et al.  The Interplay between the Host Receptor and Influenza Virus Hemagglutinin and Neuraminidase , 2017, International journal of molecular sciences.

[12]  Ehsaneddin Asgari,et al.  Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics , 2015, PloS one.

[13]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[14]  Dayan Wang,et al.  Corrigendum: Continual Antigenic Diversification in China Leads to Global Antigenic Complexity of Avian Influenza H5N1 Viruses , 2017, Scientific Reports.

[15]  Gianni Cesareni,et al.  Normalization of nomenclature for peptide motifs as ligands of modular protein domains , 2002, FEBS letters.

[16]  J. Järhult,et al.  Oseltamivir-Resistant Influenza A (H1N1) Virus Strain with an H274Y Mutation in Neuraminidase Persists without Drug Pressure in Infected Mallards , 2015, Applied and Environmental Microbiology.

[17]  Yang Zhang,et al.  STRUM: structure-based prediction of protein stability changes upon single-point mutation , 2016, Bioinform..

[18]  Aboul Ella Hassanien,et al.  The prediction of virus mutation using neural networks and rough set techniques , 2016, EURASIP J. Bioinform. Syst. Biol..

[19]  P. Daszak,et al.  Emerging infectious diseases of wildlife--threats to biodiversity and human health. , 2000, Science.

[20]  M. Lässig,et al.  A predictive fitness model for influenza , 2014, Nature.

[21]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[22]  Kevin P. Murphy,et al.  SNVMix: predicting single nucleotide variants from next-generation sequencing of tumors , 2010, Bioinform..

[23]  R. Mandal,et al.  In silico prediction of drug resistance due to S247R mutation of Influenza H1N1 neuraminidase protein , 2018, Journal of biomolecular structure & dynamics.

[24]  T. Tatusova,et al.  The Influenza Virus Resource at the National Center for Biotechnology Information , 2007, Journal of Virology.

[25]  Damien Fleury,et al.  Antigen distortion allows influenza virus to escape neutralization , 1998, Nature Structural Biology.

[26]  Alexander Churkin,et al.  Mutational analysis in RNAs: comparing programs for RNA deleterious mutation prediction , 2011, Briefings Bioinform..

[27]  Edward C Holmes,et al.  Avian influenza virus exhibits rapid evolutionary dynamics. , 2006, Molecular biology and evolution.

[28]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29]  Ian A. Wilson,et al.  Structure of the Uncleaved Human H1 Hemagglutinin from the Extinct 1918 Influenza Virus , 2004, Science.

[30]  Fenglong Ma,et al.  A Multi-task Framework for Monitoring Health Conditions via Attention-based Recurrent Neural Networks , 2017, AMIA.

[31]  A. Lauring,et al.  A novel twelve class fluctuation test reveals higher than expected mutation rates for influenza A viruses , 2017, eLife.

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[34]  Ryan McBride,et al.  Three mutations switch H7N9 influenza to human-type receptor specificity , 2017, PLoS pathogens.