Predicting Influenza Antigenicity by Matrix Completion With Antigen and Antiserum Similarity

The rapid mutation of influenza viruses especially on the two surface proteins hemagglutinin (HA) and neuraminidase (NA) has made them capable to escape from population immunity, which has become a key challenge for influenza vaccine design. Thus, it is crucial to predict influenza antigenic evolution and identify new antigenic variants in a timely manner. However, traditional experimental methods like hemagglutination inhibition (HI) assay to select vaccine strains are time and labor-intensive, while popular computational methods are less sensitive, which presents the need for more accurate algorithms. In this study, we have proposed a novel low-rank matrix completion model MCAAS to infer antigenic distances between antigens and antisera based on partially revealed antigenic distances, virus similarity based on HA protein sequences, and vaccine similarity based on vaccine strains. The model exploits the correlations of viruses and vaccines in serological tests as well as the ability of HAs from viruses and vaccine strains in inferring influenza antigenicity. We also compared the effects of comprehensive 65 amino acids substitution matrices in predicting influenza antigenicity. As a result, we applied MCAAS into H3N2 seasonal influenza virus data. Our model achieved a 10-fold cross validation root-mean-squared error (RMSE) of 0.5982, significantly outperformed existing computational methods like antigenic cartography, AntigenMap and BMCSI. We also constructed the antigenic map and studied the association between genetic and antigenic evolution of H3N2 influenza viruses. Finally, our analyses showed that homologous structure derived amino acid substitution matrix (HSDM) is most powerful in predicting influenza antigenicity, which is consistent with previous studies.

[1]  Xiong Li,et al.  A Hierarchical Clustering Method of Selecting Kernel SNP to Unify Informative SNP and Tag SNP , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[3]  Tong Zhang,et al.  Sequence-Based Antigenic Change Prediction by a Sparse Learning Method Incorporating Co-Evolutionary Information , 2014, PloS one.

[4]  Keqin Li,et al.  Predicting Drug–Target Interactions With Multi-Information Fusion , 2017, IEEE Journal of Biomedical and Health Informatics.

[5]  Tong Zhang,et al.  Using Sequence Data To Infer the Antigenicity of Influenza Virus , 2013, mBio.

[6]  Pingan He,et al.  Predicting influenza antigenicity from Hemagglutintin sequence data based on a joint random forest method , 2017, Scientific Reports.

[7]  Wei Liang,et al.  On Efficient Feature Ranking Methods for High-Throughput Data Analysis , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Weiwei Zhang,et al.  Matrix completion with side information and its applications in predicting the antigenicity of influenza viruses , 2017, Bioinform..

[9]  Ying Liang,et al.  Seeksv: an accurate tool for somatic structural variation and virus integration detection , 2017, Bioinform..

[10]  George K. Hirst,et al.  STUDIES OF ANTIGENIC DIFFERENCES AMONG STRAINS OF INFLUENZA A BY MEANS OF RED CELL AGGLUTINATION , 1943, The Journal of experimental medicine.

[11]  R. Webby,et al.  WHO recommendations for the viruses used in the 2013-2014 Northern Hemisphere influenza vaccine: Epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013. , 2014, Vaccine.

[12]  A. Lapedes,et al.  Mapping the Antigenic and Genetic Evolution of Influenza Virus , 2004, Science.

[13]  Chao A. Hsiung,et al.  Bioinformatics models for predicting antigenic variants of influenza A/H3N2 virus , 2008, Bioinform..

[14]  Ron A M Fouchier,et al.  Use of Antigenic Cartography in Vaccine Seed Strain Selection , 2010, Avian diseases.

[15]  R Farber,et al.  The geometry of shape space: application to influenza. , 2001, Journal of theoretical biology.

[16]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[17]  Influenza, Seasonal , 2016, My Child Is Sick!, 2nd Ed.

[18]  Nan Yang,et al.  A disease diagnosis and treatment recommendation system based on big data mining and cloud computing , 2018, Inf. Sci..

[19]  Richard A. Goldstein,et al.  Changing Selective Pressure during Antigenic Changes in Human Influenza H3 , 2008, PLoS pathogens.

[20]  Zhipeng Cai,et al.  AntigenMap 3D: an online antigenic cartography resource , 2012, Bioinform..

[21]  Kenli Li,et al.  MRUniNovo: an efficient tool for de novo peptide sequencing utilizing the hadoop distributed computing framework , 2016, Bioinform..

[22]  Q. Zou,et al.  Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition , 2016, International journal of molecular sciences.

[23]  Trevor Bedford,et al.  Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses , 2015, Proceedings of the National Academy of Sciences.

[24]  Esmeralda Alvarado-Facundo,et al.  Intermonomer Interactions in Hemagglutinin Subunits HA1 and HA2 Affecting Hemagglutinin Stability and Influenza Virus Infectivity , 2015, Journal of Virology.

[25]  Xingming Sun,et al.  A Novel method for similarity analysis and protein sub-cellular localization prediction , 2010, Bioinform..

[26]  Jinn-Moon Yang,et al.  Co-evolution positions and rules for antigenic variants of human influenza A/H3N2 viruses , 2009, BMC Bioinformatics.

[27]  A. Douglas,et al.  The evolution of human influenza viruses. , 2001, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[28]  Xiong Li,et al.  Informative SNPs Selection Based on Two-Locus and Multilocus Linkage Disequilibrium: Criteria of Max-Correlation and Min-Redundancy , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Xiong Li,et al.  A Novel Method to Select Informative SNPs and Their Application in Genetic Association Studies , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[30]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[31]  Zhipeng Cai,et al.  A Computational Framework for Influenza Antigenic Cartography , 2010, PLoS Comput. Biol..

[32]  Renfa Li,et al.  Multiple ant colony algorithm method for selecting tag SNPs , 2012, J. Biomed. Informatics.

[33]  Min-Shi Lee,et al.  Predicting Antigenic Variants of Influenza A/H3N2 Viruses , 2004, Emerging infectious diseases.

[34]  Hyojung Lee,et al.  Stochastic methods for epidemic models: An application to the 2009 H1N1 influenza outbreak in Korea , 2016, Appl. Math. Comput..