Attention mechanism-based deep learning pan-specific model for interpretable MHC-I peptide binding prediction

Accurate prediction of peptide binding affinity to the major histocompatibility complex (MHC) proteins has the potential to design better therapeutic vaccines. Previous work has shown that pan-specific prediction algorithms can achieve better prediction performance than other approaches. However,most of the top algorithms are neural networks based black box models. Here, we propose DeepAttentionPan, an improved pan-specific model, based on convolutional neural networks and attention mechanisms for more flexible, stable and interpretable MHC-I binding prediction. With the attention mechanism, our ensemble model consisting of 20 trained networks achieves high and more stabilized prediction performance. Extensive tests on IEDB9s weekly benchmark dataset show that our method achieves state-of-the-art prediction performance on 21 test allele datasets. Analysis of the peptide positional attention weights learned by our model demonstrates its capability to capture critical binding positions of the peptides,which leads to mechanistic understanding of MHC-peptide binding with high alignment with experimentally verified results. Furthermore, we show that with transfer learning,our pan model can be fine-tuned for alleles with few samples to achieve additional performance improvement. DeepAttentionPan is freely available as an open source software at https://github.com/jjin49/DeepAttentionPan.

[1]  J. Sidney,et al.  Prominent role of secondary anchor residues in peptide binding to HLA-A2.1 molecules , 1993, Cell.

[2]  Morten Nielsen,et al.  Gapped sequence alignment using artificial neural networks: application to the MHC class I system , 2016, Bioinform..

[3]  Morten Nielsen,et al.  NetMHCcons: a consensus method for the major histocompatibility complex class I predictions , 2011, Immunogenetics.

[4]  K. Parker,et al.  Endogenous peptides with distinct amino acid anchor residue motifs bind to HLA-A1 and HLA-B8. , 1994, Journal of immunology.

[5]  James Robinson,et al.  The IPD and IMGT/HLA database: allele variant databases , 2014, Nucleic Acids Res..

[6]  Xiaoxia Wang,et al.  ACME: pan-specific peptide-MHC class I binding prediction through attention-based deep neural networks , 2019, Bioinform..

[7]  J. Sidney,et al.  Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism , 1999, Immunogenetics.

[8]  Morten Nielsen,et al.  NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11 , 2008, Nucleic Acids Res..

[9]  Bjoern Peters,et al.  Quantitative peptide binding motifs for 19 human and mouse MHC class I molecules derived using positional scanning combinatorial peptide libraries , 2008, Immunome research.

[10]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[11]  Dongsup Kim,et al.  Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction , 2017, BMC Bioinformatics.

[12]  Alessandro Sette,et al.  Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method , 2005, BMC Bioinformatics.

[13]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[14]  Bjoern Peters,et al.  HLA class I supertypes: a revised and updated classification , 2008, BMC Immunology.

[15]  Ashish Vaswani,et al.  Self-Attention with Relative Position Representations , 2018, NAACL.

[16]  Deborah Hix,et al.  The immune epitope database (IEDB) 3.0 , 2014, Nucleic Acids Res..

[17]  Jianjun Hu,et al.  DeepMHC: Deep Convolutional Neural Networks for High-performance peptide-MHC Binding Affinity Prediction , 2017, bioRxiv.

[18]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[19]  Morten Nielsen,et al.  The PickPocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding , 2009, Bioinform..

[20]  M. Nielsen,et al.  NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets , 2016, Genome Medicine.

[21]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[22]  Xiaohui Xie,et al.  HLA class I binding prediction via convolutional neural networks , 2017, bioRxiv.

[23]  O. Lund,et al.  NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence , 2007, PloS one.

[24]  Morten Nielsen,et al.  Automated benchmarking of peptide-MHC class I binding predictions , 2015, Bioinform..

[25]  Yuxin Cui,et al.  DeepSeqPan, a novel deep convolutional neural network model for pan-specific class I HLA-peptide binding affinity prediction , 2018, Scientific Reports.

[26]  Bjoern Peters,et al.  Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications , 2005, Immunogenetics.

[27]  M. Nielsen,et al.  NetMHCpan-4.0: Improved Peptide–MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data , 2017, The Journal of Immunology.

[28]  O. Lund,et al.  NetMHCpan, a method for MHC class I binding prediction beyond humans , 2008, Immunogenetics.

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  S. Schreiber,et al.  Covalent HLA-B27/peptide complex induced by specific recognition of an aziridine mimic of arginine. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[32]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[33]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.