NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets

BackgroundBinding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells.ResultsHere, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space.ConclusionsWe have developed a neural network-based machine-learning algorithm leveraging information across multiple receptor specificities and ligand length scales, and demonstrated how this approach significantly improves the accuracy for prediction of peptide binding and identification of MHC ligands. The method is available at www.cbs.dtu.dk/services/NetMHCpan-3.0.

[1]  Georg Greiner,et al.  Preferred size of peptides that bind to H‐2 Kb is sequence dependent , 1992, European journal of immunology.

[2]  Edward J. Collins,et al.  Three-dimensional structure of a peptide extending from one end of a class I MHC binding site , 1994, Nature.

[3]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[4]  J. Yewdell,et al.  Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses. , 1999, Annual review of immunology.

[5]  H. Rammensee,et al.  SYFPEITHI: database for MHC ligands and peptide motifs , 1999, Immunogenetics.

[6]  O. Lund,et al.  novel sequence representations Reliable prediction of T-cell epitopes using neural networks with , 2003 .

[7]  Alessandro Sette,et al.  Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method , 2005, BMC Bioinformatics.

[8]  O. Lund,et al.  NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence , 2007, PloS one.

[9]  Morten Nielsen,et al.  Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method , 2007, BMC Bioinformatics.

[10]  Morten Nielsen,et al.  NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11 , 2008, Nucleic Acids Res..

[11]  O. Lund,et al.  NetMHCpan, a method for MHC class I binding prediction beyond humans , 2008, Immunogenetics.

[12]  Morten Nielsen,et al.  Accurate approximation method for prediction of class I MHC affinities for peptides of length 8, 10 and 11 using prediction tools trained on 9mers , 2008, Bioinform..

[13]  D. van Baarle,et al.  A Comparative Study of HLA Binding Affinity and Ligand Diversity: Implications for Generating Immunodominant CD8+ T Cell Responses1 , 2009, The Journal of Immunology.

[14]  Clemencia Pinilla,et al.  Derivation of an amino acid similarity matrix for peptide:MHC binding and its application as a Bayesian prior , 2009, BMC Bioinformatics.

[15]  Morten Nielsen,et al.  NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction , 2009, BMC Bioinformatics.

[16]  T. Ndung’u,et al.  HLArestrictor—a tool for patient-specific predictions of HLA restriction elements and optimal epitopes within peptides , 2010, Immunogenetics.

[17]  O. Lund,et al.  NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure , 2010, Immunome research.

[18]  Morten Nielsen,et al.  NNAlign: A Web-Based Prediction Method Allowing Non-Expert End-User Discovery of Sequence Motifs in Quantitative Peptide Data , 2011, PloS one.

[19]  Morten Nielsen,et al.  NetMHCcons: a consensus method for the major histocompatibility complex class I predictions , 2011, Immunogenetics.

[20]  Bjoern Peters,et al.  HLA Class I Alleles Are Associated with Peptide-Binding Repertoires of Different Size, Affinity, and Immunogenicity , 2013, The Journal of Immunology.

[21]  G. Røder,et al.  Tapasin Facilitation of Natural HLA-A and -B Allomorphs Is Strongly Influenced by Peptide Length, Depends on Stability, and Separates Closely Related Allomorphs , 2013, The Journal of Immunology.

[22]  O. Lund,et al.  NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ , 2013, Immunogenetics.

[23]  James McCluskey,et al.  HLA Peptide Length Preferences Control CD8+ T Cell Responses , 2013, The Journal of Immunology.

[24]  Morten Nielsen,et al.  Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions , 2014, BMC Bioinformatics.

[25]  M Peakman,et al.  Identification and characterisation of peptide binding motifs of six autoimmune disease-associated human leukocyte antigen-class I molecules including HLA-B*39:06. , 2014, Tissue antigens.

[26]  Deborah Hix,et al.  The immune epitope database (IEDB) 3.0 , 2014, Nucleic Acids Res..

[27]  Morten Nielsen,et al.  Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification , 2015, Immunogenetics.

[28]  Morten Nielsen,et al.  Automated benchmarking of peptide-MHC class I binding predictions , 2015, Bioinform..

[29]  James Robinson,et al.  The IPD and IMGT/HLA database: allele variant databases , 2014, Nucleic Acids Res..

[30]  M. Nielsen,et al.  Defining the HLA class I‐associated viral antigen repertoire from HIV‐1‐infected human cells , 2015, European journal of immunology.

[31]  Morten Nielsen,et al.  Gapped sequence alignment using artificial neural networks: application to the MHC class I system , 2016, Bioinform..