TEPITOPEpan: Extending TEPITOPE for Peptide Binding Prediction Covering over 700 HLA-DR Molecules

Motivation Accurate identification of peptides binding to specific Major Histocompatibility Complex Class II (MHC-II) molecules is of great importance for elucidating the underlying mechanism of immune recognition, as well as for developing effective epitope-based vaccines and promising immunotherapies for many severe diseases. Due to extreme polymorphism of MHC-II alleles and the high cost of biochemical experiments, the development of computational methods for accurate prediction of binding peptides of MHC-II molecules, particularly for the ones with few or no experimental data, has become a topic of increasing interest. TEPITOPE is a well-used computational approach because of its good interpretability and relatively high performance. However, TEPITOPE can be applied to only 51 out of over 700 known HLA DR molecules. Method We have developed a new method, called TEPITOPEpan, by extrapolating from the binding specificities of HLA DR molecules characterized by TEPITOPE to those uncharacterized. First, each HLA-DR binding pocket is represented by amino acid residues that have close contact with the corresponding peptide binding core residues. Then the pocket similarity between two HLA-DR molecules is calculated as the sequence similarity of the residues. Finally, for an uncharacterized HLA-DR molecule, the binding specificity of each pocket is computed as a weighted average in pocket binding specificities over HLA-DR molecules characterized by TEPITOPE. Result The performance of TEPITOPEpan has been extensively evaluated using various data sets from different viewpoints: predicting MHC binding peptides, identifying HLA ligands and T-cell epitopes and recognizing binding cores. Among the four state-of-the-art competing pan-specific methods, for predicting binding specificities of unknown HLA-DR molecules, TEPITOPEpan was roughly the second best method next to NETMHCIIpan-2.0. Additionally, TEPITOPEpan achieved the best performance in recognizing binding cores. We further analyzed the motifs detected by TEPITOPEpan, examining the corresponding literature of immunology. Its online server and PSSMs therein are available at http://www.biokdd.fudan.edu.cn/Service/TEPITOPEpan/.

[1]  Morten Nielsen,et al.  The PickPocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding , 2009, Bioinform..

[2]  U. Şahin,et al.  Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices , 1999, Nature Biotechnology.

[3]  Hiroshi Mamitsuka,et al.  Toward more accurate pan-specific MHC-peptide binding prediction: a review of current methods and tools , 2011, Briefings Bioinform..

[4]  James Robinson,et al.  The IMGT/HLA database , 2008, Nucleic Acids Res..

[5]  W. Bodmer,et al.  Nomenclature for factors of the HLA system, 2010 , 2010, Tissue antigens.

[6]  Morten Nielsen,et al.  State of the art and challenges in sequence based T-cell epitope prediction , 2010, Immunome research.

[7]  Bjoern Peters,et al.  Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications , 2005, Immunogenetics.

[8]  Hans D. Mittelmann,et al.  Prediction of the binding affinities of peptides to class II MHC using a regularized thermodynamic model , 2010, BMC Bioinformatics.

[9]  John Sidney,et al.  A Systematic Assessment of MHC Class II Peptide Binding Predictions and Evaluation of a Consensus Approach , 2008, PLoS Comput. Biol..

[10]  Morten Nielsen,et al.  Pan-specific MHC class I predictors: a benchmark of HLA class I pan-specific prediction methods , 2009, Bioinform..

[11]  Alessandro Sette,et al.  The Immune Epitope Database 2.0 , 2009, Nucleic Acids Res..

[12]  Morten Nielsen,et al.  NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction , 2009, BMC Bioinformatics.

[13]  Hans D. Mittelmann,et al.  MultiRTA: A simple yet reliable method for predicting peptide binding affinities for multiple class II MHC allotypes , 2010, BMC Bioinformatics.

[14]  Vladimir Brusic,et al.  Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research , 2008, BMC Bioinformatics.

[15]  James McCluskey,et al.  More than one reason to rethink the use of peptides in vaccine design , 2007, Nature Reviews Drug Discovery.

[16]  Morten Nielsen,et al.  A Community Resource Benchmarking Predictions of Peptide Binding to MHC-I Molecules , 2006, PLoS Comput. Biol..

[17]  V. Brusic,et al.  Evaluation of MHC class I peptide binding prediction servers: Applications for vaccine research , 2008, BMC Immunology.

[18]  Nebojsa Jojic,et al.  Shift-Invariant Adaptive Double Threading: Learning MHC II - Peptide Binding , 2007, RECOMB.

[19]  Peter Parham,et al.  The HLA FactsBook , 1999 .

[20]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[21]  Oliver Kohlbacher,et al.  Multiple Instance Learning Allows MHC Class II Epitope Predictions Across Alleles , 2008, WABI.

[22]  Søren Brunak,et al.  Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach , 2004, Bioinform..

[23]  Morten Nielsen,et al.  Quantitative Predictions of Peptide Binding to Any HLA-DR Molecule of Known Sequence: NetMHCIIpan , 2008, PLoS Comput. Biol..

[24]  Rainer Blasczyk,et al.  Nomenclature for factors of the HLA system , 1998 .

[25]  Morten Nielsen,et al.  MHC Class II epitope predictive algorithms , 2010, Immunology.

[26]  H. Rammensee,et al.  SYFPEITHI: database for MHC ligands and peptide motifs , 1999, Immunogenetics.

[27]  Morten Nielsen,et al.  Immunological bioinformatics , 2005, Computational molecular biology.

[28]  O. Lund,et al.  NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure , 2010, Immunome research.

[29]  Vladimir Brusic,et al.  Dana-Farber repository for machine learning in immunology. , 2011, Journal of immunological methods.

[30]  Morten Nielsen,et al.  Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method , 2007, BMC Bioinformatics.

[31]  Ole Lund,et al.  Immunological Bioinformatics (Computational Molecular Biology) , 2005 .

[32]  C. Janeway Immunobiology: The Immune System in Health and Disease , 1996 .

[33]  M. Torres,et al.  Nomenclature for factors of the HLA system. , 2011, Bulletin of the World Health Organization.

[34]  J. Yewdell,et al.  Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses. , 1999, Annual review of immunology.

[35]  Darren R. Flower,et al.  Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores , 2006, BMC Bioinformatics.

[36]  Vladimir Brusic,et al.  Prediction of promiscuous peptides that bind HLA class I molecules , 2002, Immunology and cell biology.

[37]  J. Sidney,et al.  Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism , 1999, Immunogenetics.

[38]  Miriam L. Land,et al.  Developing measures for microbial genome assembly quality control , 2010, BMC Bioinformatics.