EpicCapo: epitope prediction using combined information of amino acid pairwise contact potentials and HLA-peptide contact site information

BackgroundEpitope identification is an essential step toward synthetic vaccine development since epitopes play an important role in activating immune response. Classical experimental approaches are laborious and time-consuming, and therefore computational methods for generating epitope candidates have been actively studied. Most of these methods, however, are based on sophisticated nonlinear techniques for achieving higher predictive performance. The use of these techniques tend to diminish their interpretability with respect to binding potential: that is, they do not provide much insight into binding mechanisms.ResultsWe have developed a novel epitope prediction method named EpicCapo and its variants, EpicCapo+ and EpicCapo+REF. Nonapeptides were encoded numerically using a novel peptide-encoding scheme for machine learning algorithms by utilizing 40 amino acid pairwise contact potentials (referred to as AAPPs throughout this paper). The predictive performances of EpicCapo+ and EpicCapo+REF outperformed other state-of-the-art methods without losing interpretability. Interestingly, the most informative AAPPs estimated by our study were those developed by Micheletti and Simons while previous studies utilized two AAPPs developed by Miyazawa & Jernigan and Betancourt & Thirumalai. In addition, we found that all amino acid positions in nonapeptides could effect on performances of the predictive models including non-anchor positions. Finally, EpicCapo+REF was applied to identify candidates of promiscuous epitopes. As a result, 67.1% of the predicted nonapeptides epitopes were consistent with preceding studies based on immunological experiments.ConclusionsOur method achieved high performance in testing with benchmark datasets. In addition, our study identified a number of candidates of promiscuous CTL epitopes consistent with previously reported immunological experiments. We speculate that our techniques may be useful in the development of new vaccines. The R implementation of EpicCapo+REF is available athttp://pirun.ku.ac.th/~fsciiok/EpicCapoREF.zip. Datasets are available athttp://pirun.ku.ac.th/~fsciiok/Datasets.zip.

[1]  Marie-Paule Lefranc,et al.  T Cell Receptor/Peptide/MHC Molecular Characterization and Standardized pMHC Contact Sites in IMGT/3Dstructure-DB , 2005, Silico Biol..

[2]  Bjoern Peters,et al.  Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications , 2005, Immunogenetics.

[3]  W. Dunn,et al.  Amino acid side chain descriptors for quantitative structure-activity relationship studies of peptide analogues. , 1995, Journal of medicinal chemistry.

[4]  Magdalini Moutaftsi,et al.  A consensus epitope prediction approach identifies the breadth of murine TCD8+-cell responses to vaccinia virus , 2006, Nature Biotechnology.

[5]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[6]  O. Lund,et al.  Definition of supertypes for HLA molecules using clustering of specificity matrices , 2004, Immunogenetics.

[7]  Luc De Raedt,et al.  Machine Learning: ECML-94 , 1994, Lecture Notes in Computer Science.

[8]  Channa K. Hattotuwagama,et al.  AntiJen: a quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data , 2005, Immunome research.

[9]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[10]  J. Sidney,et al.  Prominent role of secondary anchor residues in peptide binding to HLA-A2.1 molecules , 1993, Cell.

[11]  Huilan Yang,et al.  Stepwise identification of HLA-A*0201-restricted CD8+ T-cell epitope peptides from herpes simplex virus type 1 genome boosted by a StepRank scheme. , 2011, Biopolymers.

[12]  S Vajda,et al.  Flexible docking of peptides to class I major-histocompatibility-complex receptors. , 1995, Genetic analysis : biomolecular engineering.

[13]  F. Tian,et al.  In silico quantitative prediction of peptides binding affinity to human MHC molecule: an intuitive quantitative structure–activity relationship approach , 2009, Amino Acids.

[14]  J R Banavar,et al.  Learning effective amino acid interactions through iterative stochastic techniques , 2000, Proteins.

[15]  Søren Brunak,et al.  Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach , 2004, Bioinform..

[16]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[17]  Hu Mei,et al.  A set of new amino acid descriptors applied in prediction of MHC class I binding peptides. , 2009, European journal of medicinal chemistry.

[18]  Hanah Margalit,et al.  A structure-based approach for prediction of MHC-binding peptides. , 2004, Methods.

[19]  D. Flower,et al.  Identifiying Human MHC Supertypes Using Bioinformatic Methods , 2004, The Journal of Immunology.

[20]  H. Rammensee,et al.  SYFPEITHI: database for MHC ligands and peptide motifs , 1999, Immunogenetics.

[21]  Morten Nielsen,et al.  State of the art and challenges in sequence based T-cell epitope prediction , 2010, Immunome research.

[22]  H. Grey,et al.  Prediction of major histocompatibility complex binding regions of protein antigens by sequence pattern analysis. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Bhartendu Nath Mishra,et al.  Ranking of binding and nonbinding peptides to MHC class I molecules using inverse folding approach: Implications for vaccine design , 2008, Bioinformation.

[24]  Morten Nielsen,et al.  A Community Resource Benchmarking Predictions of Peptide Binding to MHC-I Molecules , 2006, PLoS Comput. Biol..

[25]  Alessandro Sette,et al.  Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method , 2005, BMC Bioinformatics.

[26]  Tomer Hertz,et al.  Identifying HLA supertypes by learning distance functions , 2007, Bioinform..

[27]  H. Rammensee,et al.  Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules , 1991, Nature.

[28]  O. Schueler‐Furman,et al.  Structure‐based prediction of binding peptides to MHC class I molecules: Application to a broad range of MHC alleles , 2000, Protein science : a publication of the Protein Society.

[29]  Ji Wan,et al.  SVRMHC prediction server for MHC-binding peptides , 2006, BMC Bioinformatics.

[30]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[31]  Peter Walden,et al.  Exact prediction of a natural T cell epitope , 1991, European journal of immunology.

[32]  K. Parker,et al.  Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. , 1994, Journal of immunology.

[33]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[34]  T. Uchida,et al.  Development of a cytotoxic T‐lymphocyte‐based, broadly protective influenza vaccine , 2011, Microbiology and immunology.

[35]  Naoki Abe,et al.  Empirical Evaluation of a Dynamic Experiment Design Method for Prediction of MHC Class I-Binding Peptides1 , 2002, The Journal of Immunology.

[36]  Thomas Lengauer,et al.  DynaPred: A structure and sequence based method for the prediction of MHC class I binding peptide sequences and conformations , 2006, ISMB.

[37]  N. Shastri,et al.  Producing nature's gene-chips: the generation of peptides for display by MHC class I molecules. , 2002, Annual review of immunology.

[38]  Morten Nielsen,et al.  Modeling the adaptive immune system: predictions and simulations , 2007, Bioinform..

[39]  D. Baker,et al.  Improved recognition of native‐like protein structures using a combination of sequence‐dependent and sequence‐independent features of proteins , 1999, Proteins.

[40]  Limsoon Wong,et al.  FIMM, a database of functional molecular immunology , 2000, Nucleic Acids Res..

[41]  BMC Bioinformatics , 2005 .

[42]  Ian H. Witten,et al.  Data mining in bioinformatics using Weka , 2004, Bioinform..

[43]  J. Sidney,et al.  Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism , 1999, Immunogenetics.

[44]  Morten Nielsen,et al.  NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11 , 2008, Nucleic Acids Res..

[45]  H. Bui,et al.  Structural prediction of peptides binding to MHC class I molecules , 2006, Proteins.

[46]  Sneh Lata,et al.  MHCBN 4.0: A database of MHC/TAP binding peptides and T-cell epitopes , 2009, BMC Research Notes.

[47]  S. Wold,et al.  New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. , 1998, Journal of medicinal chemistry.

[48]  Steve Wilson,et al.  The Immune Epitope Database and Analysis Resource: From Vision to Blueprint , 2005, PLoS biology.

[49]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[50]  Kuo-Chen Chou,et al.  Predicting the affinity of epitope-peptides with class I MHC molecule HLA-A*0201: an application of amino acid-based peptide prediction. , 2007, Protein engineering, design & selection : PEDS.

[51]  E. Reinherz,et al.  Prediction of MHC class I binding peptides using profile motifs. , 2002, Human immunology.

[52]  D. Wiley,et al.  The antigenic identity of peptide-MHC complexes: A comparison of the conformations of five viral peptides presented by HLA-A2 , 1993, Cell.

[53]  Limsoon Wong,et al.  FIMM, a database of functional molecular immunology: update 2002 , 2002, Nucleic Acids Res..

[54]  Partho Ghosh,et al.  The Structure and Stability of an HLA-A*0201/Octameric Tax Peptide Complex with an Empty Conserved Peptide-N-Terminal Binding Site1 , 2000, The Journal of Immunology.

[55]  Ellis L. Reinherz,et al.  PEPVAC: a web server for multi-epitope vaccine development based on the prediction of supertypic MHC ligands , 2005, Nucleic Acids Res..

[56]  D. Wiley,et al.  Refined structure of the human histocompatibility antigen HLA-A2 at 2.6 A resolution. , 1991, Journal of molecular biology.

[57]  O. Lund,et al.  NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence , 2007, PloS one.

[58]  Clemencia Pinilla,et al.  Derivation of an amino acid similarity matrix for peptide:MHC binding and its application as a Bayesian prior , 2009, BMC Bioinformatics.

[59]  Hui Li,et al.  A simplified PCR-SSP method for HLA-A2 subtype in a population of Wuhan, China. , 2006, Cellular & molecular immunology.

[60]  John D Treanor,et al.  Influenza--the goal of control. , 2007, The New England journal of medicine.

[61]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..