Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models

BackgroundThe binding between peptide epitopes and major histocompatibility complex proteins (MHCs) is an important event in the cellular immune response. Accurate prediction of the binding between short peptides and the MHC molecules has long been a principal challenge for immunoinformatics. Recently, the modeling of MHC-peptide binding has come to emphasize quantitative predictions: instead of categorizing peptides as "binders" or "non-binders" or as "strong binders" and "weak binders", recent methods seek to make predictions about precise binding affinities.ResultsWe developed a quantitative support vector machine regression (SVR) approach, called SVRMHC, to model peptide-MHC binding affinities. As a non-linear method, SVRMHC was able to generate models that out-performed existing linear models, such as the "additive method". By adopting a new "11-factor encoding" scheme, SVRMHC takes into account similarities in the physicochemical properties of the amino acids constituting the input peptides. When applied to MHC-peptide binding data for three mouse class I MHC alleles, the SVRMHC models produced more accurate predictions than those produced previously. Furthermore, comparisons based on Receiver Operating Characteristic (ROC) analysis indicated that SVRMHC was able to out-perform several prominent methods in identifying strongly binding peptides.ConclusionAs a method with demonstrated performance in the quantitative modeling of MHC-peptide binding and in identifying strong binders, SVRMHC is a promising immunoinformatics tool with not inconsiderable future potential.

[1]  Ruisheng Zhang,et al.  Support Vector Machines-Based Quantitative Structure-Property Relationship for the Prediction of Heat Capacity , 2004, J. Chem. Inf. Model..

[2]  K. Parker,et al.  Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. , 1994, Journal of immunology.

[3]  Channa K. Hattotuwagama,et al.  Toward Prediction of Class II Mouse Major Histocompatibility Complex Peptide Binding Affinity: in Silico Bioinformatic Evaluation Using Partial Least Squares, a Robust Multivariate Statistical Technique , 2006, J. Chem. Inf. Model..

[4]  Hans-Georg Rammensee,et al.  Identification of Chlamydia pneumoniae-Derived Mouse CD8 Epitopes , 2002, Infection and Immunity.

[5]  Gajendra P. S. Raghava,et al.  MHCBN: a comprehensive database of MHC binding and non-binding peptides , 2003, Bioinform..

[6]  Hao Shen,et al.  Identification of murine CD8 T cell epitopes in codon-optimized SARS-associated coronavirus spike protein , 2005, Virology.

[7]  Yingdong Zhao,et al.  Application of support vector machines for T-cell epitopes prediction , 2003, Bioinform..

[8]  Simon Parsons,et al.  Bioinformatics: The Machine Learning Approach by P. Baldi and S. Brunak, 2nd edn, MIT Press, 452 pp., $60.00, ISBN 0-262-02506-X , 2004, The Knowledge Engineering Review.

[9]  D. Flower,et al.  Benchmarking B cell epitope prediction: Underperformance of existing methods , 2005, Protein science : a publication of the Protein Society.

[10]  Darren R. Flower,et al.  Drug design : cutting edge approaches , 2002 .

[11]  D. Flower,et al.  Additive method for the prediction of protein-peptide binding affinity. Application to the MHC class I molecule HLA-A*0201. , 2002, Journal of proteome research.

[12]  Gajendra P. S. Raghava,et al.  SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence , 2004, Bioinform..

[13]  E. Reinherz,et al.  Prediction of MHC class I binding peptides using profile motifs. , 2002, Human immunology.

[14]  Vladimir Brusic,et al.  Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network , 1998, Bioinform..

[15]  Irini A. Doytchinova,et al.  Computational vaccine design , 2002 .

[16]  Channa K. Hattotuwagama,et al.  AntiJen: a quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data , 2005, Immunome research.

[17]  Yunqian Ma,et al.  Practical selection of SVM parameters and noise estimation for SVM regression , 2004, Neural Networks.

[18]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[19]  R. Fayad,et al.  Induction of mucosal and systemic immune responses against human carcinoembryonic antigen by an oral vaccine. , 2005, Cancer research.

[20]  Arne Elofsson,et al.  Prediction of MHC class I binding peptides, using SVMHC , 2002, BMC Bioinformatics.

[21]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[22]  D. Flower,et al.  Toward the quantitative prediction of T-cell epitopes: coMFA and coMSIA studies of peptides with affinity for the class I MHC molecule HLA-A*0201. , 2001, Journal of medicinal chemistry.

[23]  V. Brusic,et al.  Neural network-based prediction of candidate T-cell epitopes , 1998, Nature Biotechnology.

[24]  O. Lund,et al.  novel sequence representations Reliable prediction of T-cell epitopes using neural networks with , 2003 .

[25]  Pingping Guan,et al.  MHCPred: a server for quantitative prediction of peptide-MHC binding , 2003, Nucleic Acids Res..

[26]  Channa K. Hattotuwagama,et al.  New horizons in mouse immunoinformatics: reliable in silico prediction of mouse class I histocompatibility major complex peptide binding affinity. , 2004, Organic & biomolecular chemistry.

[27]  Harry B. Greenberg,et al.  Characterization of Homologous and Heterologous Rotavirus-Specific T-Cell Responses in Infant and Adult Mice , 2005, Journal of Virology.

[28]  Ruisheng Zhang,et al.  Prediction of the Isoelectric Point of an Amino Acid Based on GA-PLS and SVMs , 2004, J. Chem. Inf. Model..

[29]  Søren Brunak,et al.  Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach , 2004, Bioinform..

[30]  H. Rammensee,et al.  SYFPEITHI: database for MHC ligands and peptide motifs , 1999, Immunogenetics.

[31]  H. Grey,et al.  Prediction of major histocompatibility complex binding regions of protein antigens by sequence pattern analysis. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[33]  Haifeng Chen,et al.  Comparative Study of QSAR/QSPR Correlations Using Support Vector Machines, Radial Basis Function Neural Networks, and Multiple Linear Regression , 2004, J. Chem. Inf. Model..

[34]  H Mamitsuka,et al.  Predicting peptides that bind to MHC molecules using supervised learning of hidden markov models , 1998, Proteins.

[35]  Shelley L. Hemsley,et al.  Transporter Associated with Antigen Processing Preselection of Peptides Binding to the MHC: A Bioinformatic Evaluation , 2004, The Journal of Immunology.

[36]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[37]  D. Flower,et al.  Physicochemical explanation of peptide binding to HLA‐A*0201 major histocompatibility complex: A three‐dimensional quantitative structure‐activity relationship study , 2002, Proteins.

[38]  O. Schueler‐Furman,et al.  Structure‐based prediction of binding peptides to MHC class I molecules: Application to a broad range of MHC alleles , 2000, Protein science : a publication of the Protein Society.

[39]  Chien-Fu Hung,et al.  Development of a DNA Vaccine Targeting Human Papillomavirus Type 16 Oncoprotein E6 , 2004, Journal of Virology.

[40]  J. Manning,et al.  Paraflagellar rod protein‐specific CD8+ cytotoxic T lymphocytes target Trypanosoma cruzi‐infected host cells , 2002, Parasite immunology.

[41]  Irini A. Doytchinova,et al.  Towards the in silico identification of class II restricted T-cell epitopes: a partial least squares iterative self-consistent algorithm for affinity prediction , 2003, Bioinform..