POPI: predicting immunogenicity of MHC class I binding peptides by mining informative physicochemical properties

MOTIVATION Both modeling of antigen-processing pathway including major histocompatibility complex (MHC) binding and immunogenicity prediction of those MHC-binding peptides are essential to develop a computer-aided system of peptide-based vaccine design that is one goal of immunoinformatics. Numerous studies have dealt with modeling the immunogenic pathway but not the intractable problem of immunogenicity prediction due to complex effects of many intrinsic and extrinsic factors. Moderate affinity of the MHC-peptide complex is essential to induce immune responses, but the relationship between the affinity and peptide immunogenicity is too weak to use for predicting immunogenicity. This study focuses on mining informative physicochemical properties from known experimental immunogenicity data to understand immune responses and predict immunogenicity of MHC-binding peptides accurately. RESULTS This study proposes a computational method to mine a feature set of informative physicochemical properties from MHC class I binding peptides to design a support vector machine (SVM) based system (named POPI) for the prediction of peptide immunogenicity. High performance of POPI arises mainly from an inheritable bi-objective genetic algorithm, which aims to automatically determine the best number m out of 531 physicochemical properties, identify these m properties and tune SVM parameters simultaneously. The dataset consisting of 428 human MHC class I binding peptides belonging to four classes of immunogenicity was established from MHCPEP, a database of MHC-binding peptides (Brusic et al., 1998). POPI, utilizing the m = 23 selected properties, performs well with the accuracy of 64.72% using leave-one-out cross-validation, compared with two sequence alignment-based prediction methods ALIGN (54.91%) and PSI-BLAST (53.23%). POPI is the first computational system for prediction of peptide immunogenicity based on physicochemical properties. AVAILABILITY A web server for prediction of peptide immunogenicity (POPI) and the used dataset of MHC class I binding peptides (PEPMHCI) are available at http://iclab.life.nctu.edu.tw/POPI

[1]  Loris Nanni,et al.  An ensemble of K-local hyperplanes for predicting protein-protein interactions , 2006, Bioinform..

[2]  Arne Elofsson,et al.  Prediction of MHC class I binding peptides, using SVMHC , 2002, BMC Bioinformatics.

[3]  Wen Liu,et al.  Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models , 2006, BMC Bioinformatics.

[4]  Shinn-Ying Ho,et al.  Inheritable genetic algorithm for biobjective 0/1 combinatorial optimization problems and its applications , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  S. M. Lewis,et al.  Orthogonal Fractional Factorial Designs , 1986 .

[6]  Søren Brunak,et al.  Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach , 2004, Bioinform..

[7]  S. Brunak,et al.  Prediction of proteasome cleavage motifs by neural networks. , 2002, Protein engineering.

[8]  O. Lund,et al.  The Immune Epitope Database and Analysis Resource: From Vision to Blueprint , 2005, PLoS biology.

[9]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[10]  Bhaskar D. Kulkarni,et al.  A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in Escherichia coli , 2006, Bioinform..

[11]  Shinn-Ying Ho,et al.  Intelligent evolutionary algorithms for large parameter optimization problems , 2004, IEEE Trans. Evol. Comput..

[12]  Minoru Kanehisa,et al.  AAindex: Amino Acid index database , 2000, Nucleic Acids Res..

[13]  Jiang Wang,et al.  Prediction of protein structural class with Rough Sets , 2006, BMC Bioinformatics.

[14]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[15]  Chih-Hung Hsieh,et al.  Interpretable gene expression classifier with an accurate and compact fuzzy rule base for microarray data analysis. , 2006, Bio Systems.

[16]  P. Dönnes,et al.  Integrated modeling of the major events in the MHC class I antigen processing pathway , 2005, Protein science : a publication of the Protein Society.

[17]  C. Nachtsheim Orthogonal Fractional Factorial Designs , 1985 .

[18]  Michael J. Geisow,et al.  Amino acid preferences for secondary structure vary with protein class , 1980 .

[19]  Arun Krishnan,et al.  pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties , 2005, BMC Bioinformatics.

[20]  M. V. Van Regenmortel,et al.  Antigenicity and immunogenicity of synthetic peptides. , 2001, Biologicals : journal of the International Association of Biological Standardization.

[21]  G. Hämmerling,et al.  Antigen processing and presentation‐towards the Millennium , 1999, Immunological reviews.

[22]  D. Flower,et al.  Benchmarking B cell epitope prediction: Underperformance of existing methods , 2005, Protein science : a publication of the Protein Society.

[23]  Bjoern Peters,et al.  Identifying MHC Class I Epitopes by Predicting the TAP Transport Efficiency of Epitope Precursors , 2003, The Journal of Immunology.

[24]  Gajendra P. S. Raghava,et al.  Pcleavage: an SVM based method for prediction of constitutive proteasome and immunoproteasome cleavage sites in antigenic sequences , 2005, Nucleic Acids Res..

[25]  M. Feltkamp,et al.  Efficient MHC class I-peptide binding is required but does not ensure MHC class I-restricted immunogenicity. , 1994, Molecular immunology.

[26]  Manoj Bhasin,et al.  Analysis and prediction of affinity of TAP binding peptides using cascade SVM , 2004, Protein science : a publication of the Protein Society.

[27]  T. Auton,et al.  Statistical comparison of established T-cell epitope predictors against a large database of human and murine antigens. , 1996, Molecular immunology.

[28]  O. Lund,et al.  An integrative approach to CTL epitope prediction: A combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions , 2005, European journal of immunology.

[29]  M. McMillan,et al.  The ability of peptides to induce cytotoxic T cells in vitro does not strongly correlate with their affinity for the H-2Ld molecule: implications for vaccine design and immunotherapy. , 1997, Molecular immunology.

[30]  Darja Kanduc,et al.  Peptimmunology: immunogenic peptides and sequence redundancy. , 2005, Current drug discovery technologies.

[31]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[32]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..