Application of RotaSVM for HLA Class II Protein-Peptide Interaction Prediction

In this article, the recently developed RotaSVM is used for accurate prediction of binding peptides to Human Leukocyte Antigens class II (HLA class II) proteins. The HLA II - peptide complexes are generated in the antigen presenting cells (APC) and transported to the cell membrane to elicit an immune response via T-cell activation. The understanding of HLA class II protein-peptide binding interaction facilitates the design of peptide-based vaccine, where the high rate of polymorphisms in HLA class II molecules poses a big challenge. To determine the binding activity of 636 non-redundant peptides, a set of 27 HLA class II proteins are considered in the present study. The prediction of HLA class II - peptide binding is carried out by an ensemble classifier called RotaSVM. In RotaSVM, the feature selection scheme generates bootstrap samples that are further used to create a diverse set of features using Principal Component Analysis. Thereafter, Support Vector Machines are trained with these bootstrap samples with the integration of their original feature values. The effectiveness of the RotaSVM for HLA class II protein-peptide binding prediction is demonstrated in comparison with other traditional classifiers by evaluating several validity measures with the visual plot of ROC curves. Finally, Friedman test is conducted to judge the statistical significance of RotaSVM in prediction of peptides binding to HLA class II proteins.

[1]  Vladimir Brusic,et al.  Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network , 1998, Bioinform..

[2]  Bjoern Peters,et al.  Immune epitope mapping in the post-genomic era: lessons for vaccine development. , 2007, Current opinion in immunology.

[3]  Gajendra P. S. Raghava,et al.  ProPred: prediction of HLA-DR binding sites , 2001, Bioinform..

[4]  Bjoern Peters,et al.  Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications , 2005, Immunogenetics.

[5]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[6]  J. Yewdell,et al.  Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses. , 1999, Annual review of immunology.

[7]  Darren R. Flower,et al.  Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores , 2006, BMC Bioinformatics.

[8]  Debotosh Bhattacharjee,et al.  RotaSVM: A New Ensemble Classifier , 2013 .

[9]  Anirban Mukhopadhyay,et al.  Improved Crisp and Fuzzy Clustering Techniques for Categorical Data , 2008 .

[10]  Ujjwal Maulik,et al.  Improvement of new automatic differential fuzzy clustering using SVM classifier for microarray analysis , 2011, Expert Syst. Appl..

[11]  S Brunak,et al.  Identifying cytotoxic T cell epitopes from genomic and proteomic information: "The human MHC project.". , 2000, Reviews in immunogenetics.

[12]  Ujjwal Maulik,et al.  Automatic Fuzzy Clustering Using Modified Differential Evolution for Image Classification , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Dinesh Gupta,et al.  Machine Learning Methods for Prediction of CDK-Inhibitors , 2010, PloS one.

[14]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[15]  Ujjwal Maulik,et al.  SVMeFC: SVM Ensemble Fuzzy Clustering for Satellite Image Segmentation , 2012, IEEE Geoscience and Remote Sensing Letters.

[16]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[17]  Dariusz Plewczynski,et al.  Consensus classification of human leukocyte antigen class II proteins , 2012, Immunogenetics.

[18]  Indrajit Saha,et al.  A new Wavelet based Edge detection Technique for Iris Imagery , 2009, 2009 IEEE International Advance Computing Conference.

[19]  Subhadip Basu,et al.  AMS 4.0: consensus prediction of post-translational modifications in protein sequences , 2012, Amino Acids.

[20]  Ujjwal Maulik,et al.  Fuzzy clustering of physicochemical and biochemical properties of amino Acids , 2011, Amino Acids.

[21]  Ujjwal Maulik,et al.  Improved differential evolution for microarray analysis , 2012, Int. J. Data Min. Bioinform..

[22]  Ujjwal Maulik,et al.  Unsupervised and Supervised Learning Approaches Together for Microarray Analysis , 2011, Fundam. Informaticae.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Ji Wan,et al.  SVRMHC prediction server for MHC-binding peptides , 2006, BMC Bioinformatics.

[25]  Ujjwal Maulik,et al.  Integrating Clustering and Supervised Learning for Categorical Data Analysis , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[26]  Søren Brunak,et al.  Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach , 2004, Bioinform..

[27]  O. Lund,et al.  NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence , 2007, PloS one.

[28]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[29]  D. Wiley,et al.  Antigenic peptide binding by class I and class II histocompatibility proteins. , 1994, Behring Institute Mitteilungen.

[30]  Magdalini Moutaftsi,et al.  A consensus epitope prediction approach identifies the breadth of murine TCD8+-cell responses to vaccinia virus , 2006, Nature Biotechnology.

[31]  U. Şahin,et al.  Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices , 1999, Nature Biotechnology.

[32]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[33]  Jianming Shi,et al.  Prediction of MHC class II binders using the ant colony search strategy , 2005, Artif. Intell. Medicine.

[34]  Bjoern Peters,et al.  Functional classification of class II human leukocyte antigen (HLA) molecules reveals seven different supertypes and a surprising degree of repertoire sharing across supertypes , 2011, Immunogenetics.

[35]  Yang Dai,et al.  Prediction of MHC class II binding peptides based on an iterative learning model , 2005, Immunome research.

[36]  A. Haque,et al.  New insights in antigen processing and epitope selection: development of novel immunotherapeutic strategies for cancer, autoimmunity and infectious diseases. , 2005, Journal of biological regulators and homeostatic agents.

[37]  Debashis Ghosh,et al.  Peptide length-based prediction of peptide-MHC class II binding , 2006, Bioinform..

[38]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .