Predicting peptides binding to MHC class II molecules using multi-objective evolutionary algorithms

BackgroundPeptides binding to Major Histocompatibility Complex (MHC) class II molecules are crucial for initiation and regulation of immune responses. Predicting peptides that bind to a specific MHC molecule plays an important role in determining potential candidates for vaccines. The binding groove in class II MHC is open at both ends, allowing peptides longer than 9-mer to bind. Finding the consensus motif facilitating the binding of peptides to a MHC class II molecule is difficult because of different lengths of binding peptides and varying location of 9-mer binding core. The level of difficulty increases when the molecule is promiscuous and binds to a large number of low affinity peptides.In this paper, we propose two approaches using multi-objective evolutionary algorithms (MOEA) for predicting peptides binding to MHC class II molecules. One uses the information from both binders and non-binders for self-discovery of motifs. The other, in addition, uses information from experimentally determined motifs for guided-discovery of motifs.ResultsThe proposed methods are intended for finding peptides binding to MHC class II I-Ag7 molecule – a promiscuous binder to a large number of low affinity peptides. Cross-validation results across experiments on two motifs derived for I-Ag7 datasets demonstrate better generalization abilities and accuracies of the present method over earlier approaches. Further, the proposed method was validated and compared on two publicly available benchmark datasets: (1) an ensemble of qualitative HLA-DRB1*0401 peptide data obtained from five different sources, and (2) quantitative peptide data obtained for sixteen different alleles comprising of three mouse alleles and thirteen HLA alleles. The proposed method outperformed earlier methods on most datasets, indicating that it is well suited for finding peptides binding to MHC class II molecules.ConclusionWe present two MOEA-based algorithms for finding motifs, one for self-discovery and the other for guided-discovery by experimentally determined motifs, and thereby predicting binding peptides to I-Ag7 molecule. Our experiments show that the proposed MOEA-based algorithms are better than earlier methods in predicting binding sites not only on I-Ag7 but also on most alleles of class II MHC benchmark datasets. This shows that our methods could be applicable to find binding motifs in a wide range of alleles.

[1]  G. Fogel,et al.  Discovery of sequence motifs related to coexpression of genes using evolutionary computation. , 2004, Nucleic acids research.

[2]  Yang Dai,et al.  Prediction of MHC class II binding peptides based on an iterative learning model , 2005, Immunome research.

[3]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[4]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[5]  H Mamitsuka,et al.  Predicting peptides that bind to MHC molecules using supervised learning of hidden markov models , 1998, Proteins.

[6]  Gajendra P. S. Raghava,et al.  SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence , 2004, Bioinform..

[7]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[8]  J A Koziol,et al.  Prediction of binding to MHC class I molecules. , 1995, Journal of immunological methods.

[9]  Kathleen Marchal,et al.  A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes , 2001, RECOMB.

[10]  Ellis L. Reinherz,et al.  Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profiles , 2004, Immunogenetics.

[11]  Vladimir Brusic,et al.  A Peptide-binding Motif for I-Ag7, the Class II Major Histocompatibility Complex (MHC) Molecule of NOD and Biozzi AB/H Mice , 1997, The Journal of experimental medicine.

[12]  Jun S. Liu,et al.  Gibbs motif sampling: Detection of bacterial outer membrane protein repeats , 1995, Protein science : a publication of the Protein Society.

[13]  D. Wiley,et al.  Importance of peptide amino and carboxyl termini to the stability of MHC class I molecules. , 1994, Science.

[14]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[15]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[16]  D. Wiley,et al.  Antigenic peptide binding by class I and class II histocompatibility proteins. , 1994, Structure.

[17]  David Corne,et al.  Evolving core promoter signal motifs , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[18]  David A Winkler,et al.  Predictive Bayesian neural network models of MHC class II peptide binding. , 2005, Journal of molecular graphics & modelling.

[19]  Arne Elofsson,et al.  Prediction of MHC class I binding peptides, using SVMHC , 2002, BMC Bioinformatics.

[20]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[21]  David W. Corne,et al.  Approximating the Nondominated Front Using the Pareto Archived Evolution Strategy , 2000, Evolutionary Computation.

[22]  L Adorini,et al.  The motif for peptide binding to the insulin-dependent diabetes mellitus-associated class II MHC molecule I-Ag7 validated by phage display library. , 2000, International immunology.

[23]  Darren R. Flower,et al.  Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores , 2006, BMC Bioinformatics.

[24]  Jianming Shi,et al.  Prediction of MHC class II binders using the ant colony search strategy , 2005, Artif. Intell. Medicine.

[25]  M Eisenstein,et al.  Molecular characterization of the diabetes-associated mouse MHC class II protein, I-Ag7. , 1997, International immunology.

[26]  R. R. Mallios,et al.  Predicting class II MHC/peptide multi-level binding with an iterative stepwise discriminant analysis meta-algorithm , 2001, Bioinform..

[27]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[28]  Irini A. Doytchinova,et al.  Towards the in silico identification of class II restricted T-cell epitopes: a partial least squares iterative self-consistent algorithm for affinity prediction , 2003, Bioinform..

[29]  Gajendra P. S. Raghava,et al.  MHCBN: a comprehensive database of MHC binding and non-binding peptides , 2003, Bioinform..

[30]  K. Garcia,et al.  A structural framework for deciphering the link between I-Ag7 and autoimmune diabetes. , 2000, Science.

[31]  V. Apostolopoulos,et al.  The I-Ag7 MHC Class II Molecule Linked to Murine Diabetes Is a Promiscuous Peptide Binder1 , 2000, The Journal of Immunology.

[32]  Charles Elkan,et al.  Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Mach. Learn..

[33]  Debashis Ghosh,et al.  Peptide length-based prediction of peptide-MHC class II binding , 2006, Bioinform..

[34]  Morten Nielsen,et al.  Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method , 2007, BMC Bioinformatics.

[35]  Emil R. Unanue,et al.  In APCs, the Autologous Peptides Selected by the Diabetogenic I-Ag7 Molecule Are Unique and Determined by the Amino Acid Changes in the P9 Pocket1 , 2002, The Journal of Immunology.

[36]  Z. Nagy,et al.  Precise prediction of major histocompatibility complex class II-peptide interaction based on peptide side chain scanning , 1994, The Journal of experimental medicine.

[37]  David Corne,et al.  Evolutionary Computation In Bioinformatics , 2003 .

[38]  J. Drijfhout,et al.  HLA-DR binding analysis of peptides from islet antigens in IDDM. , 1998, Diabetes.

[39]  Robert G. Beiko,et al.  GANN: Genetic algorithm neural networks for the detection of conserved combinations of features in DNA , 2005, BMC Bioinformatics.

[40]  E. Reinherz,et al.  Prediction of MHC class I binding peptides using profile motifs. , 2002, Human immunology.

[41]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[42]  Hans-Georg Rammensee,et al.  MHC ligands and peptide motifs: first listing , 2004, Immunogenetics.

[43]  J. Sidney,et al.  Prominent role of secondary anchor residues in peptide binding to HLA-A2.1 molecules , 1993, Cell.

[44]  Rong-Ming Chen,et al.  FMGA: finding motifs by genetic algorithm , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[45]  Søren Brunak,et al.  Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach , 2004, Bioinform..

[46]  H. Rammensee,et al.  SYFPEITHI: database for MHC ligands and peptide motifs , 1999, Immunogenetics.

[47]  E. Unanue,et al.  The lack of consensus for I-A(g7)-peptide binding motifs: is there a requirement for anchor amino acid side chains? , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Bjoern Peters,et al.  Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications , 2005, Immunogenetics.

[49]  T. Hanai,et al.  Hidden Markov model-based prediction of antigenic peptides that interact with MHC class II molecules. , 2002, Journal of bioscience and bioengineering.

[50]  K. Wucherpfennig,et al.  Binding of conserved islet peptides by human and murine MHC class II molecules associated with susceptibility to type I diabetes , 2000, European journal of immunology.

[51]  U. Şahin,et al.  Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices , 1999, Nature Biotechnology.

[52]  Victor Ciesielski,et al.  Application of Genetic Search in Derivation of Matrix Models of Peptide Binding to MHC Molecules , 1997, ISMB.

[53]  R. J. Stonier,et al.  Complex Systems: Mechanism of Adaptation , 1994 .

[54]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[55]  C. Fonseca,et al.  GENETIC ALGORITHMS FOR MULTI-OBJECTIVE OPTIMIZATION: FORMULATION, DISCUSSION, AND GENERALIZATION , 1993 .

[56]  C. Janeway,et al.  Self peptides isolated from MHC glycoproteins of non-obese diabetic mice. , 1994, Journal of immunology.

[57]  Maria V. Tejada-Simon,et al.  Naturally Processed HLA Class II Peptides Reveal Highly Conserved Immunogenic Flanking Region Sequence Preferences That Reflect Antigen Processing Rather Than Peptide-MHC Interactions1 , 2001, The Journal of Immunology.

[58]  Rolf Drechsler,et al.  Applications of Evolutionary Computing, EvoWorkshops 2008: EvoCOMNET, EvoFIN, EvoHOT, EvoIASP, EvoMUSART, EvoNUM, EvoSTOC, and EvoTransLog, Naples, Italy, March 26-28, 2008. Proceedings , 2008, EvoWorkshops.

[59]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[60]  W Fierz,et al.  Using a neural network to identify potential HLA‐DR1 binding sites within proteins , 1993, Journal of molecular recognition : JMR.

[61]  A Sette,et al.  Two complementary methods for predicting peptides binding major histocompatibility complex molecules. , 1997, Journal of molecular biology.

[62]  M F del Guercio,et al.  Several common HLA-DR types share largely overlapping peptide binding repertoires. , 1998, Journal of immunology.

[63]  H. Rammensee,et al.  Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules , 1991, Nature.

[64]  R. R. Mallios,et al.  Class II MHC quantitative binding motifs derived from a large molecular database with a versatile iterative stepwise discriminant analysis meta- algorithm , 1999, Bioinform..

[65]  O. Schueler‐Furman,et al.  Structure‐based prediction of binding peptides to MHC class I molecules: Application to a broad range of MHC alleles , 2000, Protein science : a publication of the Protein Society.

[66]  Yingdong Zhao,et al.  Application of support vector machines for T-cell epitopes prediction , 2003, Bioinform..

[67]  Vladimir Brusic,et al.  Multi-Objective Evolutionary Algorithm for Discovering Peptide Binding Motifs , 2006, EvoWorkshops.

[68]  Hiroyuki Honda,et al.  Prediction of peptide binding to major histocompatibility complex class II molecules through use of boosted fuzzy classifier with SWEEP operator method. , 2006, Journal of bioscience and bioengineering.

[69]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[70]  P. Travers,et al.  Encephalitogenic epitopes of myelin basic protein, proteolipid protein, myelin oligodendrocyte glycoprotein for experimental allergic encephalomyelitis induction in Biozzi ABH (H-2Ag7) mice share an amino acid motif. , 1996, Journal of immunology.

[71]  C. Elkan,et al.  Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Machine Learning.

[72]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[73]  E. Unanue,et al.  The class II MHC I-Ag7 molecules from non-obese diabetic mice are poor peptide binders. , 1996, Journal of immunology.

[74]  Ji Wan,et al.  SVRMHC prediction server for MHC-binding peptides , 2006, BMC Bioinformatics.

[75]  E. Unanue,et al.  Structural basis of peptide binding and presentation by the type I diabetes-associated MHC class II molecule of NOD mice. , 2000, Immunity.

[76]  Peter J. Fleming,et al.  Genetic Algorithms for Multiobjective Optimization: FormulationDiscussion and Generalization , 1993, ICGA.