Introduction to the Peptide Binding Problem of Computational Immunology: New Results

We attempt to establish geometrical methods for amino acid sequences. To measure the similarities of these sequences, a kernel on strings is defined using only the sequence structure and a good amino acid substitution matrix (e.g. BLOSUM62). The kernel is used in learning machines to predict binding affinities of peptides to human leukocyte antigen DR (HLA-DR) molecules. On both fixed allele (Nielsen and Lund in BMC Bioinform. 10:296, 2009) and pan-allele (Nielsen et al. in Immunome Res. 6(1):9, 2010) benchmark databases, our algorithm achieves the state-of-the-art performance. The kernel is also used to define a distance on an HLA-DR allele set based on which a clustering analysis precisely recovers the serotype classifications assigned by WHO (Holdsworth et al. in Tissue Antigens 73(2):95–170, 2009; Marsh et al. in Tissue Antigens 75(4):291–455, 2010). These results suggest that our kernel relates well the sequence structure of both peptides and HLA-DR molecules to their biological functions, and that it offers a simple, powerful and promising methodology to immunology and amino acid sequence studies.

[1]  Jean-Philippe Vert,et al.  Efficient peptide-MHC-I binding prediction for alleles with few known binders , 2008, Bioinform..

[2]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[3]  O. Lund,et al.  Definition of supertypes for HLA molecules using clustering of specificity matrices , 2004, Immunogenetics.

[4]  Bjoern Peters,et al.  HLA class I supertypes: a revised and updated classification , 2008, BMC Immunology.

[5]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[6]  Alain Sanson,et al.  HLA-DP4, the Most Frequent HLA II Molecule, Defines a New Supertype of Peptide-Binding Specificity1 , 2002, The Journal of Immunology.

[7]  James Robinson,et al.  IMGT/HLA and IMGT/MHC: sequence databases for the study of the major histocompatibility complex , 2003, Nucleic Acids Res..

[8]  G. Wahba Spline models for observational data , 1990 .

[9]  Morten Nielsen,et al.  NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction , 2009, BMC Bioinformatics.

[10]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[11]  Laurent Bartholdi,et al.  Hodge Theory on Metric Spaces , 2009, Found. Comput. Math..

[12]  Vasant Honavar,et al.  On Evaluating MHC-II Binding Peptide Prediction Methods , 2008, PloS one.

[13]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[14]  M F del Guercio,et al.  Several common HLA-DR types share largely overlapping peptide binding repertoires. , 1998, Journal of immunology.

[15]  Vladimir Brusic,et al.  Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research , 2008, BMC Bioinformatics.

[16]  Lorenzo Rosasco,et al.  Publisher Accessed Terms of Use Detailed Terms Mathematics of the Neural Response , 2022 .

[17]  Tatsuya Akutsu,et al.  Protein homology detection using string alignment kernels , 2004, Bioinform..

[18]  Tatsuya Akutsu,et al.  Optimizing amino acid substitution matrices with a local alignment kernel , 2006, BMC Bioinformatics.

[19]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decision-making , 1988 .

[20]  O. Lund,et al.  NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure , 2010, Immunome research.

[21]  Ding-Xuan Zhou,et al.  Learning Theory: An Approximation Theory Viewpoint , 2007 .

[22]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[23]  S G Marsh,et al.  The HLA dictionary 2001: a summary of HLA-A, -B, -C, -DRB1/3/4/5, -DQB1 alleles and their association with serologically defined HLA-A, -B, -C, -DR, and -DQ antigens. , 2001, Human immunology.

[24]  Gunnar Rätsch,et al.  Novel Machine Learning Methods for MHC Class I Binding Prediction , 2010, PRIB.

[25]  J. Yewdell,et al.  Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses. , 1999, Annual review of immunology.

[26]  Darren R. Flower,et al.  Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores , 2006, BMC Bioinformatics.

[27]  Richard H. Scheuermann,et al.  Departments of Pathology and , 2022 .

[28]  Gesine Reinert,et al.  Alignment-Free Sequence Comparison (II): Theoretical Power of Comparison Statistics , 2010, J. Comput. Biol..

[29]  John Sidney,et al.  A Systematic Assessment of MHC Class II Peptide Binding Predictions and Evaluation of a Consensus Approach , 2008, PLoS Comput. Biol..

[30]  Solomon Tesfamariam,et al.  Probability density functions based weights for ordered weighted averaging (OWA) operators: An example of water quality indices , 2007, Eur. J. Oper. Res..

[31]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[32]  M Setterholm,et al.  Use of a neural network to assign serologic specificities to HLA-A, -B and -DRB1 allelic products. , 2003, Tissue antigens.

[33]  W. Bodmer,et al.  Nomenclature for factors of the HLA system, 2010 , 2010, Tissue antigens.

[34]  A Sette,et al.  Practical, biochemical and evolutionary implications of the discovery of HLA class I supermotifs. , 1996, Immunology today.

[35]  Ora Schueler-Furman,et al.  Learning MHC I - peptide binding , 2006, ISMB.

[36]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[37]  L Adorini,et al.  Capacity of intact proteins to bind to MHC class II molecules. , 1989, Journal of immunology.

[38]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[40]  Steven G.E. Marsh,et al.  Nomenclature for factors of the HLA system , 1975 .

[41]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[42]  Irini A. Doytchinova,et al.  In Silico Identification of Supertypes for Class II MHCs1 , 2005, The Journal of Immunology.

[43]  Ole Lund,et al.  Immunological Bioinformatics (Computational Molecular Biology) , 2005 .

[44]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[45]  Morten Nielsen,et al.  Quantitative Predictions of Peptide Binding to Any HLA-DR Molecule of Known Sequence: NetMHCIIpan , 2008, PLoS Comput. Biol..

[46]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint (Cambridge Monographs on Applied & Computational Mathematics) , 2007 .

[47]  Gilles Caraux,et al.  A 454 multiplex sequencing method for rapid and reliable genotyping of highly polymorphic genes in large-scale studies , 2010, BMC Genomics.

[48]  L. Mitchell,et al.  A new categorization of HLA DR alleles on a functional basis. , 1998, Human immunology.

[49]  G. Chelvanayagam,et al.  Peptide binding motifs and specificities for HLA-DQ molecules , 1999, Immunogenetics.

[50]  J. Sidney,et al.  Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism , 1999, Immunogenetics.

[51]  Lawrence Hunter,et al.  Pacific symposium on biocomputing 2006 , 2005, PSB 2016.

[52]  Wen-Hsiung Li,et al.  Fundamentals of molecular evolution , 1990 .