On the feasibility of mining CD8+ T cell receptor patterns underlying immunogenic peptide recognition

Current T cell epitope prediction tools are a valuable resource in designing targeted immunogenicity experiments. They typically focus on, and are able to, accurately predict peptide binding and presentation by major histocompatibility complex (MHC) molecules on the surface of antigen-presenting cells. However, recognition of the peptide-MHC complex by a T cell receptor (TCR) is often not included in these tools. We developed a classification approach based on random forest classifiers to predict recognition of a peptide by a T cell receptor and discover patterns that contribute to recognition. We considered two approaches to solve this problem: (1) distinguishing between two sets of TCRs that each bind to a known peptide and (2) retrieving TCRs that bind to a given peptide from a large pool of TCRs. Evaluation of the models on two HIV-1, B*08-restricted epitopes reveals good performance and hints towards structural CDR3 features that can determine peptide immunogenicity. These results are of particular importance as they show that prediction of T cell epitope and T cell epitope recognition based on sequence data is a feasible approach. In addition, the validity of our models not only serves as a proof of concept for the prediction of immunogenic T cell epitopes but also paves the way for more general and high-performing models.

[1]  M. Krangel,et al.  Mechanics of T cell receptor gene rearrangement. , 2009, Current opinion in immunology.

[2]  Patrice Duroux,et al.  IMGT®, the international ImMunoGeneTics information system® 25 years on , 2014, Nucleic Acids Res..

[3]  Yuxin Sun,et al.  Feature selection using a one dimensional naïve Bayes’ classifier increases the accuracy of support vector machine classification of CDR3 repertoires , 2017, Bioinform..

[4]  Lennart Martens,et al.  MS2PIP: a tool for MS/MS peak intensity prediction , 2013, Bioinform..

[5]  M. Si-Tahar,et al.  FHL2 Regulates Natural Killer Cell Development and Activation during Streptococcus pneumoniae Infection , 2017, Front. Immunol..

[6]  O. Lund,et al.  Predictions versus high-throughput experiments in T-cell epitope discovery: competition or synergy? , 2012, Expert review of vaccines.

[7]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[8]  J. Shawe-Taylor,et al.  Specificity, Privacy, and Degeneracy in the CD4 T Cell Receptor Repertoire Following Immunization , 2017, Front. Immunol..

[9]  A. Sewell,et al.  Molecular Basis of a Dominant T Cell Response to an HIV Reverse Transcriptase 8-mer Epitope Presented by the Protective Allele HLA-B*51:01 , 2014, The Journal of Immunology.

[10]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11]  Ursula Esser,et al.  Mapping T-cell receptor–peptide contacts by variant peptide immunization of single-chain transgenics , 1992, Nature.

[12]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[13]  Sergio Rosales-Mendoza,et al.  An overview of bioinformatics tools for epitope prediction: Implications on vaccine development , 2015, J. Biomed. Informatics.

[14]  James McCluskey,et al.  T cell antigen receptor recognition of antigen-presenting molecules. , 2015, Annual review of immunology.

[15]  P. Doherty,et al.  Structural determinants of T-cell receptor bias in immunity , 2006, Nature Reviews Immunology.

[16]  F. Ascencio,et al.  In silico epitope analysis of unique and membrane associated proteins from Mycobacterium avium subsp. paratuberculosis for immunogenicity and vaccine evaluation. , 2015, Journal of theoretical biology.

[17]  A. Mustafa In silico Analysis and Experimental Validation of Mycobacterium tuberculosis-Specific Proteins and Peptides of Mycobacterium tuberculosis for Immunological Diagnosis and Vaccine Development , 2013, Medical Principles and Practice.

[18]  Rich Caruana,et al.  An empirical evaluation of supervised learning in high dimensions , 2008, ICML '08.

[19]  Kris Laukens,et al.  Varicella-Zoster Virus-Derived Major Histocompatibility Complex Class I-Restricted Peptide Affinity Is a Determining Factor in the HLA Risk Profile for the Development of Postherpetic Neuralgia , 2014, Journal of Virology.

[20]  M. Jenkins,et al.  The Role of Naive T Cell Precursor Frequency and Recruitment in Dictating Immune Response Magnitude , 2012, The Journal of Immunology.

[21]  D. Koning,et al.  Complex T-Cell Receptor Repertoire Dynamics Underlie the CD8+ T-Cell Response to HIV-1 , 2014, Journal of Virology.

[22]  Deborah Hix,et al.  The immune epitope database (IEDB) 3.0 , 2014, Nucleic Acids Res..

[23]  Alessandro Sette,et al.  Properties of MHC Class I Presented Peptides That Enhance Immunogenicity , 2013, PLoS Comput. Biol..

[24]  Morten Nielsen,et al.  NetCTLpan: pan-specific MHC class I pathway epitope predictions , 2010, Immunogenetics.

[25]  Witold R. Rudnicki,et al.  Feature Selection with the Boruta Package , 2010 .

[26]  P. Jensen Recent advances in antigen processing and presentation , 2007, Nature Immunology.

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.