VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires

Despite the growing number of immune repertoire sequencing studies, the field still lacks software for analysis and comprehension of this high-dimensional data. Here we report VDJtools, a complementary software suite that solves a wide range of T cell receptor (TCR) repertoires post-analysis tasks, provides a detailed tabular output and publication-ready graphics, and is built on top of a flexible API. Using TCR datasets for a large cohort of unrelated healthy donors, twins, and multiple sclerosis patients we demonstrate that VDJtools greatly facilitates the analysis and leads to sound biological conclusions. VDJtools software and documentation are available at https://github.com/mikessh/vdjtools.

[1]  M. Davenport,et al.  Specificity, promiscuity, and precursor frequency in immunoreceptors. , 2013, Current opinion in immunology.

[2]  A. Christophersen,et al.  Biased usage and preferred pairing of α- and β-chains of TCRs specific for an immunodominant gluten epitope in coeliac disease. , 2014, International immunology.

[3]  K. Kinzler,et al.  Detection and quantification of rare mutations with massively parallel sequencing , 2011, Proceedings of the National Academy of Sciences.

[4]  V. Giudicelli,et al.  IMGT(®) tools for the nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J repertoires, polymorphisms, and IG mutations: IMGT/V-QUEST and IMGT/HighV-QUEST for NGS. , 2012, Methods in molecular biology.

[5]  D. Price,et al.  The molecular basis for public T-cell responses? , 2008, Nature Reviews Immunology.

[6]  W. Liu,et al.  TCR usage, gene expression and function of two distinct FOXP3+Treg subsets within CD4+CD25hi T cells identified by expression of CD39 and CD45RO , 2016, Immunology and cell biology.

[7]  S. Linnarsson,et al.  Counting absolute numbers of molecules using unique molecular identifiers , 2011, Nature Methods.

[8]  A. Begovich,et al.  Selection for T-cell receptor Vβ–Dβ–Jβ gene rearrangements with specificity for a myelin basic protein peptide in brain lesions of multiple sclerosis , 1993, Nature.

[9]  Mikhail Shugay,et al.  Distinctive properties of identical twins' TCR repertoires revealed by high-throughput sequencing , 2014, Proceedings of the National Academy of Sciences.

[10]  Daniel D. Sommer,et al.  MetAMOS: a modular and open source metagenomic assembly and analysis pipeline , 2013, Genome Biology.

[11]  K. Rajewsky,et al.  Multiple sclerosis: brain-infiltrating CD8+ T cells persist as clonal expansions in the cerebrospinal fluid and blood. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  S. Quake,et al.  The promise and challenge of high-throughput sequencing of the antibody repertoire , 2014, Nature Biotechnology.

[13]  Robert K. Colwell,et al.  Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages , 2012 .

[14]  Mikhail Shugay,et al.  MiTCR: software for T-cell receptor sequencing data analysis , 2013, Nature Methods.

[15]  B. Kissela,et al.  Circulating T cell repertoire complexity in normal individuals and bone marrow recipients analyzed by CDR3 size spectratyping. Correlation with immune status. , 1994, Journal of immunology.

[16]  Claude Preudhomme,et al.  Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing , 2014, BMC Genomics.

[17]  Evan W. Newell,et al.  Beyond model antigens: high-dimensional methods for the analysis of antigen-specific T cells , 2014, Nature Biotechnology.

[18]  Yi Shi,et al.  TCRklass: A New K-String–Based Algorithm for Human and Mouse TCR Repertoire Characterization , 2015, The Journal of Immunology.

[19]  M. Egholm,et al.  Measurement and Clinical Monitoring of Human Lymphocyte Clonality by Massively Parallel V-D-J Pyrosequencing , 2009, Science Translational Medicine.

[20]  P. Doherty,et al.  Structural determinants of T-cell receptor bias in immunity , 2006, Nature Reviews Immunology.

[21]  Olga V. Britanova,et al.  Age-Related Decrease in TCR Repertoire Diversity Measured with Deep and Normalized Sequence Profiling , 2014, The Journal of Immunology.

[22]  Mikhail Shugay,et al.  Towards error-free profiling of immune repertoires , 2014, Nature Methods.

[23]  C. Leslie,et al.  A mechanism for expansion of regulatory T cell repertoire and its role in self tolerance , 2015, Nature.

[24]  Raphael Gottardo,et al.  Computational resources for high-dimensional immune analysis from the Human Immunology Project Consortium , 2014, Nature Biotechnology.

[25]  D. Douek,et al.  TCR beta-chain sharing in human CD8+ T cell responses to cytomegalovirus and EBV. , 2008, Journal of immunology.

[26]  Abigail Wacher,et al.  Comprehensive assessment of T-cell receptor beta-chain diversity in alphabeta T cells. , 2009, Blood.

[27]  V. Carlton,et al.  Immunoglobulin and T cell receptor gene high-throughput sequencing quantifies minimal residual disease in acute lymphoblastic leukemia and predicts post-transplantation relapse and survival. , 2014, Biology of blood and marrow transplantation : journal of the American Society for Blood and Marrow Transplantation.

[28]  D. Campana,et al.  Deep-sequencing approach for minimal residual disease detection in acute lymphoblastic leukemia. , 2012, Blood.

[29]  C. Carlson,et al.  Overlap and Effective Size of the Human CD8+ T Cell Receptor Repertoire , 2010, Science Translational Medicine.

[30]  O. Britanova,et al.  First autologous hematopoietic SCT for ankylosing spondylitis: a case report and clues to understanding the therapy , 2012, Bone Marrow Transplantation.

[31]  David Wu,et al.  High-Throughput Sequencing Detects Minimal Residual Disease in Acute T Lymphoblastic Leukemia , 2012, Science Translational Medicine.

[32]  Ryan Emerson,et al.  Estimating the ratio of CD4+ to CD8+ T cells using high-throughput sequence data. , 2013, Journal of immunological methods.

[33]  N. Fischer,et al.  Comparing CDRH3 diversity captured from secondary lymphoid organs for the generation of recombinant human antibodies , 2013, mAbs.

[34]  Mark M. Davis,et al.  The promised land of human immunology. , 2013, Cold Spring Harbor symposia on quantitative biology.

[35]  D. Price,et al.  TCR β-Chain Sharing in Human CD8+ T Cell Responses to Cytomegalovirus and EBV1 , 2008, The Journal of Immunology.

[36]  Olga V. Britanova,et al.  Mother and Child T Cell Receptor Repertoires: Deep Profiling Study , 2013, Front. Immunol..

[37]  Michael W. McCormick,et al.  Shaping of Human Germline IgH Repertoires Revealed by Deep Sequencing , 2012, The Journal of Immunology.

[38]  J. Borghans,et al.  Memorizing innate instructions requires a sufficiently specific adaptive immune system. , 2002, International immunology.

[39]  C. Desmarais,et al.  T cell repertoire following autologous stem cell transplantation for multiple sclerosis. , 2014, The Journal of clinical investigation.

[40]  P. McCullagh Estimating the Number of Unseen Species: How Many Words did Shakespeare Know? , 2008 .

[41]  Robert K. Colwell,et al.  Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness , 2001 .

[42]  Hans Lassmann,et al.  Clonal Expansions of Cd8+ T Cells Dominate the T Cell Infiltrate in Active Multiple Sclerosis Lesions as Shown by Micromanipulation and Single Cell Polymerase Chain Reaction , 2000, The Journal of experimental medicine.

[43]  A. Palumbo,et al.  Next-generation sequencing and real-time quantitative PCR for minimal residual disease detection in B-cell disorders , 2014, Leukemia.

[44]  H. Lassmann,et al.  Multiple sclerosis: T-cell receptor expression in distinct brain regions. , 2007, Brain : a journal of neurology.

[45]  Ning Ma,et al.  IgBLAST: an immunoglobulin variable domain sequence analysis tool , 2013, Nucleic Acids Res..

[46]  B. Efron,et al.  Estimating the number of unseen species: How many words did Shakespeare know? Biometrika 63 , 1976 .

[47]  Paolo Fontana,et al.  Bioinformatic approaches for functional annotation and pathway inference in metagenomics data , 2012, Briefings Bioinform..

[48]  Patrice Duroux,et al.  IMGT/HIGHV-QUEST: THE IMGT® WEB PORTAL FOR IMMUNOGLOBULIN (IG) OR ANTIBODY AND T CELL RECEPTOR (TR) ANALYSIS FROM NGS HIGH THROUGHPUT AND DEEP SEQUENCING , 2012 .

[49]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[50]  A. Begovich,et al.  Selection for T-cell receptor V beta-D beta-J beta gene rearrangements with specificity for a myelin basic protein peptide in brain lesions of multiple sclerosis. , 1993, Nature.

[51]  R. Hohlfeld,et al.  Repertoire dynamics of autoreactive T cells in multiple sclerosis patients and healthy subjects: epitope spreading versus clonal persistence. , 2000, Brain : a journal of neurology.

[52]  J. Hughes,et al.  Counting the Uncountable: Statistical Approaches to Estimating Microbial Diversity , 2001, Applied and Environmental Microbiology.

[53]  Mikhail Shugay,et al.  Huge Overlap of Individual TCR Beta Repertoires , 2013, Front. Immunol..

[54]  Dmitriy A Bolotin,et al.  Quantitative tracking of T cell clones after haematopoietic stem cell transplantation , 2011, EMBO molecular medicine.

[55]  H. S. Horn,et al.  Measurement of "Overlap" in Comparative Ecological Studies , 1966, The American Naturalist.

[56]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[57]  Antoni Ribas,et al.  Improved Survival with T Cell Clonotype Stability After Anti–CTLA-4 Treatment in Cancer Patients , 2014, Science Translational Medicine.

[58]  Huzefa Rangwala,et al.  16S rRNA metagenome clustering and diversity estimation using locality sensitive hashing , 2013, BMC Systems Biology.

[59]  Mikhail Shugay,et al.  MiXCR: software for comprehensive adaptive immunity profiling , 2015, Nature Methods.

[60]  Richard A. Olshen,et al.  Diversity and clonal selection in the human T-cell repertoire , 2014, Proceedings of the National Academy of Sciences.

[61]  T. Cedena,et al.  Prognostic value of deep sequencing method for minimal residual disease detection in multiple myeloma. , 2014, Blood.

[62]  TCRBV20S1 polymorphism does not influence the susceptibility to type 1 diabetes and multiple sclerosis in the Sardinian population , 2012, Immunogenetics.

[63]  W. Bossert,et al.  The Measurement of Diversity , 2001 .

[64]  W. Robinson Sequencing the functional antibody repertoire—diagnostic and therapeutic discovery , 2015, Nature Reviews Rheumatology.