powerTCR: A model-based approach to comparative analysis of the clone size distribution of the T cell receptor repertoire

Sequencing of the T cell receptor (TCR) repertoire is a powerful tool for deeper study of immune response, but the unique structure of this type of data makes its meaningful quantification challenging. We introduce a new method, the Gamma-GPD spliced threshold model, to address this difficulty. This biologically interpretable model captures the distribution of the TCR repertoire, demonstrates stability across varying sequencing depths, and permits comparative analysis across any number of sampled individuals. We apply our method to several datasets and obtain insights regarding the differentiating features in the T cell receptor repertoire among sampled individuals across conditions. We have implemented our method in the open-source R package powerTCR.

[1]  S. Sakaguchi,et al.  Molecular Determinants of Regulatory T Cell Development: The Essential Roles of Epigenetic Changes , 2013, Frontiers in Immunology.

[2]  Michał Seweryn,et al.  Model for comparative analysis of antigen receptor repertoires. , 2011, Journal of theoretical biology.

[3]  Joseph Kaplinsky,et al.  Robust estimates of overall immune-repertoire diversity from high-throughput measurements on samples , 2016, Nature Communications.

[4]  M. Vignali,et al.  Contribution of systemic and somatic factors to clinical response and resistance to PD-L1 blockade in urothelial cancer: An exploratory multi-omic analysis , 2017, PLoS medicine.

[5]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[6]  Thierry Mora,et al.  Fluctuating fitness shapes the clone-size distribution of immune repertoires , 2015, Proceedings of the National Academy of Sciences.

[7]  M. Nakayama,et al.  Shared αβ TCR Usage in Lungs of Sarcoidosis Patients with Löfgren’s Syndrome , 2017, The Journal of Immunology.

[8]  V. Appay,et al.  Human Stem Cell-like Memory T Cells Are Maintained in a State of Dynamic Flux , 2016, Cell reports.

[9]  Rei Watanabe,et al.  TCR sequencing facilitates diagnosis and identifies mature T cells as the cell of origin in CTCL , 2015, Science Translational Medicine.

[10]  M. Hill Diversity and Evenness: A Unifying Notation and Its Consequences , 1973 .

[11]  B. Dumitriu,et al.  Memory Stem T Cells in Autoimmune Disease: High Frequency of Circulating CD8+ Memory Stem Cells in Acquired Aplastic Anemia , 2016, The Journal of Immunology.

[12]  Yufeng Shen,et al.  Diversity and divergence of the glioma-infiltrating T-cell receptor repertoire , 2016, Proceedings of the National Academy of Sciences.

[13]  S. Cooper,et al.  The expression of class II major histocompatibility molecules on breast tumors delays T cell exhaustion, expands the T cell repertoire and slows tumor growth , 2018, bioRxiv.

[14]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[15]  Peter Müller,et al.  A Bayesian semiparametric approach for the differential analysis of sequence counts data , 2014, Journal of the Royal Statistical Society. Series C, Applied statistics.

[16]  A. Casrouge,et al.  A Direct Estimate of the Human αβ T Cell Receptor Diversity , 1999 .

[17]  K. Jung,et al.  Role of Stem Cell–Like Memory T Cells in Systemic Lupus Erythematosus , 2018, Arthritis & rheumatology.

[18]  R. White,et al.  High-Throughput Sequencing of the Zebrafish Antibody Repertoire , 2009, Science.

[19]  Wai Keung Li,et al.  A threshold approach for peaks-over-threshold modeling using maximum product of spacings , 2010 .

[20]  Ryan Emerson,et al.  TCR Sequencing Can Identify and Track Glioma-Infiltrating T Cells after DC Vaccination , 2016, Cancer Immunology Research.

[21]  A. Hadi,et al.  Fitting the Generalized Pareto Distribution to Data , 1997 .

[22]  E. C. Pielou An introduction to mathematical ecology , 1970 .

[23]  Anna Lorenc,et al.  T cell receptor β-chains display abnormal shortening and repertoire sharing in type 1 diabetes , 2017, Nature Communications.

[24]  J. Martin van Zyl,et al.  Application of the Kolmogorov-Smirnov Test to Estimate the Threshold When Estimating the Extreme Value Index , 2011, Commun. Stat. Simul. Comput..

[25]  S. Cooper,et al.  The expression of MHC class II molecules on murine breast tumors delays T-cell exhaustion, expands the T-cell repertoire, and slows tumor growth , 2018, Cancer Immunology, Immunotherapy.

[26]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[27]  F. Marincola,et al.  A human memory T-cell subset with stem cell-like properties , 2011, Nature Medicine.

[28]  A. Vallejo,et al.  The Influence of Age on T Cell Generation and TCR Diversity1 , 2005, The Journal of Immunology.

[29]  E. Naumova,et al.  A Fractal Clonotype Distribution in the CD8+ Memory T Cell Repertoire Could Optimize Potential for Immune Responses1 , 2003, The Journal of Immunology.

[30]  Diabetes in identical twins. A study of 200 pairs. , 1981, Diabetologia.

[31]  John J Miles,et al.  High Frequency of Herpesvirus-Specific Clonotypes in the Human T Cell Repertoire Can Remain Stable over Decades with Minimal Turnover , 2012, Journal of Virology.

[32]  Pasquale Cirillo,et al.  Are your data really Pareto distributed , 2013, 1306.0100.

[33]  Michael R. Green,et al.  The T-cell Receptor Repertoire Influences the Tumor Microenvironment and Is Associated with Survival in Aggressive B-cell Lymphoma , 2016, Clinical Cancer Research.

[34]  J. D. Burgos,et al.  Zipf-scaling behavior in the immune system. , 1996, Bio Systems.

[35]  S. Rosenberg,et al.  A new approach to the adoptive immunotherapy of cancer with tumor-infiltrating lymphocytes. , 1986, Science.

[36]  Carl Scarrott,et al.  Univariate Extreme Value Mixture Modeling , 2015 .

[37]  Daniel J. Laydon,et al.  Quantification of HTLV-1 Clonality and TCR Diversity , 2014, PLoS Comput. Biol..

[38]  J. Pickands Statistical Inference Using Extreme Order Statistics , 1975 .