Visual analysis of mass cytometry data by hierarchical stochastic neighbour embedding reveals rare cell types

Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a stepwise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.Single cell profiling yields high dimensional data of very large numbers of cells, posing challenges of visualization and analysis. Here the authors introduce a method for analysis of mass cytometry data that can handle very large datasets and allows their intuitive and hierarchical exploration.

[1]  Eli R. Zunder,et al.  A continuous molecular roadmap to iPSC reprogramming through progression analysis of single-cell mass cytometry. , 2015, Cell stem cell.

[2]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[3]  Sean C. Bendall,et al.  A deep profiler's guide to cytometry. , 2012, Trends in immunology.

[4]  M. Mearin,et al.  Mass Cytometry of the Human Mucosal Immune System Identifies Tissue- and Disease-Associated Immune Subsets. , 2016, Immunity.

[5]  A. Kirk,et al.  CD28 Negative T Cells: Is Their Loss Our Gain? , 2014, American journal of transplantation : official journal of the American Society of Transplantation and the American Society of Transplant Surgeons.

[6]  Mario Roederer,et al.  Single-cell technologies for monitoring immune systems , 2014, Nature Immunology.

[7]  Sean C. Bendall,et al.  Extracting a Cellular Hierarchy from High-dimensional Cytometry Data with SPADE , 2011, Nature Biotechnology.

[8]  Frits Koning,et al.  Identification of a potential physiological precursor of aberrant cells in refractory coeliac disease type II , 2012, Gut.

[9]  G. Nolan,et al.  Automated Mapping of Phenotype Space with Single-Cell Data , 2016, Nature Methods.

[10]  Elmar Eisemann,et al.  Hierarchical Stochastic Neighbor Embedding , 2016, Comput. Graph. Forum.

[11]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Sean C. Bendall,et al.  Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum , 2011, Science.

[13]  Elmar Eisemann,et al.  Cytosplore: Interactive Immune Cell Phenotyping for Large Single‐Cell Datasets , 2016, Comput. Graph. Forum.

[14]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[15]  Frits Koning,et al.  The composition and differentiation potential of the duodenal intraepithelial innate lymphocyte compartment is altered in coeliac disease , 2015, Gut.

[16]  Sean C. Bendall,et al.  Normalization of mass cytometry data with bead standards , 2013, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[17]  Sean C. Bendall,et al.  An interactive reference framework for modeling a dynamic immune system , 2015, Science.

[18]  Fabian J. Theis,et al.  Diffusion maps for high-dimensional single-cell analysis of differentiation data , 2015, Bioinform..

[19]  Eric Vivier,et al.  Innate lymphoid cells — a proposal for uniform nomenclature , 2013, Nature Reviews Immunology.

[20]  Sean C. Bendall,et al.  Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis , 2015, Cell.

[21]  Georgia Malamut,et al.  Interleukin-15-Dependent T-Cell-like Innate Intraepithelial Lymphocytes Develop in the Intestine and Transform into Lymphomas in Celiac Disease. , 2016, Immunity.

[22]  Elmar Eisemann,et al.  Approximated and User Steerable tSNE for Progressive Visual Analytics , 2015, IEEE Transactions on Visualization and Computer Graphics.

[23]  Y. Saeys,et al.  Computational flow cytometry: helping to make sense of high-dimensional immunology data , 2016, Nature Reviews Immunology.

[24]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[25]  Uri Shaham,et al.  Stochastic Neighbor Embedding separates well-separated clusters , 2017, 1702.02670.

[26]  Sean C. Bendall,et al.  Wishbone identifies bifurcating developmental trajectories from single-cell data , 2016, Nature Biotechnology.

[27]  M. Colonna,et al.  Transcriptional Programs Define Molecular Characteristics of Innate Lymphoid Cell Classes and Subsets , 2015, Nature Immunology.

[28]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[29]  Hergen Spits,et al.  Innate lymphoid cells: emerging insights in development, lineage relationships, and function. , 2012, Annual review of immunology.

[30]  Hergen Spits,et al.  Innate lymphoid cells in inflammation and immunity. , 2014, Immunity.