From Bivariate to Multivariate Analysis of Cytometric Data: Overview of Computational Methods and Their Application in Vaccination Studies

Flow and mass cytometry are used to quantify the expression of multiple extracellular or intracellular molecules on single cells, allowing the phenotypic and functional characterization of complex cell populations. Multiparametric flow cytometry is particularly suitable for deep analysis of immune responses after vaccination, as it allows to measure the frequency, the phenotype, and the functional features of antigen-specific cells. When many parameters are investigated simultaneously, it is not feasible to analyze all the possible bi-dimensional combinations of marker expression with classical manual analysis and the adoption of advanced automated tools to process and analyze high-dimensional data sets becomes necessary. In recent years, the development of many tools for the automated analysis of multiparametric cytometry data has been reported, with an increasing record of publications starting from 2014. However, the use of these tools has been preferentially restricted to bioinformaticians, while few of them are routinely employed by the biomedical community. Filling the gap between algorithms developers and final users is fundamental for exploiting the advantages of computational tools in the analysis of cytometry data. The potentialities of automated analyses range from the improvement of the data quality in the pre-processing steps up to the unbiased, data-driven examination of complex datasets using a variety of algorithms based on different approaches. In this review, an overview of the automated analysis pipeline is provided, spanning from the pre-processing phase to the automated population analysis. Analysis based on computational tools might overcame both the subjectivity of manual gating and the operator-biased exploration of expected populations. Examples of applications of automated tools that have successfully improved the characterization of different cell populations in vaccination studies are also presented.

[1]  R. Mukherjee,et al.  Non-Classical monocytes display inflammatory features: Validation in Sepsis and Systemic Lupus Erythematous , 2015, Scientific Reports.

[2]  Luonan Chen,et al.  Quantifying Waddington’s epigenetic landscape: a comparison of single-cell potency measures , 2018, bioRxiv.

[3]  Cliburn Chan,et al.  Hierarchical Modeling for Rare Event Detection and Cell Subset Alignment across Flow Cytometry Samples , 2013, PLoS Comput. Biol..

[4]  H. Volk,et al.  A protocol for combining proliferation, tetramer staining and intracellular cytokine detection for the flow-cytometric analysis of antigen specific T-cells. , 2003, Journal of biological regulators and homeostatic agents.

[5]  Dana Pe'er,et al.  Detection of minimal residual disease in B lymphoblastic leukemia using viSNE , 2015, Cytometry. Part B, Clinical cytometry.

[6]  R. Scheuermann,et al.  Elucidation of seventeen human peripheral blood B‐cell subsets and quantification of the tetanus response using a density‐based method for the automated identification of cell populations in multidimensional flow cytometry data , 2010, Cytometry. Part B, Clinical cytometry.

[7]  R. Tibshirani,et al.  Automated identification of stratifying signatures in cellular subpopulations , 2014, Proceedings of the National Academy of Sciences.

[8]  Pratip K. Chattopadhyay,et al.  Early immunologic correlates of HIV protection can be identified from computational analysis of complex multivariate T-cell flow cytometry assays , 2012, Bioinform..

[9]  Martin Kampel,et al.  Automated Flow Cytometric MRD Assessment in Childhood Acute B‐ Lymphoblastic Leukemia Using Supervised Machine Learning , 2019, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[10]  Guenther Walther,et al.  AutoGate: automating analysis of flow cytometry data , 2014, Immunologic research.

[11]  J. Berzofsky,et al.  Low Antigen Dose in Adjuvant-Based Vaccination Selectively Induces CD4 T Cells with Enhanced Functional Avidity and Protective Efficacy , 2017, The Journal of Immunology.

[12]  Buhm Han,et al.  Application of user-guided automated cytometric data analysis to large-scale immunoprofiling of invariant natural killer T cells , 2013, Proceedings of the National Academy of Sciences.

[13]  Fabian J. Theis,et al.  destiny: diffusion maps for large-scale single-cell data in R , 2015, Bioinform..

[14]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[15]  Ryan R Brinkman,et al.  Rapid cell population identification in flow cytometry data , 2011, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[16]  A. Ciabattini,et al.  Optimized Protocol for the Detection of Multifunctional Epitope-Specific CD4+ T Cells Combining MHC-II Tetramer and Intracellular Cytokine Staining Technologies , 2019, Front. Immunol..

[17]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[18]  Greg Finak,et al.  Critical assessment of automated flow cytometry data analysis techniques , 2013, Nature Methods.

[19]  A. Ciabattini,et al.  Computational Analysis of Multiparametric Flow Cytometric Data to Dissect B Cell Subsets in Vaccine Studies , 2019, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[20]  Yu Qian,et al.  Mapping cell populations in flow cytometry data for cross‐sample comparison using the Friedman–Rafsky test statistic as a distance measure , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[21]  Johnny Ludvigsson,et al.  Mass Cytometry Identifies Distinct Subsets of Regulatory T Cells and Natural Killer Cells Associated With High Risk for Type 1 Diabetes , 2019, Front. Immunol..

[22]  Bartek Rajwa,et al.  Automated Assessment of Disease Progression in Acute Myeloid Leukemia by Probabilistic Analysis of Flow Cytometry Data , 2017, IEEE Transactions on Biomedical Engineering.

[23]  A. A. van de Loosdrecht,et al.  Computational analysis of flow cytometry data in hematological malignancies: future clinical practice? , 2019, Current opinion in oncology.

[24]  N. Aghaeepour,et al.  Automated analysis of multidimensional flow cytometry data improves diagnostic accuracy between mantle cell lymphoma and small lymphocytic lymphoma. , 2012, American journal of clinical pathology.

[25]  Jonathan A. Rebhahn,et al.  SWIFT—Scalable Clustering for Automated Identification of Rare Cell Populations in Large, High-Dimensional Flow Cytometry Datasets, Part 2: Biological Evaluation , 2014, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[26]  Robert Gentleman,et al.  flowCore: a Bioconductor package for high throughput flow cytometry , 2009, BMC Bioinformatics.

[27]  S. Granjeaud,et al.  Increased NK Cell Maturation in Patients with Acute Myeloid Leukemia , 2015, Front. Immunol..

[28]  Lars Rønn Olsen,et al.  Algorithmic Clustering Of Single‐Cell Cytometry Data—How Unsupervised Are These Analyses Really? , 2019, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[29]  Sean C. Bendall,et al.  Wishbone identifies bifurcating developmental trajectories from single-cell data , 2016, Nature Biotechnology.

[30]  Surendra S. Negi,et al.  Antibody-Mediated Protective Mechanisms Induced by a Trivalent Parainfluenza Virus-Vectored Ebolavirus Vaccine , 2018, Journal of Virology.

[31]  A. Ciabattini,et al.  Heterologous Prime-Boost Combinations Highlight the Crucial Role of Adjuvant in Priming the Immune System , 2018, Front. Immunol..

[32]  N. Aghaeepour,et al.  Thinking outside the gate: single-cell assessments in multiple dimensions. , 2015, Immunity.

[33]  Thomas Häupl,et al.  immunoClust—An automated analysis pipeline for the identification of immunophenotypic signatures in high‐dimensional cytometric datasets , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[34]  Greg Finak,et al.  Identification and visualization of multidimensional antigen‐specific T‐cell populations in polychromatic cytometry data , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[35]  Y. Saeys,et al.  Computational flow cytometry: helping to make sense of high-dimensional immunology data , 2016, Nature Reviews Immunology.

[36]  Aysun Adan,et al.  Flow cytometry: basic principles and applications , 2017, Critical reviews in biotechnology.

[37]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[38]  Greg Finak,et al.  flowDensity: reproducing manual gating of flow cytometry data by automated density-based cell population identification , 2015, Bioinform..

[39]  Fabian J. Theis,et al.  Diffusion maps for high-dimensional single-cell analysis of differentiation data , 2015, Bioinform..

[40]  Marion Pepper,et al.  Naive CD4(+) T cell frequency varies for different epitopes and predicts repertoire diversity and response magnitude. , 2007, Immunity.

[41]  G. Nolan,et al.  Automated Mapping of Phenotype Space with Single-Cell Data , 2016, Nature Methods.

[42]  Michael Poidinger,et al.  High-dimensional analysis of the murine myeloid cell system , 2014, Nature Immunology.

[43]  Ryan R Brinkman,et al.  Per‐channel basis normalization methods for flow cytometry data , 2009, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[44]  Piet Demeester,et al.  FlowSOM: Using self‐organizing maps for visualization and interpretation of cytometry data , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[45]  H. Hoos,et al.  RchyOptimyx: Cellular hierarchy optimization for flow cytometry , 2012, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[46]  Raphael Gottardo,et al.  flowClust: a Bioconductor package for automated gating of flow cytometry data , 2009, BMC Bioinformatics.

[47]  Donghyuk Kim,et al.  High-throughput physical phenotyping of cell differentiation , 2017, Microsystems & Nanoengineering.

[48]  Sean C. Bendall,et al.  Extracting a Cellular Hierarchy from High-dimensional Cytometry Data with SPADE , 2011, Nature Biotechnology.

[49]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[50]  Sean C. Bendall,et al.  Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis , 2015, Cell.

[51]  A. Mittag,et al.  Recent advances in cytometry applications: preclinical, clinical, and cell biology. , 2011, Methods in cell biology.

[52]  A. Ciabattini,et al.  CD4+ T Cell Priming as Biomarker to Study Immune Response to Preventive Vaccines , 2013, Front. Immunol..

[53]  Greg Finak,et al.  Automated analysis of flow cytometry data comes of age , 2016, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[54]  Yvan Saeys,et al.  FloReMi: Flow density survival regression using minimal feature redundancy , 2016, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[55]  X. Mariette,et al.  Characterization of Phenotypes and Functional Activities of Leukocytes From Rheumatoid Arthritis Patients by Mass Cytometry , 2019, Front. Immunol..

[56]  Sean C. Bendall,et al.  Single-Cell Trajectory Detection Uncovers Progression and Regulatory Coordination in Human B Cell Development , 2014, Cell.

[57]  Hao Chen,et al.  flowAI: automatic and interactive anomaly discerning tools for flow cytometry data , 2016, Bioinform..

[58]  Jerome H. Kim,et al.  Dissecting Polyclonal Vaccine-Induced Humoral Immunity against HIV Using Systems Serology , 2015, Cell.

[59]  Jiří Vondrášek,et al.  SOM-based embedding improves efficiency of high-dimensional cytometry data analysis , 2019 .

[60]  Eli R. Zunder,et al.  Palladium-based mass tag cell barcoding with a doublet-filtering scheme and single-cell deconvolution algorithm , 2015, Nature Protocols.

[61]  Yvan Saeys,et al.  CytoNorm: A Normalization Algorithm for Cytometry Data , 2019, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[62]  Ming Yao,et al.  Clinically validated machine learning algorithm for detecting residual diseases with multicolor flow cytometry analysis in acute myeloid leukemia and myelodysplastic syndrome , 2018, EBioMedicine.

[63]  Ryan R Brinkman,et al.  Implementation and Validation of an Automated Flow Cytometry Analysis Pipeline for Human Immune Profiling , 2019, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[64]  George Nikiforidis,et al.  Bayesian clustering of flow cytometry data for the diagnosis of B-Chronic Lymphocytic Leukemia , 2009, J. Biomed. Informatics.

[65]  Philip J. R. Goulder,et al.  Phenotypic Analysis of Antigen-Specific T Lymphocytes , 1996, Science.

[66]  Karel Drbal,et al.  Rapid single-cell cytometry data visualization with EmbedSOM , 2018 .

[67]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[68]  N. Tchitchek,et al.  Mass Cytometry Analysis Reveals Complex Cell-State Modifications of Blood Myeloid Cells During HIV Infection , 2019, Front. Immunol..

[69]  J. Downing,et al.  Universal monitoring of minimal residual disease in acute myeloid leukemia. , 2018, JCI insight.

[70]  Stefan Steinerberger,et al.  Fast Interpolation-based t-SNE for Improved Visualization of Single-Cell RNA-Seq Data , 2017, Nature Methods.

[71]  Y. Saeys,et al.  Computational methods for trajectory inference from single‐cell transcriptomics , 2016, European journal of immunology.

[72]  P. Andersen,et al.  Peptide-specific T helper cells identified by MHC class II tetramers differentiate into several subtypes upon immunization with CAF01 adjuvanted H56 tuberculosis vaccine formulation. , 2015, Vaccine.

[73]  Y. Saeys,et al.  A Computational Pipeline for the Diagnosis of CVID Patients , 2019, Front. Immunol..

[74]  Mark M. Davis,et al.  Automatic Classification of Cellular Expression by Nonlinear Stochastic Embedding (ACCENSE) , 2013, Proceedings of the National Academy of Sciences.

[75]  Yvan Saeys,et al.  A comparison of single-cell trajectory inference methods , 2019, Nature Biotechnology.

[76]  Mark D. Robinson,et al.  Comparison of Clustering Methods for High-Dimensional Single-Cell Flow and Mass Cytometry Data , 2016, bioRxiv.

[77]  E. James,et al.  Efficient ex vivo analysis of CD4+ T-cell responses using combinatorial HLA class II tetramer staining , 2016, Nature Communications.

[78]  Greg Finak,et al.  State‐of‐the‐Art in the Computational Analysis of Cytometry Data , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[79]  A. Ciabattini,et al.  Modulation of Primary Immune Response by Different Vaccine Adjuvants , 2016, Front. Immunol..

[80]  Guenther Walther,et al.  Science not art: statistically sound methods for identifying subsets in multi-dimensional flow and mass cytometry data sets , 2017, Nature Reviews Immunology.

[81]  J. Vial,et al.  An R‐Derived FlowSOM Process to Analyze Unsupervised Clustering of Normal and Malignant Human Bone Marrow Classical Flow Cytometry Data , 2019, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[82]  Ronald R. Coifman,et al.  Visualizing structure and transitions in high-dimensional biological data , 2019, Nature Biotechnology.

[83]  Raphael Gottardo,et al.  cytometree: A binary tree algorithm for automatic gating in cytometry analysis , 2018, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[84]  Garry P Nolan,et al.  Visualization and cellular hierarchy inference of single-cell data using SPADE , 2016, Nature Protocols.

[85]  T. Strutt,et al.  Influencing the fates of CD4 T cells on the path to memory: lessons from influenza , 2008, Immunology and cell biology.

[86]  Michael Poidinger,et al.  Mapping the Diversity of Follicular Helper T Cells in Human Blood and Tonsils Using High-Dimensional Mass Cytometry Analysis. , 2015, Cell reports.

[87]  Mario Roederer,et al.  flowClean: Automated identification and removal of fluorescence anomalies in flow cytometry data , 2016, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[88]  Tao Peng,et al.  scEpath: energy landscape-based inference of transition probabilities and cellular trajectories from single-cell transcriptomic data , 2018, Bioinform..

[89]  Sean C. Bendall,et al.  Normalization of mass cytometry data with bead standards , 2013, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[90]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[91]  Cliburn Chan,et al.  Data analysis as a source of variability of the HLA-peptide multimer assay: from manual gating to automated recognition of cell clusters , 2015, Cancer Immunology, Immunotherapy.

[92]  David van Dijk,et al.  Manifold learning-based methods for analyzing single-cell RNA-sequencing data , 2018 .

[93]  Guenther Walther,et al.  Automated subset identification and characterization pipeline for multidimensional flow and mass cytometry data clustering and visualization , 2019, Communications Biology.

[94]  R. Nussenblatt,et al.  Standardizing immunophenotyping for the Human Immunology Project , 2012, Nature Reviews Immunology.

[95]  Evgeny S. Egorov,et al.  Memory CD4+ T cells are generated in the human fetal intestine , 2018, Nature Immunology.

[96]  Greg Finak,et al.  Optimizing transformations for automated, high throughput analysis of flow cytometry data , 2010, BMC Bioinformatics.

[97]  Greg Finak,et al.  OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis , 2014, PLoS Comput. Biol..

[98]  V. Maino,et al.  Standardization and optimization of multiparameter intracellular cytokine staining , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[99]  S. Sealfon,et al.  flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding , 2012, Bioinform..

[100]  Mingyong Liu,et al.  Unsupervised learning techniques reveal heterogeneity in memory CD8+ T cell differentiation following acute, chronic and latent viral infections. , 2017, Virology.

[101]  Y. Saeys,et al.  Response to Orlova et al. “Science not art: statistically sound methods for identifying subsets in multi-dimensional flow and mass cytometry data sets” , 2017, Nature reviews. Immunology.

[102]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[103]  J. Irish,et al.  Beyond the age of cellular discovery , 2014, Nature Immunology.

[104]  Sean C. Bendall,et al.  Single-cell developmental classification of B cell precursor acute lymphoblastic leukemia at diagnosis reveals predictors of relapse , 2018, Nature Medicine.