Infinity Flow: High-Throughput Single-Cell Quantification of 100s of Proteins Using Conventional Flow Cytometry and Machine Learning

Modern immunologic research increasingly requires high-dimensional analyses in order to understand the complex milieu of cell-types that comprise the tissue microenvironments of disease. To achieve this, we developed Infinity Flow combining hundreds of overlapping flow cytometry panels using machine learning to enable the simultaneous analysis of the co-expression patterns of 100s of surface-expressed proteins across millions of individual cells. In this study, we demonstrate that this approach allows the comprehensive analysis of the cellular constituency of the steady-state murine lung and to identify novel cellular heterogeneity in the lungs of melanoma metastasis bearing mice. We show that by using supervised machine learning, Infinity Flow enhances the accuracy and depth of clustering or dimensionality reduction algorithms. Infinity Flow is a highly scalable, low-cost and accessible solution to single cell proteomics in complex tissues.

[1]  Sean C. Bendall,et al.  Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum , 2011, Science.

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  T. Kalina,et al.  CD Maps—Dynamic Profiling of CD1–CD100 Surface Expression on Human Leukocyte and Lymphocyte Subsets , 2019, Front. Immunol..

[4]  M. Schindler,et al.  A Combined Omics Approach to Generate the Surface Atlas of Human Naive CD4+ T Cells during Early T-Cell Receptor Activation* , 2015, Molecular & Cellular Proteomics.

[5]  N. Baumgarth B-1 Cell Heterogeneity and the Regulation of Natural and Antigen-Induced IgM Production , 2016, Front. Immunol..

[6]  Lisa E. Wagar,et al.  An Integrated Multi-omic Single-Cell Atlas of Human B Cell Identity , 2020, Immunity.

[7]  Marcel J. T. Reinders,et al.  CyTOFmerge: integrating mass cytometry data across multiple panels , 2019, Bioinform..

[8]  Pang Wei Koh,et al.  An atlas of transcriptional, chromatin accessibility, and surface marker changes in human mesoderm development , 2016, Scientific Data.

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  M. Headley,et al.  Visualization of immediate immune responses to pioneer metastatic cells in the lung , 2016, Nature.

[11]  John R. Haliburton,et al.  Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding , 2017, Scientific Reports.

[12]  Robert Gentleman,et al.  flowCore: a Bioconductor package for high throughput flow cytometry , 2009, BMC Bioinformatics.

[13]  Hannes Stockinger,et al.  CD Nomenclature 2015: Human Leukocyte Differentiation Antigen Workshops as a Driving Force in Immunology , 2015, The Journal of Immunology.

[14]  Lai Guan Ng,et al.  Dimensionality reduction for visualizing single-cell data using UMAP , 2018, Nature Biotechnology.

[15]  Mario Roederer,et al.  A new “Logicle” display method avoids deceptive effects of logarithmic scaling for low signals and compensated data , 2006, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[16]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[17]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[18]  Paul J. Hoffman,et al.  Comprehensive Integration of Single-Cell Data , 2018, Cell.

[19]  Sean C. Bendall,et al.  Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis , 2015, Cell.

[20]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[21]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Laleh Haghverdi,et al.  Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors , 2018, Nature Biotechnology.

[24]  C. E. Pedreira,et al.  Generation of flow cytometry data files with a potentially infinite number of dimensions , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[25]  H. Swerdlow,et al.  Large-scale simultaneous measurement of epitopes and transcriptomes in single cells , 2017, Nature Methods.

[26]  S. Petropoulos,et al.  Comprehensive Cell Surface Protein Profiling Identifies Specific Markers of Human Naive and Primed Pluripotent States , 2017, Cell stem cell.

[27]  Nicolas Tchitchek,et al.  CytoBackBone: an algorithm for merging of phenotypic information from different cytometric profiles , 2019, Bioinform..

[28]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[29]  Fabian J Theis,et al.  Deep learning: new computational modelling techniques for genomics , 2019, Nature Reviews Genetics.

[30]  Chun Jimmie Ye,et al.  SCITO-seq: single-cell combinatorial indexed cytometry sequencing , 2020, Nature Methods.

[31]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[32]  L. Buydens,et al.  A novel data fusion method for the effective analysis of multiple panels of flow cytometry data , 2019, Scientific Reports.

[33]  F. Ginhoux,et al.  Single-Cell Analysis of Human Mononuclear Phagocytes Reveals Subset-Defining Markers and Identifies Circulating Inflammatory Dendritic Cells. , 2019, Immunity.

[34]  K. Haas B‐1 lymphocytes in mice and nonhuman primates , 2015, Annals of the New York Academy of Sciences.

[35]  K. Tsuchida,et al.  Cell-Surface Protein Profiling Identifies Distinctive Markers of Progenitor Cells in Human Skeletal Muscle , 2016, Stem cell reports.