Mondrian Processes for Flow Cytometry Analysis

Analysis of flow cytometry data is an essential tool for clinical diagnosis of hematological and immunological conditions. Current clinical workflows rely on a manual process called gating to classify cells into their canonical types. This dependence on human annotation limits the rate, reproducibility, and complexity of flow cytometry analysis. In this paper, we propose using Mondrian processes to perform automated gating by incorporating prior information of the kind used by gating technicians. The method segments cells into types via Bayesian nonparametric trees. Examining the posterior over trees allows for interpretable visualizations and uncertainty quantification - two vital qualities for implementation in clinical practice.

[1]  Roshini S. Abraham,et al.  Flow Cytometry, a Versatile Tool for Diagnosis and Monitoring of Primary Immunodeficiencies , 2016, Clinical and Vaccine Immunology.

[2]  Iftekhar Naim,et al.  SWIFT—Scalable Clustering for Automated Identification of Rare Cell Populations in Large, High-Dimensional Flow Cytometry Datasets, Part 1: Algorithm Design , 2014, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[3]  Holden T Maecker,et al.  Algorithmic Tools for Mining High-Dimensional Cytometry Data , 2015, The Journal of Immunology.

[4]  Sean C. Bendall,et al.  Extracting a Cellular Hierarchy from High-dimensional Cytometry Data with SPADE , 2011, Nature Biotechnology.

[5]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[6]  Yang Wang,et al.  Metadata Dependent Mondrian Processes , 2015, ICML.

[7]  Yee Whye Teh,et al.  The Mondrian Process , 2008, NIPS.

[8]  Greg Finak,et al.  Critical assessment of automated flow cytometry data analysis techniques , 2013, Nature Methods.

[9]  Y. Saeys,et al.  Computational flow cytometry: helping to make sense of high-dimensional immunology data , 2016, Nature Reviews Immunology.

[10]  David Wu,et al.  Flow cytometry for non-Hodgkin and classical Hodgkin lymphoma. , 2013, Methods in molecular biology.

[11]  Nima Aghaeepour,et al.  Flow Cytometry Bioinformatics , 2013, PLoS Comput. Biol..

[12]  Dawn M. E. Bowdish,et al.  An Introduction to Automated Flow Cytometry Gating Tools and Their Implementation , 2015, Front. Immunol..

[13]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[14]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[15]  Joel Dudley,et al.  Automated cell type discovery and classification through knowledge transfer , 2017, Bioinform..

[16]  Sean C. Bendall,et al.  Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis , 2015, Cell.

[17]  Jill P. Mesirov,et al.  Automated High-Dimensional Flow Cytometric Data Analysis , 2010, RECOMB.

[18]  Sean C. Bendall,et al.  Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum , 2011, Science.

[19]  Yee Whye Teh,et al.  The Mondrian Process for Machine Learning , 2015, 1507.05181.

[20]  B. Becher,et al.  The end of gating? An introduction to automated analysis of high dimensional cytometry data , 2016, European journal of immunology.