QFMatch: multidimensional flow and mass cytometry samples alignment

Part of the flow/mass cytometry data analysis process is aligning (matching) cell subsets between relevant samples. Current methods address this cluster-matching problem in ways that are either computationally expensive, affected by the curse of dimensionality, or fail when population patterns significantly vary between samples. Here, we introduce a quadratic form (QF)-based cluster matching algorithm (QFMatch) that is computationally efficient and accommodates cases where population locations differ significantly (or even disappear or appear) from sample to sample. We demonstrate the effectiveness of QFMatch by evaluating sample datasets from immunology studies. The algorithm is based on a novel multivariate extension of the quadratic form distance for the comparison of flow cytometry data sets. We show that this QF distance has attractive computational and statistical properties that make it well suited for analysis tasks that involve the comparison of flow/mass cytometry samples.

[1]  Noah Zimmerman,et al.  Automatic Clustering of Flow Cytometry Data with Density-Based Merging , 2009, Adv. Bioinformatics.

[2]  J. Paul Robinson,et al.  Quadratic form: A robust metric for quantitative comparison of flow cytometric histograms , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[3]  Murat Dundar,et al.  A non-parametric Bayesian model for joint cell clustering and cluster matching: identification of anomalous sample phenotypes with random effects , 2014, BMC Bioinformatics.

[4]  J. Mesirov,et al.  Automated high-dimensional flow cytometric data analysis , 2009, Proceedings of the National Academy of Sciences.

[5]  M. Roederer,et al.  Probability binning comparison: a metric for quantitating univariate distribution differences. , 2001, Cytometry.

[6]  Leonore A. Herzenberg,et al.  Two physically, functionally, and developmentally distinct peritoneal macrophage subsets , 2010, Proceedings of the National Academy of Sciences.

[7]  Guenther Walther,et al.  Science not art: statistically sound methods for identifying subsets in multi-dimensional flow and mass cytometry data sets , 2017, Nature Reviews Immunology.

[8]  Leonore A. Herzenberg,et al.  Blood basophils from cystic fibrosis patients with allergic bronchopulmonary aspergillosis are primed and hyper-responsive to stimulation by aspergillus allergens. , 2012, Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society.

[9]  G. Walther,et al.  Earth Mover’s Distance (EMD): A True Metric for Comparing Biomarker Expression Levels in Cell Populations , 2016, PloS one.

[10]  Wayne A Moore,et al.  Update for the logicle data scale including operational code implementations , 2012, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[11]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[12]  Guenther Walther,et al.  AutoGate: automating analysis of flow cytometry data , 2014, Immunologic research.

[13]  L L Wheeless,et al.  Comparison of frequency distributions in flow cytometry. , 1988, Cytometry.

[14]  M Roederer,et al.  Probability binning comparison: a metric for quantitating multivariate distribution differences. , 2001, Cytometry.

[15]  Piet Demeester,et al.  FlowSOM: Using self‐organizing maps for visualization and interpretation of cytometry data , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[16]  Cliburn Chan,et al.  Hierarchical Modeling for Rare Event Detection and Cell Subset Alignment across Flow Cytometry Samples , 2013, PLoS Comput. Biol..

[17]  James Lee Hafner,et al.  Efficient Color Histogram Indexing for Quadratic Form Distance Functions , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Sean C. Bendall,et al.  Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis , 2015, Cell.

[19]  Geoffrey J McLachlan,et al.  Modeling of inter‐sample variation in flow cytometric data with the joint clustering and matching procedure , 2016, Cytometry. Part A : the journal of the International Society for Analytical Cytology.