A clustering hybrid method to identify cellular populations and their phenotypic signatures

Flow cytometers have enabled researchers to measure 8 to 16 different cellular markers at the single-cell level. Due to the encoded complexity in flow cytometry dataset across diverse cellular subtypes, new computational methods are required to extract biological insights and potentially rare subpopulations. In this paper, we present a hybrid clustering algorithm that generates a 2-dimensional distillation of flow cy-tometry data and then automatically extracts the subtypes and their phenotypic signatures based on the markers' distribution.

[1]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[2]  M P Wand,et al.  Automation in high‐content flow cytometry screening , 2009, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[3]  D. Massart,et al.  The Mahalanobis distance , 2000 .

[4]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Inge Koch,et al.  Feature significance for multivariate kernel density estimation , 2008, Comput. Stat. Data Anal..

[6]  Jill P. Mesirov,et al.  Automated High-Dimensional Flow Cytometric Data Analysis , 2010, RECOMB.

[7]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[8]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[9]  J. Cheverud,et al.  A simple correction for multiple comparisons in interval mapping genome scans , 2001, Heredity.

[10]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[11]  Ryan R Brinkman,et al.  Rapid cell population identification in flow cytometry data , 2011, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[12]  N. Unnikrishnan Nair,et al.  Kullback–Leibler divergence: A quantile approach , 2016 .

[13]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[14]  Greg Finak,et al.  Merging Mixture Components for Cell Population Identification in Flow Cytometry , 2009, Adv. Bioinformatics.

[15]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[16]  Ryan R Brinkman,et al.  Per‐channel basis normalization methods for flow cytometry data , 2009, Cytometry. Part A : the journal of the International Society for Analytical Cytology.