Flow Cytometry Data Analysis

Flow Cytometry device is frequently used in the analysis of blood samples. Analysis of Flow Cytometry data is needed in cases such as diagnosing disease, monitoring progression of disease. However, the manual analysis of these multi-dimensional data by the hand cannot be performed at the desired level due to various reasons. In this study, it is aimed to create an algorithm which allows the gating process by manual gating to be performed automatically by the experts in the flow cytometry data. The algorithm consists of k-means, Gaussian Mixture Method (GMM) clustering methods, Expectation Maximization (EM) algorithm and Chernoff distance measurement method. The algorithm developed in the scope of the study was tested on the DLBCL dataset and a success rate of 86.06% was obtained.

[1]  R. Murphy Automated identification of subpopulations in flow cytometric list mode data using cluster analysis. , 1985, Cytometry.

[2]  D. Grasso,et al.  Flow cytometry. , 1998, Methods in molecular medicine.

[3]  Rama Chellappa,et al.  From sample similarity to ensemble similarity: probabilistic distance measures in reproducing kernel Hilbert space , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[5]  Cliburn Chan,et al.  Statistical mixture modeling for cell subtype identification in flow cytometry , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[6]  John Ferbas,et al.  Mixture modeling approach to flow cytometry data , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[7]  Greg Finak,et al.  Merging Mixture Components for Cell Population Identification in Flow Cytometry , 2009, Adv. Bioinformatics.

[8]  Gilles Celeux,et al.  Combining Mixture Components for Clustering , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[9]  R. Scheuermann,et al.  Elucidation of seventeen human peripheral blood B‐cell subsets and quantification of the tetanus response using a density‐based method for the automated identification of cell populations in multidimensional flow cytometry data , 2010, Cytometry. Part B, Clinical cytometry.

[10]  Ryan R Brinkman,et al.  Rapid cell population identification in flow cytometry data , 2011, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[11]  Nima Aghaeepour,et al.  Flow Cytometry Bioinformatics , 2013, PLoS Comput. Biol..

[12]  Greg Finak,et al.  Critical assessment of automated flow cytometry data analysis techniques , 2013, Nature Methods.

[13]  Ariful Azad,et al.  An Algorithmic Pipeline for Analyzing Multi-parametric Flow Cytometry Data , 2015, ArXiv.

[14]  Jonas Wallin,et al.  BayesFlow: latent modeling of flow cytometry cell populations , 2015, BMC Bioinformatics.

[15]  M. Kanev,et al.  Flow sitometri ve kullanım alanları , 2015 .

[16]  Mark D. Robinson,et al.  Comparison of Clustering Methods for High-Dimensional Single-Cell Flow and Mass Cytometry Data , 2016, bioRxiv.