Bayesian clustering of flow cytometry data for the diagnosis of B-Chronic Lymphocytic Leukemia

In the rapidly advancing field of flow cytometry, methodologies facilitating automated clinical decision support are increasingly needed. In the case of B-chronic lymphocytic leukemia (B-CLL), discrimination of the various subpopulations of blood cells is an important task. In this work, our objective is to provide a useful paradigm of computer-based assistance in the domain of flow-cytometric data analysis by proposing a Bayesian methodology for flow cytometry clustering. Using Bayesian clustering, we replicate a series of (unsupervised) data clustering tasks, usually performed manually by the expert. The proposed methodology is able to incorporate the expert's knowledge, as prior information to data-driven statistical learning methods, in a simple and efficient way. We observe almost optimal clustering results, with respect to the expert's gold standard. The model is flexible enough to identify correctly non canonical clustering structures, despite the presence of various abnormalities and heterogeneities in data; it offers an advantage over other types of approaches that apply hierarchical or distance-based concepts.

[1]  Howard M. Shapiro,et al.  Practical Flow Cytometry , 1985 .

[2]  George Nikiforidis,et al.  The Probabilities Mixture Model for Clustering Flow-Cytometric Data: An Application to Gating Lymphocytes in Peripheral Blood , 2006, ISBMDA.

[3]  C. Wittwer,et al.  Flow cytometry: principles and clinical applications in hematology. , 2000, Clinical chemistry.

[4]  L Boddy,et al.  Pattern recognition in flow cytometry. , 2001, Cytometry.

[5]  H. Cualing,et al.  Automated analysis in flow cytometry. , 2000, Cytometry.

[6]  Nir Friedman,et al.  Context-Specific Bayesian Clustering for Gene Expression Data , 2002, J. Comput. Biol..

[7]  D S Frankel,et al.  Application of neural networks to flow cytometry data analysis and real-time cell classification. , 1996, Cytometry.

[8]  T C Bakker Schut,et al.  Cluster analysis of flow cytometric list mode data on a personal computer. , 1993, Cytometry.

[9]  Rainer Spang,et al.  Automated in-silico detection of cell populations in flow cytometry readouts and its application to leukemia disease monitoring , 2006, BMC Bioinformatics.

[10]  R. Murphy Automated identification of subpopulations in flow cytometric list mode data using cluster analysis. , 1985, Cytometry.

[11]  Qing Zeng-Treitler,et al.  Feature-guided clustering of multi-dimensional flow cytometry datasets , 2007, J. Biomed. Informatics.

[12]  L Boddy,et al.  Comparison of five clustering algorithms to classify phytoplankton from flow cytometry data. , 2001, Cytometry.

[13]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[14]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[15]  George Nikiforidis,et al.  A perspective for biomedical data integration: Design of databases for flow cytometry , 2008, BMC Bioinformatics.

[16]  LiMin Fu,et al.  Real-time adaptive clustering of flow cytometric data , 1993, Pattern Recognit..

[17]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[18]  S Demers,et al.  Analyzing multivariate flow cytometric data in aquatic sciences. , 1992, Cytometry.

[19]  Raphael Gottardo,et al.  Automated gating of flow cytometry data via robust model‐based clustering , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.