Automated High-Dimensional Flow Cytometric Data Analysis

Flow cytometry is widely used for single cell interrogation of surface and intracellular protein expression by measuring fluorescence intensity of fluorophore-conjugated reagents We focus on the recently developed procedure of Pyne et al (2009, Proceedings of the National Academy of Sciences USA 106, 8519-8524) for automated high- dimensional flow cytometric analysis called FLAME (FLow analysis with Automated Multivariate Estimation) It introduced novel finite mixture models of heavy-tailed and asymmetric distributions to identify and model cell populations in a flow cytometric sample This approach robustly addresses the complexities of flow data without the need for transformation or projection to lower dimensions It also addresses the critical task of matching cell populations across samples that enables downstream analysis It thus facilitates application of flow cytometry to new biological and clinical problems To facilitate pipelining with standard bioinformatic applications such as high-dimensional visualization, subject classification or outcome prediction, FLAME has been incorporated with the GenePattern package of the Broad Institute Thereby analysis of flow data can be approached similarly as other genomic platforms We also consider some new work that proposes a rigorous and robust solution to the registration problem by a multi-level approach that allows us to model and register cell populations simultaneously across a cohort of high-dimensional flow samples This new approach is called JCM (Joint Clustering and Matching) It enables direct and rigorous comparisons across different time points or phenotypes in a complex biological study as well as for classification of new patient samples in a more clinical setting.

[1]  Linda S. Wicker,et al.  Allelic variant in CTLA4 alters T cell phosphorylation patterns , 2007, Proceedings of the National Academy of Sciences.

[2]  Geoffrey J. McLachlan,et al.  Robust mixture modelling using the t distribution , 2000, Stat. Comput..

[3]  S Demers,et al.  Analyzing multivariate flow cytometric data in aquatic sciences. , 1992, Cytometry.

[4]  Marc G. Genton,et al.  Skew-elliptical distributions and their applications : a journey beyond normality , 2004 .

[5]  A. Azzalini,et al.  Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t‐distribution , 2003, 0911.2342.

[6]  Raphael Gottardo,et al.  Automated gating of flow cytometry data via robust model‐based clustering , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[7]  P. Chattopadhyay,et al.  Seventeen-colour flow cytometry: unravelling the immune system , 2004, Nature Reviews Immunology.

[8]  M. Toda,et al.  Immunologic self-tolerance maintained by activated T cells expressing IL-2 receptor alpha-chains (CD25). Breakdown of a single mechanism of self-tolerance causes various autoimmune diseases. , 1995, Journal of immunology.

[9]  John Ferbas,et al.  Mixture modeling approach to flow cytometry data , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[10]  Clare Baecher-Allan,et al.  MHC Class II Expression Identifies Functionally Distinct Human Regulatory T Cells1 , 2006, The Journal of Immunology.

[11]  Rainer Spang,et al.  Automated in-silico detection of cell populations in flow cytometry readouts and its application to leukemia disease monitoring , 2006, BMC Bioinformatics.

[12]  Cliburn Chan,et al.  Statistical mixture modeling for cell subtype identification in flow cytometry , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[13]  G. Nolan,et al.  Mapping normal and cancer cell signalling networks: towards single-cell proteomics , 2006, Nature Reviews Cancer.

[14]  M. Roederer,et al.  Probability binning comparison: a metric for quantitating univariate distribution differences. , 2001, Cytometry.

[15]  Elliott Kieff,et al.  Genetic Analysis of Human Traits In Vitro: Drug Response and Gene Expression in Lymphoblastoid Cell Lines , 2008, PLoS genetics.

[16]  Jack C. Lee,et al.  Robust mixture modeling using the skew t distribution , 2007, Stat. Comput..

[17]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[18]  Mario Roederer,et al.  Quality assurance for polychromatic flow cytometry , 2006, Nature Protocols.

[19]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[20]  Maura Gasparetto,et al.  Data quality assessment of ungated flow cytometry data in high throughput experiments , 2007, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[21]  Mario Roederer,et al.  Dear Reader, , 2003, Nature Medicine.

[22]  Shuguang Huang,et al.  Mixture‐model classification in DNA content analysis , 2007, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[23]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[24]  M. Roederer,et al.  11-color, 13-parameter flow cytometry: Identification of human naive T cells by phenotype, function, and T-cell receptor diversity , 2001, Nature Medicine.

[25]  E S Costa,et al.  A new automated flow cytometry data analysis approach for the diagnostic screening of neoplastic B-cell disorders in peripheral blood samples with absolute lymphocytosis , 2006, Leukemia.

[26]  Ruth Nussinov,et al.  Recognition of Binding Patterns Common to a Set of Protein Structures , 2005, RECOMB.