Modeling of cytometry data in logarithmic space: when is a bimodal distribution not bimodal?

Recent efforts in systems immunology lead researchers to build quantitative models of cell activation and differentiation. One goal is to account for the distributions of proteins from single-cell measurements by flow cytometry or mass cytometry as a readout of biological regulation. In that context, large cell-to-cell variability is often observed in biological quantities. We show here that these readouts, viewed in logarithmic scale may result in two easily-distinguishable modes, while the underlying distribution (in linear scale) is uni-modal. We introduce a simple mathematical test to highlight this mismatch. We then dissect the flow of influence of cell-to-cell variability using a graphical model and its effect on measurement noise. Finally we show how acquiring additional biological information can be used to reduce uncertainty introduced by cell-to-cell variability, helping to clarify whether the data is uni- or bi-modal. This communication has cautionary implications for manual and automatic gating strategies, as well as clustering and modeling of single-cell measurements.

[1]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[2]  Jean-Michel Marin,et al.  Bayesian Modelling and Inference on Mixtures of Distributions , 2005 .

[3]  Kerstin Johnsson,et al.  What is a “unimodal” cell population? Using statistical tests as criteria for unimodality in automated gating and quality control , 2017, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[4]  Dirk P. Kroese,et al.  Kernel density estimation via diffusion , 2010, 1011.2602.

[5]  M. Kendall Probability and Statistical Inference , 1956, Nature.

[6]  Lucas Pelkmans,et al.  Using Cell-to-Cell Variability—A New Era in Molecular Biology , 2012, Science.

[7]  Grégoire Altan-Bonnet,et al.  Cell-to-Cell Variability Analysis Dissects the Plasticity of Signaling of Common γ Chain Cytokines in T Cells , 2013, Science Signaling.

[8]  Cliburn Chan,et al.  Hierarchical Modeling for Rare Event Detection and Cell Subset Alignment across Flow Cytometry Samples , 2013, PLoS Comput. Biol..

[9]  Heino Prinz,et al.  Hill coefficients, dose–response curves and allosteric mechanisms , 2010, Journal of chemical biology.

[10]  B. Silverman,et al.  Using Kernel Density Estimates to Investigate Multimodality , 1981 .

[11]  Konstantinos C Zygalakis,et al.  Entropy, Ergodicity, and Stem Cell Multipotency. , 2015, Physical review letters.

[12]  Xi Zhao,et al.  CCAST: A Model-Based Gating Strategy to Isolate Homogeneous Subpopulations in a Heterogeneous Population of Single Cells , 2014, PLoS Comput. Biol..

[13]  Jayajit Das,et al.  Digital Signaling and Hysteresis Characterize Ras Activation in Lymphoid Cells , 2009, Cell.

[14]  Sarah Filippi,et al.  Robustness of MEK-ERK Dynamics and Origins of Cell-to-Cell Variability in MAPK Signaling , 2016, Cell reports.

[15]  Indrani Bose,et al.  Non-genetic heterogeneity, criticality and cell differentiation , 2014, Physical biology.

[16]  Mario Roederer,et al.  A new “Logicle” display method avoids deceptive effects of logarithmic scaling for low signals and compensated data , 2006, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[17]  Paul J. Smith,et al.  Flow-Based Cytometric Analysis of Cell Cycle via Simulated Cell Populations , 2010, PLoS Comput. Biol..

[18]  Leonore A Herzenberg,et al.  Interpreting flow cytometry data: a guide for the perplexed , 2006, Nature Immunology.

[19]  J. Hartigan,et al.  The Dip Test of Unimodality , 1985 .

[20]  C Bruce Bagwell,et al.  Sometimes simpler is better: VLog, a general but easy‐to‐implement log‐like transform for cytometry , 2016, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[21]  Emily S C Ching,et al.  Reconstructing links in directed networks from noisy dynamics. , 2017, Physical review. E.

[22]  P. Sorger,et al.  Non-genetic origins of cell-to-cell variability in TRAIL-induced apoptosis , 2009, Nature.

[23]  Y. Saeys,et al.  Computational flow cytometry: helping to make sense of high-dimensional immunology data , 2016, Nature Reviews Immunology.

[24]  Grégoire Altan-Bonnet,et al.  Dichotomy of cellular inhibition by small-molecule inhibitors revealed by single-cell analysis , 2016, Nature Communications.

[25]  Nima Aghaeepour,et al.  Flow Cytometry Bioinformatics , 2013, PLoS Comput. Biol..

[26]  Guillermo A. Cecchi,et al.  Noise-Driven Causal Inference in Biomolecular Networks , 2015, PloS one.

[27]  James Wood,et al.  Flow cytometry histograms: Transformations, resolution, and display , 2008, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[28]  Greg Finak,et al.  Optimizing transformations for automated, high throughput analysis of flow cytometry data , 2010, BMC Bioinformatics.

[29]  Greg Finak,et al.  OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis , 2014, PLoS Comput. Biol..

[30]  R. Germain,et al.  Variability and Robustness in T Cell Activation from Regulated Heterogeneity in Protein Levels , 2008, Science.

[31]  Hannah H. Chang,et al.  Cell Fate Decision as High-Dimensional Critical State Transition , 2016, bioRxiv.

[32]  Hao Chen,et al.  Cytofkit: A Bioconductor Package for an Integrated Mass Cytometry Data Analysis Pipeline , 2016, PLoS Comput. Biol..

[33]  Nir Friedman,et al.  Linking stochastic dynamics to population distribution: an analytical framework of gene expression. , 2006, Physical review letters.