A Hierarchical, Data-Driven Approach to Modeling Single-Cell Populations Predicts Latent Causes of Cell-To-Cell Variability.

All biological systems exhibit cell-to-cell variability. Frameworks exist for understanding how stochastic fluctuations and transient differences in cell state contribute to experimentally observable variations in cellular responses. However, current methods do not allow identification of the sources of variability between and within stable subpopulations of cells. We present a data-driven modeling framework for the analysis of populations comprising heterogeneous subpopulations. Our approach combines mixture modeling with frameworks for distribution approximation, facilitating the integration of multiple single-cell datasets and the detection of causal differences between and within subpopulations. The computational efficiency of our framework allows hundreds of competing hypotheses to be compared. We initially validate our method using simulated data with an understood ground truth, then we analyze data collected using quantitative single-cell microscopy of cultured sensory neurons involved in pain initiation. This approach allows us to quantify the relative contribution of neuronal subpopulations, culture conditions, and expression levels of signaling proteins to the observed cell-to-cell variability in NGF/TrkA-initiated Erk1/2 signaling.

[1]  Fabian J. Theis,et al.  ODE Constrained Mixture Modelling: A Method for Unraveling Subpopulation Structures and Dynamics , 2014, PLoS Comput. Biol..

[2]  Manfred Claassen,et al.  Computational and experimental single cell biology techniques for the definition of cell type heterogeneity, interplay and intracellular dynamics. , 2015, Current opinion in biotechnology.

[3]  Mario Roederer,et al.  Compensation in Flow Cytometry , 2002, Current protocols in cytometry.

[4]  Heinz Koeppl,et al.  Accounting for extrinsic variability in the estimation of stochastic rate constants , 2012 .

[5]  Frank Allgöwer,et al.  Pain modulators regulate the dynamics of PKA-RII phosphorylation in subgroups of sensory neurons , 2014, Journal of Cell Science.

[6]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[7]  W. Hiddemann,et al.  Characterization of Rare, Dormant, and Therapy-Resistant Cells in Acute Lymphoblastic Leukemia , 2016, Cancer cell.

[8]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[9]  Cassandra Willyard,et al.  Cancer therapy: an evolved approach , 2016, Nature.

[10]  William Finn,et al.  Statistical file matching of flow cytometry data , 2010, J. Biomed. Informatics.

[11]  Chang Hyeong Lee,et al.  A moment closure method for stochastic reaction networks. , 2009, The Journal of chemical physics.

[12]  Mingxiang Teng,et al.  On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data , 2015 .

[13]  Thomas Höfer,et al.  Disparate Individual Fates Compose Robust CD8+ T Cell Immunity , 2013, Science.

[14]  Lani F. Wu,et al.  Cellular Heterogeneity: Do Differences Make a Difference? , 2010, Cell.

[15]  Jan Hasenauer,et al.  Robust parameter estimation for dynamical systems from outlier‐corrupted data , 2017, Bioinform..

[16]  Fabian J Theis,et al.  Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells , 2015, Nature Biotechnology.

[17]  Ursula Klingmüller,et al.  Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood , 2009, Bioinform..

[18]  Martin Koltzenburg,et al.  Antagonism of Nerve Growth Factor-TrkA Signaling and the Relief of Pain , 2011, Anesthesiology.

[19]  Guido Sanguinetti,et al.  Validity conditions for moment closure approximations in stochastic chemical kinetics. , 2014, The Journal of chemical physics.

[20]  Peter M. Williams,et al.  Matrix logarithm parametrizations for neural network covariance models , 1999, Neural Networks.

[21]  John Geweke,et al.  Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments , 1991 .

[22]  J. Lygeros,et al.  Moment-based inference predicts bimodality in transient gene expression , 2012, Proceedings of the National Academy of Sciences.

[23]  Tim Hucho,et al.  Signaling Pathways in Sensitization: Toward a Nociceptor Cell Biology , 2007, Neuron.

[24]  Sarah Filippi,et al.  Robustness of MEK-ERK Dynamics and Origins of Cell-to-Cell Variability in MAPK Signaling , 2016, Cell reports.

[25]  P. Kharchenko,et al.  Bayesian approach to single-cell differential expression analysis , 2014, Nature Methods.

[26]  Jan Hasenauer,et al.  Analysis of CFSE time-series data using division-, age- and label-structured population models , 2016, Bioinform..

[27]  Timothy K Lee,et al.  Single-cell NF-κB dynamics reveal digital activation and analogue information processing , 2010, Nature.

[28]  Jan Hasenauer,et al.  Threshold-Free Population Analysis Identifies Larger DRG Neurons to Respond Stronger to NGF Stimulation , 2012, PloS one.

[29]  Fabian J Theis,et al.  Lessons Learned from Quantitative Dynamical Modeling in Systems Biology , 2013, PloS one.

[30]  R. Ji,et al.  MAP kinase and pain , 2009, Brain Research Reviews.

[31]  P. Swain,et al.  Stochastic Gene Expression in a Single Cell , 2002, Science.

[32]  Jan Hasenauer,et al.  PESTO: Parameter EStimation TOolbox , 2017, Bioinform..

[33]  Aki Vehtari,et al.  Understanding predictive information criteria for Bayesian models , 2013, Statistics and Computing.

[34]  Frank Allgöwer,et al.  Identification of models of heterogeneous cell populations from population snapshot data , 2011, BMC Bioinformatics.

[35]  G. Sauvageau,et al.  Differential expression of homeobox genes in functionally distinct CD34+ subpopulations of human bone marrow cells. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[36]  N. Kampen,et al.  Stochastic processes in physics and chemistry , 1981 .

[37]  Fabian J. Theis,et al.  An adaptive scheduling scheme for calculating Bayes factors with thermodynamic integration using Simpson’s rule , 2015, Statistics and Computing.

[38]  Gioele La Manno,et al.  Quantitative single-cell RNA-seq with unique molecular identifiers , 2013, Nature Methods.

[39]  Sean C. Bendall,et al.  Extracting a Cellular Hierarchy from High-dimensional Cytometry Data with SPADE , 2011, Nature Biotechnology.

[40]  Roland Eils,et al.  The Human Cell Atlas , 2017, bioRxiv.

[41]  Bernd Bodenmiller,et al.  Influence of node abundance on signaling network state and dynamics analyzed by mass cytometry , 2017, Nature Biotechnology.

[42]  A. Raftery Bayes Factors and BIC , 1999 .

[43]  Roy Wollman,et al.  Distinct cellular states determine calcium signaling response , 2016, bioRxiv.

[44]  H. Rubin,et al.  The significance of biological heterogeneity , 1990, Cancer and Metastasis Reviews.

[45]  Fabian J. Theis,et al.  Inference for Stochastic Chemical Kinetics Using Moment Equations and System Size Expansion , 2016, PLoS Comput. Biol..

[46]  Fabian J. Theis,et al.  CERENA: ChEmical REaction Network Analyzer—A Toolbox for the Simulation and Analysis of Stochastic Chemical Kinetics , 2016, PloS one.

[47]  J. Mesirov,et al.  Automated high-dimensional flow cytometric data analysis , 2009, Proceedings of the National Academy of Sciences.

[48]  W. Elsasser,et al.  Outline of a theory of cellular heterogeneity. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Stefan Engblom,et al.  Computing the moments of high dimensional solutions of the master equation , 2006, Appl. Math. Comput..

[50]  W. Reik Stability and flexibility of epigenetic gene regulation in mammalian development , 2007, Nature.

[51]  Timm Schroeder,et al.  Long-term single-cell imaging of mammalian stem cells , 2011, Nature Methods.

[52]  A. Oudenaarden,et al.  Cellular Decision Making and Biological Noise: From Microbes to Mammals , 2011, Cell.