Dynamic distribution decomposition for single-cell snapshot time series identifies subpopulations and trajectories during iPSC reprogramming

Recent high-dimensional single-cell technologies such as mass cytometry are enabling time series experiments to monitor the temporal evolution of cell state distributions and to identify dynamically important cell states, such as fate decision states in differentiation. However, these technologies are destructive, and require analysis approaches that temporally map between cell state distributions across time points. Current approaches to approximate the single-cell time series as a dynamical system suffer from too restrictive assumptions about the type of kinetics, or link together pairs of sequential measurements in a discontinuous fashion. We propose Dynamic Distribution Decomposition (DDD), an operator approximation approach to infer a continuous distribution map between time points. On the basis of single-cell snapshot time series data, DDD approximates the continuous time Perron-Frobenius operator by means of a finite set of basis functions. This procedure can be interpreted as a continuous time Markov chain over a continuum of states. By only assuming a memoryless Markov (autonomous) process, the types of dynamics represented are more general than those represented by other common models, e.g., chemical reaction networks, stochastic differential equations. Furthermore, we can a posteriori check whether the autonomy assumptions are valid by calculation of prediction error—which we show gives a measure of autonomy within the studied system. The continuity and autonomy assumptions ensure that the same dynamical system maps between all time points, not arbitrarily changing at each time point. We demonstrate the ability of DDD to reconstruct dynamically important cell states and their transitions both on synthetic data, as well as on mass cytometry time series of iPSC reprogramming of a fibroblast system. We use DDD to find previously identified subpopulations of cells and to visualise differentiation trajectories. Dynamic Distribution Decomposition allows interpretation of high-dimensional snapshot time series data as a low-dimensional Markov process, thereby enabling an interpretable dynamics analysis for a variety of biological processes by means of identifying their dynamically important cell states.

[1]  Joshua L. Proctor,et al.  Discovering dynamic patterns from infectious disease data using dynamic mode decomposition , 2015, International health.

[2]  Yannis Pantazis,et al.  A unified approach for sparse dynamical system inference from temporal measurements , 2017, Bioinform..

[3]  Tommi S. Jaakkola,et al.  Learning population-level diffusions with generative recurrent networks , 2016, ICML 2016.

[4]  Eli R. Zunder,et al.  A continuous molecular roadmap to iPSC reprogramming through progression analysis of single-cell mass cytometry. , 2015, Cell stem cell.

[5]  P. Maini,et al.  A practical guide to stochastic simulations of reaction-diffusion processes , 2007, 0704.1908.

[6]  Sean C. Bendall,et al.  Wishbone identifies bifurcating developmental trajectories from single-cell data , 2016, Nature Biotechnology.

[7]  B. O. Koopman,et al.  Hamiltonian Systems and Transformation in Hilbert Space. , 1931, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Steven L. Brunton,et al.  Dynamic Mode Decomposition with Control , 2014, SIAM J. Appl. Dyn. Syst..

[9]  Steven L. Brunton,et al.  On dynamic mode decomposition: Theory and applications , 2013, 1312.0041.

[10]  Yvan Saeys,et al.  A comparison of single-cell trajectory inference methods: towards more accurate and robust tools , 2018, bioRxiv.

[11]  D. Lauffenburger,et al.  Physicochemical modelling of cell signalling pathways , 2006, Nature Cell Biology.

[12]  John von Neumann,et al.  Zusatze Zur Arbeit ,,Zur Operatorenmethode... , 1932 .

[13]  Y. Saeys,et al.  Computational methods for trajectory inference from single‐cell transcriptomics , 2016, European journal of immunology.

[14]  A. Kolmogoroff Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung , 1931 .

[15]  M. Mackey,et al.  Probabilistic properties of deterministic systems , 1985, Acta Applicandae Mathematicae.

[16]  Théorie des probabilités continues , 1906 .

[17]  Carsten Carstensen,et al.  Remarks around 50 lines of Matlab: short finite element implementation , 1999, Numerical Algorithms.

[18]  R. Erban,et al.  Reactive boundary conditions for stochastic simulations of reaction–diffusion processes , 2007, Physical biology.

[19]  Stefan Klus,et al.  On the numerical approximation of the Perron-Frobenius and Koopman operator , 2015, 1512.05997.

[20]  D. Gilbarg,et al.  Elliptic Partial Differential Equa-tions of Second Order , 1977 .

[21]  Clarence W. Rowley,et al.  A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition , 2014, Journal of Nonlinear Science.

[22]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[23]  Malgorzata Nowicka,et al.  CyTOF workflow: differential discovery in high-throughput high-dimensional cytometry datasets. , 2019, F1000Research.

[24]  I. Hellmann,et al.  Comparative Analysis of Single-Cell RNA Sequencing Methods , 2016, bioRxiv.

[25]  Hao Wu,et al.  Data-Driven Model Reduction and Transfer Operator Approximation , 2017, J. Nonlinear Sci..

[26]  H. Akaike A new look at the statistical model identification , 1974 .

[27]  W. Ziemer Weakly Differentiable Functions: Sobolev Spaces and Functions of Bounded Variation , 1989 .

[28]  Jorge Goncalves,et al.  Koopman-Based Lifting Techniques for Nonlinear Systems Identification , 2017, IEEE Transactions on Automatic Control.

[29]  A. D. Fokker Die mittlere Energie rotierender elektrischer Dipole im Strahlungsfeld , 1914 .

[30]  Fabian J. Theis,et al.  Beyond pseudotime: Following T-cell maturation in single-cell RNAseq time series , 2017, bioRxiv.

[31]  J. Lygeros,et al.  Moment-based inference predicts bimodality in transient gene expression , 2012, Proceedings of the National Academy of Sciences.

[32]  Jake P. Taylor-King,et al.  Operator Fitting for Parameter Estimation of Stochastic Differential Equations , 2017, 1709.05153.

[33]  P. Rigollet,et al.  Reconstruction of developmental landscapes by optimal-transport analysis of single-cell gene expression sheds light on cellular reprogramming , 2017, bioRxiv.

[34]  Manfred Claassen,et al.  Sparse Regression Based Structure Learning of Stochastic Reaction Networks from Single Cell Snapshot Time Series , 2016, PLoS Comput. Biol..

[35]  Daniel T Gillespie,et al.  Stochastic simulation of chemical kinetics. , 2007, Annual review of physical chemistry.

[36]  Nicholas J. Higham,et al.  Functions of matrices - theory and computation , 2008 .

[37]  G. Nolan,et al.  Mass Cytometry: Single Cells, Many Features , 2016, Cell.

[38]  P. Schmid,et al.  Dynamic mode decomposition of numerical and experimental data , 2008, Journal of Fluid Mechanics.

[39]  Bingni W. Brunton,et al.  Extracting spatial–temporal coherent patterns in large-scale neural recordings using dynamic mode decomposition , 2014, Journal of Neuroscience Methods.

[40]  Bengt Fornberg,et al.  A primer on radial basis functions with applications to the geosciences , 2015, CBMS-NSF regional conference series in applied mathematics.

[41]  S. Ulam A collection of mathematical problems , 1960 .

[42]  Dirk P. Kroese,et al.  Kernel density estimation via diffusion , 2010, 1011.2602.

[43]  B. Øksendal Stochastic differential equations : an introduction with applications , 1987 .

[44]  Bernd Bodenmiller,et al.  Influence of node abundance on signaling network state and dynamics analyzed by mass cytometry , 2017, Nature Biotechnology.

[45]  J. Neumann Zur Operatorenmethode In Der Klassischen Mechanik , 1932 .

[46]  Anne E Carpenter,et al.  Opportunities and obstacles for deep learning in biology and medicine , 2017, bioRxiv.