Modelling the Dynamics of Biological Systems with the Geometric Hidden Markov Model

Many biological processes can be described geometrically in a simple way: stem cell differentiation can be represented as a branching tree and cell division can be depicted as a cycle. In this paper we introduce the geometric hidden Markov model (GHMM), a dynamical model whose goal is to capture the low-dimensional characteristics of biological processes from multivariate time series data. The framework integrates a graph-theoretical algorithm for dimensionality reduction with a latent variable model for sequential data. We analyzed time series data generated by an in silico model of a biomolecular circuit, the represillator. The trained model has a simple structure: the latent Markov chain corresponds to a two-dimensional lattice. We show that the short-term and long-term predictions of the GHMM reflect the oscillatory behaviour of the genetic circuit. Analysis of the inferred model with a community detection methods leads to a coarse-grained representation of the process.

[1]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[2]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[3]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[4]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[5]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[6]  Tatsuo Narikiyo,et al.  A LLE-HMM-based framework for recognizing human gait movement from EMG , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Patrick S. Stumpf,et al.  Stem cell differentiation is a stochastic process with memory , 2017, bioRxiv.

[8]  C. Dobson Protein folding and misfolding , 2003, Nature.

[9]  Jean-Christophe Nebel,et al.  Temporal Extension of Laplacian Eigenmaps for Unsupervised Dimensionality Reduction of Time Series , 2010, 2010 20th International Conference on Pattern Recognition.

[10]  Jean-Charles Delvenne,et al.  Stability of graph communities across time scales , 2008, Proceedings of the National Academy of Sciences.

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[12]  T. Enver,et al.  Forcing cells to change lineages , 2009, Nature.

[13]  Borislav Vangelov Unravelling biological processes using graph theoretical algorithms and probabilistic models , 2014 .

[14]  David J. Fleet,et al.  Gaussian Process Dynamical Models , 2005, NIPS.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[17]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[18]  Hongli Zhu,et al.  Ensemble HMM Learning for Motion Retrieval with Non-linear PCA Dimensionality Reduction , 2007, Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2007).

[19]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[20]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[21]  D. Wales Energy Landscapes by David Wales , 2004 .

[22]  M. Elowitz,et al.  A synthetic oscillatory network of transcriptional regulators , 2000, Nature.

[23]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[24]  Chung-Lin Huang,et al.  Gait Analysis For Human Identification Through Manifold Learning and HMM , 2007, 2007 IEEE Workshop on Motion and Video Computing (WMVC'07).

[25]  John T. Chang,et al.  Early specification of CD8+ T lymphocyte fates during adaptive immunity revealed by single-cell gene expression analyses , 2014, Nature Immunology.

[26]  Sean C. Bendall,et al.  Extracting a Cellular Hierarchy from High-dimensional Cytometry Data with SPADE , 2011, Nature Biotechnology.

[27]  A Delmotte,et al.  Protein multi-scale organization through graph partitioning and robustness analysis: application to the myosin–myosin light chain interaction , 2011, Physical biology.

[28]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[29]  Miguel Á. Carreira-Perpiñán,et al.  Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.