Gene regulatory network inference from sparsely sampled noisy data

The complexity of biological systems is encoded in gene regulatory networks. Unravelling this intricate web is a fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases. The major obstacle in inferring gene regulatory networks is the lack of data. While time series data are nowadays widely available, they are typically noisy, with low sampling frequency and overall small number of samples. This paper develops a method called BINGO to specifically deal with these issues. Benchmarked with both real and simulated time-series data covering many different gene regulatory networks, BINGO clearly and consistently outperforms state-of-the-art methods. The novelty of BINGO lies in a nonparametric approach featuring statistical sampling of continuous gene expression profiles. BINGO’s superior performance and ease of use, even by non-specialists, make gene regulatory network inference available to any researcher, helping to decipher the complex mechanisms of life. Gene regulatory network inference is a topical problem in systems biology. Here, the authors presents BINGO, a powerful method for network inference from time series data.

[1]  Evan O. Paull,et al.  Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[2]  D. Bernardo,et al.  A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches , 2009, Cell.

[3]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[4]  Thalia E. Chan,et al.  Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures , 2016, bioRxiv.

[5]  Atte Aalto,et al.  Bayesian variable selection in linear dynamical systems , 2018, 1802.05753.

[6]  Gustav Eje Henter,et al.  Gaussian process dynamical models for nonparametric speech representation and synthesis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  James Hensman,et al.  Identification of Gaussian Process State Space Models , 2017, NIPS.

[8]  Fei Liu,et al.  Inference of Gene Regulatory Network Based on Local Bayesian Networks , 2016, PLoS Comput. Biol..

[9]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[10]  Masao Nagasaki,et al.  A state space representation of VAR models with sparse learning for dynamic gene networks. , 2010, Genome informatics. International Conference on Genome Informatics.

[11]  Hitoshi Takata,et al.  Short-Term Electric Load Forecasting Using Multiple Gaussian Process Models , 2014 .

[12]  N. D. Clarke,et al.  Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges , 2010, PloS one.

[13]  Qiang Ji,et al.  Switching Gaussian Process Dynamic Models for simultaneous composite motion tracking and recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Rick Chartrand,et al.  Numerical Differentiation of Noisy, Nonsmooth Data , 2011 .

[15]  Marco Grzegorczyk,et al.  Approximate Bayesian inference in semi-mechanistic models , 2016, Statistics and Computing.

[16]  Carl E. Rasmussen,et al.  Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC , 2013, NIPS.

[17]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[18]  Christopher A. Penfold,et al.  How to infer gene networks from expression profiles, revisited , 2011, Interface Focus.

[19]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[20]  Steven L. Brunton,et al.  Inferring Biological Networks by Sparse Identification of Nonlinear Dynamics , 2016, IEEE Transactions on Molecular, Biological and Multi-Scale Communications.

[21]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[22]  David J. C. MacKay,et al.  BAYESIAN NON-LINEAR MODELING FOR THE PREDICTION COMPETITION , 1996 .

[23]  Christopher A. Penfold,et al.  CSI: a nonparametric Bayesian approach to network inference from multiple perturbed time series gene expression data , 2015, Statistical applications in genetics and molecular biology.

[24]  Guy-Bart Stan,et al.  A Sparse Bayesian Approach to the Identification of Nonlinear State-Space Systems , 2014, IEEE Transactions on Automatic Control.

[25]  A. G. de la Fuente,et al.  From Knockouts to Networks: Establishing Direct Cause-Effect Relationships through Graph Analysis , 2010, PloS one.

[26]  R. Küffner,et al.  Petri Nets with Fuzzy Logic (PNFL): Reverse Engineering and Parametrization , 2010, PloS one.

[27]  Silvia Restrepo,et al.  Gene regulatory networks on transfer entropy (GRNTE): a novel approach to reconstruct gene regulatory interactions applied to a case study for the plant pathogen Phytophthora infestans , 2019, Theoretical Biology and Medical Modelling.

[28]  Georgina Stegmayer,et al.  Extreme learning machines for reverse engineering of gene regulatory networks from expression time series , 2018, Bioinform..

[29]  Su-In Lee,et al.  Node-based learning of multiple Gaussian graphical models , 2013, J. Mach. Learn. Res..

[30]  Joe W. Gray,et al.  Causal network inference using biochemical kinetics , 2014, Bioinform..

[31]  Pierre Geurts,et al.  dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data , 2018, Scientific Reports.

[32]  Michele Ceccarelli,et al.  articleTimeDelay-ARACNE : Reverse engineering of gene networks from time-course data by an information theoretic approach , 2010 .

[33]  G. Roberts,et al.  MCMC methods for diffusion bridges , 2008 .

[34]  Richard Bonneau,et al.  DREAM4: Combining Genetic and Dynamic Information to Identify Biological Networks and Dynamical Models , 2010, PloS one.

[35]  D. Gillespie The chemical Langevin equation , 2000 .

[36]  Marc Timme,et al.  Model-free inference of direct network interactions from nonlinear collective dynamics , 2017, Nature Communications.

[37]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[38]  David J. Fleet,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[39]  Marcel E Dinger,et al.  Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data , 2017, Scientific Reports.

[40]  A. Bittner,et al.  Comparison of RNA-Seq and Microarray in Transcriptome Profiling of Activated T Cells , 2014, PloS one.

[41]  S. Brunton,et al.  Discovering governing equations from data by sparse identification of nonlinear dynamical systems , 2015, Proceedings of the National Academy of Sciences.

[42]  David Hayden,et al.  Dynamical differential expression (DyDE) reveals the period control mechanisms of the Arabidopsis circadian oscillator , 2019, PLoS Comput. Biol..

[43]  Lorenz Wernisch,et al.  Pseudotime estimation: deconfounding single cell time series , 2015, bioRxiv.

[44]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[45]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[46]  Takeshi Mizuno,et al.  Data assimilation constrains new connections and components in a complex, eukaryotic circadian clock model , 2010, Molecular Systems Biology.

[47]  G. Roberts,et al.  MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster , 2012, 1202.0709.

[48]  Rini Akmeliawati,et al.  Gaussian Process Dynamical Models for hand gesture interpretation in Sign Language , 2011, Pattern Recognit. Lett..

[49]  Carl Troein,et al.  Rethinking Transcriptional Activation in the Arabidopsis Circadian Clock , 2014, PLoS Comput. Biol..

[50]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[51]  Z. Szallasi,et al.  Evaluation of Microarray Preprocessing Algorithms Based on Concordance with RT-PCR in Clinical Samples , 2009, PloS one.

[52]  Harri Lähdesmäki,et al.  Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics , 2009, Bioinform..

[53]  Shiliang Sun,et al.  High-Order Gaussian Process Dynamical Models for Traffic Flow Prediction , 2016, IEEE Transactions on Intelligent Transportation Systems.

[54]  Diego di Bernardo,et al.  Inference of gene regulatory networks and compound mode of action from time course gene expression profiles , 2006, Bioinform..