Reverse engineering gene regulatory networks using approximate Bayesian computation

Gene regulatory networks are collections of genes that interact with one other and with other substances in the cell. By measuring gene expression over time using high-throughput technologies, it may be possible to reverse engineer, or infer, the structure of the gene network involved in a particular cellular process. These gene expression data typically have a high dimensionality and a limited number of biological replicates and time points. Due to these issues and the complexity of biological systems, the problem of reverse engineering networks from gene expression data demands a specialized suite of statistical tools and methodologies. We propose a non-standard adaptation of a simulation-based approach known as Approximate Bayesian Computing based on Markov chain Monte Carlo sampling. This approach is particularly well suited for the inference of gene regulatory networks from longitudinal data. The performance of this approach is investigated via simulations and using longitudinal expression data from a genetic repair system in Escherichia coli.

[1]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[2]  P. Damlen,et al.  Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables , 1999 .

[3]  Robert D. Leclerc Survival of the sparsest: robust gene networks are parsimonious , 2008, Molecular systems biology.

[4]  J. Marin,et al.  Adaptivity for ABC algorithms: the ABC-PMC scheme , 2008 .

[5]  Christophe Ambroise,et al.  Statistical Applications in Genetics and Molecular Biology Weighted-LASSO for Structured Network Inference from Time Course Data , 2011 .

[6]  Sylvia Richardson,et al.  Monte Carlo algorithms for model assessment via conflicting summaries , 2011, 1106.5919.

[7]  A. Raftery,et al.  Markov chain Monte Carlo with mixtures of singular distributions , 2004 .

[8]  Andrea Rau,et al.  Reverse Engineering Gene Networks Using Genomic Time-Course Data , 2010 .

[9]  Zoubin Ghahramani,et al.  A Bayesian approach to reconstructing genetic regulatory networks with hidden factors , 2005, Bioinform..

[10]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[11]  Christophe Andrieu,et al.  Model criticism based on likelihood-free inference, with an application to protein network evolution , 2009, Proceedings of the National Academy of Sciences.

[12]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[13]  Christian P. Robert,et al.  Bayesian computational methods , 2010, 1002.2702.

[14]  Daniel Wegmann,et al.  Bayesian Computation and Model Selection Without Likelihoods , 2010, Genetics.

[15]  Arnaud Doucet,et al.  An adaptive sequential Monte Carlo method for approximate Bayesian computation , 2011, Statistics and Computing.

[16]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[17]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[18]  Aurélien Mazurie,et al.  Gene networks inference using dynamic Bayesian networks , 2003, ECCB.

[19]  Natalia Bochkina Probabilistic Modeling in Bioinformatics and Medical Informatics by D. Husmeier, R. Dybowski and S. Roberts (eds) , 2006 .

[20]  Michael P. H. Stumpf,et al.  Simulation-based model selection for dynamical systems in systems and population biology , 2009, Bioinform..

[21]  Korbinian Strimmer,et al.  Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process , 2007, BMC Bioinformatics.

[22]  Stephen J. Roberts,et al.  Probabilistic Modeling in Bioinformatics and Medical Informatics , 2010 .

[23]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[24]  S. Coles,et al.  Inference for Stereological Extremes , 2007 .

[25]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[26]  Raphael Gottardo,et al.  Markov Chain Monte Carlo With Mixtures of Mutually Singular Distributions , 2008 .

[27]  Carsten Wiuf,et al.  Using Likelihood-Free Inference to Compare Evolutionary Dynamics of the Protein Networks of H. pylori and P. falciparum , 2007, PLoS Comput. Biol..

[28]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[29]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[30]  Dirk Husmeier,et al.  Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks , 2003, Bioinform..

[31]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[32]  D. Wilkinson Stochastic modelling for quantitative description of heterogeneous biological systems , 2009, Nature Reviews Genetics.

[33]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[35]  U. Alon,et al.  Assigning numbers to the arrows: Parameterizing a gene regulation network by using accurate expression kinetics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Bo Li,et al.  Revisiting Climate Region Definitions via Clustering , 2009 .

[37]  Alvis Brazma,et al.  Current approaches to gene regulatory network modelling , 2007, BMC Bioinformatics.

[38]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[39]  C C Drovandi,et al.  Estimation of Parameters for Macroparasite Population Evolution Using Approximate Bayesian Computation , 2011, Biometrics.

[40]  Zoubin Ghahramani,et al.  Modeling T-cell activation using gene expression profiling and state-space models , 2004, Bioinform..

[41]  L. Excoffier,et al.  Efficient Approximate Bayesian Computation Coupled With Markov Chain Monte Carlo Without Likelihood , 2009, Genetics.

[42]  Rebecca W Doerge,et al.  An Empirical Bayesian Method for Estimating Biological Networks from Temporal Microarray Data , 2010, Statistical applications in genetics and molecular biology.

[43]  D. Husmeier,et al.  Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge , 2007, Statistical applications in genetics and molecular biology.

[44]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[45]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.