MC4: A Tempering Algorithm for Large-Sample Network Inference

Bayesian networks and their variants are widely used for modelling gene regulatory and protein signalling networks. In many settings, it is the underlying network structure itself that is the object of inference. Within a Bayesian framework inferences regarding network structure are made via a posterior probability distribution over graphs. However, in practical problems, the space of graphs is usually too large to permit exact inference, motivating the use of approximate approaches. An MCMC-based algorithm known as MC3 is widely used for network inference in this setting. We argue that recent trends towards larger sample size datasets, while otherwise advantageous, can, for reasons related to concentration of posterior mass, render inference by MC3 harder. We therefore exploit an approach known as parallel tempering to put forward an algorithm for network inference which we call MC4. We show empirical results on both synthetic and proteomic data which highlight the ability of MC4 to converge faster and thereby yield demonstrably accurate results, even in challenging settings where MC3 fails.

[1]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[2]  R. Kohn,et al.  Nonparametric regression using Bayesian variable selection , 1996 .

[3]  C. Geyer,et al.  Annealing Markov chain Monte Carlo with applications to ancestral inference , 1995 .

[4]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[5]  J. Hammersley,et al.  Monte Carlo Methods , 1965 .

[6]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[7]  Gerard T. Barkema,et al.  Monte Carlo Methods in Statistical Physics , 1999 .

[8]  F. Harary New directions in the theory of graphs , 1973 .

[9]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[10]  P. Green,et al.  Bayesian Variable Selection and the Swendsen-Wang Algorithm , 2004 .

[11]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[12]  G. Parisi,et al.  Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.

[13]  Sach Mukherjee,et al.  Network inference using informative priors , 2008, Proceedings of the National Academy of Sciences.

[14]  Zhi Geng,et al.  A Recursive Method for Structural Learning of Directed Acyclic Graphs , 2008, J. Mach. Learn. Res..

[15]  Michael I. Jordan Graphical Models , 2003 .

[16]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[17]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[18]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[19]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[20]  D Husmeier,et al.  Reverse engineering of genetic networks with Bayesian networks. , 2003, Biochemical Society transactions.