Theoretical analysis and practical insights on importance sampling in Bayesian networks

The AIS-BN algorithm [J. Cheng, M.J. Druzdzel, BN-AIS: An adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks, Journal of Artificial Intelligence Research 13 (2000) 155-188] is a successful importance sampling-based algorithm for Bayesian networks that relies on two heuristic methods to obtain an initial importance function: @e-cutoff, replacing small probabilities in the conditional probability tables by a larger @e, and setting the probability distributions of the parents of evidence nodes to uniform. However, why the simple heuristics are so effective was not well understood. In this paper, we point out that it is due to a practical requirement for the importance function, which says that a good importance function should possess thicker tails than the actual posterior probability distribution. By studying the basic assumptions behind importance sampling and the properties of importance sampling in Bayesian networks, we develop several theoretical insights into the desirability of thick tails for importance functions. These insights not only shed light on the success of the two heuristics of AIS-BN, but also provide a common theoretical basis for several other successful heuristic methods.

[1]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[2]  Robert M. Fung,et al.  Backward Simulation in Bayesian Networks , 1994, UAI.

[3]  Changhe Yuan,et al.  How Heavy Should the Tails Be? , 2005, FLAIRS.

[4]  F. Liang Dynamically Weighted Importance Sampling in Monte Carlo Computation , 2002 .

[5]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[6]  Marek J. Druzdzel,et al.  Some Properties of joint Probability Distributions , 1994, UAI.

[7]  Max Henrion,et al.  Propagating uncertainty in bayesian networks by probabilistic logic sampling , 1986, UAI.

[8]  J. Hammersley SIMULATION AND THE MONTE CARLO METHOD , 1982 .

[9]  Changhe Yuan,et al.  Importance sampling algorithms for Bayesian networks: Principles and performance , 2006, Math. Comput. Model..

[10]  David J. C. Mackay,et al.  Introduction to Monte Carlo Methods , 1998, Learning in Graphical Models.

[11]  Leslie Pack Kaelbling,et al.  Adaptive Importance Sampling for Estimation in Structured Domains , 2000, UAI.

[12]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[13]  William H. Press,et al.  Numerical recipes in C , 2002 .

[14]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[15]  Changhe Yuan,et al.  A Comparison on the Effectiveness of Two Heuristics for Importance Sampling , 2004 .

[16]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[17]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[18]  Jian Cheng,et al.  AIS-BN: An Adaptive Importance Sampling Algorithm for Evidential Reasoning in Large Bayesian Networks , 2000, J. Artif. Intell. Res..

[19]  J. Geweke,et al.  Bayesian Inference in Econometric Models Using Monte Carlo Integration , 1989 .

[20]  Serafín Moral,et al.  Dynamic Importance Sampling Computation in Bayesian Networks , 2003, ECSQARU.

[21]  Ross D. Shachter,et al.  Simulation Approaches to General Probabilistic Inference on Belief Networks , 2013, UAI.

[22]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[23]  Serafín Moral,et al.  A Monte Carlo algorithm for probabilistic propagation in belief networks based on importance sampling and stratified simulation techniques , 1998, Int. J. Approx. Reason..

[24]  A. W. Rosenbluth,et al.  MONTE CARLO CALCULATION OF THE AVERAGE EXTENSION OF MOLECULAR CHAINS , 1955 .

[25]  Kuo-Chu Chang,et al.  Weighing and Integrating Evidence for Stochastic Simulation in Bayesian Networks , 2013, UAI.

[26]  P. Grassberger Pruned-enriched Rosenbluth method: Simulations of θ polymers of chain length up to 1 000 000 , 1997 .