Recognition Networks for Approximate Inference in BN20 Networks

A recognition network is a multilayer perception (MLP) trained to predict posterior maxginals given observed evidence in a particulax Bayesian network. The input to the MLP is a vector of the states of the evidential nodes. The activity of an output unit is interpreted as a prediction of the posterior marginal of the corresponding variable. The MLP is trained using samples generated from the corresponding Bayesian network. We evaluate a recognition network that was trained to do inference in a large Bayesian network, similax in structure and complexity to the Quick Medical Reference, Decision Theoretic (QMR-DT) network. Our network is a binary, two-layer, noisy-OR (BN20) network containing over 4000 potentially observable nodes and over 600 unobservable, hidden nodes. In real medical diagnosis, most observables are unavailable, and there is a complex and unknown process that selects which ones axe provided. We incorporate a very basic type of selection bias in our network: a known preference that available observables are positive rather than negative. Even this simple bias has a significant effect on the posterior. We compare the performance of our recognition network to state-of-the-art approximate inference algorithms on a large set of test cases. In order to evaluate the effect of our simplistic model of the selection bias, we evaluate algorithms using a variety of incorrectly modelled selection biases. Recognition networks perform well using both correct and incorrect selection biases.

[1]  Robert M. Farber,et al.  How Neural Nets Work , 1987, NIPS.

[2]  Jian Cheng,et al.  AIS-BN: An Adaptive Importance Sampling Algorithm for Evidential Reasoning in Large Bayesian Networks , 2000, J. Artif. Intell. Res..

[3]  D. Heckerman,et al.  ,81. Introduction , 2022 .

[4]  David Heckerman,et al.  A Tractable Inference Algorithm for Diagnosing Multiple Diseases , 2013, UAI.

[5]  Gregory F. Cooper,et al.  An Empirical Analysis of Likelihood-Weighting Simulation on a Large, Multiply-Connected Belief Network , 1991, Computers and biomedical research, an international journal.

[6]  Bruce D'Ambrosio,et al.  Symbolic Probabilistic Inference in Large BN20 Networks , 1994, UAI.

[7]  Brendan J. Frey,et al.  Sequentially Fitting "Inclusive" Trees for Inference in Noisy-OR Networks , 2000, NIPS.

[8]  Michael I. Jordan,et al.  Variational Probabilistic Inference and the QMR-DT Network , 2011, J. Artif. Intell. Res..

[9]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[10]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[11]  Michael I. Jordan,et al.  Approximate Inference A lgorithms for Two-Layer Bayesian Networks , 1999, NIPS.

[12]  D. Heckerman,et al.  Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base. II. Evaluation of diagnostic performance. , 1991, Methods of information in medicine.

[13]  H. E. Pople,et al.  Internist-I, an Experimental Computer-Based Diagnostic Consultant for General Internal Medicine , 1982 .

[14]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.