Continuous Sigmoidal Belief Networks Trained using Slice Sampling

Real-valued random hidden variables can be useful for modelling latent structure that explains correlations among observed variables. I propose a simple unit that adds zero-mean Gaussian noise to its input before passing it through a sigmoidal squashing function. Such units can produce a variety of useful behaviors, ranging from deterministic to binary stochastic to continuous stochastic. I show how "slice sampling" can be used for inference and learning in top-down networks of these units and demonstrate learning on two simple problems.

[1]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[3]  Franz Josef Radermacher,et al.  Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Judea Pearl) , 1990, SIAM Rev..

[4]  Steffen L. Lauritzen,et al.  Independence properties of directed markov fields , 1990, Networks.

[5]  R. Tibshirani Principal curves revisited , 1992 .

[6]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[7]  Javier R. Movellan,et al.  Learning Continuous Probability Distributions with Symmetric Diffusion Networks , 1993, Cogn. Sci..

[8]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[9]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[10]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[11]  David Heckerman,et al.  Learning Bayesian Networks: A Unification for Discrete and Gaussian Domains , 1995, UAI.

[12]  Volker Tresp,et al.  Discovering Structure in Continuous Variables Using Bayesian Networks , 1995, NIPS.

[13]  Christopher M. Bishop,et al.  EM Optimization of Latent-Variable Density Models , 1995, NIPS 1995.

[14]  D. Mackay,et al.  Bayesian neural networks and density networks , 1995 .

[15]  Radford M. Neal Markov Chain Monte Carlo Methods Based on `Slicing' the Density Function , 1997 .