A Mean Field Learning Algorithm for Unsupervised Neural Networks

We introduce a learning algorithm for unsupervised neural networks based on ideas from statistical mechanics. The algorithm is derived from a mean field approximation for large,layered sigmoid belief networks. We show how to (approximately) infer the statistics of these networks without resort to sampling. This is done by solving the mean field equations, which relate the statistics of each unit to those of its Markov blanket. Using these statistics as target values, the weights in the network are adapted by a local delta rule. We evaluate the strengths and weaknesses of these networks for problems in statistical pattern recognition.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[4]  Carsten Peterson,et al.  A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[5]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[6]  William H. Press,et al.  Numerical recipes , 1990 .

[7]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[8]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[9]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[10]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[11]  C. Galland The limitations of deterministic Boltzmann machine learning , 1993 .

[12]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[13]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[14]  Stuart J. Russell,et al.  Local Learning in Probabilistic Networks with Hidden Variables , 1995, IJCAI.

[15]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[16]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[17]  Brendan J. Frey,et al.  Does the Wake-sleep Algorithm Produce Good Density Estimators? , 1995, NIPS.

[18]  Harris Drucker,et al.  Comparison of learning algorithms for handwritten digit recognition , 1995 .

[19]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[20]  Terrence J. Sejnowski,et al.  Bayesian Unsupervised Learning of Higher Order Structure , 1996, NIPS.

[21]  Michael I. Jordan Graphical Models , 2003 .