On MCMC sampling in Bayesian MLP neural networks

Bayesian MLP neural networks are a flexible tool in complex nonlinear problems. The approach is complicated by need to evaluate integrals over high-dimensional probability distributions. The integrals are generally approximated with Markov chain Monte Carlo (MCMC) methods. There are several practical issues which arise when implementing MCMC. This article discusses the choice of starting values and the number of chains in Bayesian MLP models. We propose a new method for choosing the starting values based on early stopping and we demonstrate the benefits of using several independent chains.

[1]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[2]  Jouko Lampinen,et al.  Bayesian MLP neural networks for image analysis , 2000, Pattern Recognit. Lett..

[3]  Peter Müller,et al.  Issues in Bayesian Analysis of Neural Network Models , 1998, Neural Computation.

[4]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[5]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[6]  Radford M. Neal Sampling from multimodal distributions using tempered transitions , 1996, Stat. Comput..

[7]  Christopher M. Bishop,et al.  Neural networks and machine learning , 1998 .

[8]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[9]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[10]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[11]  Geoffrey E. Hinton,et al.  Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[12]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[13]  Jouko Lampinen,et al.  Bayesian neural networks for industrial applications , 1999, SMCia/99 Proceedings of the 1999 IEEE Midnight - Sun Workshop on Soft Computing Methods in Industrial Applications (Cat. No.99EX269).

[14]  J. Stephen Judd,et al.  Learning in neural networks , 1988, COLT '88.

[15]  Hervé Bourlard,et al.  Generalization and Parameter Estimation in Feedforward Netws: Some Experiments , 1989, NIPS.

[16]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[17]  Radford M. Neal Bayesian Learning via Stochastic Dynamics , 1992, NIPS.

[18]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Radford M. Neal Assessing Relevance determination methods using DELVE , 1998 .

[20]  S. E. Hills,et al.  Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling , 1990 .

[21]  Jouko Lampinen,et al.  Bayesian Neural Network to Solve the Inverse Problem in Electrical Impedance Tomography , 1999 .

[22]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[23]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..