论文信息 - A Practical Bayesian Framework for Backprop Networks

A Practical Bayesian Framework for Backprop Networks

A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible: (1) objective comparisons between solutions using alternative network architectures; (2) objective stopping rules for deletion of weights; (3) objective choice of magnitude and type of weight decay terms or additive regularisers (for penalising large weights, etc.); (4) a measure of the e ective number of well{determined parameters in a model; (5) quanti ed estimates of the error bars on network parameters and on network output; (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian `evidence' automatically embodies `Occam's razor,' penalising over{ exible and over{complex architectures. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well{ matched to a problem, a good correlation between generalisation ability and the Bayesian evidence is obtained.

D. Mackay | D. MacKay

[1] A. M. Walker. On the Asymptotic Behaviour of Posterior Distributions , 1969 .

[2] Geoffrey E. Hinton,et al. OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[3] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[4] Geoffrey E. Hinton,et al. Learning representations by back-propagation errors, nature , 1986 .

[5] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .

[6] Esther Levin,et al. A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.

[7] S. P. Luttrell,et al. The Use of Bayesian and Entropic Methods in Neural Network Theory , 1989 .

[8] J. Skilling. The Eigenvalues of Mega-dimensional Matrices , 1989 .

[9] Yaser S. Abu-Mostafa,et al. The Vapnik-Chervonenkis Dimension: Information versus Complexity in Learning , 1989, Neural Computation.

[10] Stephen F. Gull,et al. Developments in Maximum Entropy Data Analysis , 1989 .

[11] David S. Touretzky,et al. Advances in neural information processing systems 2 , 1989 .