Consistency of posterior distributions for neural networks

In this paper we show that the posterior distribution for feedforward neural networks is asymptotically consistent. This paper extends earlier results on universal approximation properties of neural networks to the Bayesian setting. The proof of consistency embeds the problem in a density estimation problem, then uses bounds on the bracketing entropy to show that the posterior is consistent over Hellinger neighborhoods. It then relates this result back to the regression setting. We show consistency in both the setting of the number of hidden nodes growing with the sample size, and in the case where the number of hidden nodes is treated as a parameter. Thus we provide a theoretical justification for using neural networks for nonparametric regression in a Bayesian framework.

[1]  Larry Wasserman,et al.  Asymptotic Properties of Nonparametric Bayesian Procedures , 1998 .

[2]  Herbert K. H. Lee,et al.  Model selection and model averaging for neural networks , 1998 .

[3]  W. Wong,et al.  Probability inequalities for likelihood ratios and convergence rates of sieve MLEs , 1995 .

[4]  Peter Müller,et al.  Feedforward Neural Networks for Nonparametric Regression , 1998 .

[5]  Wray L. Buntine,et al.  Bayesian Back-Propagation , 1991, Complex Syst..

[6]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[7]  Eric R. Ziegel,et al.  Practical Nonparametric and Semiparametric Bayesian Statistics , 1998, Technometrics.

[8]  David Haussler,et al.  Metric Entropy and Minimax Risk in Classification , 1997, Structures in Logic and Computer Science.

[9]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[10]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[11]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[12]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[13]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[14]  Peter Müller,et al.  Issues in Bayesian Analysis of Neural Network Models , 1998, Neural Computation.

[15]  D. Freedman,et al.  On the consistency of Bayes estimates , 1986 .

[16]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[17]  L. Wasserman,et al.  The consistency of posterior distributions in nonparametric problems , 1999 .

[18]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .