论文信息 - Bayesian invariant measurements of generalisation for discrete distributions

Bayesian invariant measurements of generalisation for discrete distributions

Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.

Huaiyu Zhu | Richard Rohwer | Huaiyu Zhu | R. Rohwer

[1] J. Berger. Statistical Decision Theory and Bayesian Analysis , 1988 .

[2] Shun-ichi Amari,et al. Differential geometrical theory of statistics , 1987 .

[3] Radford M. Neal. Bayesian Learning via Stochastic Dynamics , 1992, NIPS.

[4] T. Loredo. From Laplace to Supernova SN 1987A: Bayesian Inference in Astrophysics , 1990 .

[5] S. Eguchi. Second Order Efficiency of Minimum Contrast Estimators in a Curved Exponential Family , 1983 .

[6] Huaiyu Zhu,et al. Bayesian invariant measurements of generalisation for continuous distributions , 1995 .

[7] David H. Wolpert,et al. On the Use of Evidence in Neural Networks , 1992, NIPS.

[8] Halbert White,et al. Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[9] Huaiyu Zhu,et al. Information geometric measurements of generalisation , 1995 .

[10] R. Romer,et al. Tables of functions with formulae and curves , 1934 .

[11] Howard Raiffa,et al. Applied Statistical Decision Theory. , 1961 .

[12] Gerald S. Rogers,et al. Mathematical Statistics: A Decision Theoretic Approach , 1967 .

[13] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .