Some PAC-Bayesian Theorems

This paper gives PAC guarantees for “Bayesian” algorithms—algorithms that optimize risk minimization expressions involving a prior probability and a likelihood for the training data. PAC-Bayesian algorithms are motivated by a desire to provide an informative prior encoding information about the expected experimental setting but still having PAC performance guarantees over all IID settings. The PAC-Bayesian theorems given here apply to an arbitrary prior measure on an arbitrary concept space. These theorems provide an alternative to the use of VC dimension in proving PAC bounds for parameterized concepts.

[1]  Nathan Linial,et al.  Results on learnability and the Vapnik-Chervonenkis dimension , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[2]  Andrew R. Barron,et al.  Complexity Regularization with Application to Artificial Neural Networks , 1991 .

[3]  Andrew R. Barron,et al.  Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.

[4]  Dana Ron,et al.  An experimental and theoretical comparison of model selection methods , 1995, COLT '95.

[5]  G. Lugosi,et al.  Concept learning using complexity regularization , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[6]  John Shawe-Taylor,et al.  A framework for structural risk minimisation , 1996, COLT '96.

[7]  Gábor Lugosi,et al.  Concept learning using complexity regularization , 1995, IEEE Trans. Inf. Theory.

[8]  John Shawe-Taylor,et al.  A PAC analysis of a Bayesian estimator , 1997, COLT '97.

[9]  Dana Ron,et al.  An Experimental and Theoretical Comparison of Model Selection Methods , 1995, COLT '95.