论文信息 - An Improved Predictive Accuracy Bound for Averaging Classifiers

An Improved Predictive Accuracy Bound for Averaging Classifiers

We present an improved bound on the difference between training and test errors for voting classifiers. This improved averaging bound provides a theoretical justification for popular averaging techniques such as Bayesian classification, Maximum Entropy discrimination, Winnow and Bayes point machines and has implications for learning algorithm design.

[1] Temple F. Smith. Occam's razor , 1980, Nature.

[2] James Kelly,et al. AutoClass: A Bayesian Classification System , 1993, ML.

[3] David Haussler,et al. Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension , 1991, COLT '91.

[4] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[5] Nick Littlestone,et al. Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow , 1991, COLT '91.

[6] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7] John Shawe-Taylor,et al. A framework for structural risk minimisation , 1996, COLT '96.

[8] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[9] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[10] Dale Schuurmans,et al. Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[11] David A. McAllester. PAC-Bayesian model averaging , 1999, COLT '99.

[12] Ralf Herbrich,et al. Bayes Point Machines: Estimating the Bayes Point in Kernel Space , 1999 .

[13] Tommi S. Jaakkola,et al. Maximum Entropy Discrimination , 1999, NIPS.

[14] Tom Minka,et al. Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[15] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.