论文信息 - The Value of Agreement, a New Boosting Algorithm

The Value of Agreement, a New Boosting Algorithm

We present a new generalization bound where the use of unlabeled examples results in a better ratio between training-set size and the resulting classifier's quality and thus reduce the number of labeled examples necessary for achieving it. This is achieved by demanding from the algorithms generating the classifiers to agree on the unlabeled examples. The extent of this improvement depends on the diversity of the learners—a more diverse group of learners will result in a larger improvement whereas using two copies of a single algorithm gives no advantage at all. As a proof of concept, we apply the algorithm, named AgreementBoost, to a web classification problem where an up to 40% reduction in the number of labeled examples is obtained.

Boaz Leskes | Boaz Leskes

[1] Joe F. Zhou,et al. Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, : 21-22 June 1999, University of Maryland, College Park, MD, USA , 1999 .

[2] Dale Schuurmans,et al. Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[3] Gunnar Rätsch,et al. On the Convergence of Leveraging , 2001, NIPS.

[4] Gunnar Rätsch,et al. An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[5] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[6] E. Polak. Introduction to linear and nonlinear programming , 1973 .

[7] Yan Zhou,et al. Enhancing Supervised Learning with Unlabeled Data , 2000, ICML.

[8] Sebastian Thrun,et al. Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.

[9] Tong Zhang,et al. The Value of Unlabeled Data for Classification Problems , 2000, ICML 2000.

[10] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[11] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..