Boosting in the presence of noise

Boosting algorithms are procedures that ''boost'' low-accuracy weak learning algorithms to achieve arbitrarily high accuracy. Over the past decade boosting has been widely used in practice and has become a major research topic in computational learning theory. In this paper we study boosting in the presence of random classification noise, giving both positive and negative results. We show that a modified version of a boosting algorithm due to Mansour and McAllester (J. Comput. System Sci. 64(1) (2002) 103) can achieve accuracy arbitrarily close to the noise rate. We also give a matching lower bound by showing that no efficient black-box boosting algorithm can boost accuracy beyond the noise rate (assuming that one-way functions exist). Finally, we consider a variant of the standard scenario for boosting in which the ''weak learner'' satisfies a slightly stronger condition than the usual weak learning guarantee. We give an efficient algorithm in this framework which can boost to arbitrarily high accuracy in the presence of classification noise.

[1]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[2]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[3]  Adam Tauman Kalai,et al.  Learning Monotonic Linear Functions , 2004, COLT.

[4]  Yishay Mansour,et al.  On the Boosting Ability of Top-Down Decision Tree Learning Algorithms , 1999, J. Comput. Syst. Sci..

[5]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[6]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[7]  Javed A. Aslam,et al.  Specification and simulation of statistical query algorithms for efficiency and noise tolerance , 1995, COLT '95.

[8]  Leonid A. Levin,et al.  A Pseudorandom Generator from any One-way Function , 1999, SIAM J. Comput..

[9]  Robert E. Schapire,et al.  Theoretical Views of Boosting , 1999, EuroCOLT.

[10]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[11]  Leslie G. Valiant,et al.  Cryptographic limitations on learning Boolean formulae and finite automata , 1994, JACM.

[12]  Hans Ulrich Simon,et al.  Proceedings of the 19th annual conference on Learning Theory , 2006 .

[13]  Robert E. Schapire,et al.  Efficient Distribution-free Learning of Probabilistic Concepts (Extended Abstract) , 1990, FOCS 1990.

[14]  Yishay Mansour,et al.  Boosting Using Branching Programs , 2000, J. Comput. Syst. Sci..

[15]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[16]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[17]  Alan M. Frieze,et al.  A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.

[18]  Silvio Micali,et al.  How to construct random functions , 1986, JACM.

[19]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[20]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .