Optimally-Smooth Adaptive Boosting and Application to Agnostic Learning

We construct a boosting algorithm, which is the first both smooth and adaptive booster. These two features make it possible to achieve performance improvement for many learning tasks whose solution use a boosting technique.Originally, the boosting approach was suggested for the standard PAC model; we analyze possible applications of boosting in the model of agnostic learning (which is "more realistic" than PAC). We derive a lower bound for the final error achievable by boosting in the agnostic model; we show that our algorithm actually achieves that accuracy (within a constant factor of 2): When the booster faces distribution D, its final error is bounded above by 1/1/2-s errD(F) + ?, where errD? (F) + s is an upper bound on the error of a hypothesis received from the (agnostic) weak learner when it faces distribution D? and ? is any real, so that the complexity of the boosting is polynomial in 1/?. We note that the idea of applying boosting in the agnostic model was first suggested by Ben-David, Long and Mansour and the above accuracy is an exponential improvement w.r.t. s over their result ( 1/1/2-s errD(F)2(1/2-s)2/ ln(1/s-1) + ?).Eventually, we construct a boosting "tandem", thus approaching in terms of O the lowest number of the boosting iterations possible, as well as in terms of O the best possible smoothness. This allows solving adaptively problems whose solution is based on smooth boosting (like noise tolerant boosting and DNF membership learning), preserving the original solution's complexity.

[1]  Rocco A. Servedio,et al.  Smooth Boosting and Learning with Malicious Noise , 2001, J. Mach. Learn. Res..

[2]  Nader H. Bshouty,et al.  More efficient PAC-learning of DNF with membership queries under the uniform distribution , 2004, J. Comput. Syst. Sci..

[3]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[4]  Yoav Freund,et al.  An Adaptive Version of the Boost by Majority Algorithm , 1999, COLT '99.

[5]  Russell Impagliazzo,et al.  Hard-core distributions for somewhat hard problems , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[6]  Osamu Watanabe,et al.  MadaBoost: A Modification of AdaBoost , 2000, COLT.

[7]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[8]  Nader H. Bshouty,et al.  More efficient PAC-learning of DNF with membership queries under the uniform distribution , 1999, COLT '99.

[9]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[10]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[11]  Yoav Freund,et al.  An improved boosting algorithm and its implications on learning complexity , 1992, COLT '92.

[12]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[13]  Vitaly Feldman,et al.  On Using Extended Statistical Queries to Avoid Membership Queries , 2001, J. Mach. Learn. Res..

[14]  Yishay Mansour,et al.  Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.

[15]  Shai Ben-David,et al.  Agnostic Boosting , 2001, COLT/EuroCOLT.

[16]  Dmitry Gavinsky,et al.  On Boosting with Optimal Poly-Bounded Distributions , 2001, COLT/EuroCOLT.

[17]  Rocco A. Servedio,et al.  Boosting and hard-core sets , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[18]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[19]  Yishay Mansour,et al.  Learning Boolean Functions via the Fourier Transform , 1994 .

[20]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.