Boosting a Weak Learning Algorithm by Majority to Be Published in Information and Computation

We present an algorithm for improving the accuracy of algorithms for learning binary concepts. The improvement is achieved by combining a large number of hypotheses , each of which is generated by training the given learning algorithm on a diierent set of examples. Our algorithm is based on ideas presented by Schapire in his paper \The strength of weak learnability", and represents an improvement over his results. The analysis of our algorithm provides general upper bounds on the resources required for learning in Valiant's polynomial PAC learning framework, which are the best general upper bounds known today. We show that the number of hypotheses that are combined by our algorithm is the smallest number possible. Other outcomes of our analysis are results regarding the representational power of threshold circuits, the relation between learnability and compression, and a method for parallelizing PAC learning algorithms. We provide extensions of our algorithms to cases in which the concepts are not binary and to the case where the accuracy of the learning algorithm depends on the distribution of the instances.

[1]  D. Cowling,et al.  Assessing the relationship between ad volume and awareness of a tobacco education media campaign , 2010, Tobacco Control.

[2]  Temple F. Smith Occam's razor , 1980, Nature.

[3]  David Haussler,et al.  Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.

[4]  David Haussler,et al.  Equivalence of models for polynomial learnability , 1988, COLT '88.

[5]  Ronald L. Graham,et al.  Concrete mathematics - a foundation for computer science , 1991 .

[6]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[7]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[8]  Translator-IEEE Expert staff Machine Learning: A Theoretical Approach , 1992, IEEE Expert.

[9]  Alexander A. Razborov,et al.  Majority gates vs. general weighted threshold gates , 1992, [1992] Proceedings of the Seventh Annual Structure in Complexity Theory Conference.

[10]  Harris Drucker,et al.  Improving Performance in Neural Networks Using a Boosting Algorithm , 1992, NIPS.

[11]  Michael Kearns,et al.  Efficient noise-tolerant learning from statistical queries , 1993, STOC.

[12]  Javed A. Aslam,et al.  General bounds on statistical query learning and PAC learning with noise via hypothesis boosting , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[13]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15]  Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension , 1997, EuroCOLT.