Projection Learning

A method of combining learning algorithms is described that preserves attribute-efficiency. It yields learning algorithms that require a number of examples that is polynomial in the number of relevant variables and logarithmic in the number of irrelevant ones. The algorithms are simple to implement and realizable on networks with a number of nodes linear in the total number of variables. They include generalizations of Littlestone's Winnow algorithm, and are, therefore, good candidates for experimentation on domains having very large numbers of attributes but where nonlinear hypotheses are sought.

[1]  Nader H. Bshouty,et al.  On Learning Decision Trees with Large Output Domains , 1998, Algorithmica.

[2]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[3]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[4]  Lisa Hellerstein,et al.  Attribute-efficient learning in query and mistake-bound models , 1996, COLT '96.

[5]  Manfred K. Warmuth,et al.  The perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant , 1995, COLT '95.

[6]  Dan Roth,et al.  Applying Winnow to Context-Sensitive Spelling Correction , 1996, ICML.

[7]  Mona Singh,et al.  Learning functions of k terms , 1990, COLT '90.

[8]  Lisa Hellerstein,et al.  PAC learning with irrelevant attributes , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[9]  Leslie G. Valiant,et al.  Robust logics , 1999, STOC '99.

[10]  Avrim Blum Learning boolean functions in an infinite attribute space , 1990, STOC '90.

[11]  Roni Khardon,et al.  Learning to Take Actions , 1996, Machine Learning.

[12]  Leslie G. Valiant,et al.  A neuroidal architecture for cognitive computation , 1998, ICALP.

[13]  Nader H. Bshouty,et al.  On learning width two branching programs (extended abstract) , 1996, COLT '96.

[14]  Heikki Mannila,et al.  Learning hierarchical rule sets , 1992, COLT '92.

[15]  Lisa Hellerstein,et al.  Learning in the presence of finitely or infinitely many irrelevant attributes , 1991, COLT '91.

[16]  Nader H. Bshouty,et al.  On Learning width Two Branching Programs , 1998, Inf. Process. Lett..

[17]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[18]  Lisa Hellerstein,et al.  Attribute-Efficient Learning in Query and Mistake-Bound Models , 1998, J. Comput. Syst. Sci..

[19]  Mark Craven,et al.  Learning Sparse Perceptrons , 1995, NIPS.

[20]  David Haussler,et al.  Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..