Learning Abound: Quickly When Irrelevant Attributes A New Linear-threshold Algorithm

Valiant (1984) and others have studied the problem of learning various classes of Boolean functions from examples. Here we discuss incremental learning of these functions. We consider a setting in which the learner responds to each example according to a current hypothesis. Then the learner updates the hypothesis, if necessary, based on the correct classification of the example. One natural measure of the quality of learning in this setting is the number of mistakes the learner makes. For suitable classes of functions, learning algorithms are available that make a bounded number of mistakes, with the bound independent of the number of examples seen by the learner. We present one such algorithm that learns disjunctive Boolean functions, along with variants for learning other classes of Boolean functions. The basic method can be expressed as a linear-threshold algorithm. A primary advantage of this algorithm is that the number of mistakes grows only logarithmically with the number of irrelevant attributes in the examples. At the same time, the algorithm is computationally efficient in both time and space.

[1]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2]  Saburo Muroga,et al.  Threshold logic and its applications , 1971 .

[3]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[4]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[5]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[6]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[7]  Leslie G. Valiant,et al.  Learning Disjunction of Conjunctions , 1985, IJCAI.

[8]  Ranan B. Banerji,et al.  The Logic of Learning: A Basis for Pattern Recognition and for Improvement of Performance , 1985, Adv. Comput..

[9]  David Haussler Quantifying the Inductive Bias in Concept Learning (Extended Abstract) , 1986, AAAI.

[10]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[11]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[12]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[13]  M. Kearns,et al.  Recent Results on Boolean Concept Learning , 1987 .

[14]  D. Angluin Queries and Concept Learning , 1988 .

[15]  David Haussler,et al.  Predicting (0, 1)-functions on randomly drawn points , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[16]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.