Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm (Extended Abstract)

Valiant and others have studied the problem of learning various classes of Boolean functions from examples. Here we discuss on-line learning of these functions. In on-line learning, the learner responds to each example according to a current hypothesis. Then the learner updates the hypothesis, if necessary, based on the correct classification of the example. One natural measure of the quality of learning in the on-line setting is the number of mistakes the learner makes. For suitable classes of functions, on-line learning algorithms are available that make a bounded number of mistakes, with the bound independent of the number of examples seen by the learner. We present one such algorithm, which learns disjunctive Boolean functions, and variants of the algorithm for learning other classes of Boolean functions. The algorithm can be expressed as a linear-threshold algorithm. A primary advantage of this algorithm is that the number of mistakes that it makes is relatively little affected by the presence of large numbers of irrelevant attributes in the examples; we show that the number of mistakes grows only logarithmically with the number of irrelevant attributes. At the same time, the algorithm is computationaUy time and space efficient.

[1]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[3]  David Haussler,et al.  Predicting (0, 1)-functions on randomly drawn points , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[4]  Saburo Muroga,et al.  Threshold logic and its applications , 1971 .

[5]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[6]  M. Kearns,et al.  Recent Results on Boolean Concept Learning , 1987 .

[7]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[8]  D. Angluin Queries and Concept Learning , 1988 .

[9]  David Haussler Quantifying the Inductive Bias in Concept Learning (Extended Abstract) , 1986, AAAI.

[10]  David Haussler,et al.  Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension , 1986, STOC '86.

[11]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[12]  Leslie G. Valiant,et al.  Learning Disjunction of Conjunctions , 1985, IJCAI.

[13]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[14]  Ranan B. Banerji,et al.  The Logic of Learning: A Basis for Pattern Recognition and for Improvement of Performance , 1985, Adv. Comput..

[15]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[16]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.