Learning nested differences of intersection-closed concept classes

This paper introduces a new framework for constructing learning algorithms. Our methods involve master algorithms which use learning algorithms for intersection-closed concept classes as subroutines. For example, we give a master algorithm capable of learning any concept class whose members can be expressed as nested differences (for example, c1 − (c2 − (c3 − (c4 − c5)))) of concepts from an intersection-closed class. We show that our algorithms are optimal or nearly optimal with respect to several different criteria. These criteria include: the number of examples needed to produce a good hypothesis with high confidence, the worst case total number of mistakes made, and the expected number of mistakes made in the firstt trials.

[1]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2]  Judea Pearl,et al.  ON THE CONNECTION BETWEEN THE COMPLEXITY AND CREDIBILITY OF INFERRED MODELS , 1978 .

[3]  R. Dudley A course on empirical processes , 1984 .

[4]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[5]  David Haussler,et al.  Epsilon-nets and simplex range queries , 1986, SCG '86.

[6]  Balas K. Natarajan,et al.  On learning Boolean functions , 1987, STOC.

[7]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[8]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[9]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[10]  David Haussler,et al.  Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.

[11]  David Haussler,et al.  Equivalence of models for polynomial learnability , 1988, COLT '88.

[12]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[13]  Manfred K. Warmuth,et al.  Learning Nested Differences of Intersection-Closed Concept Classes , 1989, COLT '89.

[14]  Nick Littlestone,et al.  From on-line to batch learning , 1989, COLT '89.

[15]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[16]  Manfred K. Warmuth,et al.  Learning nested differences of intersection-closed concept classes , 2004, Machine Learning.

[17]  Leonard Pitt,et al.  On the necessity of Occam algorithms , 1990, STOC '90.

[18]  Leonard Pitt,et al.  Prediction-Preserving Reducibility , 1990, J. Comput. Syst. Sci..

[19]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[20]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[21]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[22]  David Haussler,et al.  Learning Conjunctive Concepts in Structural Domains , 1989, Machine Learning.

[23]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .