Asking questions to minimize errors

A number of efficient learning algorithms achieve exact identification of an unknown function from some class using membership and equivalence queries. Using a standard transformation such algorithms can easily be converted to on-line learning algorithms that use membership queries. Under such a transformation the number of equivalence queries made by the query algorithm directly corresponds to the number of mistakes made by the on-line algorithm. In this paper we consider several of the natural classes known to be learnable in this setting, and investigate the minimum number of equivalence queries with accompanying counterexamples (or equivalently the minimum number of mistakes in the on-line model) that can be made by a learning algorithm that makes a polynomial number of membership queries and uses polynomial computation time. We are able both to reduce the number of equivalence queries used by the previous algorithms and often to prove matching lower bounds. As an example, consider the class of DNF formulas over n variables with at mostk=O(logn) terms. Previously, the algorithm of Blum and Rudich provided the best known upper bound of 2O(k)lognfor the minimum number of equivalence queries needed for exact identification. We greatly improve on this upper bound showing that exactlykcounterexamples are needed if the learner knowska priori and exactlyk+1 counterexamples are needed if the learner does not knowka priori. This exactly matches known lower bounds of Bshouty and Cleve. For many of our results we obtain a complete characterization of the trade-off between the number of membership and equivalence queries needed for exact identification. The classes we consider here are monotone DNF formulas, Horn sentences,O(logn)-term DNF formulas, read-ksat-jDNF formulas, read-once formulas over various bases, and deterministic finite automata.

[1]  Leonard Pitt,et al.  Exact learning of read-k disjoint DNF and not-so-disjoint DNF , 1992, COLT '92.

[2]  J. A. Bondy,et al.  Graph Theory with Applications , 1978 .

[3]  Nader H. Bshouty,et al.  On the exact learning of formulas in parallel , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[4]  Moni Naor,et al.  Small-bias probability spaces: efficient constructions and applications , 1990, STOC '90.

[5]  Nader H. Bshouty,et al.  Exact learning via the Monotone theory , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[6]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[7]  Marek Karpinski,et al.  Learning read-once formulas with queries , 1993, JACM.

[8]  H. Aizenstein,et al.  Exact learning of read-twice DNF formulas , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[9]  Lisa Hellerstein,et al.  Learning Boolean read-once formulas with arbitrary symmetric and constant fan-in gates , 1992, COLT '92.

[10]  Lisa Hellerstein,et al.  Learning Arithmetic Read-Once Formulas , 1995, SIAM J. Comput..

[11]  Thomas R. Hancock,et al.  Learning 2µ DNF Formulas and kµ Decision Trees , 1991, COLT.

[12]  Avrim Blum,et al.  Fast learning of k-term DNF formulas with queries , 1992, STOC '92.

[13]  Stephen R. Schach,et al.  Learning switch configurations , 1990, COLT '90.

[14]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[15]  Ronald L. Rivest,et al.  Inference of finite automata using homing sequences , 1989, STOC '89.

[16]  Thomas R. Hancock,et al.  Learning 2u DNF formulas and ku decision trees , 1991, COLT 1991.

[17]  David Haussler,et al.  Equivalence of models for polynomial learnability , 1988, COLT '88.

[18]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[19]  Lisa Hellerstein,et al.  Learning arithmetic read-once formulas , 1992, STOC '92.

[20]  Elwood S. Buffa,et al.  Graph Theory with Applications , 1977 .