Complexity theoretic hardness results for query learning

Abstract. We investigate the complexity of learning for the well-studied model in which the learning algorithm may ask membership and equivalence queries. While complexity theoretic techniques have previously been used to prove hardness results in various learning models, these techniques typically are not strong enough to use when a learning algorithm may make membership queries. We develop a general technique for proving hardness results for learning with membership and equivalence queries (and for more general query models). We apply the technique to show that, assuming $ {\rm NP} \neq \hbox {\rm co-NP} $, no polynomial-time membership and (proper) equivalence query algorithms exist for exactly learning read-thrice DNF formulas, unions of $ k \ge 3 $ halfspaces over the Boolean domain, or some other related classes. Our hardness results are representation dependent, and do not preclude the existence of representation independent algorithms.¶The general technique introduces the representation problem for a class F of representations (e.g., formulas), which is naturally associated with the learning problem for F. This problem is related to the structural question of how to characterize functions representable by formulas in F, and is a generalization of standard complexity problems such as Satisfiability. While in general the representation problem is in $ \sum^{\rm P}_2 $, we present a theorem demonstrating that for "reasonable" classes F, the existence of a polynomial-time membership and equivalence query algorithm for exactly learning F implies that the representation problem for F is in fact in co-NP. The theorem is applied to prove hardness results such as the ones mentioned above, by showing that the representation problem for specific classes of formulas is NP-hard.

[1]  Eyal Kushilevitz,et al.  On Learning Read-k-Satisfy-j DNF , 1998, SIAM J. Comput..

[2]  Paul D. Seymour,et al.  A note on the production of matroid minors , 1977, J. Comb. Theory, Ser. B.

[3]  Nader H. Bshouty,et al.  Exact learning via the Monotone theory , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[4]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[5]  Wolfgang Maass,et al.  How fast can a threshold gate learn , 1994, COLT 1994.

[6]  Tibor Hegedűs,et al.  On training simple neural networks and small-weight neurons , 1994 .

[7]  Hava T. Siegelmann,et al.  On a learnability question associated to neural networks with continuous activations (extended abstract) , 1994, COLT '94.

[8]  Yechezkel Zalcstein,et al.  A Graph-Theoretic Characterization of the PV_chunk Class of Synchronizing Primitives , 1977, SIAM J. Comput..

[9]  Tibor Hegedüs Can Complexity Theory Benefit from Learning Theory? , 1993, ECML.

[10]  M. Yannakakis The Complexity of the Partial Order Dimension Problem , 1982 .

[11]  Hans Ulrich Simon,et al.  On learning ring-sum-expansions , 1990, COLT '90.

[12]  Marek Karpinski,et al.  An Algorithm to Learn Read-Once Threshold Formulas, and some Generic Transformations between Learning Models , 1993 .

[13]  Leonard Pitt,et al.  On the learnability of disjunctive normal form formulas , 2004, Machine Learning.

[14]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1989, [1989] Proceedings. Structure in Complexity Theory Fourth Annual Conference.

[15]  Dana Angluin,et al.  When won't membership queries help? , 1991, STOC '91.

[16]  J. Stephen Judd,et al.  Learning in neural networks , 1988, COLT '88.

[17]  Stephen R. Schach,et al.  Learning switch configurations , 1990, COLT '90.

[18]  Larry J. Stockmeyer,et al.  The Polynomial-Time Hierarchy , 1976, Theor. Comput. Sci..

[19]  Michael E. Saks,et al.  Combinatorial characterization of read-once formulae , 1993, Discret. Math..

[20]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[21]  Vijay Raghavan,et al.  On the Limits of Proper Learnability of Subclasses of DNF Formulas , 1994, COLT '94.

[22]  Vijay Raghavan,et al.  Read-Twice DNF Formulas are Properly Learnable , 1994, Inf. Comput..

[23]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[24]  Sampath Kannan,et al.  Oracles and Queries That Are Sufficient for Exact Learning , 1996, J. Comput. Syst. Sci..

[25]  Celia Wrathall,et al.  Complete Sets and the Polynomial-Time Hierarchy , 1976, Theor. Comput. Sci..

[26]  Saburo Muroga,et al.  Threshold logic and its applications , 1971 .

[27]  Avrim Blum,et al.  Separating distribution-free and mistake-bound learning models over the Boolean domain , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[28]  David Haussler,et al.  Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension , 1986, STOC '86.

[29]  Tibor Hegedűs,et al.  Generalized teaching dimensions and the query complexity of learning , 1995, Annual Conference Computational Learning Theory.

[30]  Leonard Pitt,et al.  Prediction-Preserving Reducibility , 1990, J. Comput. Syst. Sci..

[31]  Lisa Hellerstein,et al.  Read-thrice DNF is hard to learn with membership and equivalence queries , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[32]  Toshihide Ibaraki,et al.  Threshold Numbers and Threshold Completions , 1981 .

[33]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[34]  P. Hammer,et al.  Aggregation of inequalities in integer programming. , 1975 .

[35]  R. Gavalda The complexity of learning with queries , 1994, Proceedings of IEEE 9th Annual Conference on Structure in Complexity Theory.

[36]  Dana Angluin,et al.  Computational learning theory: survey and selected bibliography , 1992, STOC '92.

[37]  Lisa Hellerstein,et al.  Learning Arithmetic Read-Once Formulas , 1995, SIAM J. Comput..

[38]  R. Karp,et al.  On characterizing and learning some classes of read-once functions , 1989 .

[39]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[40]  Marek Karpinski,et al.  Learning read-once formulas with queries , 1993, JACM.

[41]  Mona Singh,et al.  Learning functions of k terms , 1990, COLT '90.

[42]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[43]  Lisa Hellerstein,et al.  Learning Boolean Read-Once Formulas over Generalized Bases , 1995, J. Comput. Syst. Sci..

[44]  Dana Angluin Exact learning of p-dnf formulas with malicious membership queries , 1993 .

[45]  Leonard Pitt,et al.  On the Necessity of Occam Algorithms , 1992, Theor. Comput. Sci..

[46]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[47]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[48]  Lisa Hellerstein,et al.  How many queries are needed to learn? , 1995, JACM.

[49]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[50]  Dana Angluin Negative results for equivalence queries , 1990, Mach. Learn..

[51]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.