Learnability of DNF with representation-specific queries

We study the problem of PAC learning the class of DNF formulas with a type of natural pairwise query specific to the DNF representation. Specifically, given a pair of positive examples from a polynomial-sized sample, we consider boolean queries that ask whether the two examples satisfy at least one term in common in the target DNF, and numerical queries that ask how many terms in common the two examples satisfy. We provide both positive and negative results for learning with these queries under both uniform and general distributions. For boolean queries, we show that the problem of learning an arbitrary DNF target under an arbitrary distribution is no easier than in the traditional PAC model. However, on the positive side, we show that under the uniform distribution, we can properly learn any DNF formula with O(log(n)) relevant variables, any DNF formula where each variable appears in at most O(log(n)) terms, and any DNF formula having at most 2O(√log(n)) terms. Under general distributions, we show that 2-term DNFs are efficiently properly learnable as are disjoint DNFs. For numerical queries, we show we can learn arbitrary DNF formulas under the uniform distribution; in the process, we give an algorithm for learning a sum of monotone terms from labeled data only. Numerical-valued queries also allow us to properly learn any DNF with O(log(n)) relevant variables under arbitrary distributions, as well as DNF having O(log(n)) terms, and DNF for which each example can satisfy at most O(1) terms. Other possible generalizations of the query include allowing the algorithm to ask the query for an arbitrary number of examples from the sample at once (rather than just two), or allowing the algorithm to ask the query for examples of its own construction; we show that both of these generalizations allow for efficient proper learnability of arbitrary DNF functions under arbitrary distributions.

[1]  Rocco A. Servedio,et al.  On learning monotone DNF under product distributions , 2001, Inf. Comput..

[2]  Ryan O'Donnell,et al.  Learning functions of k relevant variables , 2004, J. Comput. Syst. Sci..

[3]  Michael Kearns,et al.  Computational complexity of machine learning , 1990, ACM distinguished dissertations.

[4]  Ryan O'Donnell,et al.  Learning Monotone Decision Trees in Polynomial Time , 2007, SIAM J. Comput..

[5]  Avi Wigderson,et al.  Restriction access , 2012, ITCS '12.

[6]  Yishay Mansour,et al.  Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.

[7]  Leslie G. Valiant,et al.  Learning Boolean formulas , 1994, JACM.

[8]  Gregory Valiant,et al.  Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[9]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[10]  Rocco A. Servedio,et al.  Learning DNF in time , 2001, STOC '01.

[11]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[12]  Ryan O'Donnell,et al.  Learning monotone decision trees in polynomial time , 2006, 21st Annual IEEE Conference on Computational Complexity (CCC'06).

[13]  Karsten A. Verbeurgt Learning DNF under the uniform distribution in quasi-polynomial time , 1990, COLT '90.

[14]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[15]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[16]  Yishay Mansour,et al.  Learning Boolean Functions via the Fourier Transform , 1994 .

[17]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[18]  Adam R. Klivans,et al.  Learning DNF in time 2 Õ(n 1/3 ) . , 2001, STOC 2001.