Learning DNF Expressions from Fourier Spectrum

Since its introduction by Valiant in 1984, PAC learning of DNF expressions remains one of the central problems in learning theory. We consider this problem in the setting where the underlying distribution is uniform, or more generally, a product distribution. Kalai, Samorodnitsky and Teng (2009) showed that in this setting a DNF expression can be efficiently approximated from its "heavy" low-degree Fourier coefficients alone. This is in contrast to previous approaches where boosting was used and thus Fourier coefficients of the target function modified by various distributions were needed. This property is crucial for learning of DNF expressions over smoothed product distributions, a learning model introduced by Kalai et al. (2009) and inspired by the seminal smoothed analysis model of Spielman and Teng (2001). We introduce a new approach to learning (or approximating) a polynomial threshold functions which is based on creating a function with range [-1,1] that approximately agrees with the unknown function on low-degree Fourier coefficients. We then describe conditions under which this is sufficient for learning polynomial threshold functions. Our approach yields a new, simple algorithm for approximating any polynomial-size DNF expression from its "heavy" low-degree Fourier coefficients alone. Our algorithm greatly simplifies the proof of learnability of DNF expressions over smoothed product distributions. We also describe an application of our algorithm to learning monotone DNF expressions over product distributions. Building on the work of Servedio (2001), we give an algorithm that runs in time $\poly((s \cdot \log{(s/\eps)})^{\log{(s/\eps)}}, n)$, where $s$ is the size of the target DNF expression and $\eps$ is the accuracy. This improves on $\poly((s \cdot \log{(ns/\eps)})^{\log{(s/\eps)} \cdot \log{(1/\eps)}}, n)$ bound of Servedio (2001).

[1]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[2]  Rocco A. Servedio,et al.  Nearly Optimal Solutions for the Chow Parameters Problem and Low-Weight Approximation of Halfspaces , 2012, J. ACM.

[3]  Linda Sellie,et al.  Exact learning of random DNF over the uniform distribution , 2009, STOC '09.

[4]  Leonid A. Levin,et al.  A hard-core predicate for all one-way functions , 1989, STOC '89.

[5]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[6]  Ryan O'Donnell,et al.  Learning DNF from random walks , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[7]  Hans Ulrich Simon,et al.  On restricted-focus-of-attention learnability of Boolean functions , 1996, COLT '96.

[8]  Vitaly Feldman,et al.  A Complete Characterization of Statistical Query Learning with Applications to Evolvability , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[9]  Rocco A. Servedio,et al.  Learning DNF in time 2Õ(n1/3) , 2004, J. Comput. Syst. Sci..

[10]  Karsten A. Verbeurgt Learning DNF under the uniform distribution in quasi-polynomial time , 1990, COLT '90.

[11]  Eyal Kushilevitz,et al.  Learning decision trees using the Fourier spectrum , 1991, STOC '91.

[12]  Pavel Pudlák,et al.  On the computational power of depth 2 circuits with threshold and modulo gates , 1994, STOC '94.

[13]  Russell Impagliazzo,et al.  Hard-core distributions for somewhat hard problems , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[14]  Ryan O'Donnell,et al.  The Chow Parameters Problem , 2011, SIAM J. Comput..

[15]  Shang-Hua Teng,et al.  Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time , 2001, STOC '01.

[16]  Noam Nisan,et al.  Constant depth circuits, Fourier transform, and learnability , 1989, 30th Annual Symposium on Foundations of Computer Science.

[17]  Akira Maruoka,et al.  Learning Monotone Log-Term DNF Formulas under the Uniform Distribution , 2000, Theory of Computing Systems.

[18]  Rocco A. Servedio On Learning Monotone DNF under Product Distributions , 2001, COLT/EuroCOLT.

[19]  Adam Tauman Kalai,et al.  The Hebrew University , 1998 .

[20]  Nathan Linial,et al.  The influence of variables on Boolean functions , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[21]  Rocco A. Servedio,et al.  Learning DNF in time , 2001, STOC '01.

[22]  Adam Tauman Kalai,et al.  Agnostically learning decision trees , 2008, STOC.

[23]  Yishay Mansour,et al.  An O(nlog log n) learning algorithm for DNF under the uniform distribution , 1992, COLT '92.

[24]  M. Bellare THE SPECTRAL NORM OF FINITE FUNCTIONS , 1991 .

[25]  Vitaly Feldman Attribute-Efficient and Non-adaptive Learning of Parities and DNF Expressions , 2007, J. Mach. Learn. Res..

[26]  Rocco A. Servedio,et al.  Boosting and Hard-Core Set Construction , 2003, Machine Learning.

[27]  Madhur Tulsiani,et al.  Regularity, Boosting, and Efficiently Simulating Every High-Entropy Distribution , 2009, 2009 24th Annual IEEE Conference on Computational Complexity.

[28]  Sean W. Smith,et al.  Improved learning of AC0 functions , 1991, COLT '91.

[29]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[30]  Adam Tauman Kalai,et al.  Reliable Agnostic Learning , 2009, COLT.

[31]  Yishay Mansour,et al.  An O(n^(log log n)) Learning Algorithm for DNT under the Uniform Distribution , 1995, J. Comput. Syst. Sci..

[32]  Nader H. Bshouty,et al.  More efficient PAC-learning of DNF with membership queries under the uniform distribution , 1999, COLT '99.

[33]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[34]  Nader H. Bshouty,et al.  On the Fourier spectrum of monotone functions , 1996, JACM.

[35]  Adam R. Klivans,et al.  Learning DNF in time 2 Õ(n 1/3 ) . , 2001, STOC 2001.

[36]  Ryan O'Donnell,et al.  The chow parameters problem , 2008, SIAM J. Comput..

[37]  Leonard Pitt,et al.  On the learnability of disjunctive normal form formulas , 2004, Machine Learning.

[38]  Rocco A. Servedio,et al.  Learning random monotone DNF , 2008, Discret. Appl. Math..

[39]  Vitaly Feldman,et al.  On Using Extended Statistical Queries to Avoid Membership Queries , 2001, J. Mach. Learn. Res..

[40]  Yishay Mansour,et al.  Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.