On the Fourier spectrum of symmetric Boolean functions with applications to learning symmetric juntas

We study the following question: What is the smallest t such that every symmetric Boolean function on k variables (which is not a constant or a parity function), has a non-zero Fourier coefficient of order at least 1 and at most t? We exclude the constant functions for which there is no such t and the parity functions for which t has to be k. Let /spl tau/(k) be the smallest such t. The train contribution of this paper is a proof of the following self similar nature of this question: If /spl tau/(l) /spl les/ s, then for any /spl epsi/ > 0 and for k /spl ges/ k/sub 0/(l, /spl epsi/), /spl tau/(k) /spl les/ ((s + 1)/(l + 1) + /spl epsi/)k. Coupling this result with a computer based search which establishes /spl tau/(30) = 2, one obtains that for large enough k, /spl tau/(k) /spl les/ 3k/31. The motivation for our work is to understand the complexity of learning symmetric juntas. A k-junta is a Boolean function of it variables that depends only on an unknown subset of k variables. If f is symmetric in the variables it depends on, it is called a symmetric k-junta. Our results imply an algorithm to learn the class of symmetric k-juntas, in the uniform PAC learning model, in time approximately n/sup 3k/31/ hash . This improves on a result of Mossel, O'Donnell and Servedio [2003], who show that symmetric k -juntas can be learned in time n/sup 2k/3/ (the main result in [11] is much more general, giving a bound of n/sup 0.7k/ for learning juntas). Technically, the study of /spl tau/(k) is equivalent to the study of 0/1 solutions of a system of Diophantine equations involving binomial coefficients. As a first step, we simplify these Diophantine equations by moving to a representation of Boolean functions, which is equivalent to their Fourier representation, but seems much simpler for the application of number theoretic tools. Once this is done, we reduce these equations modulo carefully chosen prime numbers to get a simpler system of equations which we can analyze. Finally, we combine the information about the equations over the finite fields in a combinatorial manner to deduce the nature of the 0/1 solutions.

[1]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[2]  Karsten A. Verbeurgt Learning DNF under the uniform distribution in quasi-polynomial time , 1990, COLT '90.

[3]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[4]  Manfred K. Warmuth,et al.  Learning integer lattices , 1990, COLT '90.

[5]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[6]  Karsten A. Verbeurgt Learning Sub-classes of Monotone DNF on the Uniform Distribution , 1998, ALT.

[7]  Avrim Blum,et al.  Relevant Examples and Relevant Features: Thoughts from Computational Learning Theory , 1994 .

[8]  Nader H. Bshouty,et al.  More efficient PAC-learning of DNF with membership queries under the uniform distribution , 2004, J. Comput. Syst. Sci..

[9]  Serge Tabachnikov,et al.  Arithmetical properties of binomial coefficients , 2007 .

[10]  Yishay Mansour,et al.  An O(n^(log log n)) Learning Algorithm for DNT under the Uniform Distribution , 1995, J. Comput. Syst. Sci..

[11]  Noam Nisan,et al.  Constant depth circuits, Fourier transform, and learnability , 1993, JACM.

[12]  Joachim von zur Gathen,et al.  Polynomials with two values , 1997, Comb..

[13]  Ryan O'Donnell,et al.  Learning juntas , 2003, STOC '03.

[14]  Noam Nisan,et al.  Constant depth circuits, Fourier transform, and learnability , 1989, 30th Annual Symposium on Foundations of Computer Science.

[15]  Yishay Mansour,et al.  An O(nlog log n) learning algorithm for DNF under the uniform distribution , 1992, COLT '92.

[16]  Richard J. Lipton,et al.  Cryptographic Primitives Based on Hard Learning Problems , 1993, CRYPTO.

[17]  Klaas Pieter Hart,et al.  Open Problems , 2022, Dimension Groups and Dynamical Systems.