Testing juntas nearly optimally

A function on n variables is called a k-junta if it depends on at most k of its variables. In this article, we show that it is possible to test whether a function is a k-junta or is "far" from being a k-junta with O(kε + k log k ) queries, where epsilon is the approximation parameter. This result improves on the previous best upper bound of O (k3/2)ε queries and is asymptotically optimal, up to a logarithmic factor. We obtain the improved upper bound by introducing a new algorithm with one-sided error for testing juntas. Notably, the algorithm is a valid junta tester under very general conditions: it holds for functions with arbitrary finite domains and ranges, and it holds under any product distribution over the domain. A key component of the analysis of the new algorithm is a new structural result on juntas: roughly, we show that if a function f is "far" from being a k-junta, then f is "far" from being determined by k parts in a random partition of the variables. The structural lemma is proved using the Efron-Stein decomposition method.

[1]  Dana Ron,et al.  Testing Basic Boolean Formulae , 2002, SIAM J. Discret. Math..

[2]  Mihir Bellare,et al.  Free bits, PCPs and non-approximability-towards tight results , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[3]  Mihir Bellare,et al.  Free Bits, PCPs, and Nonapproximability-Towards Tight Results , 1998, SIAM J. Comput..

[4]  Elchanan Mossel,et al.  Approximation Resistant Predicates from Pairwise Independence , 2008, Computational Complexity Conference.

[5]  N. Littlestone,et al.  Learning in the presence of finitely or infinitely many irrelevant attributes , 1991, COLT '91.

[6]  Ryan O'Donnell,et al.  Polynomial regression under arbitrary product distributions , 2010, Machine Learning.

[7]  Ryan O'Donnell,et al.  Noise stability of functions with low influences: Invariance and optimality , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[8]  Guy Kindler,et al.  Testing juntas , 2002, J. Comput. Syst. Sci..

[9]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[10]  B. Efron,et al.  The Jackknife Estimate of Variance , 1981 .

[11]  S. Karlin,et al.  Applications of Anova Type Decompositions for Comparisons of Conditional Variance Statistics Including Jackknife Estimates , 1982 .

[12]  Elchanan Mossel,et al.  Gaussian Bounds for Noise Correlation of Functions and Tight Analysis of Long Codes , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[13]  Ronitt Rubinfeld,et al.  Robust Characterizations of Polynomials with Applications to Program Testing , 1996, SIAM J. Comput..

[14]  Hana Chockler,et al.  A lower bound for testing juntas , 2004, Inf. Process. Lett..

[15]  Eric Blais Improved Bounds for Testing Juntas , 2008, APPROX-RANDOM.

[16]  Dana Ron,et al.  Property testing and its connection to learning and approximation , 1998, JACM.

[17]  Rocco A. Servedio,et al.  Testing for Concise Representations , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[18]  Ryan O'Donnell,et al.  Optimal Inapproximability Results for MAX-CUT and Other 2-Variable CSPs? , 2007, SIAM J. Comput..

[19]  Yudong D. He,et al.  Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer , 2001, Nature Biotechnology.

[20]  J. Steele An Efron-Stein inequality for nonsymmetric statistics , 1986 .