Nearly Tight Bounds on ℓ1 Approximation of Self-Bounding Functions

We study the complexity of learning and approximation of self-bounding functions over the uniform distribution on the Boolean hypercube ${0,1}^n$. Informally, a function $f:{0,1}^n \rightarrow \mathbb{R}$ is self-bounding if for every $x \in {0,1}^n$, $f(x)$ upper bounds the sum of all the $n$ marginal decreases in the value of the function at $x$. Self-bounding functions include such well-known classes of functions as submodular and fractionally-subadditive (XOS) functions. They were introduced by Boucheron et al in the context of concentration of measure inequalities. Our main result is a nearly tight $\ell_1$-approximation of self-bounding functions by low-degree juntas. Specifically, all self-bounding functions can be $\epsilon$-approximated in $\ell_1$ by a polynomial of degree $\tilde{O}(1/\epsilon)$ over $2^{\tilde{O}(1/\epsilon)}$ variables. Both the degree and junta-size are optimal up to logarithmic terms. Previously, the best known bound was $O(1/\epsilon^{2})$ on the degree and $2^{O(1/\epsilon^2)}$ on the number of variables (Feldman and Vondr \'{a}k 2013). These results lead to improved and in several cases almost tight bounds for PAC and agnostic learning of submodular, XOS and self-bounding functions. In particular, assuming hardness of learning juntas, we show that PAC and agnostic learning of self-bounding functions have complexity of $n^{\tilde{\Theta}(1/\epsilon)}$.

[1]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[2]  Maurice Queyranne,et al.  A combinatorial algorithm for minimizing symmetric submodular functions , 1995, SODA '95.

[3]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[4]  Sofya Raskhodnikova,et al.  Learning pseudo-Boolean k-DNF and submodular functions , 2013, SODA.

[5]  Vahab S. Mirrokni,et al.  Maximizing Non-Monotone Submodular Functions , 2011, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[6]  Yishay Mansour,et al.  Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.

[7]  Pravesh Kothari,et al.  Submodular functions are noise stable , 2012, SODA.

[8]  Bruce A. Reed,et al.  Concentration for self‐bounding functions and an inequality of Talagrand , 2006, Random Struct. Algorithms.

[9]  Uriel Feige,et al.  On maximizing welfare when utility functions are subadditive , 2006, STOC '06.

[10]  Jan Vondrák,et al.  Optimal Bounds on Approximation of Submodular and XOS Functions by Juntas , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[11]  Satoru Iwata,et al.  A combinatorial, strongly polynomial-time algorithm for minimizing submodular functions , 2000, STOC '00.

[12]  Avrim Blum Learning a Function of r Relevant Variables , 2003, COLT.

[13]  F. Dunstan MATROIDS AND SUBMODULAR FUNCTIONS , 1976 .

[14]  Jan Vondrák,et al.  Tight Bounds on Low-Degree Spectral Concentration of Submodular and XOS Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[15]  Ambuj Tewari,et al.  On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.

[16]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[17]  Maria-Florina Balcan,et al.  Learning Valuation Functions , 2011, COLT.

[18]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[19]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[20]  Pravesh Kothari,et al.  Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees , 2013, COLT.

[21]  Satoru Iwata,et al.  A combinatorial strongly polynomial algorithm for minimizing submodular functions , 2001, JACM.

[22]  Pravesh Kothari,et al.  Learning Coverage Functions , 2013, ArXiv.

[23]  S. Boucheron,et al.  A sharp concentration inequality with applications , 1999, Random Struct. Algorithms.

[24]  Ryan O'Donnell,et al.  Learning functions of k relevant variables , 2004, J. Comput. Syst. Sci..

[25]  Ehud Friedgut,et al.  Boolean Functions With Low Average Sensitivity Depend On Few Coordinates , 1998, Comb..

[26]  Rocco A. Servedio,et al.  Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[27]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[28]  Li-Yang Tan,et al.  Approximate resilience, monotonicity, and the complexity of agnostic learning , 2014, SODA.

[29]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[30]  Gregory Valiant,et al.  Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[31]  Colin McDiarmid,et al.  Concentration for self-bounding functions and an inequality of Talagrand , 2006 .

[32]  Vahab S. Mirrokni,et al.  Approximating submodular functions everywhere , 2009, SODA.

[33]  Rocco A. Servedio,et al.  Bounded Independence Fools Halfspaces , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[34]  George L. Nemhauser,et al.  Note--On "Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms" , 1979 .

[35]  Nathan Linial,et al.  Collective coin flipping, robust voting schemes and minima of Banzhaf values , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[36]  Jan Vondrák,et al.  A note on concentration of submodular functions , 2010, ArXiv.

[37]  Pravesh Kothari,et al.  Learning Coverage Functions and Private Release of Marginals , 2014, COLT.

[38]  Vahab Mirrokni,et al.  Maximizing Non-Monotone Submodular Functions , 2007, FOCS 2007.

[39]  Aaron Roth,et al.  Privately releasing conjunctions and the statistical query barrier , 2010, STOC '11.

[40]  Noam Nisan,et al.  CREW PRAMS and decision trees , 1989, STOC '89.

[41]  Nathan Linial,et al.  The influence of variables on Boolean functions , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[42]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[43]  Daniel Lehmann,et al.  Combinatorial auctions with decreasing marginal utilities , 2001, EC '01.

[44]  Adam Tauman Kalai,et al.  Agnostically learning decision trees , 2008, STOC.

[45]  Tim Roughgarden,et al.  Sketching valuation functions , 2012, SODA.

[46]  László Lovász,et al.  Submodular functions and convexity , 1982, ISMP.

[47]  G. Nemhauser,et al.  Exceptional Paper—Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms , 1977 .