Approximate $\mathbb{F}_2$-Sketching of Valuation Functions

We study the problem of constructing a linear sketch of minimum dimension that allows approximation of a given real-valued function $f \colon \mathbb{F}_2^n \rightarrow \mathbb R$ with small expected squared error. We develop a general theory of linear sketching for such functions through which we analyze their dimension for most commonly studied types of valuation functions: additive, budget-additive, coverage, $\alpha$-Lipschitz submodular and matroid rank functions. This gives a characterization of how many bits of information have to be stored about the input $x$ so that one can compute $f$ under additive updates to its coordinates. Our results are tight in most cases and we also give extensions to the distributional version of the problem where the input $x \in \mathbb{F}_2^n$ is generated uniformly at random. Using known connections with dynamic streaming algorithms, both upper and lower bounds on dimension obtained in our work extend to the space complexity of algorithms evaluating $f(x)$ under long sequences of additive updates to the input $x$ presented as a stream. Similar results hold for simultaneous communication in a distributed setting.

[1]  Troy Lee,et al.  Composition Theorems in Communication Complexity , 2010, ICALP.

[2]  David P. Woodruff,et al.  Turnstile streaming algorithms might as well be linear sketches , 2014, STOC.

[3]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[4]  David P. Woodru Sketching as a Tool for Numerical Linear Algebra , 2014 .

[5]  T. S. Jayram Information complexity: a tutorial , 2010, PODS '10.

[6]  Yang Li,et al.  Tight bounds for single-pass streaming complexity of the set cover problem , 2016, STOC.

[7]  Maria-Florina Balcan,et al.  Learning submodular functions , 2010, STOC '11.

[8]  David P. Woodruff,et al.  Optimal Lower Bounds for Universal Relation, and for Samplers and Finding Duplicates in Streams , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[9]  Shachar Lovett,et al.  Structure of Protocols for XOR Functions , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[10]  Yin Tat Lee,et al.  Single Pass Spectral Sparsification in Dynamic Streams , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[11]  Peter Bro Miltersen,et al.  On data structures and asymmetric communication complexity , 1994, STOC '95.

[12]  Adi Rosén,et al.  Semi-Streaming Set Cover , 2014, ACM Trans. Algorithms.

[13]  Anirban Dasgupta,et al.  Sparse and Lopsided Set Disjointness via Information Theory , 2012, APPROX-RANDOM.

[14]  Piotr Indyk,et al.  Towards Tight Bounds for the Streaming Set Cover Problem , 2015, PODS.

[15]  Piotr Indyk,et al.  Sampling in dynamic data streams and applications , 2005, Int. J. Comput. Geom. Appl..

[16]  Shengyu Zhang,et al.  The communication complexity of the Hamming distance problem , 2006, Inf. Process. Lett..

[17]  László Babai,et al.  Randomized simultaneous messages: solution of a problem of Yao in communication complexity , 1997, Proceedings of Computational Complexity. Twelfth Annual IEEE Conference.

[18]  Ravi Kumar,et al.  The One-Way Communication Complexity of Hamming Distance , 2008, Theory Comput..

[19]  David P. Woodruff,et al.  Tight lower bounds for the distinct elements problem , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[20]  Noga Alon,et al.  The Space Complexity of Approximating the Frequency Moments , 1999 .

[21]  Robert Krauthgamer,et al.  The Sketching Complexity of Pattern Matching , 2004, APPROX-RANDOM.

[22]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[23]  E. Kushilevitz,et al.  Communication Complexity: Basics , 1996 .

[24]  Ashley Montanaro,et al.  On the communication complexity of XOR functions , 2009, ArXiv.

[25]  Aaron Roth,et al.  Privately Releasing Conjunctions and the Statistical Query Barrier , 2013, SIAM J. Comput..

[26]  Maria-Florina Balcan,et al.  Learning Valuation Functions , 2011, COLT.

[27]  Sofya Raskhodnikova,et al.  Learning pseudo-Boolean k-DNF and submodular functions , 2013, SODA.

[28]  Piotr Indyk,et al.  On Streaming and Communication Complexity of the Set Cover Problem , 2014, DISC.

[29]  Vahab S. Mirrokni,et al.  Approximating submodular functions everywhere , 2009, SODA.

[30]  Shengyu Zhang,et al.  Fourier Sparsity, Spectral Norm, and the Log-Rank Conjecture , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[31]  Andreas Krause,et al.  Streaming submodular maximization: massive data summarization on the fly , 2014, KDD.

[32]  Shachar Lovett,et al.  Recent Advances on the Log-Rank Conjecture in Communication Complexity , 2014, Bull. EATCS.

[33]  Noam Nisan,et al.  On Randomized One-round Communication Complexity , 1999, computational complexity.

[34]  Xiaoming Sun,et al.  Randomized Communication Complexity for Linear Algebra Problems over Finite Fields , 2012, STACS.

[35]  Jan Vondrák,et al.  Is Submodularity Testable? , 2010, Algorithmica.

[36]  Kent Quanrud,et al.  Streaming Algorithms for Submodular Function Maximization , 2015, ICALP.

[37]  Shachar Lovett,et al.  Optimality of linear sketching under modular updates , 2018, Electron. Colloquium Comput. Complex..

[38]  Lise Getoor,et al.  On Maximum Coverage in the Streaming Model & Application to Multi-topic Blog-Watch , 2009, SDM.

[39]  David P. Woodruff,et al.  New Characterizations in Turnstile Streams with Applications , 2016, CCC.

[40]  Deeparnab Chakrabarty,et al.  Testing Coverage Functions , 2012, ICALP.

[41]  Yang Li,et al.  Tight bounds on the randomized communication complexity of symmetric XOR functions in one-way and SMP models , 2011, Electron. Colloquium Comput. Complex..

[42]  Elchanan Mossel,et al.  Linear Sketching over F_2 , 2018, CCC.

[43]  Yang Liu,et al.  Quantum and randomized communication complexity of XOR functions in the SMP model , 2013, Electron. Colloquium Comput. Complex..

[44]  Vince Grolmusz On the Power of Circuits with Gates of Low L1 Norms , 1997, Theor. Comput. Sci..

[45]  Tim Roughgarden,et al.  Sketching valuation functions , 2012, SODA.

[46]  Pravesh Kothari,et al.  Submodular functions are noise stable , 2012, SODA.

[47]  Jehoshua Bruck,et al.  Polynomial threshold functions, AC functions and spectrum norms , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[48]  Vahab S. Mirrokni,et al.  Almost Optimal Streaming Algorithms for Coverage Problems , 2016, SPAA.

[49]  Pravesh Kothari,et al.  Learning Coverage Functions and Private Release of Marginals , 2014, COLT.

[50]  Pravesh Kothari,et al.  Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees , 2013, COLT.

[51]  Amit Chakrabarti,et al.  An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance , 2012, SIAM J. Comput..

[52]  Amit Chakrabarti,et al.  Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover , 2015, SODA.

[53]  Jan Vondrák,et al.  Tight Bounds on Low-Degree Spectral Concentration of Submodular and XOS Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[54]  Eric Blais,et al.  Testing Submodularity and Other Properties of Valuation Functions , 2017, ITCS.

[55]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[56]  Satyanarayana V. Lokam,et al.  Communication Complexity of Simultaneous Messages , 2003, SIAM J. Comput..

[57]  Zhiqiang Zhang,et al.  Communication Complexities of XOR functions , 2008, ArXiv.

[58]  Vitaly Feldman,et al.  Optimal bounds on approximation of submodular and XOS functions by juntas , 2014, ITA.

[59]  Andrew McGregor,et al.  Graph stream algorithms: a survey , 2014, SGMD.

[60]  Shachar Lovett,et al.  DNF sparsification beyond sunflowers , 2018, Electron. Colloquium Comput. Complex..

[61]  Jan Vondrák,et al.  A note on concentration of submodular functions , 2010, ArXiv.

[62]  Mihir Bellare A technique for upper bounding the spectral norm with applications to learning , 1992, COLT '92.

[63]  Daniel Lehmann,et al.  Combinatorial auctions with decreasing marginal utilities , 2001, EC '01.

[64]  Justin Thaler,et al.  Semi-Streaming Algorithms for Annotated Graph Streams , 2014, Electron. Colloquium Comput. Complex..

[65]  Zhiqiang Zhang,et al.  On the parity complexity measures of Boolean functions , 2010, Theor. Comput. Sci..