Approximate F2-Sketching of Valuation Functions

We study the problem of constructing a linear sketch of minimum dimension that allows approximation of a given real-valued function f : F2 → R with small expected squared error. We develop a general theory of linear sketching for such functions through which we analyze their dimension for most commonly studied types of valuation functions: additive, budget-additive, coverage, αLipschitz submodular and matroid rank functions. This gives a characterization of how many bits of information have to be stored about the input x so that one can compute f under additive updates to its coordinates. Our results are tight in most cases and we also give extensions to the distributional version of the problem where the input x ∈ F2 is generated uniformly at random. Using known connections with dynamic streaming algorithms, both upper and lower bounds on dimension obtained in our work extend to the space complexity of algorithms evaluating f(x) under long sequences of additive updates to the input x presented as a stream. Similar results hold for simultaneous communication in a distributed setting. 2012 ACM Subject Classification Theory of computation → Sketching and sampling

[1]  Amit Chakrabarti,et al.  Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover , 2015, SODA.

[2]  Yang Li,et al.  Tight bounds on the randomized communication complexity of symmetric XOR functions in one-way and SMP models , 2011, Electron. Colloquium Comput. Complex..

[3]  Andrew McGregor,et al.  Graph stream algorithms: a survey , 2014, SGMD.

[4]  Elchanan Mossel,et al.  Linear Sketching over F_2 , 2018, CCC.

[5]  Shachar Lovett,et al.  DNF sparsification beyond sunflowers , 2018, Electron. Colloquium Comput. Complex..

[6]  Jan Vondrák,et al.  A note on concentration of submodular functions , 2010, ArXiv.

[7]  Yang Liu,et al.  Quantum and randomized communication complexity of XOR functions in the SMP model , 2013, Electron. Colloquium Comput. Complex..

[8]  Tim Roughgarden,et al.  Sketching valuation functions , 2012, SODA.

[9]  Deeparnab Chakrabarty,et al.  Testing Coverage Functions , 2012, ICALP.

[10]  Mihir Bellare A technique for upper bounding the spectral norm with applications to learning , 1992, COLT '92.

[11]  Shengyu Zhang,et al.  The communication complexity of the Hamming distance problem , 2006, Inf. Process. Lett..

[12]  Justin Thaler,et al.  Semi-Streaming Algorithms for Annotated Graph Streams , 2014, Electron. Colloquium Comput. Complex..

[13]  Zhiqiang Zhang,et al.  On the parity complexity measures of Boolean functions , 2010, Theor. Comput. Sci..

[14]  Vince Grolmusz On the Power of Circuits with Gates of Low L1 Norms , 1997, Theor. Comput. Sci..

[15]  David P. Woodruff,et al.  Turnstile streaming algorithms might as well be linear sketches , 2014, STOC.

[16]  Noam Nisan,et al.  On Randomized One-round Communication Complexity , 1995, STOC '95.

[17]  David P. Woodruff,et al.  Optimal Lower Bounds for Universal Relation, and for Samplers and Finding Duplicates in Streams , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[18]  Noga Alon,et al.  The Space Complexity of Approximating the Frequency Moments , 1999 .

[19]  Pravesh Kothari,et al.  Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees , 2013, COLT.

[20]  Kent Quanrud,et al.  Streaming Algorithms for Submodular Function Maximization , 2015, ICALP.

[21]  Shachar Lovett,et al.  Optimality of linear sketching under modular updates , 2018, Electron. Colloquium Comput. Complex..

[22]  Lise Getoor,et al.  On Maximum Coverage in the Streaming Model & Application to Multi-topic Blog-Watch , 2009, SDM.

[23]  David P. Woodruff,et al.  New Characterizations in Turnstile Streams with Applications , 2016, CCC.

[24]  Satyanarayana V. Lokam,et al.  Communication Complexity of Simultaneous Messages , 2003, SIAM J. Comput..

[25]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[26]  Zhiqiang Zhang,et al.  Communication Complexities of XOR functions , 2008, ArXiv.

[27]  T. S. Jayram Information complexity: a tutorial , 2010, PODS '10.

[28]  Troy Lee,et al.  Composition Theorems in Communication Complexity , 2010, ICALP.

[29]  Anirban Dasgupta,et al.  Sparse and Lopsided Set Disjointness via Information Theory , 2012, APPROX-RANDOM.

[30]  Piotr Indyk,et al.  Towards Tight Bounds for the Streaming Set Cover Problem , 2015, PODS.

[31]  Piotr Indyk,et al.  Sampling in dynamic data streams and applications , 2005, Int. J. Comput. Geom. Appl..

[32]  Pravesh Kothari,et al.  Learning Coverage Functions and Private Release of Marginals , 2014, COLT.

[33]  Maria-Florina Balcan,et al.  Learning submodular functions , 2010, STOC '11.

[34]  Shachar Lovett,et al.  Structure of Protocols for XOR Functions , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[35]  Yin Tat Lee,et al.  Single Pass Spectral Sparsification in Dynamic Streams , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[36]  Adi Rosén,et al.  Semi-Streaming Set Cover , 2014, ACM Trans. Algorithms.

[37]  László Babai,et al.  Randomized simultaneous messages: solution of a problem of Yao in communication complexity , 1997, Proceedings of Computational Complexity. Twelfth Annual IEEE Conference.

[38]  Ravi Kumar,et al.  The One-Way Communication Complexity of Hamming Distance , 2008, Theory Comput..

[39]  David P. Woodruff,et al.  Tight lower bounds for the distinct elements problem , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[40]  E. Kushilevitz,et al.  Communication Complexity: Basics , 1996 .

[41]  Piotr Indyk,et al.  On Streaming and Communication Complexity of the Set Cover Problem , 2014, DISC.

[42]  Vahab S. Mirrokni,et al.  Approximating submodular functions everywhere , 2009, SODA.

[43]  Amit Chakrabarti,et al.  An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance , 2012, SIAM J. Comput..

[44]  Shachar Lovett,et al.  Recent Advances on the Log-Rank Conjecture in Communication Complexity , 2014, Bull. EATCS.

[45]  Andreas Krause,et al.  Streaming submodular maximization: massive data summarization on the fly , 2014, KDD.

[46]  Aaron Roth,et al.  Privately Releasing Conjunctions and the Statistical Query Barrier , 2013, SIAM J. Comput..

[47]  Maria-Florina Balcan,et al.  Learning Valuation Functions , 2011, COLT.

[48]  Shengyu Zhang,et al.  Fourier Sparsity, Spectral Norm, and the Log-Rank Conjecture , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[49]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[50]  Sofya Raskhodnikova,et al.  Learning pseudo-Boolean k-DNF and submodular functions , 2013, SODA.

[51]  Daniel Lehmann,et al.  Combinatorial auctions with decreasing marginal utilities , 2001, EC '01.

[52]  Jan Vondrák,et al.  Tight Bounds on Low-Degree Spectral Concentration of Submodular and XOS Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[53]  Eric Blais,et al.  Testing Submodularity and Other Properties of Valuation Functions , 2017, ITCS.

[54]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[55]  Yang Li,et al.  Tight bounds for single-pass streaming complexity of the set cover problem , 2016, STOC.

[56]  Xiaoming Sun,et al.  Randomized Communication Complexity for Linear Algebra Problems over Finite Fields , 2012, STACS.

[57]  Pravesh Kothari,et al.  Submodular functions are noise stable , 2012, SODA.

[58]  Jehoshua Bruck,et al.  Polynomial threshold functions, AC functions and spectrum norms , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[59]  Vahab S. Mirrokni,et al.  Almost Optimal Streaming Algorithms for Coverage Problems , 2016, SPAA.

[60]  Ashley Montanaro,et al.  On the communication complexity of XOR functions , 2009, ArXiv.

[61]  Vitaly Feldman,et al.  Optimal bounds on approximation of submodular and XOS functions by juntas , 2014, ITA.

[62]  Peter Bro Miltersen,et al.  On data structures and asymmetric communication complexity , 1994, STOC '95.