Robust lower bounds for communication and stream computation

We study the communication complexity of evaluating functions when the input data is randomly allocated (according to some known distribution) amongst two or more players, possibly with information overlap. This naturally extends previously studied variable partition models such as the best-case and worst-case partition models [32,29]. We aim to understand whether the hardness of a communication problem holds for almost every allocation of the input, as opposed to holding for perhaps just a few atypical partitions. A key application is to the heavily studied data stream model. There is a strong connection between our communication lower bounds and lower bounds in the data stream model that are "robust" to the ordering of the data. That is, we prove lower bounds for when the order of the items in the stream is chosen not adversarially but rather uniformly (or near-uniformly) from the set of all permuations. This random-order data stream model has attracted recent interest, since lower bounds here give stronger evidence for the inherent hardness of streaming problems. Our results include the first random-partition communication lower bounds for problems including multi-party set disjointness and gap-Hamming-distance. Both are tight. We also extend and improve previous results [19,7] for a form of pointer jumping that is relevant to the problem of selection (in particular, median finding). Collectively, these results yield lower bounds for a variety of problems in the random-order data stream model, including estimating the number of distinct elements, approximating frequency moments, and quantile estimation.

[1]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[2]  Alexandr Andoni,et al.  Better Bounds for Frequency Moments in Random-Order Streams , 2008, ArXiv.

[3]  Tak Wah Lam,et al.  Results on Communication Complexity Classes , 1992, J. Comput. Syst. Sci..

[4]  J. Ian Munro,et al.  Selection and sorting with limited storage , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[5]  Jirí Sgall,et al.  An Upper Bound for a Communication Game Related to Time-Space Tradeoffs , 1995, The Mathematics of Paul Erdős I.

[6]  Sanjeev Khanna,et al.  Space-efficient online computation of quantile summaries , 2001, SIGMOD '01.

[7]  Sudipto Guha,et al.  Sketching information divergences , 2007, Machine Learning.

[8]  Hartmut Klauck,et al.  Interaction in quantum communication and the complexity of set disjointness , 2001, STOC '01.

[9]  Ziv Bar-Yossef,et al.  An information statistics approach to data stream and communication complexity , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[10]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[11]  Sudipto Guha,et al.  Approximate quantiles and the order of the stream , 2006, PODS.

[12]  Andrew Chi-Chih Yao,et al.  Informational complexity and the direct sum problem for simultaneous message complexity , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[13]  Ravi Kumar,et al.  The One-Way Communication Complexity of Hamming Distance , 2008, Theory Comput..

[14]  David P. Woodruff Optimal space lower bounds for all frequency moments , 2004, SODA '04.

[15]  Joan Feigenbaum,et al.  Graph Distances in the Data-Stream Model , 2008, SIAM J. Comput..

[16]  David P. Woodruff The average-case complexity of counting distinct elements , 2009, ICDT '09.

[17]  T. S. Jayram,et al.  Tight lower bounds for selection in randomly ordered streams , 2008, SODA '08.

[18]  Mahesh Viswanathan,et al.  An Approximate L1-Difference Algorithm for Massive Data Streams , 2002, SIAM J. Comput..

[19]  Luca Trevisan,et al.  Counting Distinct Elements in a Data Stream , 2002, RANDOM.

[20]  Peter Bro Miltersen,et al.  On data structures and asymmetric communication complexity , 1994, STOC '95.

[21]  Sudipto Guha,et al.  Streaming and sublinear approximation of entropy and information distances , 2005, SODA '06.

[22]  André Gronemeier,et al.  Asymptotically Optimal Lower Bounds on the NIH-Multi-Party Information Complexity of the AND-Function and Disjointness , 2009, STACS.

[23]  Sudipto Guha,et al.  Space-Efficient Sampling , 2007, AISTATS.

[24]  Ravi Kumar,et al.  Two applications of information complexity , 2003, STOC '03.

[25]  Andrew Chi-Chih Yao,et al.  Some complexity questions related to distributive computing(Preliminary Report) , 1979, STOC.

[26]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[27]  David P. Woodruff,et al.  Optimal approximations of the frequency moments of data streams , 2005, STOC '05.

[28]  Prabhakar Raghavan,et al.  Computing on data streams , 1999, External Memory Algorithms.

[29]  Subhash Khot,et al.  Near-optimal lower bounds on the multi-party communication complexity of set disjointness , 2003, 18th IEEE Annual Conference on Computational Complexity, 2003. Proceedings..

[30]  B. Bollobás,et al.  Extremal Graph Theory , 2013 .

[31]  Alfred V. Aho,et al.  On notions of information transfer in VLSI circuits , 1983, STOC.

[32]  Farid M. Ablayev,et al.  Lower Bounds for One-Way Probabilistic Communication Complexity and Their Application to Space Complexity , 1996, Theor. Comput. Sci..

[33]  Graham Cormode,et al.  A near-optimal algorithm for computing the entropy of a stream , 2007, SODA '07.

[34]  Sudipto Guha,et al.  Lower Bounds for Quantile Estimation in Random-Order and Multi-pass Streaming , 2007, ICALP.

[35]  Joan Feigenbaum,et al.  Graph distances in the streaming model: the value of space , 2005, SODA '05.

[36]  Sudipto Guha,et al.  Stream Order and Order Statistics: Quantile Estimation in Random-Order Streams , 2009, SIAM J. Comput..

[37]  Sumit Ganguly,et al.  Counting distinct items over update streams , 2005, Theor. Comput. Sci..

[38]  Amit Chakrabarti,et al.  An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance , 2012, SIAM J. Comput..

[39]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[40]  Krzysztof Onak,et al.  Sketching and Streaming Entropy via Approximation Theory , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[41]  David P. Woodruff,et al.  Tight lower bounds for the distinct elements problem , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[42]  Sudipto Guha,et al.  Revisiting the Direct Sum Theorem and Space Lower Bounds in Random Order Streams , 2009, ICALP.

[43]  Srinivasan Venkatesh,et al.  Lower bounds for predecessor searching in the cell probe model , 2003, J. Comput. Syst. Sci..