CS369E: Communication Complexity (for Algorithm Designers) Lecture #6: Data Structure Lower Bounds

Next we discuss how to use communication complexity to prove lower bounds on the performance — meaning space, query time, and approximation — of data structures. Our case study will be the high-dimensional approximate nearest neighbor problem. There is a large literature on data structure lower bounds. There are several different ways to use communication complexity to prove such lower bounds, and we’ll unfortunately only have time to discuss one of them. For example, we discuss only a static data structure problem — where the data structure can only be queried, not modified — and lower bounds for dynamic data structures tend to use somewhat different techniques. See [8, 10] for some starting points for further reading. We focus on the approximate nearest neighbor problem for a few reasons: it is obviously a fundamental problem, that gets solved all the time (in data mining, for example); there are some non-trivial upper bounds; for certain parameter ranges, we have matching lower bounds; and the techniques used to prove these lower bounds are representative of work in the area — asymmetric communication complexity and reductions from the “Lopsided Disjointness” problem.

[1]  Andrew Chi-Chih Yao,et al.  Should Tables Be Sorted? , 1981, JACM.

[2]  Andrew Chi-Chih Yao,et al.  Some complexity questions related to distributive computing(Preliminary Report) , 1979, STOC.

[3]  Ronen Shaltiel,et al.  Increasing the output length of zero‐error dispersers , 2012, Random Struct. Algorithms.

[4]  Peter Bro Miltersen Cell probe complexity-a survey , 1999 .

[5]  Peter Bro Miltersen,et al.  On data structures and asymmetric communication complexity , 1994, STOC '95.

[6]  Thomas Vidick,et al.  A concentration inequality for the overlap of a vector on a large set, with application to the communication complexity of the Gap-Hamming-Distance problem , 2011, Chic. J. Theor. Comput. Sci..

[7]  Avinatan Hassidim,et al.  Derandomizing Algorithms on Product Distributions and Other Applications of Order-Based Extraction , 2010, ICS.

[8]  Alexandr Andoni,et al.  On the Optimality of the Dimensionality Reduction Method , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[9]  Alexander A. Sherstov The Communication Complexity of Gap Hamming Distance , 2012, Theory Comput..

[10]  Amos Fiat,et al.  Implicit O(1) probe search , 1989, STOC '89.

[11]  Mihai Patrascu,et al.  Lower bound techniques for data structures , 2008 .

[12]  S. Yen,et al.  Nearest neighbor searching in high dimensions using multiple KD-trees , 2010 .

[13]  L FredmanMichael,et al.  Storing a Sparse Table with 0(1) Worst Case Access Time , 1984 .

[14]  Amit Chakrabarti,et al.  An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance , 2012, SIAM J. Comput..