Low-contention data structures

We consider the problem of minimizing contention in static (read-only) dictionary data structures, where contention is measured with respect to a fixed query distribution by the maximum expected number of probes to any given cell. The query distribution is known by the algorithm that constructs the data structure but not by the algorithm that queries it. Assume that the dictionary has n items. When all queries in the dictionary are equiprobable, and all queries not in the dictionary are equiprobable, we show how to construct a data structure in O(n) space where queries require O(1) probes and the contention is O(1/n). Asymptotically, all of these quantities are optimal. For arbitrary query distributions, we construct a data structure in O(n) space where each query requires O(logn/loglogn) probes and the contention is O(logn/(nloglogn)). The lack of knowledge of the query distribution by the query algorithm prevents perfect load leveling in this case: for a large class of algorithms, we present a lower bound, based on VC-dimension, that shows that for a wide range of data structure problems, achieving contention even within a polylogarithmic factor of optimal requires a cell-probe complexity of @W(loglogn).

[1]  Maurice Herlihy,et al.  Low contention load balancing on large-scale multiprocessors , 1992, SPAA '92.

[2]  Friedhelm Meyer auf der Heide,et al.  How to distribute a dictionary in a complete network , 1990, STOC '90.

[3]  Larry Rudolph,et al.  A Complexity Theory of Efficient Parallel Algorithms , 1990, Theor. Comput. Sci..

[4]  Friedhelm Meyer auf der Heide,et al.  An optimal parallel dictionary , 1989, SPAA '89.

[5]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[6]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[7]  Andrew Chi-Chih Yao,et al.  Should Tables Be Sorted? , 1981, JACM.

[8]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[9]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[10]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[11]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[12]  Maurice Herlihy,et al.  Contention in shared memory algorithms , 1997, J. ACM.

[13]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[14]  Friedhelm Meyer auf der Heide,et al.  A New Universal Class of Hash Functions and Dynamic Hashing in Real Time , 1990, ICALP.

[15]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..