On universal classes of fast high performance hash functions, their time-space tradeoff, and their applications

A mechanism is provided for constructing log-n-wise-independent hash functions that can be evaluated in O(1) time. A probabilistic argument shows that for fixed epsilon <1, a table of n/sup epsilon / random words can be accessed by a small O(1)-time program to compute one important family of hash functions. An explicit algorithm for such a family, which achieves comparable performance for all practical purposes, is also given. A lower bound shows that such a program must take Omega (k/ epsilon ) time, and a probabilistic arguments shows that programs can run in O(k/sup 2// epsilon /sup 2/) time. An immediate consequence of these constructions is that double hashing using these universal functions has (constant factor) optimal performance in time, for suitably moderate loads. Another consequence is that a T-time PRAM algorithm for n log n processors (and n/sup k/ memory) can be emulated on an n-processor machine interconnected by an n*log n Omega network with a multiplicative penalty for total work that, with high probability, is only O(1).<<ETX>>

[1]  Andrew Chi-Chih Yao,et al.  Should Tables Be Sorted? , 1981, JACM.

[2]  Larry Carter,et al.  New classes and applications of hash functions , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[3]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[4]  Leslie G. Valiant,et al.  Universal schemes for parallel communication , 1981, STOC '81.

[5]  Efficient schemes for parallel communication , 1982, PODC '82.

[6]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[7]  Kurt Mehlhorn,et al.  On the program size of perfect and universal hash functions , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[8]  Romas Aleliunas,et al.  Randomized parallel communication (Preliminary Version) , 1982, PODC '82.

[9]  Nicholas Pippenger,et al.  Parallel Communication with Limited Buffers (Preliminary Version) , 1984, FOCS.

[10]  Eli Upfal,et al.  A probabilistic relation between desirable and feasible, models of parallel computation , 1984, STOC '84.

[11]  Kurt Mehlhorn,et al.  Data Structures and Algorithms 3: Multi-dimensional Searching and Computational Geometry , 2012, EATCS Monographs on Theoretical Computer Science.

[12]  Prof. Dr. Kurt Mehlhorn,et al.  Data Structures and Algorithms 1 , 1984, EATCS.

[13]  B J Smith,et al.  A pipelined, shared resource MIMD computer , 1986 .

[14]  Debasis Mitra,et al.  Randomized Parallel Communications , 1986, ICPP.

[15]  Anna R. Karlin,et al.  Parallel hashing—an efficient implementation of shared memory , 1986, STOC '86.

[16]  Noga Alon,et al.  Better Expanders and Superconcentrators , 1987, J. Algorithms.

[17]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[18]  Eli Upfal,et al.  How to share memory in a distributed system , 1984, JACM.

[19]  George S. Lueker,et al.  More analysis of double hashing , 1988, STOC '88.

[20]  A. Siegel,et al.  On Aspects of Universality and Performance for Closed Hashing (Extended Abstract) , 1989, STOC 1989.

[21]  Leslie G. Valiant,et al.  General Purpose Parallel Architectures , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.