Dynamic Space Efficient Hashing

We consider space efficient hash tables that can grow and shrink dynamically and are always highly space efficient, i.e., their space consumption is always close to the lower bound even while growing and when taking into account storage that is only needed temporarily. None of the traditionally used hash tables have this property. We show how known approaches like linear probing and bucket cuckoo hashing can be adapted to this scenario by subdividing them into many subtables or using virtual memory overcommitting. However, these rather straightforward solutions suffer from slow amortized insertion times due to frequent reallocation in small increments. Our main result is Dynamic Space Efficient Cuckoo Table (DySECT ) which avoids these problems. DySECT consists of many subtables which grow by doubling their size. The resulting inhomogeneity in subtable sizes is counterbalanced by the flexibility available in bucket cuckoo hashing where each element can go to several buckets each of which containing several cells. Experiments indicate that DySECT works well with loads up to 98%. With up to 1.9 times better performance than the next best solution. Additionally, we give a tight theoretical analysis for the possible load threshold of DySECT, i.e., a bound where with high probability the table can be filled up to that load but not above said load. This load also matches our experimental findings.

[1]  Leonidas J. Guibas,et al.  The Analysis of Double Hashing , 1978, J. Comput. Syst. Sci..

[2]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[3]  Pat Morin,et al.  Cuckoo hashing: Further analysis , 2003, Inf. Process. Lett..

[4]  Peter Sanders,et al.  Dynamic Space Efficient Hashing , 2017, ESA.

[5]  Moni Naor,et al.  De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results , 2009, ICALP.

[6]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[7]  Xiaozhou Li,et al.  Algorithmic improvements for fast concurrent Cuckoo hashing , 2014, EuroSys '14.

[8]  Konstantinos Panagiotou,et al.  Load Thresholds for Cuckoo Hashing with Double Hashing , 2018, SWAT.

[9]  Michael Mitzenmacher,et al.  Some Open Questions Related to Cuckoo Hashing , 2009, ESA.

[10]  Andrea Montanari,et al.  Tight Thresholds for Cuckoo Hashing via XORSAT , 2009, ICALP.

[11]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[12]  Michael Mitzenmacher,et al.  Using a Queue to De-amortize Cuckoo Hashing in Hardware , 2007 .

[13]  Martin Dietzfelbinger,et al.  Balanced allocation and dictionaries with tightly packed constant size bins , 2005, Theor. Comput. Sci..

[14]  Michael Mitzenmacher,et al.  Less Hashing, Same Performance: Building a Better Bloom Filter , 2006, ESA.

[15]  Nhan Nguyen,et al.  Lock-Free Cuckoo Hashing , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[16]  Marc Lelarge,et al.  A new approach to the orientation of random hypergraphs , 2012, SODA.

[17]  J. Ian Munro,et al.  Robin hood hashing , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[18]  Martin Dietzfelbinger,et al.  Cuckoo Hashing with Pages , 2011, ESA.

[19]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[20]  Paul G. Spirakis,et al.  Space Efficient Hash Tables with Worst Case Constant Access Time , 2003, STACS.

[21]  J. Michael Steele,et al.  The Objective Method: Probabilistic Combinatorial Optimization and Local Weak Convergence , 2004 .

[22]  Friedhelm Meyer auf der Heide,et al.  Dynamic perfect hashing: upper and lower bounds , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[23]  Stefan Walzer Load Thresholds for Cuckoo Hashing with Overlapping Blocks , 2018, ICALP.

[24]  Alan M. Frieze,et al.  An Analysis of Random-Walk Cuckoo Hashing , 2011, SIAM J. Comput..

[25]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[26]  Konstantinos Panagiotou,et al.  On the Insertion Time of Cuckoo Hashing , 2010, SIAM J. Comput..