Comparison of Hash Table Performance with Open Addressing and Closed Addressing: An Empirical Study

In this paper, we conducted empirical experiments to study the performance of hashing with a large set of data and compared the results of different collision approaches. The experiment results leaned more to closed addressing than to open addressing and deemed linear probing impractical due to its low performance. Moreover, when items are randomly distributed with keys in a large space, different hash algorithms might produce similar performance. Increasing randomness in keys does not help hash table performance either and it seems that the load factor solely determines possibility of collision. These new discoveries might help programmers to design software products using hash tables.

[1]  Mikkel Thorup Even strongly universal hashing is pretty fast , 2000, SODA '00.

[2]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[3]  Qing Ye,et al.  Hybrid open hash tables for network processors , 2005, HPSR. 2005 Workshop on High Performance Switching and Routing, 2005..

[4]  R. Plackett,et al.  Karl Pearson and the Chi-squared Test , 1983 .

[5]  Robert Tappan Morris,et al.  Comparing the Performance of Distributed Hash Tables Under Churn , 2004, IPTPS.

[6]  Karl Pearson F.R.S. X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling , 2009 .

[7]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[8]  Pat Morin,et al.  Cuckoo hashing: Further analysis , 2003, Inf. Process. Lett..

[9]  Michael Drmota,et al.  A precise analysis of Cuckoo hashing , 2012, TALG.

[10]  Rich Salz,et al.  A Universally Unique IDentifier (UUID) URN Namespace , 2005, RFC.

[11]  Haoyu Song,et al.  Fast hash table lookup using extended bloom filter: an aid to network processing , 2005, SIGCOMM '05.

[12]  J. Ian Munro,et al.  Robin hood hashing , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[13]  Moni Naor,et al.  A Simple Fault Tolerant Distributed Hash Table , 2003, IPTPS.

[14]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[15]  H. Feistel Cryptography and Computer Privacy , 1973 .

[16]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.