A faster algorithm for constructing minimal perfect hash functions

Our previous research on one-probe access to large collections of data indexed by alphanumeric keys has produced the first practical minimal perfect hash functions for this problem. Here, a new algorithm is described for quickly finding minimal perfect hash functions whose specification space is very close to the theoretical lower bound, i.e., around 2 bits per key. The various stages of processing are detailed, along with analytical and empirical results, including timing for a set of over 3.8 million keys that was processed on a NeXTstation in about 6 hours.

[1]  Peter K. Pearson,et al.  Fast hashing of variable-length text strings , 1990, CACM.

[2]  Edward A. Fox,et al.  Order preserving minimal perfect hash functions and information retrieval , 1989, SIGIR '90.

[3]  Thomas J. Sager A polynomial time generator for minimal perfect hash functions , 1985, CACM.

[4]  Qifan Chen,et al.  An object-oriented database system for efficient information retrieval applications , 1992 .

[5]  J. Komlos,et al.  On the Size of Separating Systems and Families of Perfect Hash Functions , 1984 .

[6]  Edward A. Fox,et al.  Integrated Access to a Large Medical Literature Database , 1991 .

[7]  Edward A. Fox,et al.  Architecture of an expert system for composite document analysis, representation, and retrieval , 1997, Int. J. Approx. Reason..

[8]  L FredmanMichael,et al.  Storing a Sparse Table with 0(1) Worst Case Access Time , 1984 .

[9]  Gerhard Jaeschke Reciprocal hashing: a method for generating minimal perfect hashing functions , 1981, CACM.

[10]  Kurt Mehlhorn,et al.  On the program size of perfect and universal hash functions , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[11]  Chin-Chen Chang Letter-oriented reciprocal hashing scheme , 1986, Inf. Sci..

[12]  Edward A. Fox,et al.  Development of the coder system: A testbed for artificial intelligence methods in information retrieval , 1987, Inf. Process. Manag..

[13]  Richard J. Cichelli Minimal perfect hash functions made simple , 1980, CACM.

[14]  Nick Cercone,et al.  Minimal and almost minimal perfect hash function search with application to natural language lexicon design , 1983 .

[15]  Edward A. Fox,et al.  Using a frame‐based language for information retrieval , 1989, Int. J. Intell. Syst..

[16]  Edward A. Fox,et al.  Practical minimal perfect hash functions for large databases , 1992, CACM.