Order-preserving minimal perfect hash functions and information retrieval

Rapid access to information is essential for a wide variety of retrieval systems and applications. Hashing has long been used when the fastest possible direct search is desired, but is generally not appropriate when sequential or range searches are also required. This paper describes a hashing method, developed for collections that are relatively static, that supports both direct and sequential access. Indeed, the algorithm described gives hash functions that are optimal in terms of time and hash table space utilization, and that preserve any a priori ordering desired. Furthermore, the resulting order preserving minimal perfect hash functions (OPMPHFs) can be found using space and time that is on average linear in the number of keys involved.

[1]  Edward A. Fox,et al.  Building a Large Thesaurus for Information Retrieval , 1988, ANLP.

[2]  T. Austin,et al.  The Enumeration of Point Labelled Chromatic Graphs and Trees , 1960, Canadian Journal of Mathematics.

[3]  E. Palmer Graphical evolution: an introduction to the theory of random graphs , 1985 .

[4]  Edward A. Fox,et al.  A more cost effective algorithm for finding perfect hash functions , 1989, CSC '89.

[5]  Edward A. Fox,et al.  Characterization of Two New Experimental Collections in Computer and Information Science Containing Textual and Bibliographic Concepts , 1983 .

[6]  Calvin C. Gotlieb,et al.  Order-preserving key transformations , 1986, TODS.

[7]  Kurt Mehlhorn,et al.  On the program size of perfect and universal hash functions , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[8]  Edward A. Fox,et al.  An O(n log n) Algorithm for Finding Minimal Perfect Hash Functions , 1989 .

[9]  Béla Bollobás,et al.  Random Graphs , 1985 .

[10]  Edward A. Fox,et al.  Building a Lexicon from Machine-Readable Dictionaries for ImprovedInformation Retrieval , 1990 .

[11]  Thomas J. Sager A polynomial time generator for minimal perfect hash functions , 1985, CACM.

[12]  Edward A. Fox,et al.  Development of the coder system: A testbed for artificial intelligence methods in information retrieval , 1987, Inf. Process. Manag..

[13]  Richard J. Enbody,et al.  Dynamic hashing schemes , 1988, CSUR.

[14]  Edward A. Fox Optical disks and CD-ROM: publishing and access , 1988 .

[15]  Edward A. Fox,et al.  Implementation of a Perfect Hash Function Scheme , 1989 .