Order preserving minimal perfect hash functions and information retrieval

Rapid access to information is essential for a wide variety of retrieval systems and applications. Hashing has long been used when the fastest possible direct search is desired, but is generally not appropriate when sequential or range searches are also required. This paper describes a hashing method, developed for collections that are relatively static, that supports both direct and sequential access. Indeed, the algorithm described gives hash functions that are optimal in terms of time and hash table space utilization, and that preserve any a priori ordering desired. Furthermore, the resulting order preserving minimal perfect hash functions (OPMPHFs) can be found using space and time that is on average linear in the number of keys involved.

[1]  Edward A. Fox,et al.  Implementation of a Perfect Hash Function Scheme , 1989 .

[2]  Edward A. Fox,et al.  Development of the coder system: A testbed for artificial intelligence methods in information retrieval , 1987, Inf. Process. Manag..

[3]  Edward A. Fox,et al.  Characterization of Two New Experimental Collections in Computer and Information Science Containing Textual and Bibliographic Concepts , 1983 .

[4]  Edward A. Fox,et al.  An O(n log n) Algorithm for Finding Minimal Perfect Hash Functions , 1989 .

[5]  Edward A. Fox Optical disks and CD-ROM: publishing and access , 1988 .

[6]  Calvin C. Gotlieb,et al.  Order-preserving key transformations , 1986, TODS.

[7]  E. Palmer Graphical evolution: an introduction to the theory of random graphs , 1985 .

[8]  Edward A. Fox,et al.  Order-preserving minimal perfect hash functions and information retrieval , 1991, TOIS.

[9]  Edward A. Fox,et al.  Building a Lexicon from Machine-Readable Dictionaries for ImprovedInformation Retrieval , 1990 .

[10]  Kurt Mehlhorn,et al.  On the program size of perfect and universal hash functions , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[11]  Edward A. Fox,et al.  Building a Large Thesaurus for Information Retrieval , 1988, ANLP.

[12]  T. Austin,et al.  The Enumeration of Point Labelled Chromatic Graphs and Trees , 1960, Canadian Journal of Mathematics.

[13]  Edward A. Fox,et al.  A more cost effective algorithm for finding perfect hash functions , 1989, CSC '89.

[14]  Richard J. Enbody,et al.  Dynamic hashing schemes , 1988, CSUR.

[15]  I. G. BONNER CLAPPISON Editor , 1960, The Electric Power Engineering Handbook - Five Volume Set.

[16]  Thomas J. Sager A polynomial time generator for minimal perfect hash functions , 1985, CACM.