Partial-match retrieval using hashing and descriptors

This paper studies a partial-match retrieval scheme based on hash functions and descriptors. The emphasis is placed on showing how the use of a descriptor file can improve the performance of the scheme. Records in the file are given addresses according to hash functions for each field in the record. Furthermore, each page of the file has associated with it a descriptor, which is a fixed-length bit string, determined by the records actually present in the page. Before a page is accessed to see if it contains records in the answer to a query, the descriptor for the page is checked. This check may show that no relevant records are on the page and, hence, that the page does not have to be accessed. The method is shown to have a very substantial performance advantage over pure hashing schemes, when some fields in the records have large key spaces. A mathematical model of the scheme, plus an algorithm for optimizing performance, is given.

[1]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[2]  Kotagiri Ramamohanarao,et al.  Partial-match retrieval for dynamic files , 1982, BIT.

[3]  Kotagiri Ramamohanarao,et al.  Dynamic Hashing Schemes , 1982, Comput. J..

[4]  Ronald L. Rivest,et al.  Partial-Match Retrieval Algorithms , 1976, SIAM J. Comput..

[5]  Alfred V. Aho,et al.  Optimal partial-match retrieval when fields are independently specified , 1979, ACM Trans. Database Syst..

[6]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[7]  John L. Pfaltz,et al.  Partial-match retrieval using indexed descriptor files , 1980, CACM.

[8]  John W. Lloyd Optimal partial-match retrieval , 1980, BIT Comput. Sci. Sect..

[9]  Michel Scholl,et al.  New file organization based on dynamic hashing , 1981, TODS.

[10]  Ronald Fagin,et al.  Extendible hashing—a fast access method for dynamic files , 1979, ACM Trans. Database Syst..

[11]  SchollMichel New file organization based on dynamic hashing , 1981 .

[12]  Witold Litwin,et al.  Linear Hashing: A new Algorithm for Files and Tables Addressing , 1980, ICOD.

[13]  C.S. Roberts,et al.  Partial-match retrieval via the method of superimposed codes , 1979, Proceedings of the IEEE.

[14]  Azad Bolour Optimality Properties of Multiple-Key Hashing Functions , 1979, JACM.

[15]  Per-Åke Larson,et al.  Linear Hashing with Partial Expansions , 1980, VLDB.

[16]  James B. Rothnie,et al.  Attribute based file organization in a paged memory environment , 1974, CACM.