A Comparison of Concatenated and Superimposed Code Word Surrogate Files for Very Large Data/Knowledge Bases

Surrogate files are very useful as an index for very large knowledge bases to support multiple logic programming inference mechanisms because of their small size and simple maintenance requirement. In this paper, we analyse the superimposed code word (SCW) and concatenated code word (CCW) surrogate file techniques in terms of storage space and time to answer queries in various cases. One of the most important results of our analysis is that the size and the query response time of the CCW is smaller than those of the SCW when the average number of arguments specified in a query is small. It is also shown that most of the query response time is used for the surrogate file processing when the extensional database is very large. Therefore, if we use a special architecture to speed up the surrogate file processing, the total query response time can be reduced considerably.

[1]  Per-Åke Larson,et al.  Performance analysis of linear hashing with partial expansions , 1982, TODS.

[2]  Kjell Bratbergsengen,et al.  Hashing Methods and Relational Algebra Operations , 1984, VLDB.

[3]  John L. Pfaltz,et al.  Partial-match retrieval using indexed descriptor files , 1980, CACM.

[4]  Christos Faloutsos,et al.  Design of a Signature File Method that Accounts for Non-Uniform Occurrence and Query Frequencies , 1985, VLDB.

[5]  Alfonso F. Cardenas Analysis and performance of inverted data base structures , 1975, CACM.

[6]  Sudhir Ahuja,et al.  An associative/parallel processor for partial match retrieval using superimposed codes , 1980, ISCA '80.

[7]  Christos Faloutsos,et al.  Signature files: an access method for documents and its analytical performance evaluation , 1984, TOIS.

[8]  Dik Lun Lee A word-parallel, bit-serial signature processor for superimposed coding , 1986, 1986 IEEE Second International Conference on Data Engineering.

[9]  Kurt Maly,et al.  An efficient file structure for document retrieval in the automated office environment , 1987, 1987 IEEE Third International Conference on Data Engineering.

[10]  Hidenori Itoh,et al.  A Superimposed Code Scheme for Deductive Databases , 1987, IWDM.

[11]  Kotagiri Ramamohanarao,et al.  A two level superimposed coding scheme for partial match retrieval , 1983, Inf. Syst..

[12]  P. Bruce Berra,et al.  An Architecture for Very Large Rule Bases Based on Surrogate Files , 1987, IWDM.

[13]  Kotagiri Ramamohanarao,et al.  A Superimposed Codeword Indexing Scheme for Very Large Prolog Databases , 1986, ICLP.

[14]  C.S. Roberts,et al.  Partial-match retrieval via the method of superimposed codes , 1979, Proceedings of the IEEE.

[15]  Soon Myoung Chung,et al.  Computer Architecture for a Surrogate File to a Very Large Data/Knowledge Base , 1987, Computer.