Hashing Methods for Temporal Data

External dynamic hashing has been used in traditional database systems as a fast method for answering membership queries. Given a dynamic set S of objects, a membership query asks whether an object with identity k is in (the most current state of) S. This paper addresses the more general problem of temporal hashing. In this setting, changes to the dynamic set are time-stamped and the membership query has a temporal predicate, as in: "Find whether object with identity k was in set S at time t". We present an efficient solution for this problem that takes an ephemeral hashing scheme and makes it partially persistent. Our solution, also termed partially persistent hashing, uses a space that is linear on the total number of changes in the evolution of set S and has a small {O[log/sub B/(n/B)]} query overhead. An experimental comparison of partially persistent hashing with various straightforward approaches (like external linear hashing, the multi-version B-tree and the R*-tree) shows that it provides the faster membership query response time. Partially persistent hashing should be seen as an extension of traditional external dynamic hashing in a temporal environment. It is independent of the ephemeral dynamic hashing scheme used; while this paper concentrates on linear hashing, the methodology applies to other dynamic hashing schemes as well.

[1]  Arie Segev,et al.  A consensus glossary of temporal database concepts , 1994, SIGMOD 1994.

[2]  Rakesh M. Verma,et al.  An Efficient Multiversion Access STructure , 1997, IEEE Trans. Knowl. Data Eng..

[3]  Christian S. Jensen,et al.  Temporal Data Management , 1999, IEEE Trans. Knowl. Data Eng..

[4]  Ming-Ling Lo,et al.  Spatial hash-joins , 1996, SIGMOD '96.

[5]  Vassilis J. Tsotras,et al.  The Snapshot Index: An I/O-optimal access method for timeslice queries , 1995, Inf. Syst..

[6]  Betty Salzberg,et al.  Timestamping after commit , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[7]  Thomas Zurek,et al.  Optimisation of Partitioned Temporal Joins , 1997 .

[8]  Richard R. Muntz,et al.  Temporal Query Processing and Optimization in Multiprocessor Database Machines , 1992, VLDB.

[9]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[10]  Bernhard Seeger,et al.  An asymptotically optimal multiversion B-tree , 1996, The VLDB Journal.

[11]  Vassilis J. Tsotras,et al.  Efficient Management of Time-Evolving Databases , 1995, IEEE Trans. Knowl. Data Eng..

[12]  David J. DeWitt,et al.  Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines , 1990, VLDB.

[13]  Richard J. Enbody,et al.  Dynamic hashing schemes , 1988, CSUR.

[14]  Per-Åke Larson,et al.  Dynamic hashing , 1978, BIT.

[15]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[16]  Gultekin Özsoyoglu,et al.  Temporal and Real-Time Databases: A Survey , 1995, IEEE Trans. Knowl. Data Eng..

[17]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[18]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[19]  Michael J. Folk File Structures , 1987 .

[20]  Richard T. Snodgrass,et al.  Performance evaluation of a temporal database management system , 1986, SIGMOD '86.

[21]  Christos Faloutsos,et al.  Designing Access Methods for Bitemporal Databases , 1998, IEEE Trans. Knowl. Data Eng..

[22]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[23]  Richard T. Snodgrass,et al.  A taxonomy of time databases , 1985, SIGMOD Conference.

[24]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[25]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[26]  Thomas Zurek,et al.  Optimisation of Partitioned Termporal Joins , 1997, BNCOD.

[27]  Ramez Elmasri,et al.  A consensus glossary of temporal database concepts , 1994, SGMD.

[28]  Vassilis J. Tsotras,et al.  Comparison of access methods for time-evolving data , 1999, CSUR.

[29]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[30]  R. Snodgrass Temporal Databases , 1986, Computer.

[31]  Witold Litwin,et al.  Linear Hashing: A new Algorithm for Files and Tables Addressing , 1980, ICOD.

[32]  Bernhard Seeger,et al.  Query Processing Techniques for Multiversion Access Methods , 1996, VLDB.

[33]  Amos Fiat,et al.  Nonoblivious hashing , 1992, JACM.

[34]  Christian S. Jensen,et al.  Efficient evaluation of the valid-time natural join , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[35]  David B. Lomet,et al.  Access methods for multiversion data , 1989, SIGMOD '89.

[36]  Friedhelm Meyer auf der Heide,et al.  Dynamic perfect hashing: upper and lower bounds , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[37]  Witold Litwin,et al.  A New Method for Fast Data Searches with Keys , 1987, IEEE Software.

[38]  Dieter Pfoser,et al.  Incremental join of time-oriented data , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[39]  Christian S. Jensen,et al.  An extensible notation for spatiotemporal index queries , 1998, SGMD.

[40]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[41]  Ronald Fagin,et al.  Extendible hashing—a fast access method for dynamic files , 1979, ACM Trans. Database Syst..