The hB II -Tree: A Concurrent And Recoverable Multi-Attribute Index Structure

The number of applications that deal with multi-attribute (point or spatial) data is continually increasing. These applications require a Database Management System (DBMS) which o ers the same functionality for this kind of data as it o ers for traditional data. The DBMS should use e cient and reliable ways to store, index, and access the data. It should also maximize concurrent accessing of the data by as many users as possible at the same time, and be able to recover from application errors or system crashes that result in data inconsistency. Approaches that use multiple single-attribute indexes are quite ine cient. That is why there has been extensive research on explicitly multi-attribute indexing. Most proposed multi-attribute indexes do not o er performance guarantees and well understood methods for concurrency and recovery. But these are the requirements for the inclusion of an index in a general purpose DBMS. We propose a new multi-attribute index. Our approach combines the hB-tree (Lomet & Salzberg), a multi-attribute index with promising performance guarantees, and the -tree (Lomet & Salzberg), an abstract index which o ers well understood and e cient concurrency and recovery methods. We call the resulting method the hB -tree. We describe several versions of the hB -tree, each using a di erent node splitting and index term posting algorithm. We also describe a very e cient new node deletion algorithm. We have implemented all the versions of the hB -tree. Our performance results show that even the version that o ers no performance guarantees, actually performs very well, in terms of storage utilization, index size (fan-out), exact-match and range searching, under various data types and distributions. We have also shown that our index is fairly insensitive to increases in dimension. Thus, it is suitable for indexing spatial (non-point) data that is mapped to higher dimensional points. This property and the fact that all our versions of the hB -tree guarantee very high concurrency, make the hB -tree a promising candidate for inclusion in a general purpose DBMS. 1

[1]  David B. Lomet,et al.  The hB-tree: a multiattribute indexing method with good guaranteed performance , 1990, TODS.

[2]  S. B. Yao,et al.  Efficient locking for concurrent operations on B-trees , 1981, TODS.

[3]  N. S. Barnett,et al.  Private communication , 1969 .

[4]  M. Stonebraker,et al.  The Sequoia 2000 Benchmark , 1993, SIGMOD Conference.

[5]  C. Mohan,et al.  ARIES/IM: an efficient and high concurrency index management method using write-ahead logging , 1992, SIGMOD '92.

[6]  T. H. Merrett,et al.  A class of data structures for associative searching , 1984, PODS.

[7]  Jon Louis Bentley,et al.  Multidimensional Binary Search Trees in Database Applications , 1979, IEEE Transactions on Software Engineering.

[8]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[9]  B. Salzberg Practical spatial database access methods , 1991, [Proceedings] 1991 Symposium on Applied Computing.

[10]  David B. Lomet,et al.  Bounded index exponential hashing , 1983, TODS.

[11]  Kenneth Baclawski,et al.  Quickly generating billion-record synthetic databases , 1994, SIGMOD '94.

[12]  Betty Salzberg,et al.  File Structures: An Analytic Approach , 1988 .

[13]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[14]  Christos Faloutsos,et al.  Multiattribute hashing using Gray codes , 1986, SIGMOD '86.

[15]  Yehoshua Sagiv Concurrent Operations on B*-Trees with Overtaking , 1986, J. Comput. Syst. Sci..

[16]  David B. Lomet,et al.  Grow and Post Index Trees: Roles, Techniques and Future Potential , 1991, SSD.

[17]  Oliver Günther,et al.  The design of the cell tree: an object-oriented index structure for geometric databases , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[18]  Witold Litwin,et al.  Linear Hashing: A new Algorithm for Files and Tables Addressing , 1980, ICOD.

[19]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[20]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[21]  David B. Lomet,et al.  Access method concurrency with recovery , 1992, SIGMOD '92.

[22]  Nils J. Nilsson,et al.  Problem-solving methods in artificial intelligence , 1971, McGraw-Hill computer science series.

[23]  D. B. Lomet Process structuring, synchronization, and recovery using atomic actions , 1977 .

[24]  Dennis Shasha,et al.  Concurrent search structure algorithms , 1988, TODS.

[25]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[26]  Sally E. Fischbeck,et al.  The Ubiquitous B-tree: Volume II , 1987 .