Optimizing multidimensional index trees for main memory access

Recent studies have shown that cache-conscious indexes such as the CSB+-tree outperform conventional main memory indexes such as the T-tree. The key idea of these cache-conscious indexes is to eliminate most of child pointers from a node to increase the fanout of the tree. When the node size is chosen in the order of the cache block size, this pointer elimination effectively reduces the tree height, and thus improves the cache behavior of the index. However, the pointer elimination cannot be directly applied to multidimensional index structures such as the R-tree, where the size of a key, typically, an MBR (minimum bounding rectangle), is much larger than that of a pointer. Simple elimination of four-byte pointers does not help much to pack more entries in a node. This paper proposes a cache-conscious version of the R-tree called the CR-tree. To pack more entries in a node, the CR-tree compresses MBR keys, which occupy almost 80% of index data in the two-dimensional case. It first represents the coordinates of an MBR key relatively to the lower left corner of its parent MBR to eliminate the leading O's from the relative coordinate representation. Then, it quantizes the relative coordinates with a fixed number of bits to further cut off the trailing less significant bits. Consequently, the CR-tree becomes significantly wider and smaller than the ordinary R-tree. Our experimental and analytical study shows that the two-dimensional CR-tree performs search up to 2.5 times faster than the ordinary R-tree while maintaining similar update performance and consuming about 60% less memory space.

[1]  Hans-Peter Kriegel,et al.  Index research (panel session): forest or trees? , 2000, SIGMOD 2000.

[2]  Jonathan Goldstein,et al.  Compressing relations and indexes , 1998, Proceedings 14th International Conference on Data Engineering.

[3]  A. Guttman,et al.  A Dynamic Index Structure for Spatial Searching , 1984, SIGMOD 1984.

[4]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[5]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[6]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[7]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[8]  Hans-Peter Kriegel,et al.  The pyramid-technique: towards breaking the curse of dimensionality , 1998, SIGMOD '98.

[9]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[10]  Shmuel Tomi Klein,et al.  Is Huffman coding dead? (extended abstract) , 1993, SIGIR.

[11]  Joseph M. Hellerstein,et al.  Indexing research: forest or trees? , 2000, SIGMOD 2000.

[12]  Christos Faloutsos,et al.  On packing R-trees , 1993, CIKM '93.

[13]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[14]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[15]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[16]  Christos Faloutsos,et al.  Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension , 1994, PODS.

[17]  Mario A. López,et al.  STR: a simple and efficient algorithm for R-tree packing , 1997, Proceedings 13th International Conference on Data Engineering.

[18]  Kenneth A. Ross,et al.  Cache Conscious Indexing for Decision-Support in Main Memory , 1999, VLDB.

[19]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[20]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[21]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .