A revised r*-tree in comparison with related index structures

In this paper we present an improved redesign of the R*-tree that is entirely suitable for running within a DBMS. Most importantly, an insertion is guaranteed to be restricted to a single path because re-insertion could be abandoned. We re-engineered both, subtree choice and split algorithm, to be more robust against specific data distributions and insertion orders, as well as peculiarities often found in real multidimensional data sets. This comes along with a substantial reduction in CPU-time. Our experimental setup covers a wide range of different artificial and real data sets. The experimental comparison shows that the search performance of our revised R*-tree is superior to that of its three most important competitors. In comparison to its predecessor, the original R*-tree, the creation of a tree is substantially faster, while the I/O cost required for processing queries is improved by more than 30% on average for two- and three-dimensional data. For higher dimensional data, particularly for real data sets, much larger improvements are achieved.

[1]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[2]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[3]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[4]  Peter Widmayer,et al.  Enclosing Many Boxes by an Optimal Pair of Boxes , 1992, STACS.

[5]  Bernd-Uwe Pagel,et al.  Towards an analysis of range query performance in spatial data structures , 1993, PODS '93.

[6]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[7]  Christos Faloutsos,et al.  Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension , 1994, PODS.

[8]  David J. DeWitt,et al.  Shoring up persistent applications , 1994, SIGMOD '94.

[9]  Hanan Samet,et al.  Benchmarking Spatial Join Operations with Spatial Output , 1995, VLDB.

[10]  Jeffrey F. Naughton,et al.  Generalized Search Trees for Database Systems , 1995, VLDB.

[11]  Bernd-Uwe Pagel,et al.  Window query-optimal clustering of spatial objects , 1995, PODS.

[12]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[13]  The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries , 1997, SIGMOD Conference.

[14]  Bernhard Seeger,et al.  A Generic Approach to Bulk Loading Multidimensional Index Structures , 1997, VLDB.

[15]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[16]  C. Mohan,et al.  Concurrency and recovery in generalized search trees , 1997, SIGMOD '97.

[17]  Mario A. López,et al.  On Optimal Node Splitting for R-trees , 1998, VLDB.

[18]  Jay Banerjee,et al.  Indexing medium-dimensionality data in Oracle , 1999, SIGMOD '99.

[19]  Sharad Mehrotra,et al.  The hybrid tree: an index structure for high dimensional feature spaces , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[20]  Sharad Mehrotra,et al.  Efficient concurrency control in multidimensional access methods , 1999, SIGMOD '99.

[21]  Masatoshi Yoshikawa,et al.  The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation , 2000, VLDB.

[22]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[23]  Keishi Tajima,et al.  SIGMOD Conference 2002 , 2002 .

[24]  Kothuri Venkata Ravi Kanth,et al.  Quadtree and R-tree indexes in oracle spatial: a comparison using GIS data , 2002, SIGMOD '02.

[25]  장윤희,et al.  Y. , 2003, Industrial and Labor Relations Terms.

[26]  Mark de Berg,et al.  The Priority R-tree: a practically efficient and worst-case optimal R-tree , 2004, SIGMOD '04.

[27]  Robert S. Leiken,et al.  A User’s Guide , 2011 .