A model for the prediction of R-tree performance

In this paper we present an analytical model that predicts the performance of R-trees (and its variants) when a range query needs to be answered. The cost model uses knowledge of the dataset only i.e., the proposed formula that estimates the number of disk accesses is a function of data properties, namely, the amount of data and their density in the work space. In other words, the proposed model is applicable even before the construction of the R-tree index, a fact that makes it a useful tool for dynamic spatial databases. Several experiments on synthetic and real datasets show that the proposed analytical model is very accurate, the relative error being usually around 10%-15%, for uniform and non-uniform distributions. We believe that this error is involved with the gap between efficient R-tree variants, like the R*-tree, and an optimum, not implemented yet, method. Our work extends previous research concerning R-tree analysis and constitutes a useful tool for spatial query optimizers that need to evaluate the cost of a complex spatial query and its execution procedure.

[1]  Peter Widmayer,et al.  The LSD tree: spatial access to multidimensional and non-point objects , 1989, VLDB 1989.

[2]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[3]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[4]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[5]  Timos K. Sellis,et al.  Optimization Issues in R-tree Construction (Extended Abstract) , 1994, IGIS.

[6]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[7]  Michael Freeston,et al.  The BANG file: A new kind of grid file , 1987, SIGMOD '87.

[8]  Christos Faloutsos,et al.  On packing R-trees , 1993, CIKM '93.

[9]  TheodoridisYannis,et al.  Topological relations in the world of minimum bounding rectangles , 1995 .

[10]  Nick Roussopoulos,et al.  Direct spatial search on pictorial databases using packed R-trees , 1985, SIGMOD Conference.

[11]  Bernd-Uwe Pagel,et al.  Window query-optimal clustering of spatial objects , 1995, PODS.

[12]  A. Henrich,et al.  On the Performance Analysis of Multi-dimensional R-tree-based Data Structures , 1995 .

[13]  Jeffrey Scott Vitter,et al.  Faster methods for random sampling , 1984, CACM.

[14]  Timos K. Sellis,et al.  Spatio-temporal indexing for large multimedia applications , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[15]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[16]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[17]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multi-Key File Structure , 1981, ECI.

[18]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[19]  Bernd-Uwe Pagel,et al.  Towards an analysis of range query performance in spatial data structures , 1993, PODS '93.

[20]  Dimitris Papadias,et al.  Range Queries Involving Spatial Relations: A Performance Analysis , 1995, COSIT.

[21]  Ralf Hartmut Güting,et al.  An introduction to spatial database systems , 1994, VLDB J..

[22]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[23]  Christos Faloutsos,et al.  Analysis of object oriented spatial access methods , 1987, SIGMOD '87.

[24]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[25]  Nick Roussopoulos,et al.  Faloutsos: "the r+- tree: a dynamic index for multidimensional objects , 1987 .

[26]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[27]  Christos Faloutsos,et al.  Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension , 1994, PODS.