Index Clustering for High-Performance Sequential Index Access

This paper presents an index clustering technique called the segment-page clustering (SP-clustering). Most relevant index pages are widely scattered on a disk due to dynamic page allocation, and thus many random disk accesses are required during the query processing. The SP-clustering avoids the scattering by storing the relevant nodes contiguously in a segment that contains a sequence of contiguous disk pages and improves the query performance by offering sequential disk access within a segment. A new cost model is also introduced to estimate the performance of the SP-clustering. It takes account of the physical adjacency of pages read as well as the number of pages accessed. Experimental results demonstrate that the SP-clustering improves the query performance up to several times compared with the traditional ones with respect to the total elapsed time.

[1]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[2]  Peter Widmayer,et al.  The LSD tree: spatial access to multidimensional and non-point objects , 1989, VLDB 1989.

[3]  Bernhard Seeger,et al.  A Generic Approach to Bulk Loading Multidimensional Index Structures , 1997, VLDB.

[4]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[5]  Jeffrey Scott Vitter,et al.  Faster methods for random sampling , 1984, CACM.

[6]  Andreas Henrich,et al.  The LSD/sup h/-tree: an access structure for feature vectors , 1998, Proceedings 14th International Conference on Data Engineering.

[7]  Bernd-Uwe Pagel,et al.  Towards an analysis of range query performance in spatial data structures , 1993, PODS '93.

[8]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[9]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[10]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[11]  David B. Lomet,et al.  A simple bounded disorder file organization with good performance , 1988, TODS.

[12]  Christos Faloutsos,et al.  Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension , 1994, PODS.

[13]  Patrick E. O'Neil TheSB-tree an index-sequential structure for high-performance sequential access , 2005, Acta Informatica.

[14]  Witold Litwin,et al.  The bounded disorder access method , 1986, 1986 IEEE Second International Conference on Data Engineering.

[15]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[16]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .