Filter Trees for Managing Spatial Data over a Range of Size Granularities

We introduce a new file organization for the storage and manipulation of spatial (or multidimensional) data that is able to execute spatial join operations with great efficiency. The Filter Tree information structure is a hierarchical organization that tends to separate spatial entities by size, placing larger entities at the higher levels of the Filter Tree, and smaller entities at lower levels. Within each level, index entries for the entities are ordered by a space-filling curve (Hilbert curve). This allows the algorithms to use bulk I/O requests, exploiting the locality in the index information, and minimizing the number of I/O transfers from disk. We provide algorithms for constructing Filter Trees, for performing range queries on a Filter Tree, and for performing spatial joins between a pair of Filter Trees. Finally, we include results from experiments using a prototype implementation of Filter Trees to treat both synthetic and real sets of spatial entities. Our experimental results show that full spatial joins can always be done more efficiently with Filter Trees than with current competitive methods, and that in some cases the improvement in performance is very large. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 22nd VLDB Conference Mumbai(Bombay), India, 1996

[1]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[2]  Frank Manola,et al.  PROBE Spatial Data Modeling and Query Processing in an Image Database Application , 1988, IEEE Trans. Software Eng..

[3]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[4]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[5]  Gerhard Weikum Set-oriented disk access to large complex objects , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[6]  Hans-Werner Six,et al.  The R-file: an efficient access structure for proximity queries , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[7]  Christos Faloutsos,et al.  The R+ - tree : A Dynamic Index for Multi - dimensional Data , 1987, VLDB 1987.

[8]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[9]  Reijo Sulonen,et al.  The EXCELL Method for Efficient Geometric Access to Data , 1982, 19th Design Automation Conference.

[10]  Bernhard Seeger,et al.  Reading a Set of Disk Pages , 1993, VLDB.

[11]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[12]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[13]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[14]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[15]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[16]  Michael Stonebraker The Miro DBMS , 1993, SIGMOD '93.

[17]  Gershon Kedem The Quad-CIF Tree: A Data Structure for Hierarchical On-Line Algorithms , 1982, 19th Design Automation Conference.

[18]  Hans-Werner Six,et al.  Spatial searching in geometric databases , 1988, Proceedings. Fourth International Conference on Data Engineering.

[19]  H. V. Jagadish,et al.  Linear clustering of objects with multiple attributes , 1990, SIGMOD '90.

[20]  Hanan Samet,et al.  A consistent hierarchical representation for vector data , 1986, SIGGRAPH.

[21]  Christos Faloutsos,et al.  On packing R-trees , 1993, CIKM '93.

[22]  J. L. Smith,et al.  A data structure and algorithm based on a linear key for a rectangle retrieval problem , 1983, Comput. Vis. Graph. Image Process..

[23]  Jack A. Orenstein Spatial query processing in an object-oriented database system , 1986, SIGMOD '86.

[24]  R. Ng,et al.  Eecient and Eeective Clustering Methods for Spatial Data Mining , 1994 .

[25]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[26]  Hanan Samet,et al.  Benchmarking Spatial Join Operations with Spatial Output , 1995, VLDB.

[27]  Oliver Günther Evaluation of Spatial Access Methods with Oversize Shelves , 1992 .

[28]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[29]  Theodore Bially,et al.  Space-filling curves: Their generation and their application to bandwidth reduction , 1969, IEEE Trans. Inf. Theory.