Analysis of n-Dimensional Quadtrees using the Hausdorff Fractal Dimension

There is mounting evidence [Man77, SchSI] that real datasets are statistically self-similar, and thus, ‘fractal’. This is an important insight since it permits a compact statistical description of spatial datasets; subsequently, as we show, it also forms the basis for the theoretical analysis of spatial access methods, without using the typical, but unrealistic, uniformity assumption. In this paper, we focus on the estimation of the number of quadtree blocks that a real, spatial dataset will require. Using the the wellknown Hausdorff fractal dimension, we derive some closed formulas which allow us to predict the number of quadtree blocks, given some few parameters. Using our formulas, it is possible to predict the space overhead and the response time of linear quadtrees/z-ordering [OM88], which are widely used in practice. In order to verify our analytical model, we performed *This work was partially supported by the National Science Foundation under Grants No. CDR8803012, EEC-94-02384, IRI-8958546 and IRI-9205273), with matching funds from Empress Software Inc. and Thinking Machines Inc. Some of the work was performed while he was visiting AT&T Bell Labor+ tories, Murray Hill, NJ. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 22nd VLDB Conference Mumbai(Bombay), India, 1996 Volker Gaede Institut fiir Wirtschaftsinformatik Humboldt-Universitgt zu Berlin Spandauer Str. 1 10178 Berlin, Germany gaede@wiwi.hu-berlin.de an extensive experimental investigation using several real datasets coming from different domains. In these experiments, we found that our analytical model agrees well with our experiments as well as with older empirical observations on 2-d [Gae95b] and 3-d [ACF+94] data.

[1]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[2]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[3]  Christos Faloutsos,et al.  QBISM: extending a DBMS to support 3D medical images , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[4]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[5]  Christos Faloutsos,et al.  Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension , 1994, PODS.

[6]  Clifford A. Shaffer,et al.  A formula for computing the number of quadtree node fragments created by a shift , 1988, Pattern Recognit. Lett..

[7]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[8]  Volker Gaede Geometric Information Makes Spatial Query Processing More Efficient , 1995, ACM-GIS.

[9]  Charles R. Dyer,et al.  The space efficiency of quadtrees , 1982, Comput. Graph. Image Process..

[10]  Christos Faloutsos,et al.  Analytical results on the quadtree decomposition of arbitrary rectangles , 1992, Pattern Recognit. Lett..

[11]  Yannis Manolopoulos,et al.  A random model for analyzing region quadtrees , 1995, Pattern Recognit. Lett..

[12]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[13]  Allen Klinger,et al.  PATTERNS AND SEARCH STATISTICS , 1971 .

[14]  T. H. Merrett,et al.  A class of data structures for associative searching , 1984, PODS.

[15]  Kenneth Steiglitz,et al.  Operations on Images Using Quad Trees , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[17]  Volker Gaede,et al.  Optimal Redundancy in Spatial Database Systems , 1995, SSD.

[18]  H. V. Jagadish,et al.  A retrieval technique for similar shapes , 1991, SIGMOD '91.

[19]  Frank Manola,et al.  PROBE Spatial Data Modeling and Query Processing in an Image Database Application , 1988, IEEE Trans. Software Eng..

[20]  Azriel Rosenfeld,et al.  Computer Vision , 1988, Adv. Comput..

[21]  Jack A. Orenstein Redundancy in spatial databases , 1989, SIGMOD '89.

[22]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[23]  Clifford A. Shaffer,et al.  QUILT: a geographic information system based on quadtrees , 1990, Int. J. Geogr. Inf. Sci..

[24]  Clifford A. Shaffer,et al.  A Model for the Analysis of Neighbor Finding in Pointer-Based Quadtrees , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Manfred Schroeder,et al.  Fractals, Chaos, Power Laws: Minutes From an Infinite Paradise , 1992 .

[26]  Christos Faloutsos,et al.  Estimating the Selectivity of Spatial Queries Using the 'Correlation' Fractal Dimension , 1995, VLDB.

[27]  Ralf Hartmut Güting,et al.  An introduction to spatial database systems , 1994, VLDB J..

[28]  Wolf-Fritz Riekert,et al.  Spatial Access Methods and Query Processing in the Object-Oriented GIS GODOT , 1994, AGDM.

[29]  Christos Faloutsos,et al.  Analysis of the n-Dimensional Quadtree Decomposition for Arbitrary Hyperectangles , 1997, IEEE Trans. Knowl. Data Eng..