The SB-index and the HSB-index: efficient indices for spatial data warehouses

Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding the high cost of joining large tables. Therefore, mechanisms to provide efficient query processing over SDWs are essential. In this paper, we propose two efficient indices for SDW: the SB-index and the HSB-index. The proposed indices share the following characteristics. They enable multidimensional queries with spatial predicate for SDW and also support predefined spatial hierarchies. Furthermore, they compute the spatial predicate and transform it into a conventional one, which can be evaluated together with other conventional predicates by accessing a star-join Bitmap index. While the SB-index has a sequential data structure, the HSB-index uses a hierarchical data structure to enable spatial objects clustering and a specialized buffer-pool to decrease the number of disk accesses. The advantages of the SB-index and the HSB-index over the DBMS resources for SDW indexing (i.e. star-join computation and materialized views) were investigated through performance tests, which issued roll-up operations extended with containment and intersection range queries. The performance results showed that improvements ranged from 68% up to 99% over both the star-join computation and the materialized view. Furthermore, the proposed indices proved to be very compact, adding only less than 1% to the storage requirements. Therefore, both the SB-index and the HSB-index are excellent choices for SDW indexing. Choosing between the SB-index and the HSB-index mainly depends on the query selectivity of spatial predicates. While low query selectivity benefits the HSB-index, the SB-index provides better performance for higher query selectivity.

[1]  Kesheng Wu,et al.  Bitmap Index Design Choices and Their Performance Implications , 2007, 11th International Database Engineering and Applications Symposium (IDEAS 2007).

[2]  Anne Tchounikine,et al.  A Grid Services-Oriented Architecture for Efficient Operation of Distributed Data Warehouses on Globus , 2007, 21st International Conference on Advanced Information Networking and Applications (AINA '07).

[3]  Marcin Gorawski,et al.  Balanced Spatio-Temporal Data Warehouse with R-MVB, STCAT and BITMAP Indexes , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[4]  Cláudio de Souza Baptista,et al.  Towards a logical multidimensional model for spatial data warehousing and OLAP , 2006, DOLAP '06.

[5]  Arie Shoshani,et al.  Optimizing bitmap indices with efficient compression , 2006, TODS.

[6]  Elaheh Pourabbas,et al.  Characterization of hierarchies and some operators in OLAP environment , 1999, DOLAP '99.

[7]  Panos Vassiliadis,et al.  View usability and safety for the answering of top-k queries via materialized views , 2009, DOLAP.

[8]  Esteban Zimányi,et al.  Spatial Hierarchies and Topological Relationships in the Spatial MultiDimER Model , 2005, BNCOD.

[9]  Esteban Zimányi,et al.  Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications , 2010 .

[10]  Ling Liu,et al.  Encyclopedia of Database Systems , 2009, Encyclopedia of Database Systems.

[11]  Prabhat,et al.  FastBit: interactively searching massive data , 2009 .

[12]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[13]  Ladjel Bellatreche,et al.  Dimension table driven approach to referential partition relational data warehouses , 2009, DOLAP.

[14]  Yannis E. Ioannidis,et al.  An efficient bitmap encoding scheme for selection queries , 1999, SIGMOD '99.

[15]  Eamonn J. Keogh,et al.  Curse of Dimensionality , 2010, Encyclopedia of Machine Learning.

[16]  Ladjel Bellatreche,et al.  Yet Another Algorithms for Selecting Bitmap Join Indexes , 2010, DaWak.

[17]  Valéria Cesário Times,et al.  A Taxonomy of SOLAP Operators , 2009, SBBD.

[18]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[19]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[20]  Paul M. Aoki Generalizing "search" in generalized search trees , 1998, Proceedings 14th International Conference on Data Engineering.

[21]  Jiawei Han,et al.  Object-Based Selective Materialization for Efficient Implementation of Spatial Data Cubes , 2000, IEEE Trans. Knowl. Data Eng..

[22]  Kesheng Wu,et al.  Bitmap Indices for Data Warehouses , 2006 .

[23]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[24]  Thiago Luís Lopes Siqueira,et al.  The impact of spatial data redundancy on SOLAP query performance , 2009, Journal of the Brazilian Computer Society.

[25]  Panos Kalnis,et al.  Indexing spatio-temporal data warehouses , 2002, Proceedings 18th International Conference on Data Engineering.

[26]  Thiago Luís Lopes Siqueira,et al.  How Does the Spatial Data Redundancy Affect Query Performance in Geographic Data Warehouses? , 2010, J. Inf. Data Manag..

[27]  Esteban Zimányi,et al.  Logical Representation of a Conceptual Model for Spatial Data Warehouses , 2007, GeoInformatica.

[28]  Arie Shoshani,et al.  Breaking the Curse of Cardinality on Bitmap Indexes , 2008, SSDBM.

[29]  Thiago Luís Lopes Siqueira,et al.  A spatial bitmap-based index for geographical data warehouses , 2009, SAC '09.

[30]  Lei Chen,et al.  Curse of Dimensionality , 2018, Encyclopedia of Database Systems.

[31]  Long Zhang,et al.  Spatial hierarchy and OLAP-favored search in spatial data warehouse , 2003, DOLAP '03.

[32]  Yannis Manolopoulos,et al.  Hierarchical Bitmap Index: An Efficient and Scalable Indexing Technique for Set-Valued Attributes , 2003, ADBIS.

[33]  Bernhard Seeger,et al.  Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases , 2001 .

[34]  Shashi Shekhar,et al.  Should SDBMS support a join index?: a case study from CrimeStat , 2008, GIS '08.

[35]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[36]  Goetz Graefe,et al.  Multi-table joins through bitmapped join indices , 1995, SGMD.

[37]  Thiago Luís Lopes Siqueira,et al.  Investigating the Effects of Spatial Data Redundancy in Query Performance over Geographical Data Warehouses , 2008, GeoInfo.

[38]  Cristina Dutra de Aguiar Ciferri,et al.  Horizontal fragmentation as a technique to improve the performance of drill-down and roll-up queries , 2007, SAC '07.

[39]  Paul M. Aoki Generalizing Search'' in Generalized Search Trees (Extended Abstract) , 1998, ICDE 1998.

[40]  Sandro Bimonte,et al.  Towards a spatial multidimensional model , 2005, DOLAP '05.

[41]  Robert Wrembel,et al.  HOBI: Hierarchically Organized Bitmap Index for Indexing Dimensional Data , 2009, DaWaK.

[42]  Panos Kalnis,et al.  Efficient OLAP Operations in Spatial Data Warehouses , 2001, SSTD.

[43]  Matteo Golfarelli,et al.  Materialization of fragmented views in multidimensional databases , 2004, Data Knowl. Eng..

[44]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .

[45]  Yvan Bédard,et al.  SOLAP technology: Merging business intelligence with geospatial technology for interactive spatio-temporal exploration and analysis of data , 2005 .

[46]  Hans-Joachim Lenz,et al.  Tree Based Indexes vs. Bitmap Indexes - a Performance Study , 1999, DMDW.

[47]  Xuedong Chen,et al.  The Star Schema Benchmark and Augmented Fact Table Indexing , 2009, TPCTC.