Towards Optimal Multi-Dimensional Query Processing with BitmapIndices

Bitmap indices have been widely used in scientific applications and commercial systems for processing complex, multi-dimensional queries where traditional tree-based indices would not work efficiently. This paper studies strategies for minimizing the access costs for processing multi-dimensional queries using bitmap indices with binning. Innovative features of our algorithm include (a) optimally placing the bin boundaries and (b) dynamically reordering the evaluation of the query terms. In addition, we derive several analytical results concerning optimal bin allocation for a probabilistic query model. Our experimental evaluation with real life data shows an average I/O cost improvement of at least a factor of 10 for multi-dimensional queries on datasets from two different applications. Our experiments also indicate that the speedup increases with the number of query dimensions.

[1]  Patrick E. O'Neil,et al.  Model 204 Architecture and Performance , 1987, HPTS.

[2]  Kesheng Wu,et al.  Optimizing candidate check costs for bitmap indices , 2005, CIKM '05.

[3]  Kesheng Wu,et al.  Optimizing I/O Costs of Multi-dimensional Queries Using Bitmap Indices , 2005, DEXA.

[4]  Arie Shoshani,et al.  Grid Collector: Using an event catalog to speed up user analysisin distributed environment , 2004 .

[5]  Sudipto Guha,et al.  Fast algorithms for hierarchical range histogram construction , 2002, PODS '02.

[6]  Alejandro P. Buchmann,et al.  Encoded bitmap indexing for data warehouses , 1998, Proceedings 14th International Conference on Data Engineering.

[7]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[8]  Nick Koudas Space efficient bitmap indexing , 2000, CIKM '00.

[9]  Theodore Johnson,et al.  Performance Measurements of Compressed Bitmap Indices , 1999, VLDB.

[10]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[11]  Philip S. Yu,et al.  Range-based bitmap indexing for high cardinality attributes with skew , 1998, Proceedings. The Twenty-Second Annual International Computer Software and Applications Conference (Compsac '98) (Cat. No.98CB 36241).

[12]  G. Antoshenkov,et al.  Byte-aligned bitmap compression , 1995, Proceedings DCC '95 Data Compression Conference.

[13]  Alfred V. Aho,et al.  Optimal partial-match retrieval when fields are independently specified , 1979, ACM Trans. Database Syst..

[14]  Arie Shoshani,et al.  Evaluation Strategies for Bitmap Indices with Binning , 2004, DEXA.

[15]  Yannis E. Ioannidis,et al.  Bitmap index design and evaluation , 1998, SIGMOD '98.

[16]  Anthony Mezzacappa,et al.  TeraScale Supernova Initiative , 2002 .

[17]  Yannis E. Ioannidis,et al.  An efficient bitmap encoding scheme for selection queries , 1999, SIGMOD '99.

[18]  Arie Shoshani,et al.  On the performance of bitmap indices for high cardinality attributes , 2004, VLDB.

[19]  Sihem Amer-Yahia,et al.  Optimizing Queries on Compressed Bitmaps , 2000, VLDB.

[20]  Divesh Srivastava,et al.  Optimal histograms for hierarchical range queries (extended abstract) , 2000, PODS '00.

[21]  Kesheng Wu,et al.  Bitmap Indices for Fast End-User Physics Analysis in ROOT , 2006 .