Improved Bitmap Indexing Strategy for Data Warehouses

Improving the query performance is critical in data warehousing and decision support systems. A lot of methods have been proposed by various researches. Indexing the data warehouse is a common but effective technique. Bitmap indices play a very important role in improving query performance in data warehousing and decision support systems. In this paper we present a new bitmap indexing strategy that can be applied to any existing bitmap compression schemes that are based on run length encoding. The new strategy, in most cases, requires less space and provides performance gains as well. The new strategy is tested on two commonly used bitmap compression schemes namely, word-aligned hybrid (WAH) and byte-aligned bitmap code (BBC) and results are presented graphically. The proposed strategy simply sorts the field on which a bitmap is to be created. Sorting of the field ensures long runs of ones and zeros. These long runs of ones and zeros are desirable for any compression scheme that is based on run length encoding and its variants. The space required to store the bitmap indexes goes down dramatically. The effect of sorting on query response time is studied for equality and range queries and it is found that there is a considerable decrease in the response time of queries. The overheads associated with the proposed strategy are sorting a table on a particular field and maintaining a sorted table. These extra tasks could be easily performed during the ETL process or when the data warehouse is offline. The new strategy concentrates on reducing space requirement for the bitmap index and the response time of queries and achieves both objectives without incurring any processing overheads when the data warehouse is online.

[1]  Theodore Johnson,et al.  Performance Measurements of Compressed Bitmap Indices , 1999, VLDB.

[2]  Patrick E. O'Neil,et al.  Model 204 Architecture and Performance , 1987, HPTS.

[3]  Sihem Amer-Yahia,et al.  Optimizing Queries on Compressed Bitmaps , 2000, VLDB.

[4]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[5]  Arie Shoshani,et al.  Strategies for processing ad hoc queries on large data warehouses , 2002, DOLAP '02.

[6]  Matthias Jarke,et al.  Query Processing and Optimization , 2000 .

[7]  G. Antoshenkov,et al.  Byte-aligned bitmap compression , 1995, Proceedings DCC '95 Data Compression Conference.

[8]  Yannis E. Ioannidis,et al.  An efficient bitmap encoding scheme for selection queries , 1999, SIGMOD '99.

[9]  Shmuel Tomi Klein,et al.  Compression of correlated bit-vectors , 1991, Inf. Syst..

[10]  Alejandro P. Buchmann,et al.  Encoded bitmap indexing for data warehouses , 1998, Proceedings 14th International Conference on Data Engineering.

[11]  Steven Barker,et al.  IEEE international Conference on Information Technology , 2004 .

[12]  Arie Shoshani,et al.  A performance comparison of bitmap indexes , 2001, CIKM '01.

[13]  Mohamed Ziauddin,et al.  Query processing and optimization in Oracle Rdb , 1996, The VLDB Journal.

[14]  Ernst J. Schuegraf Compression of large inverted files with hyperbolic term distribution , 1976, Inf. Process. Manag..

[15]  Yannis E. Ioannidis,et al.  Bitmap index design and evaluation , 1998, SIGMOD '98.

[16]  Nick Koudas Space efficient bitmap indexing , 2000, CIKM '00.