Block Histogram Compression Method for Selectivity Estimation in High-dimensions

Database query optimates the selectivety of a query to find the most efficient access plan. Multi-dimensional selectivity estimation technique is required for a query with multiple attributes because the attributes are not independent each other. Histogram is practically used in most commercial database products because it approximates data distributions with small overhead and small error rates. However, histogram is inadequate for a query with multiple attributes because it incurs high storage overhead and high error rates. In this paper, we propose a novel method for multi-dimentional selectivity estimation. Compressed information from a large number of small-sized histogram buckets is maintained using the discrete cosine transform. This enables low error rates and low storage overheads even in high dimensions. Extensive experimental results show adventages of the proposed approach.