Selectivity Estimation for Optimizing Similarity Query in Multimedia Databases

For multimedia databases, a fuzzy query consists of a logical combination of content based similarity queries on features such as the color and the texture which are represented in continuous dimensions. Since features are intrinsically multi-dimensional, the multi-dimensional selectivity estimation is required in order to optimize a fuzzy query. The histogram is popularly used for the selectivity estimation. But the histogram has the shortcoming. It is difficult to estimate the selectivity of a similarity query, since a typical similarity query has the shape of a hyper sphere and the ranges of features are continuous. In this paper, we propose a curve fitting method using DCT to estimate the selectivity of a similarity query with a spherical shape in multimedia databases. Experiments show the effectiveness of the proposed method.