Reduce Pruning Cost to Accelerate Multimedia kNN Search over MBRs Based Index Structures

MBR (Minimum Bounding Rectangle) has been widely used to represent multimedia data objects for multimedia indexing techniques. In kNN search, MINDIST and MINMAXDIST was the most popular pruning metrics employed by MBR based index structures. However, with the increase of dimensionality, the computation costs of above two metrics are expensive, and the involved distance computations may become the considerable component. In this paper, to improve the performance of kNN search over multimedia data, we proposed an approach to reduce the computation cost of pruning heuristics by use the cheap upper bound of MINMAXDIST instead of its precise value, and then we derive enhanced cheap heuristics based on this upper bound for kNN pruning, which are helpful to avoid visiting unnecessary data objects and MBRs. We conducted an extensive experimental study on a large real-life multimedia dataset called MSRA-MM 2.0, obtained from the commercial Bing Image search engine. The results show that the proposed approach significantly reduced the computation cost and boosted the overall performance of MBR based multimedia kNN search.

[1]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[2]  Hans-Peter Kriegel,et al.  Techniques for efficiently searching in spatial, temporal, spatio-temporal, and multimedia databases , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[3]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[4]  David Novak,et al.  Scalability comparison of Peer-to-Peer similarity search structures , 2008, Future Gener. Comput. Syst..

[5]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[6]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[7]  Bernhard Seeger,et al.  A revised r*-tree in comparison with related index structures , 2009, SIGMOD Conference.

[8]  Reynold Cheng,et al.  Efficient Evaluation of Imprecise Location-Dependent Queries , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[10]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[11]  Hans-Peter Kriegel,et al.  Techniques for Efficiently Searching in Spatial, Temporal, Spatio-temporal, and Multimedia Databases , 2009, DASFAA.

[12]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[13]  Meng Wang,et al.  MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[14]  Ramesh C. Jain,et al.  Similarity indexing: algorithms and performance , 1996, Electronic Imaging.

[15]  SeegerBernhard,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990 .

[16]  Ihab F. Ilyas,et al.  Efficient search for the top-k probable nearest neighbors in uncertain databases , 2008, Proc. VLDB Endow..

[17]  Hans-Peter Kriegel,et al.  Boosting spatial pruning: on optimal pruning of MBRs , 2010, SIGMOD Conference.

[18]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[19]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[20]  Dimitrios Gunopulos,et al.  Indexing spatiotemporal archives , 2006, The VLDB Journal.