Adaptive nearest neighbor search for relevance feedback in large image databases

Relevance feedback is often used in refining similarity retrievals in image and video databases. Typically this involves modification to the similarity metrics based on the user feedback and recomputing a set of nearest neighbors using the modified similarity values. Such nearest neighbor computations are expensive given that typical image features, such as color and texture, are represented in high dimensional spaces. Search complexity is a ciritcal issue while dealing with large databases and this issue has not received much attention in relevance feedback research. Most of the current methods report results on very small data sets, of the order of few thousand items, where a sequential (and hence exhaustive search) is practical. The main contribution of this paper is a novel algorithm for adaptive nearest neigbor computations for high dimensional feature vectors and when the number of items in the databse is large. The proposed method exploits the correlations between two consecutive nearest neighbor searches when the underlying similarity metric is changing, and filters out a significant number of candidates ina two stage search and retrieval process, thus reducing the number of I/O accesses to the database. Detailed experimental results are provided using a set of about 700,000 images. Comparision to the existing method shows an order of magnitude overall imporovement.

[1]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[2]  Sharad Mehrotra,et al.  The hybrid tree: an index structure for high dimensional feature spaces , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[3]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[4]  Chahab Nastar,et al.  Relevance feedback and category search in image databases , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[5]  Divyakant Agrawal,et al.  Vector approximation based indexing for non-uniform high dimensional data sets , 2000, CIKM '00.

[6]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[7]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[8]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[9]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[10]  Christian Böhm,et al.  Independent quantization: an index compression technique for high-dimensional data spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[11]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[12]  Amarnath Gupta,et al.  Virage image search engine: an open framework for image management , 1996, Electronic Imaging.

[13]  H.D. Tagare Increasing retrieval efficiency by index tree adaptation , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[14]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[15]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[16]  W. Eric L. Grimson,et al.  A framework for learning query concepts in image classification , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[17]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.