Towards Optimal Indexing for Relevance Feedback in Large Image Databases$^+$

Motivated by the need to efficiently leverage user relevance feedback in content-based retrieval from image databases, we propose a fast, clustering-based indexing technique for exact nearest-neighbor search that adapts to the Mahalanobis distance with a varying weight matrix. We derive a basic property of point-to-hyperplane Mahalanobis distance, which enables efficient recalculation of such distances as the Mahalanobis weight matrix is varied. This property is exploited to recalculate bounds on query-cluster distances via projection on known separating hyperplanes (available from the underlying clustering procedure), to effectively eliminate noncompetitive clusters from the search and to retrieve clusters in increasing order of (the appropriate) distance from the query. We compare performance with an existing variant of VA-File indexing designed for relevance feedback, and observe considerable gains.

[1]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[2]  Michael J. Swain,et al.  Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[3]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[4]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[6]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[7]  Divyakant Agrawal,et al.  Vector approximation based indexing for non-uniform high dimensional data sets , 2000, CIKM '00.

[8]  Kenneth Rose,et al.  Fast adaptive Mahalanobis distance-based search and retrieval in image databases , 2008, 2008 15th IEEE International Conference on Image Processing.

[9]  Beng Chin Ooi,et al.  Indexing the Distance: An Efficient Method to KNN Processing , 2001, VLDB.

[10]  Klemens Böhm,et al.  Trading Quality for Time with Nearest Neighbor Search , 2000, EDBT.

[11]  Anthony K. H. Tung,et al.  LDC: enabling search by partial distance in a hyper-dimensional space , 2004, Proceedings. 20th International Conference on Data Engineering.

[12]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[13]  Kenneth Rose,et al.  Adaptive Cluster-Distance Bounding for Nearest Neighbor Search in Image Databases , 2007, 2007 IEEE International Conference on Image Processing.

[14]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[15]  Francesca Odone,et al.  Histogram intersection kernel for image classification , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[16]  Thomas S. Huang,et al.  One-class SVM for learning in image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[17]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[18]  Kenneth Rose,et al.  Towards optimal clustering for approximate similarity searching , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[19]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[20]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Miroslaw Bober,et al.  MPEG-7 visual shape descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[22]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[23]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[24]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[25]  Kenneth Rose,et al.  VQ-index: an index structure for similarity searching in multimedia databases , 2002, MULTIMEDIA '02.

[26]  Jan P. Allebach,et al.  Fast image database search using tree-structured VQ , 1997, Proceedings of International Conference on Image Processing.

[27]  Thomas S. Huang,et al.  Image retrieval with relevance feedback: from heuristic weight adjustment to optimal learning methods , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[28]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[29]  John C. Dalton,et al.  Hierarchical browsing and search of large image databases , 2000, IEEE Trans. Image Process..

[30]  Divyakant Agrawal,et al.  Approximate nearest neighbor searching in multimedia databases , 2001, Proceedings 17th International Conference on Data Engineering.

[31]  Jitendra Malik,et al.  Efficient shape matching using shape contexts , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Thomas Sikora,et al.  The MPEG-7 visual standard for content description-an overview , 2001, IEEE Trans. Circuits Syst. Video Technol..

[33]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[35]  TomasiCarlo,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000 .

[36]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[37]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[38]  Ryoji Kataoka,et al.  Similarity Search for Adaptive Ellipsoid Queries Using Spatial Transformation , 2001, VLDB.

[39]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[40]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[41]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.