Spatially-aware indexing for image object retrieval

The success of image object retrieval systems relies on the visual bag-of-words paradigm, which allows image retrieval systems to adopt a retrieval strategy analogous to text retrieval. In this paper we propose two spatially-aware retrieval strategies for image object retrieval that replaces the vector space model. The advantage of the proposed spatially-aware indexing and retrieval strategies are threefold: (1) It allows for the deployment of small visual vocabularies, (2) the number of images evaluated at retrieval time is significantly reduced, and (3) it eliminates the need for a post-retrieval phase, which is normally used to test the spatial composition of the visual words in the retrieved images. The first spatially-aware retrieval strategy explores the direct neighbourhood of two local features for common visual words to determine the similarity of the region surrounding the local features. The second strategy embeds the spatial composition of its neighbourhood directly in the index using edge signatures. Both strategies rely on the coherence of the neighbourhood of points in different images containing similar objects. The comparison of the spatially-aware retrieval strategies against the vector space baseline shows a significant improvement in terms of early precision, and at the same time significantly reduce the number of candidates to be considered at retrieval time.

[1]  Cordelia Schmid,et al.  Semi-Local Affine Parts for Object Recognition , 2004, BMVC.

[2]  Yannis Avrithis,et al.  Retrieving landmark and non-landmark images from community photo collections , 2010, ACM Multimedia.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Óscar Martínez Mozos,et al.  A comparative evaluation of interest point detectors and local descriptors for visual SLAM , 2010, Machine Vision and Applications.

[5]  Matthew A. Brown,et al.  Invariant Features from Interest Point Groups , 2002, BMVC.

[6]  Trevor Darrell,et al.  Unsupervised Learning of Categories from Sets of Partially Matching Image Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Hassan Foroosh,et al.  View-invariant action recognition using fundamental ratios , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Wen Gao,et al.  Constructing visual phrases for effective and efficient object-based image retrieval , 2008, TOMCCAP.

[9]  Luc Van Gool,et al.  Edinburgh Research Explorer Simultaneous Object Recognition and Segmentation by Image Exploration , 2022 .

[10]  Nick G. Kingsbury,et al.  Matching of interest point groups with pairwise spatial constraints , 2010, 2010 IEEE International Conference on Image Processing.

[11]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[12]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[13]  Shin'ichi Satoh,et al.  Indexing local configurations of features for scalable content-based video copy detection , 2009, LS-MMRM '09.

[14]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[15]  Gang Hua,et al.  Integrated feature selection and higher-order spatial feature extraction for object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[17]  Ming Yang,et al.  Discovery of Collocation Patterns: from Visual Words to Visual Phrases , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[20]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[21]  Yannis Avrithis,et al.  Feature map hashing: sub-linear indexing of appearance and global geometry , 2010, ACM Multimedia.

[22]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[23]  R. Szeliski,et al.  Local Bi-gram Model for Object Recognition MSR-TR-2007-54 Xiangyang Lan , 2007 .