Region similarity arrangement for large-scale image retrieval

Abstract We propose a promising method of geometric verification to improve the precision of Bag-of-Words (BoW) model in image retrieval. Most previous methods focus on the positions of interest points or the absolute differences of regions’ scales and angles. In contrast, our method, named Region Similarity Arrangement (RSA), exploits the spatial arrangement of interest regions. For each image, RSA constructs a Region Property Space, regarding each region’s (scale, angle) pair as a point in a polar coordinate system, and encodes the arrangement of these points into the BoW vector. Furthermore, based on the particular distribution of points in Region Property Space, we design a Spatial Weighting to reduce the burstiness phenomenon during query. From experimental results on Holidays, Oxford5K and Paris, RSA could get comparable results with state-of-the-art methods. In addition, RSA increases no extra memory and negligible computational consumption compared with the baseline BoW approach.

[1]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Qi Tian,et al.  Contextual Hashing for Large-Scale Image Search , 2014, IEEE Transactions on Image Processing.

[3]  Ricardo da Silva Torres,et al.  Visual word spatial arrangement for image retrieval and classification , 2014, Pattern Recognit..

[4]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Martha Larson,et al.  Pairwise geometric matching for large-scale object retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ming Yang,et al.  Contextual weighting for vocabulary tree based image retrieval , 2011, 2011 International Conference on Computer Vision.

[7]  Victor S. Lempitsky,et al.  Neural Codes for Image Retrieval , 2014, ECCV.

[8]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[10]  Qi Tian,et al.  Packing and Padding: Coupled Multi-index for Accurate Image Retrieval , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[12]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[13]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[14]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, CVPR.

[15]  Victor Lempitsky,et al.  The inverted multi-index , 2012, CVPR.

[16]  Steven C. H. Hoi,et al.  Fast Object Retrieval Using Direct Spatial Matching , 2015, IEEE Transactions on Multimedia.

[17]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Andrew Zisserman,et al.  Triangulation Embedding and Democratic Aggregation for Image Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Jie Lin,et al.  A practical guide to CNNs and Fisher Vectors for image instance retrieval , 2015, Signal Process..

[20]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Changhu Wang,et al.  Spatial-bag-of-features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[24]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Yannis Avrithis,et al.  Early burst detection for memory-efficient image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Zheng Liu,et al.  Integrated Imaging and Vision Techniques for Industrial Inspection: Advances and Applications , 2015 .

[30]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[31]  Qi Tian,et al.  Embedding spatial context information into inverted filefor large-scale image retrieval , 2012, ACM Multimedia.