Light-weight binary code embedding of local feature distribution in image search

Binary code embedding, which aims to generate compact and discriminative binary codes from local image features, can remarkably improve the image search performance by compensating the quantization error in Bag-of-Words (BoW) model. However, the relationship between local features and their neighbors are often ignored by existing embedding schemes, while such information of spatial distribution can greatly improve the discriminative ability of binary codes. Toward this end, this paper proposes two light-weight schemes for binary code embedding that take the spatial distribution of local features into account. These two schemes, including the Content Similarity Embedding (CSE) and Scale Similarity Embedding (SSE), are highly flexible in balancing the computational cost as well as the image search performance. Experimental results on several public benchmarks show that, with the proposed two embedding schemes, image search achieves comparable performance with state-of-the-arts with much lower computational cost and memory usage.

[1]  Qi Tian,et al.  Fine-residual VLAD for image retrieval , 2016, Neurocomputing.

[2]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Wei Li,et al.  Fully affine invariant SURF for image matching , 2012, Neurocomputing.

[4]  Svetlana Lazebnik,et al.  Asymmetric Distances for Binary Embeddings , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[8]  Qi Tian,et al.  Packing and Padding: Coupled Multi-index for Accurate Image Retrieval , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Hai-Miao Hu,et al.  A hierarchal BoW for image retrieval by enhancing feature salience , 2016, Neurocomputing.

[10]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Patrick Gros,et al.  Asymmetric hamming embedding: taking the best of our bits for large scale image search , 2011, ACM Multimedia.

[12]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, CVPR.

[14]  Cordelia Schmid,et al.  Local Convolutional Features with Unsupervised Training for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  David J. Fleet,et al.  Fast Exact Search in Hamming Space With Multi-Index Hashing , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[18]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Shengjin Wang,et al.  Visual Phraselet: Refining Spatial Constraints for Large Scale Image Search , 2013, IEEE Signal Processing Letters.

[21]  Mohan S. Kankanhalli,et al.  Automatic video logo detection and removal , 2005, Multimedia Systems.

[22]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Jiri Matas,et al.  Geometric min-Hashing: Finding a (thick) needle in a haystack , 2009, CVPR.

[24]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[25]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[26]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[27]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[28]  Jie Yang,et al.  An efficient indexing method for content-based image retrieval , 2013, Neurocomputing.

[29]  Yao Zhao,et al.  Frame Fusion for Video Copy Detection , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Cordelia Schmid,et al.  Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach , 2016, International Journal of Computer Vision.

[31]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Tsuhan Chen,et al.  Image retrieval with geometry-preserving visual phrases , 2011, CVPR 2011.

[34]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[35]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[36]  Wen Gao,et al.  Learning to Distribute Vocabulary Indexing for Scalable Visual Search , 2013, IEEE Transactions on Multimedia.

[37]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[38]  Shiliang Zhang,et al.  Multi-order visual phrase for scalable image search , 2013, ICIMCS '13.

[39]  Jian Sun,et al.  Optimized Product Quantization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Yao Zhao,et al.  Joint Optimization Toward Effective and Efficient Image Search , 2013, IEEE Transactions on Cybernetics.

[41]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[42]  Gang Hua,et al.  Descriptive visual words and visual phrases for image applications , 2009, ACM Multimedia.