Collaborative Index Embedding for Image Retrieval

In content-based image retrieval, SIFT feature and the feature from deep convolutional neural network (CNN) have demonstrated promising performance. To fully explore both visual features in a unified framework for effective and efficient retrieval, we propose a collaborative index embedding method to implicitly integrate the index matrices of them. We formulate the index embedding as an optimization problem from the perspective of neighborhood sharing and solve it with an alternating index update scheme. After the iterative embedding, only the embedded CNN index is kept for on-line query, which demonstrates significant gain in retrieval accuracy, with very economical memory cost. Extensive experiments have been conducted on the public datasets with million-scale distractor images. The experimental results reveal that, compared with the recent state-of-the-art retrieval algorithms, our approach achieves competitive accuracy performance with less memory overhead and efficient query computation.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Qi Tian,et al.  Packing and Padding: Coupled Multi-index for Accurate Image Retrieval , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Qi Tian,et al.  Query-adaptive late fusion for image search and person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Qi Tian,et al.  Lp-Norm IDF for Large Scale Image Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[6]  Qi Tian,et al.  Towards Codebook-Free: Scalable Cascaded Hashing for Mobile Image Search , 2014, IEEE Transactions on Multimedia.

[7]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[9]  Qi Tian,et al.  Scalable Feature Matching by Dual Cascaded Scalar Quantization for Image Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Qi Tian,et al.  Image Classification and Retrieval are ONE , 2015, ICMR.

[12]  Xiao Zhang,et al.  QsRank: Query-sensitive hash code ranking for efficient ∊-neighbor search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[14]  Shiliang Zhang,et al.  Cross Indexing With Grouplets , 2015, IEEE Transactions on Multimedia.

[15]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[16]  Luc Van Gool,et al.  Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors , 2011, CVPR 2011.

[17]  Qi Tian,et al.  Accurate Image Search with Multi-Scale Contextual Evidences , 2016, International Journal of Computer Vision.

[18]  Cordelia Schmid,et al.  Local Convolutional Features with Unsupervised Training for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[20]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Qi Tian,et al.  Heterogeneous Graph Propagation for Large-Scale Web Image Search , 2015, IEEE Transactions on Image Processing.

[22]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[23]  Cordelia Schmid,et al.  Accurate Image Search Using the Contextual Dissimilarity Measure , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ming Yang,et al.  Query Specific Fusion for Image Retrieval , 2012, ECCV.

[25]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[26]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[28]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Rainer Lienhart,et al.  Deep networks for image retrieval on large-scale databases , 2008, ACM Multimedia.

[30]  Qi Tian,et al.  Linear Distance Preserving Pseudo-Supervised and Unsupervised Hashing , 2016, ACM Multimedia.

[31]  Ming Yang,et al.  Contextual weighting for vocabulary tree based image retrieval , 2011, 2011 International Conference on Computer Vision.

[32]  Qi Tian,et al.  Scalable Object Retrieval with Compact Image Representation from Generic Object Regions , 2015, ACM Trans. Multim. Comput. Commun. Appl..

[33]  Qi Tian,et al.  Cross-Indexing of Binary SIFT Codes for Large-Scale Image Search , 2014, IEEE Transactions on Image Processing.

[34]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[35]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[36]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2019, Computational Visual Media.

[37]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, CVPR.

[38]  Qi Tian,et al.  Making Residual Vector Distribution Uniform for Distinctive Image Representation , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[40]  Chunyan Miao,et al.  Online multimodal deep similarity learning with application to image retrieval , 2013, ACM Multimedia.

[41]  Shiliang Zhang,et al.  Semantic-Aware Co-indexing for Image Retrieval , 2013, 2013 IEEE International Conference on Computer Vision.

[42]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Victor S. Lempitsky,et al.  Neural Codes for Image Retrieval , 2014, ECCV.

[44]  Yong Rui,et al.  Image search—from thousands to billions in 20 years , 2013, TOMCCAP.

[45]  Ming Yang,et al.  Query Specific Rank Fusion for Image Retrieval , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Shih-Fu Chang,et al.  Mobile product search with Bag of Hash Bits and boundary reranking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[48]  Wei Li,et al.  Partial-Duplicate Clustering and Visual Pattern Discovery on Web Scale Image Database , 2015, IEEE Transactions on Multimedia.

[49]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[50]  Xiao Zhang,et al.  Efficient indexing for large scale visual search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[51]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[52]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[53]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[54]  Liqing Zhang,et al.  Edgel index for large-scale sketch-based image search , 2011, CVPR 2011.

[55]  Ying Wu,et al.  Object retrieval and localization with spatially-constrained similarity measure and k-NN re-ranking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Qi Tian,et al.  BSIFT: Toward Data-Independent Codebook for Large Scale Image Search , 2015, IEEE Transactions on Image Processing.

[58]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.