Search by Detection: Object-Level Feature for Image Retrieval

In content-based image retrieval (CBIR), images are usually represented by local invariant features or global features. Although great success has been witnessed, there still exists some non-trivial problems with those features. In this paper, we propose a novel image representation for image retrieval. We identify some regions of interest with an advanced general object detector, and the regions are described by features extracted with convolutional neural network (CNN). We evaluate the performance of the proposed representation on two public datasets. The experimental results demonstrate the effectiveness of the proposed method.

[1]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2014, Computational Visual Media.

[2]  Qi Tian,et al.  SIFT match verification by geometric coding for large-scale partial-duplicate web image search , 2013, TOMCCAP.

[3]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[5]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Qi Tian,et al.  Towards Codebook-Free: Scalable Cascaded Hashing for Mobile Image Search , 2014, IEEE Transactions on Multimedia.

[8]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[9]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[10]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[13]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Qi Tian,et al.  Scalar quantization for large scale image search , 2012, ACM Multimedia.

[15]  Ming Yang,et al.  Contextual weighting for vocabulary tree based image retrieval , 2011, 2011 International Conference on Computer Vision.