Dimensionality reduction of visual features using sparse projectors for content-based image retrieval

In web-scale image retrieval, the most effective strategy is to aggregate local descriptors into a high dimensionality signature and then reduce it to a small dimensionality. Thanks to this strategy, web-scale image databases can be represented with small index and explored using fast visual similarities. However, the computation of this index has a very high complexity, because of the high dimensionality of signature projectors. In this work, we propose a new efficient method to greatly reduce the signature dimensionality with low computational and storage costs. Our method is based on the linear projection of the signature onto a small subspace using a sparse projection matrix. We report several experimental results on two standard datasets (Inria Holidays and Oxford) and with 100k image distractors. We show that our method reduces both the projectors storage cost and the computational cost of projection step while incurring a very slight loss in mAP (mean Average Precision) performance of these computed signatures.

[1]  Cordelia Schmid,et al.  Combining attributes and Fisher vectors for efficient image retrieval , 2011, CVPR 2011.

[2]  Florent Perronnin,et al.  High-dimensional signature compression for large-scale image classification , 2011, CVPR 2011.

[3]  Matthieu Cord,et al.  Kernels on Bags of Fuzzy Regions for Fast Object retrieval , 2007, 2007 IEEE International Conference on Image Processing.

[4]  Patrick Pérez,et al.  Revisiting the VLAD image representation , 2013, ACM Multimedia.

[5]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  David Picard,et al.  Improving image similarity with vectors of locally aggregated tensors , 2011, 2011 18th IEEE International Conference on Image Processing.

[10]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[11]  Florent Perronnin,et al.  Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[14]  SánchezJorge,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012 .

[15]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  David Picard,et al.  Web-Scale Image Retrieval Using Compact Tensor Aggregation of Visual Descriptors , 2013, IEEE MultiMedia.

[17]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[19]  David Picard,et al.  Compact tensor based image representation for similarity search , 2012, 2012 19th IEEE International Conference on Image Processing.

[20]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[21]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.