SSP: Supervised Sparse Projections for Large-Scale Retrieval in High Dimensions

As “big data” transforms the way we solve computer vision problems, the question of how we can efficiently leverage large labelled databases becomes increasingly important. High-dimensional features, such as the convolutional neural network activations that drive many leading recognition frameworks, pose particular challenges for efficient retrieval. We present a novel method for learning compact binary codes in which the conventional dense projection matrix is replaced with a discriminatively-trained sparse projection matrix. The proposed method achieves two to three times faster encoding than modern dense binary encoding methods, while obtaining comparable retrieval accuracy, on SUN RGB-D, AwA, and ImageNet datasets. The method is also more accurate than unsupervised high-dimensional binary encoding methods at similar encoding speeds.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[3]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  David J. Fleet,et al.  Fast search in Hamming space with multi-index hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jian Sun,et al.  Graph Cuts for Supervised Binary Coding , 2014, ECCV.

[6]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[9]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[10]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[11]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Cristian Sminchisescu,et al.  Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Pushmeet Kohli,et al.  Computationally bounded retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Stan Sclaroff,et al.  Adaptive Hashing for Fast Similarity Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Shih-Fu Chang,et al.  Circulant Binary Embedding , 2014, ICML.

[19]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Jian Sun,et al.  Sparse projections for high-dimensional binary codes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Sanjiv Kumar,et al.  Learning Binary Codes for High-Dimensional Data Using Bilinear Projections , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Shih-Fu Chang,et al.  Fast Orthogonal Projection Based on Kronecker Product , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Michael P. Friedlander,et al.  Probing the Pareto Frontier for Basis Pursuit Solutions , 2008, SIAM J. Sci. Comput..

[26]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[28]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[29]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[30]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[31]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[32]  Ke Jiang,et al.  Revisiting kernelized locality-sensitive hashing for improved large-scale image retrieval , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).