Product Sparse Coding

Sparse coding is a widely involved technique in computer vision. However, the expensive computational cost can hamper its applications, typically when the codebook size must be limited due to concerns on running time. In this paper, we study a special case of sparse coding in which the codebook is a Cartesian product of two subcodebooks. We present algorithms to decompose this sparse coding problem into smaller subproblems, which can be separately solved. Our solution, named as Product Sparse Coding (PSC), reduces the time complexity from O(K) to O(√K) in the codebook size K. In practice, this can be 20-100× faster than standard sparse coding. In experiments we demonstrate the efficiency and quality of this method on the applications of image classification and image retrieval.

[1]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[2]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[3]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[5]  B. Turlach Discussion of "Least Angle Regression" by Efron, Hastie, Johnstone and Tibshirani , 2004 .

[6]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[7]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[8]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[10]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[11]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[13]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[15]  Thomas S. Huang,et al.  Image super-resolution as sparse representation of raw image patches , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[18]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[22]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[23]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[24]  Florent Perronnin,et al.  High-dimensional signature compression for large-scale image classification , 2011, CVPR 2011.

[25]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[26]  Andrew Zisserman,et al.  Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Victor Lempitsky,et al.  The inverted multi-index , 2012, CVPR.

[29]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Jian Sun,et al.  Joint Inverted Indexing , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Jian Sun,et al.  Sparse-Coded Features for Image Retrieval , 2013, BMVC.

[32]  David J. Fleet,et al.  Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jian Sun,et al.  Optimized Product Quantization for Approximate Nearest Neighbor Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Jian Sun,et al.  Optimized Product Quantization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.