Sparse Quantization for Patch Description

The representation of local image patches is crucial for the good performance and efficiency of many vision tasks. Patch descriptors have been designed to generalize towards diverse variations, depending on the application, as well as the desired compromise between accuracy and efficiency. We present a novel formulation of patch description, that serves such issues well. Sparse quantization lies at its heart. This allows for efficient encodings, leading to powerful, novel binary descriptors, yet also to the generalization of existing descriptors like SIFT or BRIEF. We demonstrate the capabilities of our formulation for both key point matching and image classification. Our binary descriptors achieve state-of-the-art results for two key point matching benchmarks, namely those by Brown and Mikolajczyk. For image classification, we propose new descriptors, that perform similar to SIFT on Caltech101 and PASCAL VOC07.

[1]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Matthew A. Brown,et al.  Picking the best DAISY , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Vincent Lepetit,et al.  Learning Image Descriptors with the Boosting-Trick , 2012, NIPS.

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[6]  Yuichi Yoshida,et al.  CARD: Compact And Real-time Descriptors , 2011, 2011 International Conference on Computer Vision.

[7]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[8]  Andrew Zisserman,et al.  Descriptor Learning Using Convex Optimisation , 2012, ECCV.

[9]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[11]  Jan-Michael Frahm,et al.  Comparative Evaluation of Binary Features , 2012, ECCV.

[12]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13]  John D. Lafferty,et al.  Learning image representations from the pixel level via hierarchical sparse coding , 2011, CVPR 2011.

[14]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Luc Van Gool,et al.  Nested Sparse Quantization for Efficient Feature Coding , 2012, ECCV.

[16]  Ethan Rublee,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[17]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[18]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Brian C. Lovell,et al.  Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[20]  Simon Lucey,et al.  V1-Inspired Features Induce a Weighted Margin in SVMs , 2012, ECCV.

[21]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Vincent Lepetit,et al.  BRIEF: Computing a Local Binary Descriptor Very Fast , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[25]  Michael Isard,et al.  Descriptor Learning for Efficient Retrieval , 2010, ECCV.

[26]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[27]  Vincent Lepetit,et al.  Efficient Discriminative Projections for Compact Binary Descriptors , 2012, ECCV.