CHoG: Compressed histogram of gradients A low bit-rate feature descriptor

Establishing visual correspondences is an essential component of many computer vision problems, and is often done with robust, local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile distributed camera networks and large indexing problems. We propose a framework for computing low bit-rate feature descriptors with a 20× reduction in bit rate. The framework is low complexity and has significant speed-up in the matching stage. We represent gradient histograms as tree structures which can be efficiently compressed. We show how to efficiently compute distances between descriptors in their compressed representation eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes.

[1]  Bernd Girod,et al.  Outdoors augmented reality on mobile phone using loxel-based visual feature organization , 2008, MIR '08.

[2]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[3]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[4]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[5]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Bernd Girod,et al.  Feature Tracking for Mobile Augmented Reality Using Video Coder Motion Vectors , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[10]  Chuohao Yeo,et al.  Rate-efficient visual correspondences using random projections , 2008, 2008 15th IEEE International Conference on Image Processing.

[11]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[12]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[13]  Matthew A. Brown,et al.  Learning Local Image Descriptors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  James J. Little,et al.  Vision-based global localization and mapping for mobile robots , 2005, IEEE Transactions on Robotics.

[16]  Venkatesh Raman,et al.  Succinct representation of balanced parentheses, static trees and planar graphs , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[17]  Bernd Girod,et al.  Transform coding of image feature descriptors , 2009, Electronic Imaging.

[18]  Travis Gagie,et al.  Compressing probability distributions , 2005, Inf. Process. Lett..

[19]  Gregory Shakhnarovich,et al.  Learning task-specific similarity , 2005 .

[20]  Matthew A. Brown,et al.  Unsupervised 3D object recognition and reconstruction in unordered datasets , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[21]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[22]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[23]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.