Boosting Binary Keypoint Descriptors

Binary key point descriptors provide an efficient alternative to their floating-point competitors as they enable faster processing while requiring less memory. In this paper, we propose a novel framework to learn an extremely compact binary descriptor we call Bin Boost that is very robust to illumination and viewpoint changes. Each bit of our descriptor is computed with a boosted binary hash function, and we show how to efficiently optimize the different hash functions so that they complement each other, which is key to compactness and robustness. The hash functions rely on weak learners that are applied directly to the image patches, which frees us from any intermediate representation and lets us automatically learn the image gradient pooling configuration of the final descriptor. Our resulting descriptor significantly outperforms the state-of-the-art binary descriptors and performs similarly to the best floating-point descriptors at a fraction of the matching time and memory footprint.

[1]  Vincent Lepetit,et al.  BRIEF: Computing a Local Binary Descriptor Very Fast , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Vincent Lepetit,et al.  Learning Image Descriptors with the Boosting-Trick , 2012, NIPS.

[3]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  Ji Zhu,et al.  Boosting as a Regularized Path to a Maximum Margin Classifier , 2004, J. Mach. Learn. Res..

[7]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Vincent Lepetit,et al.  Efficient Discriminative Projections for Compact Binary Descriptors , 2012, ECCV.

[9]  Ya Zhang,et al.  Boosted multi-task learning , 2010, Machine Learning.

[10]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Antonio Torralba,et al.  Multidimensional Spectral Hashing , 2012, ECCV.

[12]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[13]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[14]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[15]  David J. Fleet,et al.  Fast search in Hamming space with multi-index hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Rabab Kreidieh Ward,et al.  A Fast Approximate Nearest Neighbor Search Algorithm in the Hamming Space , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Ethan Rublee,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[18]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[20]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[21]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[22]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[23]  Andrew Zisserman,et al.  Descriptor Learning Using Convex Optimisation , 2012, ECCV.

[24]  Pascal Fua,et al.  A Real-Time Deformable Detector , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Gregory Shakhnarovich,et al.  Learning task-specific similarity , 2005 .

[26]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[27]  C. Lawrence Zitnick,et al.  Binary Coherent Edge Descriptors , 2010, ECCV.

[28]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[29]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[30]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[32]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Guanghui Wang,et al.  Quasi-perspective Projection Model: Theory and Application to Structure and Motion Factorization from Uncalibrated Image Sequences , 2010, International Journal of Computer Vision.

[34]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[36]  Inderjit S. Dhillon,et al.  Metric and Kernel Learning Using a Linear Transformation , 2009, J. Mach. Learn. Res..

[37]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.