Supervised Hashing Using Graph Cuts and Boosted Decision Trees

To build large-scale query-by-example image retrieval systems, embedding image features into a binary Hamming space provides great benefits. Supervised hashing aims to map the original features to compact binary codes that are able to preserve label based similarity in the binary Hamming space. Most existing approaches apply a single form of hash function, and an optimization process which is typically deeply coupled to this specific form. This tight coupling restricts the flexibility of those methods, and can result in complex optimization problems that are difficult to solve. In this work we proffer a flexible yet simple framework that is able to accommodate different types of loss functions and hash functions. The proposed framework allows a number of existing approaches to hashing to be placed in context, and simplifies the development of new problem-specific hashing methods. Our framework decomposes the hashing learning problem into two steps: binary code (hash bit) learning and hash function learning. The first step can typically be formulated as binary quadratic problems, and the second step can be accomplished by training a standard binary classifier. For solving large-scale binary code inference, we show how it is possible to ensure that the binary quadratic problems are submodular such that efficient graph cut methods may be used. To achieve efficiency as well as efficacy on large-scale high-dimensional data, we propose to use boosted decision trees as the hash functions, which are nonlinear, highly descriptive, and are very fast to train and evaluate. Experiments demonstrate that the proposed method significantly outperforms most state-of-the-art methods, especially on high-dimensional data.

[1]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Vladimir Kolmogorov,et al.  Optimizing Binary MRFs via Extended Roof Duality , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[5]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[6]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Csaba Szepesvári,et al.  Deep Representations and Codes for Image Auto-Annotation , 2012, NIPS.

[9]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[10]  Fumin Shen,et al.  Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[12]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[13]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[14]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Deng Cai,et al.  Extensions to Self-Taught Hashing: Kernelisation and Supervision , 2010 .

[16]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[17]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[19]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[20]  Sanjiv Kumar,et al.  Angular Quantization-based Binary Codes for Fast Similarity Search , 2012, NIPS.

[21]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[22]  Antonio Torralba,et al.  Multidimensional Spectral Hashing , 2012, ECCV.

[23]  Jian Sun,et al.  K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[25]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[26]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[28]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[30]  Guosheng Lin,et al.  Learning Hash Functions Using Column Generation , 2013, ICML.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[33]  VekslerOlga,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001 .

[34]  Yihong Gong,et al.  Learning to Search Efficiently in High Dimensions , 2011, NIPS.

[35]  Miguel Á. Carreira-Perpiñán,et al.  The Elastic Embedding Algorithm for Dimensionality Reduction , 2010, ICML.

[36]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[37]  Pietro Perona,et al.  Quickly Boosting Decision Trees - Pruning Underachieving Features Early , 2013, ICML.

[38]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[40]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Koby Crammer,et al.  Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training , 2012, J. Mach. Learn. Res..

[42]  Jianxin Wu,et al.  Optimizing Ranking Measures for Compact Binary Code Learning , 2014, ECCV.

[43]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[45]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  David Suter,et al.  A General Two-Step Approach to Learning-Based Hashing , 2013, 2013 IEEE International Conference on Computer Vision.

[47]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.