Quantization-based hashing: a general framework for scalable image and video retrieval

As far as we know, we are the first to propose a general framework to incorporate the quantization-based methods into the conventional similarity-preserving hashing, in order to improve the effectiveness of hashing methods. In theory, any quantization method can be adopted to reduce the quantization error of any similarity-preserving hashing methods to improve their performance.This framework can be applied to both unsupervised and supervised hashing. We experimentally obtained the best performance compared to state-ofthe-art supervised and unsupervised hashing methods on six popular datasets.We successfully show it to work on a huge dataset SIFT1B (1 billion data points) by utilizing the graph approximation and out-of-sample extension. Nowadays, due to the exponential growth of user generated images and videos, there is an increasing interest in learning-based hashing methods. In computer vision, the hash functions are learned in such a way that the hash codes can preserve essential properties of the original space (or label information). Then the Hamming distance of the hash codes can approximate the data similarity. On the other hand, vector quantization methods quantize the data into different clusters based on the criteria of minimal quantization error, and then perform the search using look-up tables. While hashing methods using Hamming distance can achieve faster search speed, their accuracy is often outperformed by quantization methods with the same code length, due to the low quantization error and more flexible distance lookups. To improve the effectiveness of the hashing methods, in this work, we propose Quantization-based Hashing (QBH), a general framework which incorporates the advantages of quantization error reduction methods into conventional property preserving hashing methods. The learned hash codes simultaneously preserve the properties in the original space and reduce the quantization error, and thus can achieve better performance. Furthermore, the hash functions and a quantizer can be jointly learned and iteratively updated in a unified framework, which can be readily used to generate hash codes or quantize new data points. Importantly, QBH is a generic framework that can be integrated to different property preserving hashing methods and quantization strategies, and we apply QBH to both unsupervised and supervised hashing models as showcases in this paper. Experimental results on three large-scale unlabeled datasets (i.e., SIFT1M, GIST1M, and SIFT1B), three labeled datastes (i.e., ESPGAME, IAPRTC and MIRFLICKR) and one video dataset (UQ_VIDEO) demonstrate the superior performance of our QBH over existing unsupervised and supervised hashing methods.

[1]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[4]  Nicu Sebe,et al.  Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation , 2016, IEEE Transactions on Image Processing.

[5]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[6]  Jian Sun,et al.  Graph Cuts for Supervised Binary Coding , 2014, ECCV.

[7]  Jinhui Tang,et al.  Sparse composite quantization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Zi Huang,et al.  Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.

[9]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Huanbo Luan,et al.  Discrete Collaborative Filtering , 2016, SIGIR.

[11]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[12]  Nenghai Yu,et al.  Order preserving hashing for approximate nearest neighbor search , 2013, ACM Multimedia.

[13]  Xianglong Liu,et al.  Collaborative Hashing , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Jinhui Tang,et al.  Supervised Quantization for Similarity Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Nicu Sebe,et al.  Optimal graph learning with partial tags and multiple features for image and video annotation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  David J. Fleet,et al.  Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[19]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Larry S. Davis,et al.  Learning predictable binary codes for face indexing , 2015, Pattern Recognit..

[21]  Lei Huang,et al.  Multi-View Complementary Hash Tables for Nearest Neighbor Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  David J. Fleet,et al.  Fast Exact Search in Hamming Space With Multi-Index Hashing , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[24]  Meng Wang,et al.  Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing , 2016, ACM Multimedia.

[25]  Nicu Sebe,et al.  Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval , 2015, ACM Multimedia.

[26]  Zi Huang,et al.  Robust Hashing With Local Models for Approximate Similarity Search , 2014, IEEE Transactions on Cybernetics.

[27]  Huchuan Lu,et al.  Video anomaly detection based on locality sensitive hashing filters , 2016, Pattern Recognit..

[28]  Victor S. Lempitsky,et al.  Tree quantization for large-scale similarity search and classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Patrick P. K. Chan,et al.  Asymmetric Cyclical Hashing for Large Scale Image Retrieval , 2015, IEEE Transactions on Multimedia.

[30]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[31]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[32]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jingkuan Song,et al.  Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning , 2015, ACM Multimedia.

[36]  Naiyan Wang,et al.  Trinary-Projection Trees for Approximate Nearest Neighbor Search , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[38]  Zi Huang,et al.  A Sparse Embedding and Least Variance Encoding Approach to Hashing , 2014, IEEE Transactions on Image Processing.

[39]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[40]  Jiwen Lu,et al.  Simultaneous Local Binary Feature Learning and Encoding for Face Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[41]  Heng Tao Shen,et al.  Optimized Cartesian K-Means , 2014, IEEE Transactions on Knowledge and Data Engineering.

[42]  Nicu Sebe,et al.  Compact Image Fingerprint Via Multiple Kernel Hashing , 2015, IEEE Transactions on Multimedia.

[43]  Chun Chen,et al.  Harmonious Hashing , 2013, IJCAI.

[44]  Jiwen Lu,et al.  Learning Compact Binary Face Descriptor for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Miguel Á. Carreira-Perpiñán,et al.  Hashing with binary autoencoders , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Jian Sun,et al.  K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[50]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Zhen Wang,et al.  The ordinal relation preserving binary codes , 2015, Pattern Recognit..

[52]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[53]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.