A fast online spherical hashing method based on data sampling for large scale image retrieval

Abstract Hashing methods are used to perform the approximate nearest neighbor search due to the low storage for binary codes and the fast computation of Hamming distance. However, in most of the hashing methods, the learning process of hash functions has high cost in both time and storage. To overcome this issue, in this paper, a fast online unsupervised hashing method based on data sampling is proposed to learn the hypersphere-based hash functions from the streaming data. By maintaining a small-size data sample to efficiently preserve the properties of the streaming data, the hypersphere-based hash functions are learnt in an online fashion from the data sample and we can justify the hash functions by proving their theoretic properties. To further improve the search accuracy of our method, a new dimensionality reduction algorithm is proposed to learn the projection matrix from the data sample to construct a low-dimensional space. Then, the data sample is projected into the low-dimensional space, and our method can learn the hash functions online from the small-size projected data sample with low computational complexity and storage space. The experiments show that our method has a better search accuracy than other online hashing methods and runs faster in learning the hash functions.

[1]  Xianfu Wang Volumes of Generalized Unit Balls , 2005 .

[2]  Xinbo Gao,et al.  Triplet-Based Deep Hashing Network for Cross-Modal Retrieval , 2018, IEEE Transactions on Image Processing.

[3]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[4]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[6]  David J. Fleet,et al.  Fast Exact Search in Hamming Space With Multi-Index Hashing , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[8]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Xianglong Liu,et al.  Adaptive multi-bit quantization for hashing , 2015, Neurocomputing.

[10]  Graham Cormode,et al.  Sampling for big data: a tutorial , 2014, KDD.

[11]  Shih-Fu Chang,et al.  Spherical Hashing: Binary Code Embedding with Hyperspheres , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Fei Yang,et al.  Web scale photo hash clustering on a single machine , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Wei Liu,et al.  Semantic Structure-based Unsupervised Deep Hashing , 2018, IJCAI.

[14]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Kun He,et al.  MIHash: Online Hashing with Mutual Information , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Rongrong Ji,et al.  Top Rank Supervised Binary Coding for Visual Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Hanqing Lu,et al.  Online sketching hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Lianli Gao,et al.  Large-scale image retrieval with supervised sparse hashing , 2017, Neurocomputing.

[19]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[20]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[22]  Chao Li,et al.  Shared Predictive Cross-Modal Deep Quantization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[23]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[24]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Miguel Á. Carreira-Perpiñán,et al.  Hashing with binary autoencoders , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Rongrong Ji,et al.  Rank Preserving Hashing for Rapid Image Search , 2015, 2015 Data Compression Conference.

[29]  Wei-Shi Zheng,et al.  Online Hashing , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Wei Liu,et al.  Fast Structural Binary Coding , 2016, IJCAI.

[31]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[32]  Ling Shao,et al.  Discretely Coding Semantic Rank Orders for Supervised Image Hashing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Stan Sclaroff,et al.  Adaptive Hashing for Fast Similarity Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[35]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[36]  Edo Liberty,et al.  Simple and deterministic matrix sketching , 2012, KDD.

[37]  Wu-Jun Li,et al.  Isotropic Hashing , 2012, NIPS.

[38]  Wei Liu,et al.  Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval , 2017, AAAI.