Post Tuned Hashing: A New Approach to Indexing High-dimensional Data

Learning to hash has proven to be an effective solution for indexing high-dimensional data by projecting them to similarity-preserving binary codes. However, most existing methods end up the learning scheme with a binarization stage, i.e. binary quantization, which inevitably destroys the neighborhood structure of original data. As a result, those methods still suffer from great similarity loss and result in unsatisfactory indexing performance. In this paper we propose a novel hashing model, namely Post Tuned Hashing (PTH), which includes a new post-tuning stage to refine the binary codes after binarization. The post-tuning seeks to rebuild the destroyed neighborhood structure, and hence significantly improves the indexing performance. We cast the post-tuning into a binary quadratic optimization framework and, despite its NP-hardness, give a practical algorithm to efficiently obtain a high-quality solution. Experimental results on five noted image benchmarks show that our PTH improves previous state-of-the-art methods by 13-58% in mean average precision.

[1]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[2]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[3]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[4]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[5]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[6]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[7]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[8]  Yongdong Zhang,et al.  Topology preserving hashing for similarity search , 2013, MM '13.

[9]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[11]  Shih-Fu Chang,et al.  Locally Linear Hashing for Extracting Non-linear Manifolds , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Zhi-Quan Luo,et al.  Semidefinite Relaxation of Quadratic Optimization Problems , 2010, IEEE Signal Processing Magazine.

[14]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[15]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[17]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[18]  Mohan S. Kankanhalli,et al.  Hierarchical Clustering Multi-Task Learning for Joint Human Action Grouping and Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yue Gao,et al.  Multi-Modal Clique-Graph Matching for View-Based 3D Model Retrieval , 2016, IEEE Transactions on Image Processing.

[20]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Wu-Jun Li,et al.  Double-Bit Quantization for Hashing , 2012, AAAI.

[22]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[23]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[24]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Wu-Jun Li,et al.  Isotropic Hashing , 2012, NIPS.

[27]  Gaofeng Meng,et al.  AMVH: Asymmetric Multi-Valued hashing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Miguel Á. Carreira-Perpiñán,et al.  Hashing with binary autoencoders , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  Wei Liu,et al.  Discrete Graph Hashing , 2014, NIPS.

[32]  Wei Liu,et al.  Learning to Hash for Indexing Big Data—A Survey , 2015, Proceedings of the IEEE.

[33]  WangJun,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012 .

[34]  Deng Cai,et al.  Density Sensitive Hashing , 2012, IEEE Transactions on Cybernetics.

[35]  Shih-Fu Chang,et al.  Spherical Hashing: Binary Code Embedding with Hyperspheres , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Huanbo Luan,et al.  Discrete Collaborative Filtering , 2016, SIGIR.

[38]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.