Reconstruction-based supervised hashing

Abstract In the context of large scale similarity search, one promising technique is to encode high dimensional data as compact binary codes to take advantage of the speed and storage efficiencies. Many existing hashing approaches achieve similarity preservation in the Hamming space by preserving similarity relationship between data points. However, most of these methods only consider the relationship between points, which can not capture the data structure comprehensively. In this paper, we propose a reconstruction-based supervised hashing (RSH) method to learn compact binary codes with holistic structure preservation. The proposed method characterizes the similarity structure by the relationship between each data point and the structure generated by the remaining points. The learning objective is set to simultaneously minimize the distance between each point and the structure with the same class label and maximize the distance between each point and the structure with different class labels. In cross-modal retrieval, we propose a reconstruction-based hashing method by distilling the correlation structure in the common latent hamming space. The correlation structure characterizes the semantic correlation by the relationship between data points and structures in the common hamming space. Minimizing the reconstruction error of each single-modal latent model makes hidden layer outputs representative for the input of each modality. Experimental results in both single-modal and cross-modal datasets demonstrate the effectiveness of our methods when compared to several recently proposed approaches.

[1]  Nicu Sebe,et al.  Quantization-based hashing: a general framework for scalable image and video retrieval , 2018, Pattern Recognit..

[2]  Xiang Zhu,et al.  Supervised deep hashing for scalable face image retrieval , 2018, Pattern Recognit..

[3]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[4]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[5]  Yinyu Ye Interior-Point Polynomial Algorithms in Convex Programming (Y. Nesterov and A. Nemirovskii) , 1994, SIAM Rev..

[6]  Gurmeet Singh Manku,et al.  Detecting near-duplicates for web crawling , 2007, WWW '07.

[7]  Haibin Yan,et al.  Collaborative discriminative multi-metric learning for facial expression recognition in video , 2018, Pattern Recognit..

[8]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[10]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Haibin Yan,et al.  Video-based kinship verification using distance metric learning , 2018, Pattern Recognit..

[13]  Ali Farhadi,et al.  Attribute Discovery via Predictable Discriminative Binary Codes , 2012, ECCV.

[14]  Ruifan Li,et al.  Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.

[15]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[16]  Guosheng Lin,et al.  Learning Hash Functions Using Column Generation , 2013, ICML.

[17]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[18]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[19]  Qi Tian,et al.  Batch-Orthogonal Locality-Sensitive Hashing for Angular Similarity , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wei Liu,et al.  Compact Hyperplane Hashing with Bilinear Functions , 2012, ICML.

[21]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[22]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[24]  Yi Zhen,et al.  Co-Regularized Hashing for Multimodal Data , 2012, NIPS.

[25]  Jiwen Lu,et al.  Learning Compact Binary Face Descriptor for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Kilian Q. Weinberger,et al.  Feature hashing for large scale multitask learning , 2009, ICML '09.

[30]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[31]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[33]  Jiwen Lu,et al.  Nonlinear Sparse Hashing , 2017, IEEE Transactions on Multimedia.

[34]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[35]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Wu-Jun Li,et al.  Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Zhou Yu,et al.  Cross-Media Hashing with Neural Networks , 2014, ACM Multimedia.

[38]  Jiwen Lu,et al.  Nonlinear Discrete Hashing , 2017, IEEE Transactions on Multimedia.

[39]  Wen Gao,et al.  Parametric local multiview hamming distance metric learning , 2018, Pattern Recognit..

[40]  Jian Sun,et al.  K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Wei Liu,et al.  Learning to Hash for Indexing Big Data—A Survey , 2015, Proceedings of the IEEE.

[42]  Fumin Shen,et al.  Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[44]  Jürgen Schmidhuber,et al.  Multimodal Similarity-Preserving Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Jianmin Wang,et al.  Correlation Autoencoder Hashing for Supervised Cross-Modal Search , 2016, ICMR.

[46]  Philip S. Yu,et al.  Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.

[47]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[50]  Jun Zhou,et al.  Adaptive hash retrieval with kernel based similarity , 2018, Pattern Recognit..

[51]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[52]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[54]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[55]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[56]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[57]  Jiwen Lu,et al.  Nonlinear Structural Hashing for Scalable Video Search , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[58]  Zhixiang Chen,et al.  Collaborative multiview hashing , 2018, Pattern Recognit..