Weakly Supervised Multimodal Hashing for Scalable Social Image Retrieval

Recent years have witnessed a dramatic increase in the number of community-contributed images. Hashing-based similarity searches for social images have been attracting considerable interest from computer vision and multimedia communities due to their computational and memory efficiency. In this paper, we propose a novel weakly supervised hashing method named weakly supervised multimodal hashing, for scalable social image retrieval. Semantic-aware hash functions are learned by jointly leveraging the weakly supervised tag information and visual information. Specifically, because user-provided tags associated with social images can describe the semantic information, the hash functions are learned by exploring the semantic structure. Unfortunately, the user-provided tags are imperfect. To avoid overfitting the weakly supervised tags, the local discriminative structure and the geometric structure in the visual space are explored. Besides, to learn compact and non-redundant hash codes, the hash functions are constrained to be orthogonal and an information theoretic regularization based on the maximum entropy principle is introduced to maximize the information provided by each hash code. The learned hash functions are orthogonal, which can avoid redundancy in the learned hash codes as much as possible. The proposed hashing learning problem is formulated as the eigenvalue problem, which can be solved efficiently. Extensive experiments are conducted on two widely used social image data sets and the encouraging performance compared with the state-of-the-art hashing techniques demonstrates the effectiveness of the proposed method.

[1]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[3]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[4]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[5]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[6]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[7]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[10]  Venkatesh Saligrama,et al.  Efficient Training of Very Deep Neural Networks for Supervised Hashing , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[12]  Huanbo Luan,et al.  Discrete Collaborative Filtering , 2016, SIGIR.

[13]  Qi Tian,et al.  Super-Bit Locality-Sensitive Hashing , 2012, NIPS.

[14]  Tat-Seng Chua,et al.  Discrete Image Hashing Using Large Weakly Annotated Photo Collections , 2016, AAAI.

[15]  Wu-Jun Li,et al.  Feature Learning Based Deep Supervised Hashing with Pairwise Labels , 2015, IJCAI.

[16]  Meng Wang,et al.  Neighborhood Discriminant Hashing for Large-Scale Image Retrieval , 2015, IEEE Transactions on Image Processing.

[17]  Xiang Zhu,et al.  Supervised deep hashing for scalable face image retrieval , 2018, Pattern Recognit..

[18]  Ke Jiang,et al.  Revisiting kernelized locality-sensitive hashing for improved large-scale image retrieval , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Qingming Huang,et al.  Semantic-aware Hashing for Social Image Retrieval , 2015, ICMR.

[20]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[21]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[22]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, International Journal of Computer Vision.

[23]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[24]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Fumin Shen,et al.  Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Nenghai Yu,et al.  Order preserving hashing for approximate nearest neighbor search , 2013, ACM Multimedia.

[27]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Jinhui Tang,et al.  Discriminative Deep Hashing for Scalable Face Image Retrieval , 2017, IJCAI.

[29]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Chun Chen,et al.  Semi-Supervised Nonlinear Hashing Using Bootstrap Sequential Projection Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[32]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[33]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[34]  Zhou Yu,et al.  Sparse Multi-Modal Hashing , 2014, IEEE Transactions on Multimedia.

[35]  Shih-Fu Chang,et al.  Spherical Hashing: Binary Code Embedding with Hyperspheres , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Yueting Zhuang,et al.  Hypergraph spectral hashing for similarity search of social image , 2011, ACM Multimedia.

[37]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[38]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[39]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[40]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[41]  Shiguang Shan,et al.  Semisupervised Hashing via Kernel Hyperplane Learning for Scalable Image Search , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  Jingdong Wang,et al.  Binary Optimized Hashing , 2016, ACM Multimedia.

[43]  Jen-Hao Hsiao,et al.  Deep learning of binary hash codes for fast image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[44]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[45]  Jinhui Tang,et al.  Weakly Supervised Deep Matrix Factorization for Social Image Understanding , 2017, IEEE Transactions on Image Processing.

[46]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[47]  Jing Liu,et al.  Robust Structured Subspace Learning for Data Representation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.