Online latent semantic hashing for cross-media retrieval

Abstract Hashing based cross-media method has been become an increasingly popular technique in facilitating large-scale multimedia retrieval task, owing to its effectiveness and efficiency. Most existing cross-media hashing methods learn hash functions in a batch based mode. However, in practical applications, data points often emerge in a streaming manner, which makes batch based hashing methods loss their efficiency. In this paper, we propose an Online Latent Semantic Hashing (OLSH) method to address this issue. Only newly arriving multimedia data points are utilized to retrain hash functions efficiently and meanwhile preserve the semantic correlations in old data points. Specifically, for learning discriminative hash codes, discrete labels are mapped to a continuous latent semantic space where the relative semantic distances in data points can be measured more accurately. And then, we propose an online optimization scheme towards the challenging task of learning hash functions efficiently on streaming data points, and the computational complexity and memory cost are much less than the size of training dataset at each round. Extensive experiments across many real-world datasets, e.g. Wiki, Mir-Flickr25K and NUS-WIDE, show the effectiveness and efficiency of the proposed method.

[1]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[2]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[3]  Jonghyun Choi,et al.  Predictable Dual-View Hashing , 2013, ICML.

[4]  Jingdong Wang,et al.  Collaborative Quantization for Cross-Modal Similarity Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Qi Tian,et al.  Semantic consistency hashing for cross-modal retrieval , 2016, Neurocomputing.

[6]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[7]  Guosheng Lin,et al.  Supervised Hashing Using Graph Cuts and Boosted Decision Trees , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[9]  Yang Yang,et al.  Supervised hashing with adaptive discrete optimization for multimedia retrieval , 2017, Neurocomputing.

[10]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[11]  Fei Wang,et al.  Composite hashing with multiple information sources , 2011, SIGIR.

[12]  Yongdong Zhang,et al.  Scalable Similarity Search With Topology Preserving Hashing , 2014, IEEE Transactions on Image Processing.

[13]  Qi Tian,et al.  Super-Bit Locality-Sensitive Hashing , 2012, NIPS.

[14]  Zhou Yu,et al.  Sparse Multi-Modal Hashing , 2014, IEEE Transactions on Multimedia.

[15]  Xinbo Gao,et al.  Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval , 2016, IEEE Transactions on Image Processing.

[16]  Heng Tao Shen,et al.  Semi-Paired Discrete Hashing: Learning Latent Hash Codes for Semi-Paired Cross-View Retrieval , 2017, IEEE Transactions on Cybernetics.

[17]  Yi Zhen,et al.  Co-Regularized Hashing for Multimodal Data , 2012, NIPS.

[18]  Rongrong Ji,et al.  Cross-Modality Binary Code Learning via Fusion Similarity Hashing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Hefei Ling,et al.  Feature fusion based hashing for large scale image copy detection , 2014, Fifth International Conference on Intelligent Control and Information Processing.

[20]  Xiaochun Cao,et al.  Sketch based image retrieval via image-aided cross domain learning , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[21]  Wei-Shi Zheng,et al.  Online Hashing , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[23]  Fumin Shen,et al.  Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[25]  Ling Shao,et al.  Sequential Discrete Hashing for Scalable Cross-Modality Similarity Retrieval , 2017, IEEE Transactions on Image Processing.

[26]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[27]  Devraj Mandal,et al.  Label consistent matrix factorization based hashing for cross-modal retrieval , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[28]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Zahir Tari,et al.  KRNN: k Rare-class Nearest Neighbour classification , 2017, Pattern Recognit..

[30]  Tianming Liang,et al.  A new image classification method based on modified condensed nearest neighbor and convolutional neural networks , 2017, Pattern Recognit. Lett..

[31]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[33]  Xiaochun Cao,et al.  Audio Visual Attribute Discovery for Fine-Grained Object Recognition , 2018, AAAI.

[34]  Xinbo Gao,et al.  Semantic Topic Multimodal Hashing for Cross-Media Retrieval , 2015, IJCAI.

[35]  Hiroyuki Arai,et al.  Alternating Co-Quantization for Cross-Modal Hashing , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Jun Zhou,et al.  Adaptive hash retrieval with kernel based similarity , 2018, Pattern Recognit..

[37]  Chong-Wah Ngo,et al.  Semi-supervised Hashing with Semantic Confidence for Large Scale Visual Search , 2015, SIGIR.

[38]  Larry S. Davis,et al.  Learning predictable binary codes for face indexing , 2015, Pattern Recognit..

[39]  Zi Huang,et al.  Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.

[40]  Jiebo Luo,et al.  Multi-modal deep feature learning for RGB-D object detection , 2017, Pattern Recognit..

[41]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Jianfei Cai,et al.  Semi-supervised manifold-embedded hashing with joint feature representation and classifier learning , 2017, Pattern Recognit..

[43]  Lei Zhang,et al.  Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification , 2015, IEEE Transactions on Image Processing.

[44]  Fumin Shen,et al.  Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources , 2017, Neurocomputing.

[45]  Ran He,et al.  Frustratingly Easy Cross-Modal Hashing , 2016, ACM Multimedia.

[46]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[47]  Zhou Yu,et al.  Discriminative coupled dictionary hashing for fast cross-media retrieval , 2014, SIGIR.

[48]  Nicu Sebe,et al.  Quantization-based hashing: a general framework for scalable image and video retrieval , 2018, Pattern Recognit..

[49]  Inderjit S. Dhillon,et al.  Online Metric Learning and Fast Similarity Search , 2008, NIPS.

[50]  Lei Zhu,et al.  Online Cross-Modal Hashing for Web Image Retrieval , 2016, AAAI.

[51]  Hanqing Lu,et al.  Online sketching hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).