Cross-media hashing with Centroid Approaching

Cross-media retrieval has received increasing interest in recent years, which aims to addressing the semantic correlation issues within rich media. As two key aspects, cross-media representation and indexing have been studied for dealing with cross-media similarity measure and the scalability issue, respectively. In this paper, we propose a new cross-media hashing scheme, called Centroid Approaching Cross-Media Hashing (CAMH), to handle both cross-media representation and indexing simultaneously. Different from existing indexing methods, the proposed method introduces semantic category information into the learning procedure, leading to more exact hash codes of multiple media type instances. In addition, we present a comparative study of cross-media indexing methods under a unique evaluation framework. Extensive experiments on two commonly used datasets demonstrate the good performance in terms of search accuracy and time complexity.

[1]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[2]  Wen Gao,et al.  Multi-Task Rank Learning for Visual Saliency Estimation , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Yueting Zhuang,et al.  Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval , 2013, AAAI.

[4]  Jian Pei,et al.  Parallel field alignment for cross media retrieval , 2013, ACM Multimedia.

[5]  Yi Zhen,et al.  Co-Regularized Hashing for Multimodal Data , 2012, NIPS.

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[8]  Qi Tian,et al.  ${\rm S}^{3}{\rm MKL}$: Scalable Semi-Supervised Multiple Kernel Learning for Real-World Image Applications , 2012, IEEE Transactions on Multimedia.

[9]  Liang-Tien Chia,et al.  Cross-media retrieval using query dependent search methods , 2010, Pattern Recognit..

[10]  Yao Zhao,et al.  Multimodal Fusion for Video Search Reranking , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Yansheng Lu,et al.  A semantic model for cross-modal and multi-modal retrieval , 2013, ICMR '13.

[12]  Qi Tian,et al.  Nearest-neighbor method using multiple neighborhood similarities for social media data mining , 2012, Neurocomputing.

[13]  Yao Zhao,et al.  Joint Optimization Toward Effective and Efficient Image Search , 2013, IEEE Transactions on Cybernetics.

[14]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[15]  Wen Gao,et al.  Removing Label Ambiguity in Learning-Based Visual Saliency Estimation , 2012, IEEE Transactions on Image Processing.

[16]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[17]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[18]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.