论文信息 - Index and Retrieve Multimedia Data: Cross-Modal Hashing by Learning Subspace Relation

Index and Retrieve Multimedia Data: Cross-Modal Hashing by Learning Subspace Relation

Hashing methods have been extensively applied to efficient multimedia data indexing and retrieval on account of explosion of multimedia data. Cross-modal hashing usually learns binary codes by mapping multi-modal data into a common Hamming space. Most supervised methods utilize relation information like class labels as pairwise similarities of cross-modal data pair to narrow intra-modal and inter-modal gap. In this paper, we propose a novel supervised cross-modal hashing method dubbed Subspace Relation Learning for Cross-modal Hashing (SRLCH), which exploits relation information in semantic labels to make similar data from different modalities closer in the low-dimension Hamming subspace. SRLCH preserves the discrete constraints and nonlinear structures, while admitting a closed-form binary codes solution, which effectively enhances the training efficiency. An iterative alternative optimization algorithm is developed to simultaneously learn both hash functions and unified binary codes, indexing multimedia data in an efficient way. Evaluations in two cross-modal retrieval tasks on three widely-used datasets show that the proposed SRLCH outperforms most cross-modal hashing methods.

[1] Nikos Paragios,et al. Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2] Jonghyun Choi,et al. Predictable Dual-View Hashing , 2013, ICML.

[3] Svetlana Lazebnik,et al. Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[4] Xuelong Li,et al. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[5] Jianmin Wang,et al. Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Yi Zhen,et al. A probabilistic model for multimodal hash function learning , 2012, KDD.

[7] Rongrong Ji,et al. Cross-Modality Binary Code Learning via Fusion Similarity Hashing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Zi Huang,et al. Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[9] Qing Li,et al. Learning Manifold Representation from Multimodal Data for Event Detection in Flickr-Like Social Media , 2016, DASFAA Workshops.

[10] Kristen Grauman,et al. Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11] Heng Tao Shen,et al. Hashing with Angular Reconstructive Embeddings , 2018, IEEE Transactions on Image Processing.

[12] Zhenan Sun,et al. Fast Supervised Discrete Hashing , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Dongqing Zhang,et al. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[14] Guiguang Ding,et al. Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Kien A. Hua,et al. Linear Subspace Ranking Hashing for Cross-Modal Retrieval , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Wu-Jun Li,et al. Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Xin Huang,et al. An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[18] Gang Hua,et al. Supervised Matrix Factorization for Cross-Modality Hashing , 2016, IJCAI.

[19] Alejandro de la Vega,et al. Developing a Comprehensive Framework for Multimodal Feature Extraction , 2017, KDD.

[20] Wei Wang,et al. A Comprehensive Survey on Cross-modal Retrieval , 2016, ArXiv.

[21] Yi Zhen,et al. Co-Regularized Hashing for Multimodal Data , 2012, NIPS.

[22] Wei Liu,et al. Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[24] Yang Yang,et al. Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[25] Zi Huang,et al. Robust discrete code modeling for supervised hashing , 2018, Pattern Recognit..

[26] Zhou Yu,et al. Discriminative coupled dictionary hashing for fast cross-media retrieval , 2014, SIGIR.

[27] Guihai Chen,et al. AngleCut: A Ring-Based Hashing Scheme for Distributed Metadata Management , 2017, DASFAA.

[28] Yang Yang,et al. Zero-Shot Hashing via Transferring Supervised Knowledge , 2016, ACM Multimedia.

[29] Seungjin Choi,et al. Deep Learning to Hash with Multiple Representations , 2012, 2012 IEEE 12th International Conference on Data Mining.

[30] Beng Chin Ooi,et al. Effective deep learning-based multi-modal retrieval , 2015, The VLDB Journal.

[31] Roger Levy,et al. A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[32] Nicu Sebe,et al. A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[34] Trevor Darrell,et al. Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[35] Yang Yang,et al. Attribute hashing for zero-shot image retrieval , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[36] Rongrong Ji,et al. Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Guiguang Ding,et al. Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[38] Tat-Seng Chua,et al. NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[39] Heng Tao Shen,et al. Hashing for Similarity Search: A Survey , 2014, ArXiv.