Supervised Matrix Factorization Hashing With Quantitative Loss for Image-Text Search

Image-text hashing approaches have been widely applied in large-scale similarity search applications due to their efficiency in both search speed and storage efficiency. Most recent supervised hashing approaches learn a hash function by constructing a pairwise similarity matrix or directly learning the hash function and hash code (i.e.,1 or −1) procedure based on class labels. However, the former suffers from high training complexity and storage cost, and the latter ignores the semantic correlation of the original data, both of which prevent discriminative hash codes. To this end, we propose a novel discrete hashing algorithm called supervised matrix factorization hashing with quantitative loss (SMFH-QL). The proposed SMFH-QL first generates hash codes via the class label, avoiding the construction of a pairwise similarity; then, matrix factorization is used to design hash codes from original image-text data, thereby eliminating the impact of class labels and reducing the quantization error. Moreover, we introduce a quantitative loss function term to learn hash codes by incorporating class labels and the original data information, facilitating learning a similarity-preserving hash function in image-text search. Extensive experiments show that SMFH-QL outperforms several existing hashing methods on three representative datasets.

[1]  Gang Hua,et al.  Supervised Matrix Factorization for Cross-Modality Hashing , 2016, IJCAI.

[2]  Huan Zhao,et al.  Automatic syllable segmentation algorithm of Chinese speech based on MF-DFA , 2017, Speech Commun..

[3]  Xiaodong Gu,et al.  Learning Cross-Modal Aligned Representation With Graph Embedding , 2018, IEEE Access.

[4]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Yang Yang,et al.  Supervised hashing with adaptive discrete optimization for multimedia retrieval , 2017, Neurocomputing.

[6]  Qi Tian,et al.  Discrete Robust Supervised Hashing for Cross-Modal Retrieval , 2019, IEEE Access.

[7]  Heng Tao Shen,et al.  Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[9]  Hanling Zhang,et al.  Combining multi-layer integration algorithm with background prior and label propagation for saliency detection , 2017, J. Vis. Commun. Image Represent..

[10]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[11]  Zi Huang,et al.  Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval , 2016, IEEE Transactions on Multimedia.

[12]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[13]  Keqin Li,et al.  Envy-free auction mechanism for VM pricing and allocation in clouds , 2018, Future Gener. Comput. Syst..

[14]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Li Fu,et al.  A novel fuzzy deep-learning approach to traffic flow prediction with uncertain spatial–temporal data features , 2018, Future Generation Computer Systems.

[16]  Xiao-Jun Wu,et al.  Unsupervised Multimodal Hashing for Cross-modal retrieval. , 2020 .

[17]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[18]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[19]  Yuxin Peng,et al.  Multi-Pathway Generative Adversarial Hashing for Unsupervised Cross-Modal Retrieval , 2020, IEEE Transactions on Multimedia.

[20]  Yuliang Shi,et al.  Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval , 2018, ICMR.

[21]  Zhiyong Li,et al.  Robust Object Tracking via Local Sparse Appearance Model , 2018, IEEE Transactions on Image Processing.

[22]  Ling Shao,et al.  Supervised Matrix Factorization Hashing for Cross-Modal Retrieval , 2016, IEEE Transactions on Image Processing.

[23]  Xinbo Gao,et al.  Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jiannong Cao,et al.  Fast Tensor Factorization for Accurate Internet Anomaly Detection , 2017, IEEE/ACM Transactions on Networking.

[25]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[26]  Yaping Lin,et al.  Dynamic Texture Recognition Using Volume Local Binary Count Patterns With an Application to 2D Face Spoofing Detection , 2018, IEEE Transactions on Multimedia.

[27]  Chao Zhang,et al.  Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Zhenan Sun,et al.  Fast Supervised Discrete Hashing , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[30]  Liqiang Nie,et al.  Multimodal Learning toward Micro-Video Understanding , 2019, Synthesis Lectures on Image, Video, and Multimedia Processing.

[31]  Keqin Li,et al.  Multi-view correlation tracking with adaptive memory-improved update model , 2019, Neural Computing and Applications.

[32]  Qi Tian,et al.  Enhancing Micro-video Understanding by Harnessing External Sounds , 2017, ACM Multimedia.

[33]  Hongmei Tang,et al.  Cross-modal Hashing Retrieval Based on Density Clustering , 2020 .

[34]  Roger Levy,et al.  On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Huaxiang Zhang,et al.  Flexible Multi-modal Hashing for Scalable Multimedia Retrieval , 2020, ACM Trans. Intell. Syst. Technol..

[36]  Xuelong Li,et al.  Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[37]  Fumin Shen,et al.  Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources , 2017, Neurocomputing.

[38]  Jungong Han,et al.  Cross-View Retrieval via Probability-Based Semantics-Preserving Hashing , 2017, IEEE Transactions on Cybernetics.

[39]  Song Gao,et al.  Robust object tracking based on adaptive templates matching via the fusion of multiple features , 2017, J. Vis. Commun. Image Represent..

[40]  Rui Yang,et al.  Supervised cross-modal hashing without relaxation , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[41]  Chen Huang,et al.  Unsupervised Learning of Discriminative Attributes and Visual Representations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Junho Shim,et al.  Efficient Data Stream Clustering With Sliding Windows Based on Locality-Sensitive Hashing , 2018, IEEE Access.

[44]  Wu-Jun Li,et al.  Discrete Latent Factor Model for Cross-Modal Hashing , 2017, IEEE Transactions on Image Processing.

[45]  Keqin Li,et al.  Energy management for multiple real-time workflows on cyber-physical cloud systems , 2017, Future Gener. Comput. Syst..

[46]  Geyong Min,et al.  Supervised Intra- and Inter-Modality Similarity Preserving Hashing for Cross-Modal Retrieval , 2018, IEEE Access.

[47]  Wei Wei,et al.  Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval , 2019, IEEE Access.