CMPD: Using Cross Memory Network With Pair Discrimination for Image-Text Retrieval
暂无分享,去创建一个
Zhizhong Han | Yu-Shen Liu | Xin Wen | Zhizhong Han | Yu-Shen Liu | Xin Wen
[1] Jacob Abernethy,et al. On Convergence and Stability of GANs , 2018 .
[2] Nitish Srivastava,et al. Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .
[3] Jiwen Lu,et al. Deep Coupled Metric Learning for Cross-Modal Matching , 2017, IEEE Transactions on Multimedia.
[4] Meng Wang,et al. Learning Visual Semantic Relationships for Efficient Visual Retrieval , 2015, IEEE Transactions on Big Data.
[5] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[6] Xiaohua Zhai,et al. Learning Cross-Media Joint Representation With Sparse and Semisupervised Regularization , 2014, IEEE Transactions on Circuits and Systems for Video Technology.
[7] An-An Liu,et al. 3D Object Retrieval Based on Multi-View Latent Variable Model , 2019, IEEE Transactions on Circuits and Systems for Video Technology.
[8] Yang Yang,et al. Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.
[9] Jianmin Wang,et al. Collective Deep Quantization for Efficient Cross-Modal Retrieval , 2017, AAAI.
[10] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[11] Xuelong Li,et al. Deep Binary Reconstruction for Cross-Modal Hashing , 2017, IEEE Transactions on Multimedia.
[12] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Krystian Mikolajczyk,et al. Deep correlation for matching images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Liming Chen,et al. DeepVisage: Making Face Recognition Simple Yet With Powerful Generalization Skills , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[15] Gang Wang,et al. Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[16] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[17] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.
[18] Wei Liu,et al. Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval , 2017, AAAI.
[19] Dezhong Peng,et al. Deep Supervised Cross-Modal Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Wei Wang,et al. Learning Coupled Feature Spaces for Cross-Modal Matching , 2013, 2013 IEEE International Conference on Computer Vision.
[21] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[22] Shiming Xiang,et al. Cross-Modal Hashing via Rank-Order Preserving , 2017, IEEE Transactions on Multimedia.
[23] Ruifan Li,et al. Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.
[24] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[26] Huimin Lu,et al. Unsupervised cross-modal retrieval through adversarial learning , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).
[27] Tieniu Tan,et al. Joint Feature Selection and Subspace Learning for Cross-Modal Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[28] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[29] Bhiksha Raj,et al. SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Xin Huang,et al. An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges , 2017, IEEE Transactions on Circuits and Systems for Video Technology.
[31] Qian Huang,et al. Multimedia search and retrieval: new concepts, system implementation, and application , 2000, IEEE Trans. Circuits Syst. Video Technol..
[32] Tat-Seng Chua,et al. NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.
[33] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[34] Roger Levy,et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[35] Xinbo Gao,et al. Triplet-Based Deep Hashing Network for Cross-Modal Retrieval , 2018, IEEE Transactions on Image Processing.
[36] Yuxin Peng,et al. MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval , 2017, IEEE Transactions on Cybernetics.
[37] Xin Wen,et al. Adversarial Cross-Modal Retrieval via Learning and Transferring Single-Modal Similarities , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).
[38] Quan Wang,et al. Robust and Flexible Discrete Hashing for Cross-Modal Similarity Search , 2018, IEEE Transactions on Circuits and Systems for Video Technology.
[39] Lin Ma,et al. Multimodal Convolutional Neural Networks for Matching Image and Sentence , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[40] Yueting Zhuang,et al. Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval , 2013, AAAI.
[41] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[42] Xin Luo,et al. SCRATCH: A Scalable Discrete Matrix Factorization Hashing Framework for Cross-Modal Retrieval , 2020, IEEE Transactions on Circuits and Systems for Video Technology.
[43] Miki Haseyama,et al. A Cross-Modal Approach for Extracting Semantic Relationships Between Concepts Using Tagged Images , 2014, IEEE Transactions on Multimedia.
[44] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[45] Bernd Girod,et al. Large-Scale Video Retrieval Using Image Queries , 2018, IEEE Transactions on Circuits and Systems for Video Technology.
[46] Yuxin Peng,et al. Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks , 2016, IJCAI.
[47] Cyrus Rashtchian,et al. Collecting Image Annotations Using Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.
[48] Xuelong Li,et al. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.
[49] Edward Y. Chang,et al. CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..