Universal Weighting Metric Learning for Cross-Modal Matching
暂无分享,去创建一个
Heng Tao Shen | Xing Xu | Yang Yang | Zheng Wang | Yanli Ji | Jiwei Wei | H. Shen | Xing Xu | Yang Yang | Jiwei Wei | Yanli Ji | Zheng Wang
[1] Huimin Lu,et al. Deep adversarial metric learning for cross-modal retrieval , 2019, World Wide Web.
[2] Gang Wang,et al. Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[3] Xuelong Li,et al. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.
[4] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[5] Nir Ailon,et al. Deep Metric Learning Using Triplet Network , 2014, SIMBAD.
[6] Huimin Lu,et al. Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval , 2020, IEEE Transactions on Cybernetics.
[7] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[9] Yang Yang,et al. Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.
[10] Yang Yang,et al. Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking , 2019, ACM Multimedia.
[11] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[12] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[13] Yu Liu,et al. Learning a Recurrent Residual Fusion Network for Multimodal Matching , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[14] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[15] Yale Song,et al. Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Jingjing Li,et al. Residual Graph Convolutional Networks for Zero-Shot Learning , 2019, MMAsia.
[17] Jung-Woo Ha,et al. Dual Attention Networks for Multimodal Reasoning and Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Deli Zhao,et al. Recognizing an Action Using Its Name: A Knowledge-Based Approach , 2016, International Journal of Computer Vision.
[19] Yang Liu,et al. Use What You Have: Video retrieval using representations from collaborative experts , 2019, BMVC.
[20] Tianbao Yang,et al. Learning Attributes Equals Multi-Source Domain Generalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Bowen Zhang,et al. Cross-Modal and Hierarchical Modeling of Video and Text , 2018, ECCV.
[22] Tao Mei,et al. MSR-VTT: A Large Video Description Dataset for Bridging Video and Language , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[24] Zi Huang,et al. Exploiting Subspace Relation in Semantic Labels for Cross-Modal Hashing , 2021, IEEE Transactions on Knowledge and Data Engineering.
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Xun Wang,et al. Dual Dense Encoding for Zero-Example Video Retrieval , 2018, ArXiv.
[27] Xirong Li,et al. Predicting Visual Features From Text for Image and Video Caption Retrieval , 2017, IEEE Transactions on Multimedia.
[28] Heng Tao Shen,et al. Hierarchical LSTMs with Adaptive Attention for Visual Captioning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Yun Fu,et al. Visual Semantic Reasoning for Image-Text Matching , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Yan Huang,et al. Learning Semantic Concepts and Order for Image and Sentence Matching , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[31] Xi Chen,et al. Stacked Cross Attention for Image-Text Matching , 2018, ECCV.
[32] Josep Lladós,et al. Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Xuelong Li,et al. From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[34] Amit K. Roy-Chowdhury,et al. Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval , 2018, ICMR.
[35] Liqiang Nie,et al. Scalable Deep Hashing for Large-Scale Social Image Retrieval , 2020, IEEE Transactions on Image Processing.
[36] Heng Tao Shen,et al. Collective Reconstructive Embeddings for Cross-Modal Hashing , 2019, IEEE Transactions on Image Processing.
[37] Xiang Zhou,et al. Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings , 2019, IEEE Transactions on Image Processing.
[38] Heng Tao Shen,et al. Video Captioning by Adversarial LSTM , 2018, IEEE Transactions on Image Processing.
[39] Juan Carlos Niebles,et al. Dense-Captioning Events in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[40] Lei Zhang,et al. Optimal Projection Guided Transfer Hashing for Image Retrieval , 2019, IEEE Transactions on Circuits and Systems for Video Technology.
[41] Jiwen Lu,et al. Deep Coupled Metric Learning for Cross-Modal Matching , 2017, IEEE Transactions on Multimedia.