Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning
暂无分享,去创建一个
[1] Pengfei Zhu,et al. Latent Heterogeneous Graph Network for Incomplete Multi-View Learning , 2022, IEEE Transactions on Multimedia.
[2] Jingkuan Song,et al. S2 Transformer for Image Captioning , 2022, IJCAI.
[3] Zhenzhen Hu,et al. Efficient and self-adaptive rationale knowledge base for visual commonsense reasoning , 2022, Multimedia Systems.
[4] Xiaoyan Cai,et al. Relation-aware Heterogeneous Graph Transformer based drug repurposing , 2021, Expert Syst. Appl..
[5] Pranav Aggarwal,et al. Towards Zero-shot Cross-lingual Image Retrieval and Tagging , 2021, ArXiv.
[6] Hongliang Fei,et al. Heterogeneous Attention Network for Effective and Efficient Cross-modal Retrieval , 2021, SIGIR.
[7] Yang Wang,et al. Exploring Pairwise Relationships Adaptively From Linguistic Context in Image Captioning , 2021, IEEE Transactions on Multimedia.
[8] Xuanjing Huang,et al. TCIC: Theme Concepts Learning Cross Language and Vision for Image Captioning , 2021, IJCAI.
[9] Yejin Choi,et al. VinVL: Revisiting Visual Representations in Vision-Language Models , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Yongjian Wu,et al. RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Yanfang Ye,et al. Heterogeneous Graph Structure Learning for Graph Neural Networks , 2021, AAAI.
[12] Jingjing Liu,et al. UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Xirong Li,et al. Towards annotation-free evaluation of cross-lingual image captioning , 2020, MMAsia.
[14] Philip S. Yu,et al. A Survey on Heterogeneous Graph Embedding: Methods, Techniques, Applications and Sources , 2020, IEEE Transactions on Big Data.
[15] Qiang Wu,et al. Dual Attention on Pyramid Feature Maps for Image Captioning , 2020, IEEE Transactions on Multimedia.
[16] Shafiq R. Joty,et al. UNISON: Unpaired Cross-Lingual Image Captioning , 2020, AAAI.
[17] Weifeng Zhang,et al. Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering , 2020, Pattern Recognit..
[18] Xiaojun Wan,et al. Heterogeneous Graph Transformer for Graph-to-Sequence Learning , 2020, ACL.
[19] Xing Xie,et al. Graph Neural News Recommendation with Unsupervised Preference Disentanglement , 2020, ACL.
[20] Yujing Wang,et al. Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering , 2020, IJCAI.
[21] Yahong Han,et al. Reasoning with Heterogeneous Graph Alignment for Video Question Answering , 2020, AAAI.
[22] Tao Mei,et al. X-Linear Attention Networks for Image Captioning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Ruotian Luo. A Better Variant of Self-Critical Sequence Training , 2020, ArXiv.
[24] Yizhou Sun,et al. Heterogeneous Graph Transformer , 2020, WWW.
[25] Xirong Li,et al. iCap: Interactive Image Captioning with Predictive Text , 2020, ICMR.
[26] Marcella Cornia,et al. Meshed-Memory Transformer for Image Captioning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Linmei Hu,et al. Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification , 2019, EMNLP.
[28] Jiajun Zhang,et al. Synchronously Generating Two Languages with Interactive Decoding , 2019, EMNLP.
[29] Nong Xiao,et al. Heterogeneous Graph Learning for Visual Commonsense Reasoning , 2019, NeurIPS.
[30] Jie Chen,et al. Attention on Attention for Image Captioning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[31] Simao Herdade,et al. Image Captioning: Transforming Objects into Words , 2019, NeurIPS.
[32] Lingfeng Wang,et al. Deep Hierarchical Encoder–Decoder Network for Image Captioning , 2019, IEEE Transactions on Multimedia.
[33] Shu Zhang,et al. Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Wei Zhao,et al. Multitask Learning for Cross-Domain Image Captioning , 2019, IEEE Transactions on Multimedia.
[35] Chee Seng Chan,et al. COMIC: Toward A Compact Image Captioning Model With Attention , 2019, IEEE Transactions on Multimedia.
[36] Jianfei Cai,et al. Auto-Encoding Scene Graphs for Image Captioning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Tao Mei,et al. Exploring Visual Relationship for Image Captioning , 2018, ECCV.
[38] Wei Liu,et al. Recurrent Fusion Network for Image Captioning , 2018, ECCV.
[39] Philip S. Yu,et al. Leveraging Meta-path based Context for Top- N Recommendation with A Neural Co-Attention Model , 2018, KDD.
[40] Xirong Li,et al. COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval , 2018, IEEE Transactions on Multimedia.
[41] Gang Wang,et al. Unpaired Image Captioning by Language Pivoting , 2018, ECCV.
[42] Philip S. Yu,et al. Heterogeneous Information Network Embedding for Recommendation , 2017, IEEE Transactions on Knowledge and Data Engineering.
[43] Wang-Chien Lee,et al. HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning , 2017, CIKM.
[44] Gang Wang,et al. Stack-Captioning: Coarse-to-Fine Learning for Image Captioning , 2017, AAAI.
[45] Alan Jaffe,et al. Generating Image Descriptions using Multilingual Data , 2017, WMT.
[46] Xirong Li,et al. Fluency-Guided Cross-Lingual Image Captioning , 2017, ACM Multimedia.
[47] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[48] David J. Crandall,et al. Using Artificial Tokens to Control Languages for Multilingual Image Caption Generation , 2017, ArXiv.
[49] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[50] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.
[51] Vaibhava Goel,et al. Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Nobuyuki Shimizu,et al. Cross-Lingual Image Caption Generation , 2016, ACL.
[53] Xirong Li,et al. Adding Chinese Captions to Images , 2016, ICMR.
[54] Desmond Elliott,et al. Multilingual Image Description with Neural Sequence Models , 2015, 1510.04709.
[55] Svetlana Lazebnik,et al. Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, International Journal of Computer Vision.
[56] Xinlei Chen,et al. Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.
[57] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[58] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Lexing Xie,et al. Picture tags and world knowledge: learning tag relations from visual semantic sources , 2013, ACM Multimedia.
[61] Yizhou Sun,et al. Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.
[62] Philip S. Yu,et al. PathSim , 2011, Proc. VLDB Endow..
[63] Tat-Seng Chua,et al. NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.
[64] Changsheng Xu,et al. Heterogeneous Graph Contrastive Learning Network for Personalized Micro-Video Recommendation , 2023, IEEE Transactions on Multimedia.
[65] Fei Yin,et al. Cross-Lingual Text Image Recognition via Multi-Hierarchy Cross-Modal Mimic , 2023, IEEE Transactions on Multimedia.
[66] Changsheng Xu,et al. Heterogeneous Hierarchical Feature Aggregation Network for Personalized Micro-Video Recommendation , 2022, IEEE Transactions on Multimedia.
[67] Ho-fung Leung,et al. Image Difference Captioning With Instance-Level Fine-Grained Feature Representation , 2022, IEEE Transactions on Multimedia.
[68] Zhenzhen Hu,et al. A Text-Guided Generation and Refinement Model for Image Captioning , 2023, IEEE Transactions on Multimedia.
[69] Rita Cucchiara,et al. From Show to Tell: A Survey on Image Captioning , 2021, ArXiv.
[70] Hanli Wang,et al. CaptionNet: A Tailor-made Recurrent Neural Network for Generating Image Descriptions , 2021, IEEE Transactions on Multimedia.
[71] Kuizhi Mei,et al. Integrating Part of Speech Guidance for Image Captioning , 2021, IEEE Transactions on Multimedia.
[72] Chenhui Chu,et al. Cross-Lingual Visual Grounding , 2021, IEEE Access.
[73] Yang Wang,et al. Cross-Lingual Image Caption Generation Based on Visual Attention Model , 2020, IEEE Access.
[74] Shaowei Liu,et al. General Knowledge Embedded Image Representation Learning , 2018, IEEE Transactions on Multimedia.