News Content Completion with Location-Aware Image Selection
暂无分享,去创建一个
Adam Jatowt | Shao-Ping Lu | Jun Wang | Zhengkun Zhang | Zhe Sun | Zhenglu Yang | A. Jatowt | Zhengkun Zhang | Jun Wang | Zhenglu Yang | Shao-Ping Lu | Zhe Sun
[1] Dimosthenis Karatzas,et al. Good News, Everyone! Context Driven Entity-Aware Captioning for News Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] David Mimno,et al. Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents , 2019, EMNLP.
[3] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[4] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[5] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[6] Sanjeev Arora,et al. A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.
[7] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[8] Yang Yang,et al. Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking , 2019, ACM Multimedia.
[9] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[10] Huchuan Lu,et al. Deep Cross-Modal Projection Learning for Image-Text Matching , 2018, ECCV.
[11] Yu Zhou,et al. MSMO: Multimodal Summarization with Multimodal Output , 2018, EMNLP.
[12] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[15] Lexing Xie,et al. SentiCap: Generating Image Descriptions with Sentiments , 2015, AAAI.
[16] Zhe Gan,et al. StyleNet: Generating Attractive Visual Captions with Styles , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Lexing Xie,et al. Transform and Tell: Entity-Aware News Image Captioning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[19] Samy Bengio,et al. Order Matters: Sequence to sequence for sets , 2015, ICLR.
[20] Lexing Xie,et al. SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[21] Gang Hua,et al. Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[22] Xi Chen,et al. Stacked Cross Attention for Image-Text Matching , 2018, ECCV.
[23] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[24] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.
[25] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[26] Karl Aberer,et al. Upgrading the Newsroom: An Automated Image Selection System for News Articles , 2020 .
[27] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[28] Wei Wang,et al. Instance-Aware Image and Sentence Matching with Selective Multimodal LSTM , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).