暂无分享,去创建一个
[1] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[2] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[3] Mauro Cettolo,et al. Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation , 2013, MTSUMMIT.
[4] Zhe Gan,et al. Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation , 2018, AAAI.
[5] Kate Saenko,et al. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.
[6] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Diana Gonzalez-Rico,et al. Contextualize, Show and Tell: A Neural Visual Storyteller , 2018, ArXiv.
[8] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Xin Wang,et al. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling , 2018, ACL.
[11] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[12] Hexiang Hu,et al. Visual Storytelling via Predicting Anchor Word Embeddings in the Stories , 2020, ArXiv.
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Lun-Wei Ku,et al. Knowledge-Enriched Visual Storytelling , 2019, AAAI.
[16] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Piji Li,et al. Storytelling from an Image Stream Using Scene Graphs , 2020, AAAI.
[18] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.
[19] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[20] Peng Gao,et al. Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Xinlei Chen,et al. Mind's eye: A recurrent visual representation for image caption generation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[25] Byoung-Tak Zhang,et al. GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation , 2018, ArXiv.
[26] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[28] Gunhee Kim,et al. Expressing an Image Stream with a Sequence of Natural Sentences , 2015, NIPS.
[29] Licheng Yu,et al. Hierarchically-Attentive RNN for Album Summarization and Storytelling , 2017, EMNLP.
[30] Wei Zhang,et al. Hierarchical Photo-Scene Encoder for Album Storytelling , 2019, AAAI.
[31] Tamir Hazan,et al. Factor Graph Attention , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Yueting Zhuang,et al. Informative Visual Storytelling with Cross-modal Rules , 2019, ACM Multimedia.
[33] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Natalie Parde,et al. The Steep Road to Happily Ever after: an Analysis of Current Visual Storytelling Models , 2019, Proceedings of the Second Workshop on Shortcomings in Vision and Language.
[35] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[36] Francis Ferraro,et al. Visual Storytelling , 2016, NAACL.
[37] Lei Li,et al. Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling , 2019, IJCAI.
[38] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[39] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[40] Zhou Yu,et al. Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[41] Tamir Hazan,et al. High-Order Attention Models for Visual Question Answering , 2017, NIPS.
[42] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.