Vision Matters When It Should: Sanity Checking Multimodal Machine Translation Models
[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[2] Graham Neubig,et al. Measuring and Increasing Context Usage in Context-Aware Machine Translation , 2021, ACL.
[3] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Lucia Specia,et al. Distilling Translations with Visual Awareness , 2019, ACL.
[6] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[7] Jiebo Luo,et al. A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation , 2020, ACL.
[8] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[9] Desmond Elliott,et al. Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description , 2017, WMT.
[10] Khalil Sima'an,et al. Multi30K: Multilingual English-German Image Descriptions , 2016, VL@ACL.
[11] Jiebo Luo,et al. Dynamic Context-guided Capsule Network for Multimodal Machine Translation , 2020, ACM Multimedia.
[12] Jieyu Zhao,et al. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.
[13] Jindrich Libovický,et al. CUNI System for the WMT18 Multimodal Translation Task , 2018, WMT.
[14] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.
[15] Desmond Elliott,et al. Adversarial Evaluation of Multimodal Machine Translation , 2018, EMNLP.
[16] Florian Metze,et al. On Leveraging the Visual Modality for Neural Machine Translation , 2019, INLG.
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[19] Frank Keller,et al. Cross-lingual Visual Verb Sense Disambiguation , 2019, NAACL.
[20] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[21] Desmond Elliott,et al. Findings of the Third Shared Task on Multimodal Machine Translation , 2018, WMT.
[22] Nick Campbell,et al. Doubly-Attentive Decoder for Multi-modal Neural Machine Translation , 2017, ACL.
[23] Lucia Specia,et al. Sheffield Submissions for WMT18 Multimodal Translation Shared Task , 2018, WMT.
[24] Khalil Sima'an,et al. A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.
[25] Wei Bi,et al. Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine Translation , 2021, ACL.
[26] Lucia Specia,et al. Probing the Need for Visual Context in Multimodal Machine Translation , 2019, NAACL.
[27] Lucia Specia,et al. MultiSubs: A Large-scale Multimodal and Multilingual Dataset , 2021, ArXiv.
[28] Radu Soricut,et al. Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning , 2018, ACL.