暂无分享,去创建一个
[1] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[2] Hervé Jégou,et al. Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion , 2018, EMNLP.
[3] Guillaume Lample,et al. Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.
[4] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[5] Nobuyuki Shimizu,et al. Cross-Lingual Image Caption Generation , 2016, ACL.
[6] Sanja Fidler,et al. Order-Embeddings of Images and Language , 2015, ICLR.
[7] Jianwei Yang,et al. Neural Baby Talk , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[8] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.
[9] Vaibhava Goel,et al. Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and VQA , 2017, ArXiv.
[11] Hideki Nakayama,et al. Image-Mediated Learning for Zero-Shot Cross-Lingual Document Retrieval , 2015, EMNLP.
[12] Quoc V. Le,et al. Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.
[13] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[14] Richard Socher,et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Angeliki Lazaridou,et al. Combining Language and Vision with a Multimodal Skip-gram Model , 2015, NAACL.
[16] Khalil Sima'an,et al. A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.
[17] Wei Wang,et al. Instance-Aware Image and Sentence Matching with Selective Multimodal LSTM , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.
[19] David J. Fleet,et al. VSE++: Improved Visual-Semantic Embeddings , 2017, ArXiv.
[20] Georgiana Dinu,et al. Improving zero-shot learning by mitigating the hubness problem , 2014, ICLR.
[21] Akikazu Takeuchi,et al. STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset , 2017, ACL.
[22] Douglas A. Reynolds,et al. SHEEP, GOATS, LAMBS and WOLVES A Statistical Analysis of Speaker Performance in the NIST 1998 Speaker Recognition Evaluation , 1998 .
[23] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[24] Yuji Matsumoto,et al. Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.
[25] Liwei Wang,et al. Learning Two-Branch Neural Networks for Image-Text Matching Tasks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Balaraman Ravindran,et al. Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning , 2015, NAACL.
[27] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Guillaume Lample,et al. Word Translation Without Parallel Data , 2017, ICLR.
[29] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[31] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[32] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[33] Nazli Ikizler-Cinbis,et al. Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures , 2016, J. Artif. Intell. Res..
[34] Frank Keller,et al. Image Pivoting for Learning Multilingual Multimodal Representations , 2017, EMNLP.
[35] Nick Campbell,et al. Multilingual Multi-modal Embeddings for Natural Language Processing , 2017, ArXiv.
[36] Stefan Riezler,et al. Multimodal Pivots for Image Caption Translation , 2016, ACL.