Multimodal Logical Inference System for Visual-Textual Entailment
暂无分享,去创建一个
Koji Mineshima | Daisuke Bekki | Hitomi Yanaka | Masashi Yoshikawa | Riko Suzuki | Hitomi Yanaka | K. Mineshima | D. Bekki | Riko Suzuki | Masashi Yoshikawa
[1] John McCarthy,et al. Applications of Circumscription to Formalizing Common Sense Knowledge , 1987, NMR.
[2] Samy Bengio,et al. Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.
[3] Yoav Artzi,et al. A Corpus of Natural Language for Visual Reasoning , 2017, ACL.
[4] Anton van den Hengel,et al. Graph-Structured Representations for Visual Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[6] Michael S. Bernstein,et al. Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Albert Gatt,et al. Grounded Textual Entailment , 2018, COLING.
[8] Francis Jeffry Pelletier,et al. Representation and Inference for Natural Language: A First Course in Computational Semantics , 2005, Computational Linguistics.
[9] Martin Kay,et al. Syntactic Process , 1979, ACL.
[10] Danfei Xu,et al. Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[12] Weijian Li,et al. Attentive Relational Networks for Mapping Images to Scene Graphs , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Johan Bos,et al. Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images , 2016, VL@ACL.
[14] Pascual Martínez-Gómez,et al. Higher-order logical inference with compositional semantics , 2015, EMNLP.
[15] Sandro Pezzelle,et al. Comparatives, Quantifiers, Proportions: A Multi-Task Model for the Learning of Quantities from Vision , 2018, NAACL-HLT.
[16] Christopher Kanan,et al. TallyQA: Answering Complex Counting Questions , 2018, AAAI.
[17] Pascual Martínez-Gómez,et al. On-demand Injection of Lexical Knowledge for Recognising Textual Entailment , 2017, EACL.
[18] Asim Kadav,et al. Visual Entailment Task for Visually-Grounded Language Learning , 2018, ArXiv.
[19] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[20] Christopher D. Manning,et al. GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Li Fei-Fei,et al. Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval , 2015, VL@EMNLP.