Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering
暂无分享,去创建一个
Soumen Chakrabarti | Preethi Jyothi | Ganesh Ramakrishnan | Vishwajeet Kumar | Mayank Kothyari | Aman Jain
[1] Raymond J. Mooney,et al. Improving VQA and its Explanations by Comparing Competing Explanations , 2020, ArXiv.
[2] Trevor Darrell,et al. Multimodal Explanations: Justifying Decisions and Pointing to the Evidence , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[3] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[4] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[5] Sameer Singh,et al. Compositional Questions Do Not Necessitate Multi-hop Reasoning , 2019, ACL.
[6] Charles L. A. Clarke,et al. Term proximity scoring for ad-hoc retrieval on very large text collections , 2006, SIGIR.
[7] Praveen Paritosh,et al. Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.
[8] Christopher D. Manning,et al. GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] W. Bruce Croft,et al. Proximity-based document representation for named entity retrieval , 2007, CIKM '07.
[10] Ganesh Ramakrishnan,et al. LIGHTEN: Learning Interactions with Graph and Hierarchical TEmporal Networks for HOI in videos , 2020, ACM Multimedia.
[11] Ali Farhadi,et al. OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Kyunghyun Cho,et al. Unsupervised Question Decomposition for Question Answering , 2020, EMNLP.
[13] Ganesh Ramakrishnan,et al. Neural architecture for question answering using a knowledge graph and web corpus , 2017, Information Retrieval Journal.
[14] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Omer Levy,et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.
[16] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[17] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[18] Catherine Havasi,et al. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.
[19] Ellen M. Voorhees,et al. The TREC-8 Question Answering Track Report , 1999, TREC.
[20] R. Thomas McCoy,et al. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference , 2019, ACL.
[21] Douwe Kiela,et al. Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.
[22] Oren Etzioni,et al. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.
[23] Oren Etzioni,et al. Open Information Extraction: The Second Generation , 2011, IJCAI.
[24] Jonathan Berant,et al. The Web as a Knowledge-Base for Answering Complex Questions , 2018, NAACL.
[25] George A. Miller,et al. Introduction to WordNet: An On-line Lexical Database , 1990 .
[26] Jordi Pont-Tuset,et al. The Open Images Dataset V4 , 2018, International Journal of Computer Vision.
[27] Colin Raffel,et al. How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.
[28] Andrew Chou,et al. Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.
[29] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[30] Haoyu Zhang,et al. Complex Question Decomposition for Semantic Parsing , 2019, ACL.
[31] Luke Zettlemoyer,et al. Don’t Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases , 2019, EMNLP.
[32] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[33] Yao Zhao,et al. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.
[34] François Gardères,et al. ConceptBert: Concept-Aware Representation for Visual Question Answering , 2020, FINDINGS.
[35] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[36] Matthieu Cord,et al. BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection , 2019, AAAI.
[37] Jonathan Berant,et al. MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension , 2019, ACL.
[38] Dhruv Batra,et al. Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? , 2016, EMNLP.
[39] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[40] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[41] Sameer Singh,et al. Are Red Roses Red? Evaluating Consistency of Question-Answering Models , 2019, ACL.
[42] Xuchen Yao,et al. Information Extraction over Structured Data: Question Answering with Freebase , 2014, ACL.
[43] William W. Cohen,et al. PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text , 2019, EMNLP.
[44] ChengXiang Zhai,et al. Positional language models for information retrieval , 2009, SIGIR.
[45] Weifeng Zhang,et al. Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering , 2020, Pattern Recognit..
[46] Lei Li,et al. Dynamically Fused Graph Network for Multi-hop Reasoning , 2019, ACL.
[47] Ruslan Salakhutdinov,et al. Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text , 2018, EMNLP.