GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
暂无分享,去创建一个
[1] R. L. Ebel,et al. An Analysis of Test-Wiseness , 1965 .
[2] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[3] Jeffery J. Mondak,et al. Asked and Answered: Knowledge Levels When We Will Not Take “Don't Know” for an Answer , 2001 .
[4] M. Bar-Hillel,et al. Guess Where: The Position of Correct Answers in Multiple-Choice Test Items as a Psychometric Variable , 2003 .
[5] Leonidas J. Guibas,et al. The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.
[6] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[7] Alexei A. Efros,et al. Unbiased look at dataset bias , 2011, CVPR 2011.
[8] Miljenko Huzak,et al. Chi-Square Distribution , 2011, International Encyclopedia of Statistical Science.
[9] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[10] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[11] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[12] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[13] Michael S. Bernstein,et al. Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] David A. Shamma,et al. The New Data and New Challenges in Multimedia Research , 2015, ArXiv.
[17] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[20] Allan Jabri,et al. Revisiting Visual Question Answering Baselines , 2016, ECCV.
[21] Dhruv Batra,et al. Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? , 2016, EMNLP.
[22] Yash Goyal,et al. Yin and Yang: Balancing and Answering Binary Visual Questions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Margaret Mitchell,et al. Generating Natural Questions About an Image , 2016, ACL.
[24] Dhruv Batra,et al. Analyzing the Behavior of Visual Question Answering Models , 2016, EMNLP.
[25] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[26] Michael S. Bernstein,et al. Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[28] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Stefan Lee,et al. The Promise of Premise: Harnessing Question Premises in Visual Question Answering , 2017, EMNLP.
[30] Danfei Xu,et al. Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Alexander G. Schwing,et al. Creativity: Generating Diverse Questions Using Variational Autoencoders , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Christopher Kanan,et al. Visual question answering: Datasets, algorithms, and future challenges , 2016, Comput. Vis. Image Underst..
[33] Christopher Kanan,et al. An Analysis of Visual Question Answering Algorithms , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[34] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and VQA , 2017, ArXiv.
[36] Shaodi You,et al. Automatic Generation of Grounded Visual Questions , 2016, IJCAI.
[37] Anton van den Hengel,et al. Graph-Structured Representations for Visual Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Jianfei Cai,et al. VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions , 2018, ECCV.
[39] Christopher D. Manning,et al. Compositional Attention Networks for Machine Reasoning , 2018, ICLR.
[40] Dhruv Batra,et al. Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[41] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[42] Anton van den Hengel,et al. Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Trevor Darrell,et al. Multimodal Explanations: Justifying Decisions and Pointing to the Evidence , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[44] Yoav Artzi,et al. A Corpus for Reasoning about Natural Language Grounded in Photographs , 2018, ACL.