Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
暂无分享,去创建一个
Tao Mei | Jiebo Luo | Jianlong Fu | Qing Li | Dongfei Yu | Jiebo Luo | Tao Mei | Jianlong Fu | Qing Li | D. Yu
[1] Jianwei Yang,et al. Neural Baby Talk , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[2] Mario Fritz,et al. Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[3] Bohyung Han,et al. Image Question Answering Using Convolutional Neural Network with Dynamic Parameter Prediction , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Liangliang Cao,et al. Focal Visual-Text Attention for Visual Question Answering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[5] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Chunhua Shen,et al. What Value Do Explicit High Level Concepts Have in Vision to Language Problems? , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Allan Jabri,et al. Revisiting Visual Question Answering Baselines , 2016, ECCV.
[8] Richard Socher,et al. Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.
[9] Michael S. Bernstein,et al. Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[11] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[12] Xinlei Chen,et al. Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.
[13] Jiebo Luo,et al. Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[16] Jung-Woo Ha,et al. Dual Attention Networks for Multimodal Reasoning and Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[18] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Shuicheng Yan,et al. A Focused Dynamic Attention Model for Visual Question Answering , 2016, ArXiv.
[20] Dhruv Batra,et al. Analyzing the Behavior of Visual Question Answering Models , 2016, EMNLP.
[21] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[22] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.
[23] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Anton van den Hengel,et al. Graph-Structured Representations for Visual Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[26] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[27] Wei Xu,et al. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question , 2015, NIPS.
[28] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[29] Jianfei Cai,et al. VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions , 2018, ECCV.
[30] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Saurabh Singh,et al. Where to Look: Focus Regions for Visual Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Tao Mei,et al. Multi-level Attention Networks for Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[34] Anton van den Hengel,et al. Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[35] Peng Wang,et al. Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Qi Wu,et al. The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Kate Saenko,et al. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.
[39] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[40] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[41] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[43] Jiebo Luo,et al. VizWiz Grand Challenge: Answering Visual Questions from Blind People , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[44] Richard S. Zemel,et al. Exploring Models and Data for Image Question Answering , 2015, NIPS.