Improving visual question answering using dropout and enhanced question encoder
暂无分享,去创建一个
Hanqing Lu | Yong Li | Zhiwei Fang | Jing Liu | Yanyuan Qiao | Hanqing Lu | Jing Liu | Yong Li | Zhiwei Fang | Yanyuan Qiao
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[3] Tamir Hazan,et al. High-Order Attention Models for Visual Question Answering , 2017, NIPS.
[4] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[6] Petros Daras,et al. Motion analysis: Action detection, recognition and evaluation based on motion capture data , 2018, Pattern Recognit..
[7] Dhruv Batra,et al. Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? , 2016, EMNLP.
[8] Michael I. Jordan. Serial Order: A Parallel Distributed Processing Approach , 1997 .
[9] Zhou Yu,et al. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[10] Jun Xiao,et al. Multi-Turn Video Question Answering via Multi-Stream Hierarchical Attention Context Network , 2018, IJCAI.
[11] Anton van den Hengel,et al. Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[14] Deng Cai,et al. A Better Way to Attend: Attention With Trees for Video Question Answering , 2018, IEEE Transactions on Image Processing.
[15] Juan Carlos Niebles,et al. Leveraging Video Descriptions to Learn Video Question Answering , 2016, AAAI.
[16] Yoshua Bengio,et al. Fraternal Dropout , 2017, ICLR.
[17] Christopher Kermorvant,et al. Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.
[18] Christopher Kanan,et al. Visual question answering: Datasets, algorithms, and future challenges , 2016, Comput. Vis. Image Underst..
[19] Qi Wu,et al. Image Captioning and Visual Question Answering Based on Attributes and External Knowledge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[20] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Mario Fritz,et al. Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[22] Priyanka Jadhav,et al. Automatic Caption Generation for News Images , 2014 .
[23] Adi Livnat,et al. Sex, mixability, and modularity , 2010, Proceedings of the National Academy of Sciences.
[24] Yueting Zhuang,et al. Video Question Answering via Hierarchical Dual-Level Attention Network Learning , 2017, ACM Multimedia.
[25] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[26] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[29] Qian Zhang,et al. A survey and analysis on automatic image annotation , 2018, Pattern Recognit..
[30] Yueting Zhuang,et al. Video Question Answering via Hierarchical Spatio-Temporal Attention Networks , 2017, IJCAI.
[31] Christopher Kanan,et al. An Analysis of Visual Question Answering Algorithms , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[32] Jiwen Lu,et al. Scene recognition with objectness , 2018, Pattern Recognit..
[33] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[34] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[35] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[36] Qi Wu,et al. Visual question answering: A survey of methods and datasets , 2016, Comput. Vis. Image Underst..
[37] Cheng-Lin Liu,et al. Adaptive spatial pooling for image classification , 2016, Pattern Recognit..
[38] Michael S. Bernstein,et al. Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[40] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[41] Zhou Yu,et al. Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[42] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[44] Zhou Yu,et al. Open-Ended Long-form Video Question Answering via Adaptive Hierarchical Reinforced Networks , 2018, IJCAI.
[45] F. Xavier Roca,et al. A coarse-to-fine approach for fast deformable object detection , 2015, Pattern Recognit..
[46] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[48] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[49] Yash Goyal,et al. Yin and Yang: Balancing and Answering Binary Visual Questions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).