RepeatPadding: Balancing words and sentence length for language comprehension in visual question answering
暂无分享,去创建一个
[1] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[2] J. Weijer,et al. Word length, sentence length and frequency: Zipf revisited , 2004 .
[3] Richard S. Zemel,et al. Exploring Models and Data for Image Question Answering , 2015, NIPS.
[4] Qi Wu,et al. Visual question answering: A survey of methods and datasets , 2016, Comput. Vis. Image Underst..
[5] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.
[6] T. Bever,et al. Sentence comprehension: The integration of habits and rules. David J. Townsend and Thomas G. Bever. Cambridge, MA: MIT Press, 2001. Pp. 455. , 2002, Applied Psycholinguistics.
[7] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[10] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[11] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[12] Matthieu Cord,et al. BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection , 2019, AAAI.
[13] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Meng Liu,et al. Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning , 2019, IEEE Transactions on Image Processing.
[15] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[16] S. Piantadosi. Zipf’s word frequency law in natural language: A critical review and future directions , 2014, Psychonomic Bulletin & Review.
[17] Hanzhang Wang,et al. Categorizing Concepts with Basic Level for Vision-to-Language , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[18] Thomas G. Bever,et al. Sentence Comprehension: The Integration of Habits and Rules , 2001 .
[19] Anton van den Hengel,et al. Visual Question Answering as Reading Comprehension , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Zhou Yu,et al. Deep Modular Co-Attention Networks for Visual Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Yu Cheng,et al. Relation-Aware Graph Attention Network for Visual Question Answering , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[22] Byoung-Tak Zhang,et al. Multimodal Residual Learning for Visual QA , 2016, NIPS.
[23] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[24] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[25] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.
[26] Dennis Koelma,et al. The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection , 2016, ICMR.
[27] Lei Zou,et al. Interactive natural language question answering over knowledge graphs , 2019, Inf. Sci..
[28] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[29] Mirella Lapata,et al. Learning to Paraphrase for Question Answering , 2017, EMNLP.
[30] Liqiang Nie,et al. Low-Rank Regularized Multi-Representation Learning for Fashion Compatibility Prediction , 2020, IEEE Transactions on Multimedia.
[31] Anton van den Hengel,et al. Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[34] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[35] Yue Gao,et al. Beyond Text QA: Multimedia Answer Generation by Harvesting Web Information , 2013, IEEE Transactions on Multimedia.
[36] Peng Wang,et al. Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Jakob Uszkoreit,et al. A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.
[38] Bin Li,et al. CNN-Based Adversarial Embedding for Image Steganography , 2019, IEEE Transactions on Information Forensics and Security.
[39] Meng Wang,et al. Disease Inference from Health-Related Questions via Sparse Deep Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.
[40] Meng Wang,et al. Multimedia answering: enriching text QA with media information , 2011, SIGIR.
[41] Luke S. Zettlemoyer,et al. AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.
[42] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Zhou Yu,et al. Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[44] W. Nelson Francis,et al. Beschrijving en interpretare van Lingulstische Frequences: Computational analysis of present-day American English , 1968 .
[45] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[46] Sung-Hyon Myaeng,et al. Semantic passage segmentation based on sentence topics for question answering , 2007, Inf. Sci..