From Superficial to Deep: Language Bias driven Curriculum Learning for Visual Question Answering
暂无分享,去创建一个
Michael S. Lew | Mingrui Lao | Nan Pu | Wei Chen | Yanming Guo | Yu Liu | M. Lew | Yu Liu | Yanming Guo | Wei Chen | Nan Pu | Mingrui Lao
[1] Michael S. Lew,et al. A Language Prior Based Focal Loss for Visual Question Answering , 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME).
[2] Chitta Baral,et al. MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering , 2020, EMNLP.
[3] Yonatan Belinkov,et al. Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects , 2019, Proceedings of the Second Workshop on Shortcomings in Vision and Language.
[4] Zhou Yu,et al. Deep Modular Co-Attention Networks for Visual Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Yu Liu,et al. Multi-stage hybrid embedding fusion network for visual question answering , 2021, Neurocomputing.
[6] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[7] Luke Zettlemoyer,et al. Don’t Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases , 2019, EMNLP.
[8] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Daphne Koller,et al. Self-Paced Learning for Latent Variable Models , 2010, NIPS.
[10] Yongdong Zhang,et al. Overcoming Language Priors with Self-supervised Learning for Visual Question Answering , 2020, IJCAI.
[11] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[12] Barnabás Póczos,et al. Competence-based Curriculum Learning for Neural Machine Translation , 2019, NAACL.
[13] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Jonathan Le Roux,et al. Student-teacher network learning with enhanced features , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Xinlei Chen,et al. Webly Supervised Learning of Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[16] Shiliang Pu,et al. Counterfactual Samples Synthesizing for Robust Visual Question Answering , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Kevin Duh,et al. Curriculum Learning for Domain Adaptation in Neural Machine Translation , 2019, NAACL.
[18] Dhruv Batra,et al. Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[19] Anurag Mittal,et al. Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder , 2020, ECCV.
[20] Wei Xu,et al. Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.
[21] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[22] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[23] Matthieu Cord,et al. BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection , 2019, AAAI.
[24] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.
[25] Matthieu Cord,et al. RUBi: Reducing Unimodal Biases in Visual Question Answering , 2019, NeurIPS.
[26] Stefan Lee,et al. Overcoming Language Priors in Visual Question Answering with Adversarial Regularization , 2018, NeurIPS.
[27] Anton van den Hengel,et al. Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision , 2020, ECCV.
[28] Weitao Jiang,et al. Learning to Contrast the Counterfactual Samples for Robust Visual Question Answering , 2020, EMNLP.
[29] Christopher Kanan,et al. An Analysis of Visual Question Answering Algorithms , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[30] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[31] Tamir Hazan,et al. Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies , 2020, NeurIPS.
[32] Sheng-Jun Huang,et al. Self-Paced Active Learning: Query the Right Thing at the Right Time , 2019, AAAI.
[33] Guilin Qi,et al. Curriculum-Meta Learning for Order-Robust Continual Relation Extraction , 2021, AAAI.
[34] Matthew E. Taylor,et al. Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..
[35] Zhiwu Lu,et al. Counterfactual VQA: A Cause-Effect Look at Language Bias , 2020, Computer Vision and Pattern Recognition.
[36] Dim P. Papadopoulos,et al. How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[38] Yunde Jia,et al. Overcoming Language Priors in VQA via Decomposed Linguistic Representations , 2020, AAAI.
[39] Raymond J. Mooney,et al. Self-Critical Reasoning for Robust Visual Question Answering , 2019, NeurIPS.
[40] Song-Chun Zhu,et al. A Competence-aware Curriculum for Visual Concepts Learning via Question Answering , 2020, ECCV.
[41] Chao Li,et al. A Self-Paced Multiple-Instance Learning Framework for Co-Saliency Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[42] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[43] Hongxia Jin,et al. Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[44] Deyu Meng,et al. Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search , 2014, ACM Multimedia.
[45] Michael S. Lew,et al. Deep learning for visual understanding: A review , 2016, Neurocomputing.
[46] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[47] Changick Kim,et al. Pseudo-Labeling Curriculum for Unsupervised Domain Adaptation , 2019, BMVC.
[48] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[49] Erik Cambria,et al. A review of affective computing: From unimodal analysis to multimodal fusion , 2017, Inf. Fusion.