论文信息 - An Extended Evaluation of the Impact of Different Modules in ST-VQA Systems - 字舞流文

An Extended Evaluation of the Impact of Different Modules in ST-VQA Systems

Mickaël Coustaty | Antoine Doucet | Nicholas Journet | Juan C. Caicedo | Viviana Beltrán

[1] Ernest Valveny,et al. Scene Text Visual Question Answering , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Shashank Shekhar,et al. OCR-VQA: Visual Question Answering by Reading Text in Images , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[5] Matthieu Cord,et al. MUREL: Multimodal Relational Reasoning for Visual Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Mickaël Coustaty,et al. Semantic Text Recognition via Visual Question Answering , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).

[7] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[9] Shashank Shekhar,et al. From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10] Gaofeng Meng,et al. Scene text detection and recognition with advances in deep learning: a survey , 2019, International Journal on Document Analysis and Recognition (IJDAR).

[11] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.

[12] Xinlei Chen,et al. Towards VQA Models That Can Read , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).