暂无分享,去创建一个
Qinghua Zheng | Jie Ma | Jun Liu | Junjun Li | Qingyu Yin | Jianlong Zhou | Yi Huang
[1] Joshua B. Tenenbaum,et al. Separating Style and Content , 1996, NIPS.
[2] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[3] Jonghyun Choi,et al. Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Apoorv Saxena,et al. Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings , 2020, ACL.
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Zhou Yu,et al. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[8] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[9] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.
[10] Xin Hu,et al. Jointly Optimized Neural Coreference Resolution with Mutual Attention , 2020, WSDM.
[11] Jaewoo Kang,et al. Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering , 2018, EMNLP.
[12] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[13] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[14] Peng Gao,et al. Multi-Modality Latent Interaction Network for Visual Question Answering , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] Chuang Gan,et al. The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision , 2019, ICLR.
[16] Wei Wang,et al. Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering , 2018, ACL.
[17] Zhou Yu,et al. Deep Modular Co-Attention Networks for Visual Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Parisa Kordjamshidi,et al. Cross-Modality Relevance for Reasoning on Language and Vision , 2020, ACL.
[19] Wendy Grace Lehnert,et al. The Process of Question Answering , 2022 .
[20] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[21] Mohit Bansal,et al. Revealing the Importance of Semantic Retrieval for Machine Reading at Scale , 2019, EMNLP.
[22] Chang Zhou,et al. Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.
[23] Mahmoud Khademi,et al. Multimodal Neural Graph Memory Networks for Visual Question Answering , 2020, ACL.
[24] Chuang Gan,et al. Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding , 2018, NeurIPS.