Vision-Language Transformer for Interpretable Pathology Visual Question Answering
暂无分享,去创建一个
[1] Matloob Khushi,et al. Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT , 2021, BMC Bioinform..
[2] Bo Liu,et al. Slake: A Semantically-Labeled Knowledge-Enhanced Dataset For Medical Visual Question Answering , 2021, 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI).
[3] Imran Razzak,et al. A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models , 2020, ACM Trans. Asian Low Resour. Lang. Inf. Process..
[4] Matloob Khushi,et al. BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition , 2020, 2021 International Joint Conference on Neural Networks (IJCNN).
[5] Bo Liu,et al. Medical Visual Question Answering via Conditional Reasoning , 2020, ACM Multimedia.
[6] Eric Xing,et al. Pathological Visual Question Answering , 2020, ArXiv.
[7] Fuji Ren,et al. CGMVQA: A New Classification and Generative Model for Medical Visual Question Answering , 2020, IEEE Access.
[8] Eric Xing,et al. PathVQA: 30000+ Questions for Medical Visual Question Answering , 2020, ArXiv.
[9] Yu Cheng,et al. UNITER: UNiversal Image-TExt Representation Learning , 2019, ECCV.
[10] Furu Wei,et al. VL-BERT: Pre-training of Generic Visual-Linguistic Representations , 2019, ICLR.
[11] D. Tao,et al. A Survey on Visual Transformer , 2020, ArXiv.
[12] Chunhua Shen,et al. AIML at VQA-Med 2020: Knowledge Inference via a Skeleton-based Sentence Mapping Approach for Medical Domain Visual Question Answering , 2020, CLEF.
[13] Henning Müller,et al. Overview of the VQA-Med Task at ImageCLEF 2021: Visual Question Answering and Generation in the Medical Domain , 2020, CLEF.
[14] Thanh-Toan Do,et al. Overcoming Data Limitation in Medical Visual Question Answering , 2019, MICCAI.
[15] Mohit Bansal,et al. LXMERT: Learning Cross-Modality Encoder Representations from Transformers , 2019, EMNLP.
[16] Cho-Jui Hsieh,et al. VisualBERT: A Simple and Performant Baseline for Vision and Language , 2019, ArXiv.
[17] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[18] M. Gurcan,et al. Digital pathology and artificial intelligence. , 2019, The Lancet. Oncology.
[19] William W. Cohen,et al. Probing Biomedical Embeddings from Language Models , 2019, Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for.
[20] Lin Li,et al. Zhejiang University at ImageCLEF 2019 Visual Question Answering in the Medical Domain , 2019, CLEF.
[21] Lei Shi,et al. Deep Multimodal Learning for Medical Visual Question Answering , 2019, CLEF.
[22] Henning Müller,et al. VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019 , 2019, CLEF.
[23] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[24] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[25] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.
[26] Zhou Yu,et al. Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[27] Asma Ben Abacha,et al. NLM at ImageCLEF 2018 Visual Question Answering in the Medical Domain , 2018, CLEF.
[28] Feifan Liu,et al. UMass at ImageCLEF Medical Visual Question Answering(Med-VQA) 2018 Task , 2018, CLEF.
[29] Fuji Ren,et al. Employing Inception-Resnet-v2 and Bi-LSTM for Medical Domain Visual Question Answering , 2018, CLEF.
[30] Henning Müller,et al. Overview of ImageCLEF 2018 Medical Domain Visual Question Answering Task , 2018, CLEF.
[31] Zhou Yu,et al. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[32] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[33] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[34] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[35] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[36] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[37] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[40] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[41] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[42] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[43] Jürgen Schmidhuber,et al. Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.
[44] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[45] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .