A Question-Centric Model for Visual Question Answering in Medical Imaging
Raphael Sznitman | Tommy Löfstedt | Minh H. Vu | Tufve Nyholm | R. Sznitman | Tommy Löfstedt | T. Nyholm
[1] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[2] Carlos R. del-Blanco,et al. ImageCLEF 2019: Multimedia Retrieval in Medicine, Lifelogging, Security and Nature , 2019, CLEF.
[3] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[4] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[5] Christopher Kanan,et al. Visual question answering: Datasets, algorithms, and future challenges , 2016, Comput. Vis. Image Underst..
[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[7] Donald Geman,et al. Visual Turing test for computer vision systems , 2015, Proceedings of the National Academy of Sciences.
[8] Fabrice Mériaudeau,et al. Indian Diabetic Retinopathy Image Dataset (IDRiD): A Database for Diabetic Retinopathy Screening Research , 2018, Data.
[9] Yuxin Peng,et al. The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[11] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[12] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[13] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[14] Saurabh Singh,et al. Where to Look: Focus Regions for Visual Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Henning Müller,et al. VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019 , 2019, CLEF.
[16] Janez Demsar,et al. Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..
[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[18] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[19] Catarina Eloy,et al. BACH: Grand Challenge on Breast Cancer Histology Images , 2018, Medical Image Anal..
[20] Dan Klein,et al. Learning to Compose Neural Networks for Question Answering , 2016, NAACL.
[21] Raphael Sznitman,et al. Concept-Centric Visual Turing Tests for Method Validation , 2019, MICCAI.
[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[23] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[24] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[25] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[26] Thanh-Toan Do,et al. Overcoming Data Limitation in Medical Visual Question Answering , 2019, MICCAI.
[27] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[28] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[29] Charless C. Fowlkes,et al. Bilinear classifiers for visual recognition , 2009, NIPS.
[30] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[31] Jonathan Krause,et al. Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[32] Raphael Sznitman,et al. Ensemble of Streamlined Bilinear Visual Question Answering Models for the ImageCLEF 2019 Challenge in the Medical Domain , 2019, CLEF.
[33] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[34] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).