Overview of ImageCLEF 2018 Medical Domain Visual Question Answering Task

This paper presents an overview of the inaugural edition of the ImageCLEF 2018 Medical Domain Visual Question Answering (VQA-Med) task. Inspired by the recent success of visual question answering in the general domain, a pilot task was proposed this year to focus on visual question answering in the medical domain. Given medical images accompanied with clinically relevant questions, participating systems were tasked with answering the questions based on the visual image content. A dataset of 6,413 question-answer pairs accompanied with 2,866 medical images extracted from PubMed Central articles was provided; from which, 5,413 question-answer pairs with 2,278 medical images were used for training, 500 question-answer pairs with 324 medical images were used for validation, and 500 questions with 264 medical images were used for testing. Among 28 registered participants, 5 groups submitted a total of 17 runs, indicating a considerable interest in the VQA-Med task.

[1]  Mahmoud Al-Ayyoub,et al.  JUST at VQA-Med: A VGG-Seq2Seq Model , 2018, CLEF.

[2]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[3]  Michael Riegler,et al.  Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation , 2018, CLEF.

[4]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[5]  Noah A. Smith,et al.  Question Generation via Overgenerating Transformations and Ranking , 2009 .

[6]  Henning Müller,et al.  Overview of ImageCLEFcaption 2017 - Image Caption Prediction and Concept Detection for Biomedical Images , 2017, CLEF.

[7]  L. Cappellato,et al.  Experimental IR Meets Multilinguality, Multimodality, and Interaction , 2016, Lecture Notes in Computer Science.

[8]  Asma Ben Abacha,et al.  NLM at ImageCLEF 2018 Visual Question Answering in the Medical Domain , 2018, CLEF.

[9]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[11]  Feifan Liu,et al.  UMass at ImageCLEF Medical Visual Question Answering(Med-VQA) 2018 Task , 2018, CLEF.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Arzucan Özgür,et al.  BIOSSES: a semantic sentence similarity estimation system for the biomedical domain , 2017, Bioinform..

[15]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Fuji Ren,et al.  Employing Inception-Resnet-v2 and Bi-LSTM for Medical Domain Visual Question Answering , 2018, CLEF.