A new visual question answering system for medical images characterization

This article presents our proposed system combining a Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) for the visual question answering applied in the medical images characterization. In our proposed Encoder-Decoder Model we have used a pre-trained convolutional neural network to extract image features, a pre-trained word embedding for questions-answers representation.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  Mohamed Ben Ahmed,et al.  Deep Neural Networks and Decision Tree Classifier for Visual Question Answering in the Medical Domain , 2018, CLEF.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Peter Hall,et al.  Integrating vision processing and natural language processing with a clinical application , 1995, Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[5]  Mohamed Ben Ahmed,et al.  An Encoder-Decoder Model for Visual Question Answering in the Medical Domain , 2019, CLEF.

[6]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[7]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8]  Alexander J. Smola,et al.  Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Fuji Ren,et al.  Employing Inception-Resnet-v2 and Bi-LSTM for Medical Domain Visual Question Answering , 2018, CLEF.

[10]  Feifan Liu,et al.  UMass at ImageCLEF Medical Visual Question Answering(Med-VQA) 2018 Task , 2018, CLEF.

[11]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[12]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Sonit Singh,et al.  Pushing the Limits of Radiology with Joint Modeling of Visual and Textual Information , 2018, ACL.

[14]  Yang Yin,et al.  Hybrid LSTM Neural Network for Short-Term Traffic Flow Prediction , 2019, Inf..

[15]  Yuandong Tian,et al.  Simple Baseline for Visual Question Answering , 2015, ArXiv.