论文信息 - A new visual question answering system for medical images characterization

A new visual question answering system for medical images characterization

This article presents our proposed system combining a Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) for the visual question answering applied in the medical images characterization. In our proposed Encoder-Decoder Model we have used a pre-trained convolutional neural network to extract image features, a pre-trained word embedding for questions-answers representation.

Dahdouh Yousra | Allaouzi Imane | Ben Ahmed Mohamed | Bghiel Afrae | Anouar Boudhir Abdelhakim

[1] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2] Mohamed Ben Ahmed,et al. Deep Neural Networks and Decision Tree Classifier for Visual Question Answering in the Medical Domain , 2018, CLEF.

[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Peter Hall,et al. Integrating vision processing and natural language processing with a clinical application , 1995, Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[5] Mohamed Ben Ahmed,et al. An Encoder-Decoder Model for Visual Question Answering in the Medical Domain , 2019, CLEF.

[6] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[7] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Fuji Ren,et al. Employing Inception-Resnet-v2 and Bi-LSTM for Medical Domain Visual Question Answering , 2018, CLEF.

[10] Feifan Liu,et al. UMass at ImageCLEF Medical Visual Question Answering(Med-VQA) 2018 Task , 2018, CLEF.

[11] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[12] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Sonit Singh,et al. Pushing the Limits of Radiology with Joint Modeling of Visual and Textual Information , 2018, ACL.

[14] Yang Yin,et al. Hybrid LSTM Neural Network for Short-Term Traffic Flow Prediction , 2019, Inf..

[15] Yuandong Tian,et al. Simple Baseline for Visual Question Answering , 2015, ArXiv.