Automatic labeling of continuous wave Doppler images based on combined image and sentence networks

As medical imaging datasets grow, we are approaching the era of big data for radiologic decision support systems. This requires renewed efforts in dataset curation and labeling. We propose a methodology for weak labeling of medical images for attributes such as anatomy and disease that relies on image to sentence transformation. The methodology consists of three models, a convolutional neural network that is trained on a coarse classification task and acts as an image feature generator, a language model to map sentences to a fixed length space, and a multi-layer perceptron that acts as a function approximator to map images to the sentence space. The transform model is trained on matched image-sentence pairs on a dataset of echocardiography studies. For a given image, labels are extracted from the closest sentences to the output of the image-sentence transform. We show that the resulting solution has an 78.2% accuracy in labeling Doppler images with aortic stenosis. We also show that the retrieved sentences are consistent with the true sentences in terms of meaning with an average BLEU score of 0.34, matching the current highly performing machine translation solutions.

[1]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[2]  Yanrong Guo,et al.  Identifying Patients at Risk for Aortic Stenosis Through Learning from Multimodal Data , 2016, MICCAI.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[5]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[6]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[7]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[8]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[9]  Hongzhi Wang,et al.  A hybrid learning approach for semantic labeling of cardiac CT slices and recognition of body position , 2016, 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI).

[10]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[11]  Meng Zhou,et al.  Understanding and Generating Ultrasound Image Description , 2018, Journal of Computer Science and Technology.

[12]  Tanveer F. Syeda-Mahmood,et al.  A Cross-Modality Neural Network Transform for Semi-automatic Medical Image Annotation , 2016, MICCAI.