Cross-Modal Information Maximization for Medical Imaging: CMIM

In hospitals, data are siloed to specific information systems that make the same information available under different modalities such as the different medical imaging exams the patient undergoes (CT scans, MRI, PET, Ultrasound, etc.) and their associated radiology reports. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time. In this paper, we propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time, using recent advances in mutual information maximization. By maximizing cross-modal information at train time, we are able to outperform several state-of-the-art baselines in two different settings, medical image classification, and segmentation. In particular, our method is shown to have a strong impact on the inference-time performance of weaker modalities.

[1]  Yoshua Bengio,et al.  Mutual Information Neural Estimation , 2018, ICML.

[2]  Mohammad Havaei,et al.  HeMIS: Hetero-Modal Image Segmentation , 2016, MICCAI.

[3]  Mehdi Moradi,et al.  Scandent Tree: A Random Forest Learning Method for Incomplete Multimodal Datasets , 2015, MICCAI.

[4]  Guoyin Wang,et al.  Deconvolutional Paragraph Representation Learning , 2017, NIPS.

[5]  Bernhard Schölkopf,et al.  MRI-Based Attenuation Correction for PET/MRI: A Novel Approach Combining Pattern Recognition and Atlas Registration , 2008, Journal of Nuclear Medicine.

[6]  Yun Yang,et al.  Knowledge Graph-Based Image Classification Refinement , 2019, IEEE Access.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Hamid R. Rabiee,et al.  MDL-CW: A Multimodal Deep Learning Framework with CrossWeights , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Brian B. Avants,et al.  The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[10]  Clement J. McDonald,et al.  Preparing a collection of radiology examinations for distribution and retrieval , 2015, J. Am. Medical Informatics Assoc..

[11]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[12]  Lin Yang,et al.  MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Tatsuya Harada,et al.  DeMIAN: Deep Modality Invariant Adversarial Network , 2016, ArXiv.

[14]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[15]  Marleen de Bruijne,et al.  Why Does Synthesized Data Improve Multi-sequence Classification? , 2015, MICCAI.

[16]  Guy Amit,et al.  Classification of breast lesions using cross-modal deep learning , 2017, 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017).

[17]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[18]  Frédo Durand,et al.  Data augmentation using learned transforms for one-shot medical image segmentation , 2019, ArXiv.

[19]  R. Devon Hjelm,et al.  Locality and compositionality in zero-shot learning , 2019, ICLR.

[20]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  R Devon Hjelm,et al.  Zero-Shot Learning from scratch (ZFS): leveraging local compositional representations , 2020, ArXiv.

[23]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[24]  Yoshua Bengio,et al.  Unsupervised State Representation Learning in Atari , 2019, NeurIPS.

[25]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[26]  Ronald M. Summers,et al.  TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Christos Davatzikos,et al.  Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features , 2017, Scientific Data.