Multimodal Learning with Incomplete Modalities by Knowledge Distillation

Multimodal learning aims at utilizing information from a variety of data modalities to improve the generalization performance. One common approach is to seek the common information that is shared among different modalities for learning, whereas we can also fuse the supplementary information to leverage modality-specific information. Though the supplementary information is often desired, most existing multimodal approaches can only learn from samples with complete modalities, which wastes a considerable amount of data collected. Otherwise, model-based imputation needs to be used to complete the missing values and yet may introduce undesired noise, especially when the sample size is limited. In this paper, we proposed a framework based on knowledge distillation, utilizing the supplementary information from all modalities, and avoiding imputation and noise associated with it. Specifically, we first train models on each modality independently using all the available data. Then the trained models are used as teachers to teach the student model, which is trained with the samples having complete modalities. We demonstrate the effectiveness of the proposed method in extensive empirical studies on both synthetic datasets and real-world datasets.

[1]  Anders M. Dale,et al.  An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest , 2006, NeuroImage.

[2]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[3]  Sabrina Eberhart,et al.  Applied Missing Data Analysis , 2016 .

[4]  Aristidis Likas,et al.  Kernel-Based Weighted Multi-view Clustering , 2012, 2012 IEEE 12th International Conference on Data Mining.

[5]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[6]  Jiayu Zhou,et al.  The Added Value of Diffusion-Weighted MRI-Derived Structural Connectome in Evaluating Mild Cognitive Impairment: A Multi-Cohort Validation1. , 2018, Journal of Alzheimer's disease : JAD.

[7]  Manik Varma,et al.  More generality in efficient multiple kernel learning , 2009, ICML '09.

[8]  Liang Wang,et al.  Unified subspace learning for incomplete and unlabeled multi-view data , 2017, Pattern Recognit..

[9]  Weifeng Liu,et al.  Multiview Hessian Regularization for Image Annotation , 2013, IEEE Transactions on Image Processing.

[10]  Dinggang Shen,et al.  Synthesizing Missing PET from MRI with Cycle-consistent Generative Adversarial Networks for Alzheimer's Disease Diagnosis , 2018, MICCAI.

[11]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[12]  Jiayu Zhou,et al.  Missing Modalities Imputation via Cascaded Residual Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yang Song,et al.  Multi-Rate Deep Learning for Temporal Recommendation , 2016, SIGIR.

[14]  Jiayu Zhou,et al.  Multi-Modality Disease Modeling via Collective Deep Matrix Factorization , 2017, KDD.

[15]  Jennifer Williams,et al.  Recognizing Emotions in Video Using Multimodal DNN Feature Fusion , 2018 .

[16]  Philip S. Yu,et al.  Clustering on Multiple Incomplete Datasets via Collective Kernel Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[17]  Barnabás Póczos,et al.  Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities , 2018, AAAI.

[18]  Paul M. Thompson,et al.  Discriminative fusion of multiple brain networks for early mild cognitive impairment detection , 2016, 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI).

[19]  Yun Fu,et al.  Partial Multi-view Clustering via Consistent GAN , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[20]  Liang Wang,et al.  Incomplete Multi-view Clustering via Subspace Learning , 2015, CIKM.

[21]  Huchuan Lu,et al.  Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Peter A. Calabresi,et al.  Multiple Sclerosis Lesion Segmentation from Brain MRI via Fully Convolutional Neural Networks , 2018, ArXiv.

[23]  Inderjit S. Dhillon,et al.  Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems , 2012, 2012 IEEE 12th International Conference on Data Mining.

[24]  Fenglong Ma,et al.  Metric Learning on Healthcare Data with Incomplete Modalities , 2019, IJCAI.

[25]  H. Soininen,et al.  Volumes of the Entorhinal and Perirhinal Cortices in Alzheimer’s Disease , 1998, Neurobiology of Aging.

[26]  Nick C. Fox,et al.  Patterns of Cortical Thickness according to APOE Genotype in Alzheimer’s Disease , 2009, Dementia and Geriatric Cognitive Disorders.

[27]  Xu Xu,et al.  Toward Marker-Free 3D Pose Estimation in Lifting: A Deep Multi-View Solution , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[28]  Geoffrey J M Parker,et al.  A framework for a streamline‐based probabilistic index of connectivity (PICo) using a structural interpretation of MRI diffusion measurements , 2003, Journal of magnetic resonance imaging : JMRI.

[29]  Dinggang Shen,et al.  Deep Adversarial Learning for Multi-Modality Missing Data Completion , 2018, KDD.

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31]  Hatice Gunes,et al.  Affect recognition from face and body: early fusion vs. late fusion , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[32]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[33]  Zili Zhang,et al.  Incomplete Multi-View Weak-Label Learning , 2018, IJCAI.

[34]  Raquel Urtasun,et al.  Fully Connected Deep Structured Networks , 2015, ArXiv.

[35]  Paul M. Thompson,et al.  Multi-source learning for joint analysis of incomplete multi-modality neuroimaging data , 2012, KDD.

[36]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[37]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[38]  S. Rauch,et al.  Structural brain magnetic resonance imaging of limbic and thalamic volumes in pediatric bipolar disorder. , 2005, The American journal of psychiatry.