Monitoring Shortcut Learning using Mutual Information

The failure of deep neural networks to generalize to out-of-distribution data is a well-known problem and raises concerns about the deployment of trained networks in safety-critical domains such as healthcare, finance and autonomous vehicles. We study a particular kind of distribution shift — shortcuts or spurious correlations in the training data. Shortcut learning is often only exposed when models are evaluated on real-world data that does not contain the same spurious correlations, posing a serious dilemma for AI practitioners to properly assess the effectiveness of a trained model for real-world applications. In this work, we propose to use the mutual information (MI) between the learned representation and the input as a metric to find where in training the network latches onto shortcuts. Experiments demonstrate that MI can be used as a domain-agnostic metric for monitoring shortcut learning.

[1]  Been Kim,et al.  Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation , 2022, ICLR.

[2]  Kira A. Selby,et al.  Learning Functions on Multiple Sets using Multi-Set Transformers , 2022, UAI.

[3]  Ali Taylan Cemgil,et al.  A Fine-Grained Analysis on Distribution Shift , 2021, ICLR.

[4]  Joseph D. Janizek,et al.  AI for radiographic COVID-19 detection selects shortcuts over signal , 2020, Nature Machine Intelligence.

[5]  Jacob Andreas,et al.  Are Visual Explanations Useful? A Case Study in Model-in-the-Loop Prediction , 2020, ArXiv.

[6]  M. Bethge,et al.  Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[7]  Enrico Costanza,et al.  Evaluating saliency map explanations for convolutional neural networks: a user study , 2020, IUI.

[8]  Daniel C. Castro,et al.  Causality matters in medical imaging , 2019, Nature Communications.

[9]  Alexander A. Alemi,et al.  Information in Infinite Ensembles of Infinitely-Wide Neural Networks , 2019, AABI.

[10]  Zhitao Gong,et al.  Strike (With) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Christian S. Perone,et al.  Unsupervised domain adaptation for medical imaging segmentation with self-ensembling , 2018, NeuroImage.

[12]  Marcus A. Badgeley,et al.  Deep learning predicts hip fracture using confounding patient and healthcare variables , 2018, npj Digital Medicine.

[13]  Luc Van Gool,et al.  Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[14]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[15]  Nicolas Macris,et al.  Entropy and mutual information in models of deep neural networks , 2018, NeurIPS.

[16]  Ashirbani Saha,et al.  Deep learning for segmentation of brain tumors: Impact of cross‐institutional training and testing , 2018, Medical physics.

[17]  David D. Cox,et al.  On the information bottleneck theory of deep learning , 2018, ICLR.

[18]  Arnold W. M. Smeulders,et al.  i-RevNet: Deep Invertible Networks , 2018, ICLR.

[19]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[20]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[22]  Oriol Vinyals,et al.  Qualitatively characterizing neural network optimization problems , 2014, ICLR.

[23]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[24]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .