Saliency is a Possible Red Herring When Diagnosing Poor Generalization

Poor generalization is one symptom of models that learn to predict target variables using spuriously-correlated image features present only in the training distribution instead of the true image features that denote a class. It is often thought that this can be diagnosed visually using attribution (aka saliency) maps. We study if this assumption is correct. In some prediction tasks, such as for medical images, one may have some images with masks drawn by a human expert, indicating a region of the image containing relevant information to make the prediction. We study multiple methods that take advantage of such auxiliary labels, by training networks to ignore distracting features which may be found outside of the region of interest. This mask information is only used during training and has an impact on generalization accuracy depending on the severity of the shift between the training and test distributions. Surprisingly, while these methods improve generalization performance in the presence of a covariate shift, there is no strong correspondence between the correction of attribution towards the features a human expert has labelled as important and generalization performance. These results suggest that the root cause of poor generalization may not always be spatially defined, and raise questions about the utility of masks as “attribution priors” as well as saliency maps for explainable predictions.

[1]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[2]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[3]  Sébastien Marcel,et al.  Torchvision the machine-vision package of torch , 2010, ACM Multimedia.

[4]  Francisco Herrera,et al.  A unifying view on dataset shift in classification , 2012, Pattern Recognit..

[5]  Brian C. Lovell,et al.  Unsupervised Domain Adaptation by Domain Invariant Projection , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[7]  Barbara Hammer,et al.  Neural Smithing – Supervised Learning in Feedforward Artificial Neural Networks , 2001, Pattern Analysis & Applications.

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[10]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[11]  Joseph Paul Cohen,et al.  Prediction gradients for feature extraction and analysis from convolutional neural networks , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[12]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[17]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[18]  Mehrtash Tafazzoli Harandi,et al.  Distribution-Matching Embedding for Visual Domain Adaptation , 2016, J. Mach. Learn. Res..

[19]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[20]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yoshua Bengio,et al.  Measuring the tendency of CNNs to Learn Surface Statistical Regularities , 2017, ArXiv.

[22]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[23]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[24]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[25]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[26]  Cengiz Öztireli,et al.  Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[27]  Marcus A. Badgeley,et al.  Confounding variables can degrade generalization performance of radiological deep learning models , 2018, ArXiv.

[28]  Marcus A. Badgeley,et al.  Deep learning predicts hip fracture using confounding patient and healthcare variables , 2018, npj Digital Medicine.

[29]  Sally Shrapnel,et al.  Deep neural network or dermatologist? , 2019, iMIMIC/ML-CDS@MICCAI.

[30]  F. Pfeiffer,et al.  Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization , 2019, Scientific Reports.

[31]  Carol C Wu,et al.  Augmenting the National Institutes of Health Chest Radiograph Dataset with Expert Annotations of Possible Pneumonia. , 2019, Radiology. Artificial intelligence.

[32]  Beomsu Kim,et al.  Why are Saliency Maps Noisy? Cause of and Solution to Noisy Saliency Maps , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[33]  Yoshua Bengio,et al.  GradMask: Reduce Overfitting by Regularizing Saliency , 2019, ArXiv.

[34]  Pascal Sturmfels,et al.  Learning Explainable Models Using Attribution Priors , 2019, ArXiv.

[35]  Jianguo Zhang,et al.  CARE: Class Attention to Regions of Lesion for Classification on Imbalanced Data , 2019, MIDL.

[36]  Andrew Kyle Lampinen,et al.  What shapes feature representations? Exploring datasets, architectures, and training , 2020, NeurIPS.

[37]  M. Bethge,et al.  Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[38]  Joseph Paul Cohen,et al.  On the limits of cross-domain generalization in automated X-ray prediction , 2020, MIDL.

[39]  Chandan Singh,et al.  Interpretations are useful: penalizing explanations to align neural networks with prior knowledge , 2019, ICML.

[40]  Gabriel Erion,et al.  An adversarial approach for the robust classification of pneumonia from chest radiographs , 2020, CHIL.

[41]  Joseph D. Janizek,et al.  AI for radiographic COVID-19 detection selects shortcuts over signal , 2020, Nature Machine Intelligence.

[42]  Alexander D'Amour,et al.  Underspecification Presents Challenges for Credibility in Modern Machine Learning , 2020, J. Mach. Learn. Res..

[43]  Bilal Alsallakh,et al.  Captum: A unified and generic model interpretability library for PyTorch , 2020, ArXiv.

[44]  John Paoli,et al.  Human–computer collaboration for skin cancer recognition , 2020, Nature Medicine.

[45]  Antonio Pertusa,et al.  PadChest: A large chest x-ray image dataset with multi-label annotated reports , 2019, Medical Image Anal..

[46]  Wouter M. Kouw,et al.  A Review of Domain Adaptation without Target Labels , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  B. Schölkopf,et al.  Learning explanations that are hard to vary , 2020, ICLR.

[48]  Joseph Paul Cohen,et al.  TorchXRayVision: A library of chest X-ray datasets and models , 2021, MIDL.