From Heatmaps to Structural Explanations of Image Classifiers

This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps users understand network classifications that are less intuitive and substantially improves user performance on a difficult fine-grained classification task of discriminating among different species of seagulls. Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed I-GOS and iGOS++ utilizing integrated gradients to avoid local optima in heatmap generation, which improved the performance across all resolutions. During the development of those visualizations, we realized that for a significant number of images, the classifier has multiple different paths to reach a confident prediction. This has lead to our recent development of structured attention graphs (SAGs), an approach that utilizes beam search to locate multiple coarse heatmaps for a single image, and compactly visualizes a set of heatmaps by capturing how different combinations of image regions impact the confidence of a classifier. Through the research process, we have learned much about insights in building deep network explanations, the existence and frequency of multiple explanations, and various tricks of the trade that make explanations work. In this paper, we attempt to share those insights and opinions with the readers with the hope that some of them will be informative for future researchers on explainable deep learning.

[1]  Bolei Zhou,et al.  Interpretable Basis Decomposition for Visual Explanation , 2018, ECCV.

[2]  Kate Saenko,et al.  RISE: Randomized Input Sampling for Explanation of Black-box Models , 2018, BMVC.

[3]  Qian Yang,et al.  Designing Theory-Driven User-Centric Explainable AI , 2019, CHI.

[4]  Trevor Darrell,et al.  Multimodal Explanations: Justifying Decisions and Pointing to the Evidence , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Brian J. Hoskins,et al.  A regime analysis of Atlantic winter jet variability applied to evaluate HadGEM3‐GC2 , 2016 .

[7]  Chun-Liang Li,et al.  On Completeness-aware Concept-Based Explanations in Deep Neural Networks , 2020, NeurIPS.

[8]  Sanja Fidler,et al.  Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Trevor Darrell,et al.  Generating Visual Explanations , 2016, ECCV.

[10]  Wenxuan Wu,et al.  Visualizing point cloud classifiers by curvature smoothing , 2019, BMVC.

[11]  David Jensen,et al.  Toybox: A Suite of Environments for Experimental Evaluation of Deep Reinforcement Learning , 2019, ArXiv.

[12]  Fuxin Li,et al.  Learning Explainable Embeddings for Deep Networks , 2017 .

[13]  Alan Fern,et al.  Interactive Naming for Explaining Deep Neural Networks: A Formative Study , 2018, IUI Workshops.

[14]  Fuxin Li,et al.  iGOS++: integrated gradient optimized saliency by bilateral perturbations , 2021, CHIL.

[15]  Q. Liao,et al.  Questioning the AI: Informing Design Practices for Explainable AI User Experiences , 2020, CHI.

[16]  Mani B. Srivastava,et al.  How Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation Methods , 2020, NeurIPS.

[17]  Fuxin Li,et al.  Visualizing Deep Networks by Optimizing with Integrated Gradients , 2019, CVPR Workshops.

[18]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[19]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[20]  Wojciech Samek,et al.  Explainable AI: Interpreting, Explaining and Visualizing Deep Learning , 2019, Explainable AI.

[21]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[22]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[23]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[24]  Yang Zhang,et al.  A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations , 2018, ICML.

[25]  Amit Dhurandhar,et al.  Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.

[26]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Structured Attention Graphs for Understanding Deep Image Classifications , 2020, ArXiv.

[28]  Zhe L. Lin,et al.  Top-Down Neural Attention by Excitation Backprop , 2016, International Journal of Computer Vision.

[29]  Fuxin Li,et al.  Embedding Deep Networks into Visual Explanations , 2017, Artif. Intell..

[30]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[31]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[32]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[33]  Minsuk Kahng,et al.  Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers , 2018, IEEE Transactions on Visualization and Computer Graphics.

[34]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  James Zou,et al.  Towards Automatic Concept-based Explanations , 2019, NeurIPS.

[37]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[38]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[39]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.