Embedding Deep Networks into Visual Explanations

In this paper, we propose a novel explanation module to explain the predictions made by a deep network. The explanation module works by embedding a high-dimensional deep network layer nonlinearly into a low-dimensional explanation space while retaining faithfulness, so that the original deep learning predictions can be constructed from the few concepts extracted by the explanation module. We then visualize such concepts for human to learn about the high-level concepts that deep learning is using to make decisions. We propose an algorithm called Sparse Reconstruction Autoencoder (SRAE) for learning the embedding to the explanation space. SRAE aims to reconstruct part of the original feature space while retaining faithfulness. A pull-away term is applied to SRAE to make the explanation space more orthogonal. A visualization system is then introduced for human understanding of the features in the explanation space. The proposed method is applied to explain CNN models in image classification tasks, and several novel metrics are introduced to evaluate the performance of explanations quantitatively without human involvement. Experiments show that the proposed approach generates interesting explanations of the mechanisms CNN use for making predictions.

[1]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Tinne Tuytelaars,et al.  Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks , 2017, ICLR.

[4]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[5]  Georg Langs,et al.  Causability and explainability of artificial intelligence in medicine , 2019, WIREs Data Mining Knowl. Discov..

[6]  Trevor Darrell,et al.  Attentive Explanations: Justifying Decisions and Pointing to the Evidence , 2016, ArXiv.

[7]  Zhe L. Lin,et al.  Top-Down Neural Attention by Excitation Backprop , 2016, International Journal of Computer Vision.

[8]  Andreas Holzinger,et al.  Measuring the Quality of Explanations: The System Causability Scale (SCS) , 2020, KI - Künstliche Intelligenz.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Dumitru Erhan,et al.  The (Un)reliability of saliency methods , 2017, Explainable AI.

[11]  Weng-Keen Wong,et al.  Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.

[12]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[13]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Fuxin Li,et al.  Learning Explainable Embeddings for Deep Networks , 2017 .

[15]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[16]  Yuval Pinter,et al.  Attention is not not Explanation , 2019, EMNLP.

[17]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[18]  Sanja Fidler,et al.  What Are You Talking About? Text-to-Image Coreference , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Bernt Schiele,et al.  Learning Deep Representations of Fine-Grained Visual Descriptions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[22]  Sanja Fidler,et al.  Visual Semantic Search: Retrieving Videos via Complex Textual Queries , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[24]  D. Ruderman The statistics of natural images , 1994 .

[25]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[27]  Zhang Han,et al.  SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition , 2016 .

[28]  Trevor Darrell,et al.  Generating Visual Explanations , 2016, ECCV.

[29]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[30]  Wei Xu,et al.  Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[32]  Cynthia Rudin,et al.  This Looks Like That: Deep Learning for Interpretable Image Recognition , 2018 .

[33]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[34]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[35]  Jitendra Malik,et al.  Actions and Attributes from Wholes and Parts , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Alexandros G. Dimakis,et al.  Streaming Weak Submodularity: Interpreting Neural Networks on the Fly , 2017, NIPS.

[37]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[38]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[39]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[40]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[41]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[42]  Jianfei Cai,et al.  Weakly Supervised Fine-Grained Categorization With Part-Based Image Representation , 2016, IEEE Transactions on Image Processing.

[43]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[44]  Davide Modolo,et al.  Do Semantic Parts Emerge in Convolutional Neural Networks? , 2016, International Journal of Computer Vision.

[45]  Marcel Simon,et al.  Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[47]  Joao Marques-Silva,et al.  Abduction-Based Explanations for Machine Learning Models , 2018, AAAI.

[48]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[49]  Ahmed M. Elgammal,et al.  SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[51]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Cynthia Rudin,et al.  Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions , 2017, AAAI.

[53]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[54]  Qi Tian,et al.  Picking Deep Filter Responses for Fine-Grained Image Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Bolei Zhou,et al.  Interpretable Basis Decomposition for Visual Explanation , 2018, ECCV.

[56]  Kate Saenko,et al.  RISE: Randomized Input Sampling for Explanation of Black-box Models , 2018, BMVC.

[57]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Yixin Chen,et al.  Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[59]  Jitendra Malik,et al.  Analyzing the Performance of Multilayer Neural Networks for Object Recognition , 2014, ECCV.

[60]  Adnan Darwiche,et al.  A Symbolic Approach to Explaining Bayesian Network Classifiers , 2018, IJCAI.

[61]  Vineeth N. Balasubramanian,et al.  Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[62]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Yan Liu,et al.  Interpretable Deep Models for ICU Outcome Prediction , 2016, AMIA.

[64]  Cynthia Rudin,et al.  Supersparse linear integer models for optimized medical scoring systems , 2015, Machine Learning.

[65]  Mark O. Riedl Human-Centered Artificial Intelligence and Machine Learning , 2019, Human Behavior and Emerging Technologies.

[66]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[67]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Bo Zhao,et al.  Diversified Visual Attention Networks for Fine-Grained Object Classification , 2016, IEEE Transactions on Multimedia.

[69]  Bo Sun,et al.  The morphodynamics of 3D migrating cancer cells , 2018 .

[70]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[71]  Ruslan Salakhutdinov,et al.  Multimodal Neural Language Models , 2014, ICML.

[72]  Peter A. Flach,et al.  Explainability fact sheets: a framework for systematic assessment of explainable approaches , 2019, FAT*.

[73]  Xiaolin Hu,et al.  Interpret Neural Networks by Identifying Critical Data Routing Paths , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[74]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[75]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[76]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[78]  Andreas Holzinger,et al.  Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.