Exemplary Natural Images Explain CNN Activations Better than Feature Visualizations

Feature visualizations such as synthetic maximally activating images are a widely used explanation method to better understand the information processing of convolutional neural networks (CNNs). At the same time, there are concerns that these visualizations might not accurately represent CNNs' inner workings. Here, we measure how much extremely activating images help humans to predict CNN activations. Using a well-controlled psychophysical paradigm, we compare the informativeness of synthetic images (Olah et al., 2017) with a simple baseline visualization, namely exemplary natural images that also strongly activate a specific feature map. Given either synthetic or natural reference images, human participants choose which of two query images leads to strong positive activation. The experiment is designed to maximize participants' performance, and is the first to probe intermediate instead of final layer representations. We find that synthetic images indeed provide helpful information about feature map activations (82% accuracy; chance would be 50%). However, natural images-originally intended to be a baseline-outperform synthetic images by a wide margin (92% accuracy). Additionally, participants are faster and more confident for natural images, whereas subjective impressions about the interpretability of feature visualization are mixed. The higher informativeness of natural images holds across most layers, for both expert and lay participants as well as for hand- and randomly-picked feature visualizations. Even if only a single reference image is given, synthetic images provide less information than natural images (65% vs. 73%). In summary, popular synthetic images from feature visualizations are significantly less informative for assessing CNN activations than natural images. We argue that future visualization methods should improve over this simple baseline.

[1]  Simon B. Eickhoff,et al.  Evolving complex yet interpretable representations: application to Alzheimer’s diagnosis and prognosis , 2020, 2020 IEEE Congress on Evolutionary Computation (CEC).

[2]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Cuntai Guan,et al.  Quantifying Explainability of Saliency Methods in Deep Neural Networks , 2020, ArXiv.

[4]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[5]  Mohit Bansal,et al.  Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? , 2020, ACL.

[6]  Minsuk Kahng,et al.  CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization , 2020, IEEE transactions on visualization and computer graphics.

[7]  Dumitru Erhan,et al.  A Benchmark for Interpretability Methods in Deep Neural Networks , 2018, NeurIPS.

[8]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[9]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[10]  Duen Horng Chau,et al.  Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations , 2019, IEEE Transactions on Visualization and Computer Graphics.

[11]  Imme Ebert-Uphoff,et al.  Evaluation, Tuning, and Interpretation of Neural Networks for Working with Images in Meteorological Applications , 2020 .

[12]  Jarke J. van Wijk,et al.  ExplainExplore: Visual Exploration of Machine Learning Explanations , 2020, 2020 IEEE Pacific Visualization Symposium (PacificVis).

[13]  Nick Cammarata,et al.  An Overview of Early Vision in InceptionV1 , 2020 .

[14]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[15]  Ulrike von Luxburg,et al.  Comparison-Based Framework for Psychophysics: Lab versus Crowdsourcing , 2019, ArXiv.

[16]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[17]  J. Gray,et al.  PsychoPy2: Experiments in behavior made easy , 2019, Behavior Research Methods.

[18]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[20]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[21]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[22]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[23]  Jacob Andreas,et al.  Are Visual Explanations Useful? A Case Study in Model-in-the-Loop Prediction , 2020, ArXiv.

[24]  Quanshi Zhang,et al.  Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.

[25]  Ilya Sutskever,et al.  Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[26]  Panagiotis Papapetrou,et al.  Evaluating Local Interpretable Model-Agnostic Explanations on Clinical Machine Learning Classification Models , 2020, 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS).

[27]  Wojciech Samek,et al.  Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond , 2020, ArXiv.

[28]  Rui Shi,et al.  Group visualization of class-discriminative features , 2020, Neural Networks.

[29]  Max Welling,et al.  Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.

[30]  Peter Bell,et al.  Perceptual bias and technical metapictures: critical machine vision as a humanities challenge , 2020, AI & SOCIETY.

[31]  Jason Yosinski,et al.  Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks , 2016, ArXiv.

[32]  Andrea Vedaldi,et al.  Net2Vec: Quantifying and Explaining How Concepts are Encoded by Filters in Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  J. Cox,et al.  I Know It When I See It , 2009 .

[34]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[35]  Abubakar Abid,et al.  Interpretation of Neural Networks is Fragile , 2017, AAAI.

[36]  Felix Bießmann,et al.  Quantifying Interpretability and Trust in Machine Learning Systems , 2019, ArXiv.

[37]  Murat Kantarcioglu,et al.  Does Explainable Artificial Intelligence Improve Human Decision-Making? , 2020, AAAI.

[38]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[39]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[40]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[41]  Nick Cammarata,et al.  Zoom In: An Introduction to Circuits , 2020 .

[42]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[43]  Fabian Offert,et al.  "I know it when I see it". Visualization and Intuitive Interpretability , 2017, 1711.08042.

[44]  Arvind Satyanarayan,et al.  The Building Blocks of Interpretability , 2018 .

[45]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[46]  Matthew Botvinick,et al.  On the importance of single directions for generalization , 2018, ICLR.

[47]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[48]  R. Venkatesh Babu,et al.  Saliency-Driven Class Impressions For Feature Visualization Of Deep Neural Networks , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[49]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[50]  Martin Wattenberg,et al.  Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making , 2019, CHI.

[51]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[54]  Alexander Binder,et al.  Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[55]  James Zou,et al.  Towards Automatic Concept-based Explanations , 2019, NeurIPS.

[56]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[57]  Felix Biessmann,et al.  A psychophysics approach for quantitative comparison of interpretable computer vision models , 2019, ArXiv.

[58]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[59]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[60]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[61]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[62]  Z. Berkay Celik,et al.  What Do You See?: Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors , 2020, KDD.

[63]  Thomas Fel,et al.  Representativity & Consistency Measures for Deep Neural Network Explanations , 2020 .

[64]  Hua Shen,et al.  How Useful Are the Machine-Generated Interpretations to General Users? A Human Evaluation on Guessing the Incorrectly Predicted Labels , 2020, HCOMP.

[65]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[66]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[67]  Bolei Zhou,et al.  Understanding the role of individual units in a deep neural network , 2020, Proceedings of the National Academy of Sciences.

[68]  Dumitru Erhan,et al.  The (Un)reliability of saliency methods , 2017, Explainable AI.

[69]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[70]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[71]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[72]  Leon A. Gatys,et al.  Diverse feature visualizations reveal invariances in early layers of deep neural networks , 2018, ECCV.

[73]  Yoshua Bengio,et al.  Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Andreas Hotho,et al.  Evaluation of Post-hoc XAI Approaches Through Synthetic Tabular Data , 2020, ISMIS.