Group visualization of class-discriminative features

Research explaining the behavior of convolutional neural networks (CNNs) has gained a lot of attention over the past few years. Although many visualization methods have been proposed to explain network predictions, most fail to provide clear correlations between the target output and the features extracted by convolutional layers. In this work, we define a concept, i.e., class-discriminative feature groups, to specify features that are extracted by groups of convolutional kernels correlated with a particular image class. We propose a detection method to detect class-discriminative feature groups and a visualization method to highlight image regions correlated with particular output and to interpret class-discriminative feature groups intuitively. The experiments showed that the proposed method can disentangle features based on image classes and shed light on what feature groups are extracted from which regions of the image. We also applied this method to visualize "lost" features in adversarial samples and features in an image containing a non-class object to demonstrate its ability to debug why the network failed or succeeded.

[1]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[2]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[3]  Toon Goedemé,et al.  Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[5]  Abubakar Abid,et al.  Interpretation of Neural Networks is Fragile , 2017, AAAI.

[6]  Marcel van Gerven,et al.  Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges , 2018, ArXiv.

[7]  Federico Tombari,et al.  Restricting the Flow: Information Bottlenecks for Attribution , 2020, ICLR.

[8]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[9]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[11]  Paulo Drews,et al.  Visualization Methods for Image Transformation Convolutional Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[13]  Yi Sun,et al.  Axiomatic attribution for multilinear functions , 2011, EC '11.

[14]  L. Shapley,et al.  Values of Non-Atomic Games , 1974 .

[15]  Cengiz Öztireli,et al.  Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[16]  Markus H. Gross,et al.  Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Values Approximation , 2019, ICML.

[17]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[18]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[19]  Jianhua Zhao,et al.  ML estimation for factor analysis: EM or non-EM? , 2008, Stat. Comput..

[20]  Jian Cheng,et al.  Visualizing deep neural network by alternately image blurring and deblurring , 2018, Neural Networks.

[21]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[22]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[23]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[24]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[25]  Pascal Sturmfels,et al.  Visualizing the Impact of Feature Attribution Baselines , 2020 .

[26]  Arvind Satyanarayan,et al.  The Building Blocks of Interpretability , 2018 .

[27]  Daniel Gómez,et al.  Polynomial calculation of the Shapley value based on sampling , 2009, Comput. Oper. Res..

[28]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[29]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[30]  Dennis Child,et al.  The essentials of factor analysis , 1970 .

[31]  Anna Shcherbina,et al.  Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.

[32]  Dumitru Erhan,et al.  The (Un)reliability of saliency methods , 2017, Explainable AI.

[33]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[34]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[35]  David Barber,et al.  Bayesian reasoning and machine learning , 2012 .

[36]  Max Welling,et al.  Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.

[37]  Andrea Vedaldi,et al.  Visualizing Deep Convolutional Neural Networks Using Natural Pre-images , 2015, International Journal of Computer Vision.

[38]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[39]  Yang Zhang,et al.  A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations , 2018, ICML.