论文信息 - Towards better understanding of gradient-based attribution methods for Deep Neural Networks - 字舞流文

Towards better understanding of gradient-based attribution methods for Deep Neural Networks

Understanding the flow of information in Deep Neural Networks (DNNs) is a challenging problem that has gain increasing attention over the last few years. While several methods have been proposed to explain network predictions, there have been only a few attempts to compare them from a theoretical perspective. What is more, no exhaustive empirical comparison has been performed in the past. In this work, we analyze four gradient-based attribution methods and formally prove conditions of equivalence and approximation between them. By reformulating two of these methods, we construct a unified framework which enables a direct comparison, as well as an easier implementation. Finally, we propose a novel evaluation metric, called Sensitivity-n and test the gradient-based attribution methods alongside with a simple perturbation-based attribution method on several datasets in the domains of image and text classification, using various network architectures.

Cengiz Öztireli | Markus Gross | Enea Ceolini | Marco Ancona | Enea Ceolini | C. Öztireli | M. Gross | Marco Ancona | M. Gross | Markus H. Gross | Cengiz Öztireli

[1] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[2] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[3] L. Shapley,et al. The Shapley Value , 1994 .

[4] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5] Anna Shcherbina,et al. Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.

[6] Ramprasaath R. Selvaraju,et al. Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[7] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[8] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[9] Alexander Binder,et al. Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[10] Klaus-Robert Müller,et al. Investigating the influence of noise and distractors on the interpretation of neural networks , 2016, ArXiv.

[11] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[12] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[14] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[15] Max Welling,et al. Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.

[16] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[19] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[20] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[21] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[22] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.

[24] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[25] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[27] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[28] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[29] Alexander Binder,et al. Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[30] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[31] Klaus-Robert Müller,et al. Explaining Recurrent Neural Network Predictions in Sentiment Analysis , 2017, WASSA@EMNLP.