Learning Propagation Rules for Attribution Map Generation

Prior gradient-based attribution-map methods rely on handcrafted propagation rules for the non-linear/activation layers during the backward pass, so as to produce gradients of the input and then the attribution map. Despite the promising results achieved, such methods are sensitive to the non-informative high-frequency components and lack adaptability for various models and samples. In this paper, we propose a dedicated method to generate attribution maps that allow us to learn the propagation rules automatically, overcoming the flaws of the handcrafted ones. Specifically, we introduce a learnable plugin module, which enables adaptive propagation rules for each pixel, to the non-linear layers during the backward pass for mask generating. The masked input image is then fed into the model again to obtain new output that can be used as a guidance when combined with the original one. The introduced learnable module can be trained under any auto-grad framework with higher-order differential support. As demonstrated on five datasets and six network architectures, the proposed method yields state-of-the-art results and gives cleaner and more visually plausible attribution maps.

[1]  Klaus-Robert Müller,et al.  Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[2]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[3]  Dougal Maclaurin,et al.  Modeling, Inference and Optimization With Composable Differentiable Procedures , 2016 .

[4]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[5]  Mingli Song,et al.  Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Xinchao Wang,et al.  Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Dumitru Erhan,et al.  A Benchmark for Interpretability Methods in Deep Neural Networks , 2018, NeurIPS.

[8]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Kate Saenko,et al.  RISE: Randomized Input Sampling for Explanation of Black-box Models , 2018, BMVC.

[10]  Cengiz Öztireli,et al.  Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[11]  Markus H. Gross,et al.  Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Values Approximation , 2019, ICML.

[12]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[13]  Dacheng Tao,et al.  Dual Swap Disentangling , 2018, NeurIPS.

[14]  Andrea Vedaldi,et al.  Understanding Deep Networks via Extremal Perturbations and Smooth Masks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Dacheng Tao,et al.  On Compressing Deep Models by Low Rank and Sparse Decomposition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[18]  Klaus-Robert Müller,et al.  PatternNet and PatternLRP - Improving the interpretability of neural networks , 2017, ArXiv.

[19]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[20]  Li Sun,et al.  Amalgamating Knowledge towards Comprehensive Classification , 2018, AAAI.

[21]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[22]  Yixin Chen,et al.  DEPARA: Deep Attribution Graph for Deep Knowledge Transferability , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Qi Tian,et al.  Data-Free Learning of Student Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[26]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[27]  Abubakar Abid,et al.  Interpretation of Neural Networks is Fragile , 2017, AAAI.

[28]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Dacheng Tao,et al.  Distilling Knowledge From Graph Convolutional Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Alexander Binder,et al.  Layer-Wise Relevance Propagation for Neural Networks with Local Renormalization Layers , 2016, ICANN.

[31]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[32]  Le Song,et al.  Learning to Explain: An Information-Theoretic Perspective on Model Interpretation , 2018, ICML.

[33]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[34]  Quanshi Zhang,et al.  Interpreting CNN knowledge via an Explanatory Graph , 2017, AAAI.

[35]  Dacheng Tao,et al.  Adversarial Learning of Portable Student Networks , 2018, AAAI.

[36]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[37]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[38]  Andreas Griewank,et al.  Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.

[39]  Yixin Chen,et al.  Deep Model Transferability from Attribution Maps , 2019, NeurIPS.

[40]  Anna Shcherbina,et al.  Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.

[41]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).