Explanations can be manipulated and geometry is to blame
暂无分享,去创建一个
Klaus-Robert Müller | Maximilian Alber | Christopher J. Anders | Pan Kessel | Marcel Ackermann | Ann-Kathrin Dombrowski | K. Müller | M. Alber | P. Kessel | Ann-Kathrin Dombrowski | M. Ackermann
[1] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[3] Cengiz Öztireli,et al. Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.
[4] Yarin Gal,et al. Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.
[5] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.
[6] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[7] Alexander Binder,et al. Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[8] Dumitru Erhan,et al. The (Un)reliability of saliency methods , 2017, Explainable AI.
[9] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.
[10] Alexander Binder,et al. Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.
[11] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[12] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[13] Klaus-Robert Müller,et al. iNNvestigate neural networks! , 2018, J. Mach. Learn. Res..
[14] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[15] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[16] Abubakar Abid,et al. Interpretation of Neural Networks is Fragile , 2017, AAAI.
[17] Motoaki Kawanabe,et al. How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..
[18] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[19] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[20] Taesup Moon,et al. Fooling Neural Network Interpretations via Adversarial Model Manipulation , 2019, NeurIPS.
[21] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[23] Tommi S. Jaakkola,et al. On the Robustness of Interpretability Methods , 2018, ArXiv.
[24] L. Tu,et al. Differential Geometry: Connections, Curvature, and Characteristic Classes , 2017 .
[25] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[26] Klaus-Robert Müller,et al. Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.
[27] Max Welling,et al. Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.
[28] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[29] Been Kim,et al. Sanity Checks for Saliency Maps , 2018, NeurIPS.
[30] Wojciech Samek,et al. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning , 2019, Explainable AI.
[31] Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[32] Alexander Binder,et al. Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..
[33] Tommi S. Jaakkola,et al. Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.
[34] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.
[35] Martin Wattenberg,et al. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.