On the (In)fidelity and Sensitivity for Explanations.
暂无分享,去创建一个
Chih-Kuan Yeh | Cheng-Yu Hsieh | Arun Sai Suggala | David Inouye | Pradeep Ravikumar | David I. Inouye | Pradeep Ravikumar | Chih-Kuan Yeh | Cheng-Yu Hsieh | A. Suggala
[1] Motoaki Kawanabe,et al. How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..
[2] Erik Strumbelj,et al. Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.
[3] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[4] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[5] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[6] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.
[7] Weng-Keen Wong,et al. Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.
[8] Anna Shcherbina,et al. Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.
[9] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[10] Yair Zick,et al. Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).
[11] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.
[12] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.
[13] Ramprasaath R. Selvaraju,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[14] Max Welling,et al. Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.
[15] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[16] Alexander Binder,et al. Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[17] Bolei Zhou,et al. Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Yarin Gal,et al. Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.
[19] Finale Doshi-Velez,et al. A Roadmap for a Rigorous Science of Interpretability , 2017, ArXiv.
[20] Klaus-Robert Müller,et al. PatternNet and PatternLRP - Improving the interpretability of neural networks , 2017, ArXiv.
[21] John C. Duchi,et al. Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.
[22] Markus H. Gross,et al. A unified view of gradient-based attribution methods for Deep Neural Networks , 2017, NIPS 2017.
[23] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[24] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.
[25] Cho-Jui Hsieh,et al. Towards Robust Neural Networks via Random Self-ensemble , 2017, ECCV.
[26] Denali Molitor,et al. Model Agnostic Supervised Local Explanations , 2018, NeurIPS.
[27] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.
[28] Dumitru Erhan,et al. Evaluating Feature Importance Estimates , 2018, ArXiv.
[29] Martin Wattenberg,et al. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.
[30] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.
[31] Le Song,et al. Learning to Explain: An Information-Theoretic Perspective on Model Interpretation , 2018, ICML.
[32] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.
[33] Been Kim,et al. Sanity Checks for Saliency Maps , 2018, NeurIPS.
[34] Aditi Raghunathan,et al. Certified Defenses against Adversarial Examples , 2018, ICLR.
[35] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..
[36] Pradeep Ravikumar,et al. Representer Point Selection for Explaining Deep Neural Networks , 2018, NeurIPS.
[37] Andrew Slavin Ross,et al. Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients , 2017, AAAI.
[38] Tommi S. Jaakkola,et al. On the Robustness of Interpretability Methods , 2018, ArXiv.
[39] Kate Saenko,et al. RISE: Randomized Input Sampling for Explanation of Black-box Models , 2018, BMVC.
[40] J. Leskovec,et al. GNNExplainer: Generating Explanations for Graph Neural Networks , 2019, NeurIPS.
[41] Jure Leskovec,et al. GNN Explainer: A Tool for Post-hoc Explanation of Graph Neural Networks , 2019, ArXiv.
[42] Been Kim,et al. Automating Interpretability: Discovering and Testing Visual Concepts Learned by Neural Networks , 2019, ArXiv.
[43] James Zou,et al. Towards Automatic Concept-based Explanations , 2019, NeurIPS.
[44] D. Erhan,et al. A Benchmark for Interpretability Methods in Deep Neural Networks , 2018, NeurIPS.
[45] Dumitru Erhan,et al. The (Un)reliability of saliency methods , 2017, Explainable AI.
[46] Ziyan Wu,et al. Counterfactual Visual Explanations , 2019, ICML.
[47] Tommi S. Jaakkola,et al. Towards Robust, Locally Linear Deep Networks , 2019, ICLR.
[48] Oluwasanmi Koyejo,et al. Interpreting Black Box Predictions using Fisher Kernels , 2018, AISTATS.
[49] Tim Miller,et al. Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..
[50] Abubakar Abid,et al. Interpretation of Neural Networks is Fragile , 2017, AAAI.
[51] David Duvenaud,et al. Explaining Image Classifiers by Counterfactual Generation , 2018, ICLR.
[52] J. Zico Kolter,et al. Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.
[53] Ting Wang,et al. Interpretable Deep Learning under Fire , 2018, USENIX Security Symposium.