Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations
暂无分享,去创建一个
Andrew Slavin Ross | Finale Doshi-Velez | Michael C. Hughes | Finale Doshi-Velez | A. Ross | M. Hughes | F. Doshi-Velez
[1] B. Ripley,et al. Pattern Recognition , 1968, Nature.
[2] Harris Drucker,et al. Improving generalization performance using double backpropagation , 1992, IEEE Trans. Neural Networks.
[3] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[4] F. Keil,et al. Explanation and understanding , 2015 .
[5] Christine D. Piatko,et al. Using “Annotator Rationales” to Improve Machine Learning for Text Categorization , 2007, NAACL.
[6] Peter Tino,et al. IEEE Transactions on Neural Networks , 2009 .
[7] Motoaki Kawanabe,et al. How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..
[8] Jeff Donahue,et al. Annotator rationales for visual recognition , 2011, 2011 International Conference on Computer Vision.
[9] Shafi Goldwasser,et al. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference , 2012 .
[10] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.
[11] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[12] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[13] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Johannes Gehrke,et al. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.
[16] Suresh Venkatasubramanian,et al. Auditing Black-box Models by Obscuring Features , 2016, ArXiv.
[17] Carlos Guestrin,et al. Programs as Black-Box Explanations , 2016, ArXiv.
[18] Anna Shcherbina,et al. Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.
[19] Scott Lundberg,et al. An unexpected unity among methods for interpreting model predictions , 2016, ArXiv.
[20] Yotam Hechtlinger,et al. Interpretation of Prediction Models Using the Input Gradient , 2016, ArXiv.
[21] Regina Barzilay,et al. Rationalizing Neural Predictions , 2016, EMNLP.
[22] Daniel Jurafsky,et al. Understanding Neural Networks through Representation Erasure , 2016, ArXiv.
[23] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[24] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).
[25] Ankur Taly,et al. Gradients of Counterfactuals , 2016, ArXiv.
[26] Abhishek Das,et al. Grad-CAM: Why did you say that? , 2016, ArXiv.
[27] Ye Zhang,et al. Rationale-Augmented Convolutional Neural Networks for Text Classification , 2016, EMNLP.
[28] Xinlei Chen,et al. Visualizing and Understanding Neural Models in NLP , 2015, NAACL.
[29] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.
[30] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
[31] Krishna P. Gummadi,et al. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.
[32] Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[33] Alexander Binder,et al. Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..
[34] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.