Attention cannot be an Explanation

Attention based explanations (viz. saliency maps), by providing interpretability to black box models such as deep neural networks, are assumed to improve human trust and reliance in the underlying models. Recently, it has been shown that attention weights are frequently uncorrelated with gradient-based measures of feature importance. Motivated by this, we ask a follow-up question: “Assuming that we only consider the tasks where attention weights correlate well with feature importance, how effective are these attention based explanations in increasing human trust and reliance in the underlying models?”. In other words, can we use attention as an explanation? We perform extensive human study experiments that aim to qualitatively and quantitatively assess the degree to which attention based explanations are suitable in increasing human trust and reliance. Our experiment results show that attention cannot be used as an explanation.

[1]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[2]  Ariel D. Procaccia,et al.  Influence in Classification via Cooperative Game Theory , 2015, IJCAI.

[3]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[4]  Song-Chun Zhu,et al.  X-ToM: Explaining with Theory-of-Mind for Gaining Justified Human Trust , 2019, ArXiv.

[5]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[6]  Arjun R. Akula Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal Grounding , 2022 .

[7]  James P. Bliss,et al.  The Role of Trust as a Mediator Between System Characteristics and Response Behaviors , 2015, Hum. Factors.

[8]  Song-Chun Zhu,et al.  CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization , 2021, EMNLP.

[9]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[10]  Rajeev Sangal,et al.  A Novel Approach Towards Incorporating Context Processing Capabilities in NLIDB System , 2013, IJCNLP.

[11]  Alexander M. Rush,et al.  Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks , 2016, ArXiv.

[12]  Robert R. Hoffman,et al.  A Taxonomy of Emergent Trusting in the Human–Machine Relationship , 2017 .

[13]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[14]  Radhika Mamidi,et al.  Classification of Attributes in a Natural Language Query into Different SQL Clauses , 2015, RANLP.

[15]  Song-Chun Zhu,et al.  Robust Visual Reasoning via Language Guided Neural Module Networks , 2021, NeurIPS.

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[18]  Discourse Parsing in Videos: A Multi-modal Appraoch , 2019, 1903.02252.

[19]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Zhiyong Lu,et al.  Explaining Naive Bayes Classifications , 2003 .

[21]  Song-Chun Zhu,et al.  CoCoX: Generating Conceptual and Counterfactual Explanations via Fault-Lines , 2020, AAAI.

[22]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[23]  Song-Chun Zhu,et al.  Effective Representation to Capture Collaboration Behaviors between Explainer and User , 2022, ArXiv.

[24]  Song-Chun Zhu,et al.  Visual Discourse Parsing , 2019, ArXiv.

[25]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[26]  Joseph B. Lyons,et al.  Certifiable Trust in Autonomous Systems: Making the Intractable Tangible , 2017, AI Mag..

[27]  Yaser Al-Onaizan,et al.  Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions , 2020, ACL.

[28]  Jeffrey M. Bradshaw,et al.  Metrics, Metrics, Metrics, Part 2: Universal Metrics? , 2010, IEEE Intelligent Systems.

[29]  Amit Dhurandhar,et al.  Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.

[30]  Abhijeet Gupta,et al.  A Novel Approach Towards Building a Portable NLIDB System Using the Computational Paninian Grammar Framework , 2012, 2012 International Conference on Asian Language Processing.

[31]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Song-Chun Zhu,et al.  Discourse Analysis for Evaluating Coherence in Video Paragraph Captions , 2022, ArXiv.

[34]  Ziyan Wu,et al.  Counterfactual Visual Explanations , 2019, ICML.

[35]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[36]  Song-Chun Zhu,et al.  Mind the Context: The Impact of Contextualization in Neural Module Networks for Grounding Visual Referring Expressions , 2021, EMNLP.

[37]  Gary Klein,et al.  Metrics for Explainable AI: Challenges and Prospects , 2018, ArXiv.

[38]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[39]  Song-Chun Zhu,et al.  Explainable AI as Collaborative Task Solving , 2019, CVPR Workshops.

[40]  Song-Chun Zhu,et al.  Natural Language Interaction with Explainable AI Models , 2019, CVPR Workshops.

[41]  Gargi Dasgupta,et al.  Desire: Deep Semantic Understanding and Retrieval for Technical Support Services , 2016, ICSOC Workshops.

[42]  Syed Azeemuddin,et al.  A Web-Based Virtual Laboratory for Electromagnetic Theory , 2013, 2013 IEEE Fifth International Conference on Technology for Education (t4e 2013).

[43]  Gargi Dasgupta,et al.  Towards Auto-remediation in Services Delivery: Context-Based Classification of Noisy and Unstructured Tickets , 2014, ICSOC.

[44]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[45]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[46]  Gargi Dasgupta,et al.  Automatic problem extraction and analysis from unstructured text in IT tickets , 2017, IBM J. Res. Dev..

[47]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.