Why Attention is Not Explanation: Surgical Intervention and Causal Reasoning about Neural Models

As the demand for explainable deep learning grows in the evaluation of language technologies, the value of a principled grounding for those explanations grows as well. Here we study the state-of-the-art in explanation for neural models for NLP tasks from the viewpoint of philosophy of science. We focus on recent evaluation work that finds brittleness in explanations obtained through attention mechanisms. We harness philosophical accounts of explanation to suggest broader conclusions from these studies. From this analysis, we assert the impossibility of causal explanations from attention layers over text data. We then introduce NLP researchers to contemporary philosophy of science theories that allow robust yet non-causal reasoning in explanation, giving computer scientists a vocabulary for future research.

[1]  Seungwhan Moon,et al.  OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs , 2019, ACL.

[2]  Collin Rice,et al.  Models Don’t Decompose That Way: A Holistic View of Idealized Models , 2019, The British Journal for the Philosophy of Science.

[3]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[4]  Cynthia L. Bennett,et al.  What is the point of fairness? , 2019, ACM SIGACCESS Access. Comput..

[5]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[6]  James Woodward,et al.  Explanation and Invariance in the Special Sciences , 2000, The British Journal for the Philosophy of Science.

[7]  Thomas Bartelborth,et al.  Explanatory Unification , 2004, Synthese.

[8]  Yuval Pinter,et al.  Attention is not not Explanation , 2019, EMNLP.

[9]  Bas C. van Fraassen,et al.  The Scientific Image , 1980 .

[10]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[11]  Dong Nguyen,et al.  Comparing Automatic and Human Evaluation of Local Explanations for Text Classification , 2018, NAACL.

[12]  Senlin Luo,et al.  Rule Extraction From Support Vector Machines Using Ensemble Learning Approach: An Application for Diagnosis of Diabetes , 2015, IEEE Journal of Biomedical and Health Informatics.

[13]  James Woodward,et al.  Explanation, Invariance, and Intervention , 1997, Philosophy of Science.

[14]  Bronwyn Woods,et al.  Formative Essay Feedback Using Predictive Scoring Models , 2017, KDD.

[15]  Ivan Titov,et al.  Interpretable Neural Predictions with Differentiable Binary Variables , 2019, ACL.

[16]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[17]  William Yang Wang,et al.  Towards Explainable NLP: A Generative Explanation Framework for Text Classification , 2018, ACL.

[18]  Searching for Non-Causal Explanations in a Sea of Causes , 2018, Oxford Scholarship Online.

[19]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[20]  Noah A. Smith,et al.  Is Attention Interpretable? , 2019, ACL.

[21]  Xinlei Chen,et al.  Visualizing and Understanding Neural Models in NLP , 2015, NAACL.

[22]  P. Railton A Deductive-Nomological Model of Probabilistic Explanation , 1978, Philosophy of Science.

[23]  Mandy Eberhart,et al.  The Scientific Image , 2016 .

[24]  Vladimir I. Arnold,et al.  THE ROLE OF MATHEMATICS IN PHYSICAL SCIENCES , 2003 .

[25]  Philosophical Problems of the Internal and External Worlds: Essays on the Philosophy of Adolf Grünbaum , 1995 .

[26]  Collin Rice Moving Beyond Causes: Optimality Models and Scientific Explanation , 2015 .

[27]  A. Reutlinger Why Is There Universal Macrobehavior? Renormalization Group Explanation as Noncausal Explanation , 2014, Philosophy of Science.

[28]  C. Hempel,et al.  Studies in the Logic of Explanation , 1948, Philosophy of Science.

[29]  Alistair Knott,et al.  Transparency in Algorithmic and Human Decision-Making: Is There a Double Standard? , 2018, Philosophy & Technology.

[30]  Mark O. Riedl,et al.  Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations , 2017, AIES.

[31]  Közigazgatási jog alkotmányjog Közjog Jacobellis v. Ohio , 2011 .

[32]  Dipanjan Das,et al.  BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.

[33]  Omer Levy,et al.  What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.

[34]  K. Wilson Renormalization Group and Critical Phenomena. I. Renormalization Group and the Kadanoff Scaling Picture , 1971 .

[35]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[36]  Chris Callison-Burch,et al.  Problems in Current Text Simplification Research: New Data Can Help , 2015, TACL.

[37]  A. Potochnik Eight Other Questions about Explanation , 2018, Oxford Scholarship Online.

[38]  Paul Goldsmith-Pinkham,et al.  Predictably Unequal? The Effects of Machine Learning on Credit Markets , 2017, The Journal of Finance.

[39]  Frederick Liu,et al.  Incorporating Priors with Feature Attribution on Text Classification , 2019, ACL.

[40]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[41]  J. Woodward,et al.  Scientific Explanation and the Causal Structure of the World , 1988 .

[42]  J. Pearl Causal diagrams for empirical research , 1995 .

[43]  Alisa Bokulich,et al.  How scientific models can explain , 2011, Synthese.

[44]  P. Dowe An empiricist defence of the causal account of explanation , 1992 .

[45]  K. Wilson The renormalization group and critical phenomena , 1983 .

[46]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[47]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.

[48]  Juha Saatsi,et al.  Alexander Reutlinger and Juha Saatsi, eds. Explanation Beyond Causation: Philosophical Perspectives on Non-Causal Explanations , 2018 .

[49]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[50]  H. W. Regt,et al.  The Epistemic Value of Understanding , 2009 .

[51]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[52]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[53]  Dan Geiger,et al.  Identifying independence in bayesian networks , 1990, Networks.

[54]  Robert W. Batterman,et al.  Minimal Model Explanations , 2014, Philosophy of Science.

[55]  Marc Lange What Makes a Scientific Explanation Distinctively Mathematical? , 2013, The British Journal for the Philosophy of Science.

[56]  Emily Sullivan,et al.  Understanding from Machine Learning Models , 2020, The British Journal for the Philosophy of Science.

[57]  Diyi Yang,et al.  Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms , 2019, NAACL.

[58]  Robert W. Batterman,et al.  The devil in the details : asymptotic reasoning in explanation, reduction, and emergence , 2002 .

[59]  M. Strevens Depth: An Account of Scientific Explanation , 2008 .

[60]  J. D. Trout Scientific Explanation And The Sense Of Understanding* , 2002, Philosophy of Science.

[61]  Seth Chin-Parker,et al.  The Pragmatics of Explanation , 2008 .

[62]  Jure Leskovec,et al.  Interpretable & Explorable Approximations of Black Box Models , 2017, ArXiv.

[63]  Qijun Zhao,et al.  Facial Ethnicity Classification with Deep Convolutional Neural Networks , 2016, CCBR.

[64]  Lucy Vasserman,et al.  Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.

[65]  James Woodward,et al.  Law and Explanation in Biology: Invariance Is the Kind of Stability That Matters , 2001, Philosophy of Science.

[66]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[67]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[68]  M. Strevens Scientific Explanation , 2005 .

[69]  Edward Mackinnon Aspects of Scientific Explanation: and Other Essays in the Philosophy of Science , 1967 .

[70]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[71]  Keith A. Markus,et al.  Making Things Happen: A Theory of Causal Explanation , 2007 .

[72]  Collin Rice,et al.  Idealized models, holistic distortions, and universality , 2017, Synthese.

[73]  Anind K. Dey,et al.  Assessing demand for intelligibility in context-aware applications , 2009, UbiComp.

[74]  Kareem Khalifa,et al.  The Role of Explanation in Understanding , 2013, The British Journal for the Philosophy of Science.

[75]  Or Biran,et al.  Explanation and Justification in Machine Learning : A Survey Or , 2017 .

[76]  Tim Miller,et al.  Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences , 2017, ArXiv.

[77]  Carmen Lacave,et al.  A review of explanation methods for Bayesian networks , 2002, The Knowledge Engineering Review.

[78]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[79]  Tao Li,et al.  Visual Interrogation of Attention-Based Models for Natural Language Inference and Machine Comprehension , 2018, EMNLP.

[80]  M. Friedman Explanation and Scientific Understanding , 1974 .

[81]  Aaron Halfaker,et al.  Who Did What: Editor Role Identification in Wikipedia , 2021, ICWSM.