Explaining Explanations in AI

Recent work on interpretability in machine learning and AI has focused on the building of simplified models that approximate the true criteria used to make decisions. These models are a useful pedagogical device for teaching trained professionals how to predict what decisions will be made by the complex system, and most importantly how the system might break. However, when considering any such model it's important to remember Box's maxim that "All models are wrong but some are useful." We focus on the distinction between these models and explanations in philosophy and sociology. These models can be understood as a "do it yourself kit" for explanations, allowing a practitioner to directly answer "what if questions" or generate contrastive explanations without external assistance. Although a valuable ability, giving these models as explanations appears more difficult than necessary, and other forms of explanation may not have the same trade-offs. We contrast the different schools of thought on what makes an explanation, and suggest that machine learning might benefit from viewing the problem more broadly.

[1]  Paulo J. G. Lisboa,et al.  Interpretability in Machine Learning - Principles and Practice , 2013, WILF.

[2]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[3]  Adrian Weller,et al.  Transparency: Motivations and Challenges , 2019, Explainable AI.

[4]  Brian Hobbs,et al.  Interpretable Clustering via Discriminative Rectangle Mixture Model , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[5]  Alessandro Mantelero,et al.  Personal data for decisional purposes in the age of analytics: From an individual to a collective dimension of data protection , 2016, Comput. Law Secur. Rev..

[6]  Sofia Ranchordás The Black Box Society: The Secret Algorithms That Control Money and Information , 2016 .

[7]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[8]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[9]  S. C. Olhede,et al.  The growing ubiquity of algorithms in society: implications, impacts and innovations , 2018, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[10]  H. Nissenbaum Accountability in a computerized society , 1997 .

[11]  Foster J. Provost,et al.  Explaining Data-Driven Document Classifications , 2013, MIS Q..

[12]  Jure Leskovec,et al.  Interpretable & Explorable Approximations of Black Box Models , 2017, ArXiv.

[13]  John David N. Dionisio,et al.  Case-based explanation of non-case-based learning methods , 1999, AMIA.

[14]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[15]  McCarthyEd,et al.  A Unified Approach , 2005 .

[16]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[17]  Anind K. Dey,et al.  Assessing demand for intelligibility in context-aware applications , 2009, UbiComp.

[18]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[19]  B. Rehder A causal-model theory of conceptual representation and categorization. , 2003, Journal of experimental psychology. Learning, memory, and cognition.

[20]  D. Hilton,et al.  Knowledge-Based Causal Attribution: The Abnormal Conditions Focus Model , 1986 .

[21]  Denis J. Hilton,et al.  Contemporary science and natural explanation : commonsense conceptions of causality , 1988 .

[22]  James Woodward,et al.  Explanation, Invariance, and Intervention , 1997, Philosophy of Science.

[23]  Bob Rehder,et al.  When similarity and causality compete in category-based property generalization , 2006, Memory & cognition.

[24]  Boris Kment,et al.  Counterfactuals and Explanation , 2006 .

[25]  Suresh Venkatasubramanian,et al.  Auditing Black-box Models by Obscuring Features , 2016, ArXiv.

[26]  John McClure,et al.  Implicit and Explicit Processes in Social Judgments and Decisions: The Role of Goal-Based Explanations , 2003 .

[27]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[28]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[29]  P. Kim Data-Driven Discrimination at Work , 2017 .

[30]  Ivan Leudar,et al.  Explaining in conversation: towards an argument model. , 1992 .

[31]  Alex Pentland,et al.  Fair, Transparent, and Accountable Algorithmic Decision-making Processes , 2017, Philosophy & Technology.

[32]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[33]  Karrie Karahalios,et al.  Auditing Algorithms : Research Methods for Detecting Discrimination on Internet Platforms , 2014 .

[34]  D. Hilton Knowledge-Based Causal Attribution : The Abnormal Conditions Focus Model , 2004 .

[35]  Cynthia Rudin,et al.  Falling Rule Lists , 2014, AISTATS.

[36]  Anna Shcherbina,et al.  Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.

[37]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[38]  Mireille Hildebrandt,et al.  The Challenges of Ambient Law and Legal Protection in the Profiling Era , 2010 .

[39]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[40]  Douglas Walton,et al.  Dialogical Models of Explanation , 2007, ExaCt.

[41]  Giuseppe Del Re,et al.  Models and analogies in science , 2013 .

[42]  R. Binns,et al.  Algorithmic Accountability and Public Reason , 2017, Philosophy & Technology.

[43]  Simant Dube,et al.  High Dimensional Spaces, Deep Learning and Adversarial Examples , 2018, ArXiv.

[44]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[45]  Frank A. Pasquale The Black Box Society: The Secret Algorithms That Control Money and Information , 2015 .

[46]  Roger Lamb,et al.  Attribution in conversational context: Effect of mutual knowledge on explanation‐giving , 1993 .

[47]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  D. Stapel Social judgments: Implicit and explicit processes. , 2003 .

[49]  T. Lombrozo Explanation and categorization: How “why?” informs “what?” , 2009, Cognition.

[50]  Roman Frigg,et al.  Scientific Representation and the Semantic View of Theories , 2006, THEORIA.

[51]  Izak Benbasat,et al.  Explanations From Intelligent Systems: Theoretical Foundations and Implications for Practice , 1999, MIS Q..

[52]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[53]  D. Hilton Mental Models and Causal Explanation: Judgements of Probable Cause and Explanatory Relevance , 1996 .

[54]  Frank A. Pasquale,et al.  [89WashLRev0001] The Scored Society: Due Process for Automated Predictions , 2014 .

[55]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[56]  Gerrit van Bruggen,et al.  How Incorporating Feedback Mechanisms in a DSS Affects DSS Evaluations , 2009, Inf. Syst. Res..

[57]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[58]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[59]  John Fox,et al.  Argumentation-Based Inference and Decision Making--A Medical Perspective , 2007, IEEE Intelligent Systems.

[60]  Cynthia Rudin,et al.  Interpretable classification models for recidivism prediction , 2015, 1503.07810.

[61]  Duane Szafron,et al.  Visual Explanation of Evidence with Additive Classifiers , 2006, AAAI.

[62]  G. Box Robustness in the Strategy of Scientific Model Building. , 1979 .

[63]  Balázs Bodó,et al.  Tackling the Algorithmic Control Crisis – the Technical, Legal, and Ethical Challenges of Research into Algorithmic Agents , 2018 .

[64]  Michael R. Waldmann,et al.  Do Social Norms Influence Causal Inferences? , 2014, CogSci.

[65]  Brent Mittelstadt,et al.  Automation, Algorithms, and Politics| Auditing for Transparency in Content Personalization Systems , 2016 .

[66]  Enrico Bertini,et al.  Interpreting Black-Box Classifiers Using Instance-Level Visual Explanations , 2017, HILDA@SIGMOD.

[67]  Petri Ylikoski,et al.  Causal and Constitutive Explanation Compared , 2013, Erkenntnis.

[68]  K. McKeown,et al.  Justification Narratives for Individual Classifications , 2014 .

[69]  David Weinberger,et al.  Accountability of AI Under the Law: The Role of Explanation , 2017, ArXiv.

[70]  Jenna Burrell,et al.  How the machine ‘thinks’: Understanding opacity in machine learning algorithms , 2016 .

[71]  Douglas Walton,et al.  A new dialectical theory of explanation , 2004 .

[72]  Tim Miller,et al.  Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences , 2017, ArXiv.

[73]  Solon Barocas,et al.  The Intuitive Appeal of Explainable Machines , 2018 .

[74]  D. Hilton Conversational processes and causal explanation. , 1990 .

[75]  Osbert Bastani,et al.  Interpretability via Model Extraction , 2017, ArXiv.

[76]  Weng-Keen Wong,et al.  Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.

[77]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[78]  Sameer Singh,et al.  Towards Extracting Faithful and Descriptive Representations of Latent Variable Models , 2015, AAAI Spring Symposia.

[79]  Michael Veale,et al.  Clarity, surprises, and further questions in the Article 29 Working Party draft guidance on automated decision-making and profiling , 2018, Comput. Law Secur. Rev..

[80]  Sören Preibusch,et al.  Toward Accountable Discrimination-Aware Data Mining: The Importance of Keeping the Human in the Loop - and Under the Looking Glass , 2017, Big Data.

[81]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[82]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.