Human-Centered Evaluation of Explanations

The NLP community are increasingly interested in providing explanations for NLP models to help people make sense of model behavior and potentially improve human interaction with models. In addition to computational challenges in generating these explanations, evaluations of the generated explanations require human-centered perspectives and approaches. This tutorial will provide an overview of human-centered evaluations of explanations. First, we will give a brief introduction to the psychological foundation of explanations as well as types of NLP model explanations and their corresponding presentation, to provide the necessary background. We will then present a taxonomy of human-centered evaluation of explanations and dive into depth in the two categories: 1) evaluation based on human-annotated explanations; 2) evaluation with human-subjects studies. We will conclude by discussing future directions. We will also adopt a flipped format to maximize the in- teractive components for the live audience.

[1]  Alexey Kotov,et al.  The Explanatory Effect of a Label: Its Influence on a Category Persists Even If We Forget the Label , 2022, Frontiers in Psychology.

[2]  Q. Vera Liao,et al.  Towards a Science of Human-AI Decision Making: A Survey of Empirical Studies , 2021, ArXiv.

[3]  Kush R. Varshney,et al.  Human-Centered Explainable AI (XAI): From Algorithms to User Experiences , 2021, ArXiv.

[4]  Ming Yin,et al.  Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making , 2021, IUI.

[5]  Greg Durrett,et al.  Connecting Attributions and QA Model Behavior on Realistic Counterfactuals , 2021, EMNLP.

[6]  Arvind Satyanarayan,et al.  Beyond Expertise and Roles: A Framework to Characterize the Stakeholders of Interpretable Machine Learning and their Needs , 2021, CHI.

[7]  Ana Marasović,et al.  Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing , 2021, NeurIPS Datasets and Benchmarks.

[8]  Sameer Singh,et al.  Interpreting Predictions of NLP Models , 2020, EMNLP.

[9]  John P. Dickerson,et al.  Counterfactual Explanations for Machine Learning: A Review , 2020, ArXiv.

[10]  Chenhao Tan,et al.  Evaluating and Characterizing Human Rationales , 2020, EMNLP.

[11]  Shiyue Zhang,et al.  Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? , 2020, FINDINGS.

[12]  Qiaozhu Mei,et al.  Neural Language Generation: Formulation, Methods, and Evaluation , 2020, ArXiv.

[13]  Harmanpreet Kaur,et al.  Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning , 2020, CHI.

[14]  Solon Barocas,et al.  The meaning and measurement of bias: lessons from natural language processing , 2020, FAT*.

[15]  Krzysztof Z. Gajos,et al.  Proxy tasks and subjective measures can be misleading in evaluating explainable AI systems , 2020, IUI.

[16]  Q. Liao,et al.  Questioning the AI: Informing Design Practices for Explainable AI User Experiences , 2020, CHI.

[17]  Yunfeng Zhang,et al.  Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making , 2020, FAT*.

[18]  Byron C. Wallace,et al.  ERASER: A Benchmark to Evaluate Rationalized NLP Models , 2019, ACL.

[19]  Kevin,et al.  Automatic and Human Evaluation of Local Topic Quality , 2020 .

[20]  Tania Lombrozo,et al.  Experiential Explanation , 2020, Top. Cogn. Sci..

[21]  Daniel A. Wilkenfeld,et al.  Mechanistic versus Functional Understanding , 2019, Varieties of Understanding.

[22]  Sameer Singh,et al.  Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.

[23]  Haiyi Zhu,et al.  Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders , 2019, CHI.

[24]  Ziyan Wu,et al.  Counterfactual Visual Explanations , 2019, ICML.

[25]  Rachel K. E. Bellamy,et al.  Explaining models an empirical study of how explanations impact fairness judgment , 2019 .

[26]  Vivian Lai,et al.  On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection , 2018, FAT.

[27]  Charu C. Aggarwal,et al.  Efficient Data Representation by Selecting Prototypes with Importance Weights , 2017, 2019 IEEE International Conference on Data Mining (ICDM).

[28]  Thomas Lukasiewicz,et al.  e-SNLI: Natural Language Inference with Natural Language Explanations , 2018, NeurIPS.

[29]  S. Sloman,et al.  Community Appeal: Explanation Without Information , 2018, Journal of experimental psychology. General.

[30]  T. Lombrozo,et al.  Stability, breadth and guidance , 2018 .

[31]  Carlos Guestrin,et al.  Semantically Equivalent Adversarial Rules for Debugging NLP models , 2018, ACL.

[32]  Dan Roth,et al.  Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences , 2018, NAACL.

[33]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[34]  T. Lombrozo,et al.  The explanatory effect of a label: Explanations with named categories are more satisfying , 2017, Cognition.

[35]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[36]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[37]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[38]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[39]  Daniel Jurafsky,et al.  Understanding Neural Networks through Representation Erasure , 2016, ArXiv.

[40]  T. Lombrozo Explanatory Preferences Shape Learning and Inference , 2016, Trends in Cognitive Sciences.

[41]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[42]  Quentin Pleple,et al.  Interactive Topic Modeling , 2013 .

[43]  Embry-Riddle Aeronautical,et al.  The Flipped Classroom: A Survey of the Research , 2013 .

[44]  T. Lombrozo Explanation and Abductive Inference , 2012 .

[45]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[46]  Christine D. Piatko,et al.  Using “Annotator Rationales” to Improve Machine Learning for Text Categorization , 2007, NAACL.

[47]  M. Strevens Scientific Explanation , 2005 .

[48]  Frank C. Keil,et al.  The Shadows and Shallows of Explanation , 1998, Minds and Machines.

[49]  F. Keil Folkscience: coarse interpretations of a complex reality , 2003, Trends in Cognitive Sciences.

[50]  Charles J. Kacmar,et al.  Developing and Validating Trust Measures for e-Commerce: An Integrative Typology , 2002, Inf. Syst. Res..

[51]  D. Kuhn How do People Know? , 2001, Psychological science.

[52]  A WilsonRobert,et al.  The Shadows and Shallows of Explanation , 1998 .

[53]  P. Lipton Contrastive Explanation , 1990, Royal Institute of Philosophy Supplement.