论文信息 - Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

Explainable Natural Language Processing (EXNLP) has increasingly focused on 1 collecting human-annotated textual explanations. These explanations are used 2 downstream in three ways: as data augmentation to improve performance on a 3 predictive task, as supervision to train models to produce explanations for their 4 predictions, and as a ground-truth to evaluate model-generated explanations. In 5 this review, we identify 61 datasets with three predominant classes of textual expla6 nations (highlights, free-text, and structured), organize the literature on annotating 7 each type, identify strengths and shortcomings of existing collection methodologies, 8 and give recommendations for collecting EXNLP datasets in the future. 9

Sarah Wiegreffe | Ana Marasović | Ana Marasović | Sarah Wiegreffe

[1] Yejin Choi,et al. Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs , 2020, FINDINGS.

[2] Barry Smyth,et al. Generating Plausible Counterfactual Explanations for Deep Transformers in Financial Text Classification , 2020, COLING.

[3] Eunsol Choi,et al. QED: A Framework and Dataset for Explanations in Question Answering , 2020, Transactions of the Association for Computational Linguistics.

[4] Marco Valentino,et al. A Survey on Explainability in Machine Reading Comprehension , 2020, ArXiv.

[5] Tommi S. Jaakkola,et al. Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[6] Ye Zhang,et al. Rationale-Augmented Convolutional Neural Networks for Text Classification , 2016, EMNLP.

[7] Ellie Pavlick,et al. Inherent Disagreements in Human Textual Inferences , 2019, Transactions of the Association for Computational Linguistics.

[8] Doug Downey,et al. Abductive Commonsense Reasoning , 2019, ICLR.

[9] Sawan Kumar,et al. NILE : Natural Language Inference with Faithful Natural Language Explanations , 2020, ACL.

[10] Ido Dagan,et al. Controlled Crowdsourcing for High-Quality QA-SRL Annotation , 2019, ACL.

[11] Yoav Goldberg,et al. Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? , 2020, ACL.

[12] H. Hastie,et al. A Survey of Explainable AI Terminology , 2019, Proceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI 2019).

[13] Chandan Singh,et al. Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[14] Iryna Gurevych,et al. A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking , 2019, CoNLL.

[15] Xiang Zhou,et al. What Can We Learn from Collective Human Opinions on Natural Language Inference Data? , 2020, EMNLP.

[16] Qiaozhu Mei,et al. Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts , 2018, EMNLP.

[17] Francesca Toni,et al. Explainable Automated Fact-Checking for Public Health Claims , 2020, EMNLP.

[18] Amandalynne Paullada,et al. Data and its (dis)contents: A survey of dataset development and use in machine learning research , 2020, Patterns.

[19] Diyi Yang,et al. ToTTo: A Controlled Table-To-Text Generation Dataset , 2020, EMNLP.

[20] Georg Groh,et al. Investigating Annotator Bias with a Graph-Based Approach , 2020, ALW.

[21] Dan Roth,et al. Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences , 2018, NAACL.

[22] Yejin Choi,et al. Social Bias Frames: Reasoning about Social and Power Implications of Language , 2020, ACL.

[23] Jason Weston,et al. ELI5: Long Form Question Answering , 2019, ACL.

[24] Animesh Mukherjee,et al. HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection , 2020, AAAI.