When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data

Many methods now exist for conditioning model outputs on task instructions, retrieved documents, and user-provided explanations and feedback. Rather than relying solely on examples of task inputs and outputs, these approaches use valuable additional data for improving model correctness and aligning learned models with human priors. Meanwhile, a growing body of evidence suggests that some language models can (1) store a large amount of knowledge in their parameters, and (2) perform inference over tasks in textual inputs at test time. These results raise the possibility that, for some tasks, humans cannot explain to a model any more about the task than it already knows or could infer on its own. In this paper, we study the circumstances under which explanations of individual data points can (or cannot) improve modeling performance. In order to carefully control important properties of the data and explanations, we introduce a synthetic dataset for experiments, and we also make use of three existing datasets with explanations: e-SNLI, TACRED, and SemEval. We first give a formal framework for the available modeling approaches, in which explanation data can be used as model inputs, as targets, or as a prior. After arguing that the most promising role for explanation data is as model inputs, we propose to use a retrieval-based method and show that it solves our synthetic task with accuracies upwards of 95%, while baselines without explanation data achieve below 65% accuracy. We then identify properties of datasets for which retrieval-based modeling fails. With the three existing datasets, we find no improvements from explanation retrieval. Drawing on findings from our synthetic task, we suggest that at least one of six preconditions for successful modeling fails to hold with these datasets.1 Our code and data will be made publicly available at: https: //github.com/peterbhase/ExplanationRoles

[1]  Matt Gardner,et al.  Learning from Task Descriptions , 2020, EMNLP.

[2]  Christine D. Piatko,et al.  Using “Annotator Rationales” to Improve Machine Learning for Text Categorization , 2007, NAACL.

[3]  Percy Liang,et al.  ExpBERT: Representation Engineering with Natural Language Explanations , 2020, ACL.

[4]  Colin Raffel,et al.  How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[5]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[6]  William W. Cohen,et al.  Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students? , 2020, TACL.

[7]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[8]  Thomas Lukasiewicz,et al.  e-SNLI: Natural Language Inference with Natural Language Explanations , 2018, NeurIPS.

[9]  Ming-Wei Chang,et al.  Retrieval Augmented Language Model Pre-Training , 2020, ICML.

[10]  Ye Zhang,et al.  Rationale-Augmented Convolutional Neural Networks for Text Classification , 2016, EMNLP.

[11]  Kristian Kersting,et al.  Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[13]  Hongxia Jin,et al.  Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[15]  Tom M. Mitchell,et al.  Zero-shot Learning of Classifiers from Natural Language Quantification , 2018, ACL.

[16]  Fabio Petroni,et al.  Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.

[17]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[18]  Sunita Sarawagi,et al.  Learning from Rules Generalizing Labeled Exemplars , 2020, ICLR.

[19]  Zhou Yu,et al.  ALICE: Active Learning with Contrastive Natural Language Explanations , 2020, EMNLP.

[20]  Colin Raffel,et al.  WT5?! Training Text-to-Text Models to Explain their Predictions , 2020, ArXiv.

[21]  Dan Klein,et al.  Learning with Latent Language , 2017, NAACL.

[22]  Richard Socher,et al.  Explain Yourself! Leveraging Language Models for Commonsense Reasoning , 2019, ACL.

[23]  Percy Liang,et al.  Can Small and Synthetic Benchmarks Drive Modeling Innovation? A Retrospective Study of Question Answering Modeling Approaches , 2021, ArXiv.

[24]  Regina Barzilay,et al.  Deriving Machine Attention from Human Rationales , 2018, EMNLP.

[25]  Jun Yan,et al.  Learning from Explanations with Neural Execution Tree , 2020, ICLR.

[26]  Carla E. Brodley,et al.  The Constrained Weight Space SVM: Learning with Ranked Features , 2011, ICML.

[27]  Jonathan Berant,et al.  Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge , 2020, ArXiv.

[28]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[29]  Yue Zhang,et al.  Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation , 2019, ACL.

[30]  Preslav Nakov,et al.  SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.

[31]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[32]  Chenyan Xiong,et al.  Towards Interpretable Natural Language Understanding with Explanations as Latent Variables , 2020, NeurIPS.

[33]  Nassir Navab,et al.  Guide Me: Interacting with Deep Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[35]  Christopher Ré,et al.  Training Classifiers with Natural Language Explanations , 2018, ACL.

[36]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[37]  John DeNero,et al.  Guiding Policies with Language via Meta-Learning , 2018, ICLR.

[38]  Sawan Kumar,et al.  NILE : Natural Language Inference with Faithful Natural Language Explanations , 2020, ACL.

[39]  Nicola De Cao,et al.  How Do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking , 2020, EMNLP.

[40]  Shiyue Zhang,et al.  Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? , 2020, FINDINGS.

[41]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[42]  Noah A. Smith,et al.  Measuring Association Between Labels and Free-Text Rationales , 2020, EMNLP.

[43]  Tom. Mitchell,et al.  Learning Classifiers from Declarative Language , 2018 .

[44]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[45]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[46]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[47]  Sanja Fidler,et al.  Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[48]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.