Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution

Despite the prominence of neural abstractive summarization models, we know little about how they actually form summaries and how to understand where their decisions come from. We propose a two-step method to interpret summarization model decisions. We first analyze the model’s behavior by ablating the full model to categorize each decoder decision into one of several generation modes: roughly, is the model behaving like a language model, is it relying heavily on the input, or is it somewhere in between? After isolating decisions that do depend on the input, we explore interpreting these decisions using several different attribution methods. We compare these techniques based on their ability to select content and reconstruct the model’s predicted token from perturbations of the input, thus revealing whether highlighted attributions are truly important for the generation of the next token. While this machinery can be broadly useful even beyond summarization, we specifically demonstrate its capability to identify phrases the summarization model has memorized and determine where in the training pipeline this memorization happened, as well as study complex generation phenomena like sentence fusion on a per-instance basis.

[1]  Yu Cheng,et al.  Discourse-Aware Neural Extractive Text Summarization , 2020, ACL.

[2]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[3]  Franck Dernoncourt,et al.  Analyzing Sentence Fusion in Abstractive Summarization , 2019, EMNLP.

[4]  Jasmijn Bastings,et al.  The elephant in the interpretability room: Why use attention as explanation when we have saliency methods? , 2020, BLACKBOXNLP.

[5]  Franck Dernoncourt,et al.  Learning to Fuse Sentences with Transformers for Summarization , 2020, EMNLP.

[6]  Yejin Choi,et al.  Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics , 2020, EMNLP.

[7]  Fei Liu,et al.  Controlling the Amount of Verbatim Copying in Abstractive Summarization , 2019, AAAI.

[8]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[9]  Yoav Goldberg,et al.  Understanding Convolutional Neural Networks for Text Classification , 2018, BlackboxNLP@EMNLP.

[10]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[11]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[12]  Graham Neubig,et al.  Learning to Deceive with Attention-Based Explanations , 2020, ACL.

[13]  Ankur P. Parikh,et al.  Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation , 2019, ArXiv.

[14]  Katja Filippova Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data , 2020, FINDINGS.

[15]  Kathleen McKeown,et al.  Content Selection in Deep Learning Models of Summarization , 2018, EMNLP.

[16]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[17]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[18]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[19]  Dong Nguyen,et al.  Comparing Automatic and Human Evaluation of Local Explanations for Text Classification , 2018, NAACL.

[20]  Rico Sennrich,et al.  Analyzing the Source and Target Contributions to Predictions in Neural Machine Translation , 2020, ACL.

[21]  Jiacheng Xu,et al.  Neural Extractive Text Summarization with Syntactic Compression , 2019, EMNLP.

[22]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[23]  Ryan McDonald,et al.  On Faithfulness and Factuality in Abstractive Summarization , 2020, ACL.

[24]  Yangfeng Ji,et al.  Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection , 2020, ACL.

[25]  Colin Raffel,et al.  Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.

[26]  Philipp Koehn,et al.  An Analysis of Source Context Dependency in Neural Machine Translation , 2018, EAMT.

[27]  Kathleen McKeown,et al.  Supervised Sentence Fusion with Single-Stage Inference , 2013, IJCNLP.

[28]  Mirella Lapata,et al.  Text Summarization with Pretrained Encoders , 2019, EMNLP.

[29]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[30]  Tanya Goyal,et al.  Annotating and Modeling Fine-grained Factuality in Summarization , 2021, NAACL.

[31]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[32]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[33]  Richard Socher,et al.  Evaluating the Factual Consistency of Abstractive Text Summarization , 2019, EMNLP.

[34]  Yejin Choi,et al.  RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.

[35]  Yoav Goldberg,et al.  Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? , 2020, ACL.

[36]  Xuanjing Huang,et al.  Searching for Effective Neural Extractive Summarization: What Works and What’s Next , 2019, ACL.

[37]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[38]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[39]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[40]  Byron C. Wallace,et al.  ERASER: A Benchmark to Evaluate Rationalized NLP Models , 2020, ACL.

[41]  Yotam Hechtlinger,et al.  Interpretation of Prediction Models Using the Input Gradient , 2016, ArXiv.

[42]  Yuval Pinter,et al.  Attention is not not Explanation , 2019, EMNLP.

[43]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[44]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[45]  Alec Radford,et al.  Learning to summarize from human feedback , 2020, NeurIPS.

[46]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[47]  Jonathan Berant,et al.  oLMpics-On What Language Model Pre-training Captures , 2019, Transactions of the Association for Computational Linguistics.

[48]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[49]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[50]  Yulia Tsvetkov,et al.  Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions , 2020, ACL.

[51]  Fandong Meng,et al.  Prevent the Language Model from being Overconfident in Neural Machine Translation , 2021, ACL.

[52]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[53]  Sameer Singh,et al.  Eliciting Knowledge from Language Models Using Automatically Generated Prompts , 2020, EMNLP.

[54]  Yao Zhao,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[55]  Richard Socher,et al.  Neural Text Summarization: A Critical Evaluation , 2019, EMNLP.

[56]  Lemao Liu,et al.  Evaluating Explanation Methods for Neural Machine Translation , 2020, ACL.

[57]  Shrey Desai,et al.  Compressive Summarization with Plausibility and Salience Modeling , 2020, EMNLP.

[58]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[59]  Shrey Desai,et al.  Understanding Neural Abstractive Summarization Models via Uncertainty , 2020, EMNLP.

[60]  Greg Durrett,et al.  Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals , 2021, ArXiv.