Inspecting the Factuality of Hallucinated Entities in Abstractive Summarization

State-of-the-art abstractive summarization systems often generate hallucinations; i.e., content that is not directly inferable from the source text. Despite being assumed incorrect, many of the hallucinated contents are consistent with world knowledge (factual hallucinations). Including these factual hallucinations into a summary can be beneficial in providing additional background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method is based on an entity’s prior and posterior probabilities according to pre-trained and finetuned masked language models, respectively. Empirical results suggest that our method vastly outperforms three strong baselines in both accuracy and F1 scores and has a strong correlation with human judgements on factuality classification tasks. Furthermore, our approach can provide insight into whether a particular hallucination is caused by the summarizer’s pre-training or fine-tuning step.1

[1]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[2]  Shrey Desai,et al.  Understanding Neural Abstractive Summarization Models via Uncertainty , 2020, EMNLP.

[3]  Shay B. Cohen,et al.  Reducing the Frequency of Hallucinated Quantities in Abstractive Summaries , 2020, FINDINGS.

[4]  Yang Liu,et al.  Minimum Risk Training for Neural Machine Translation , 2015, ACL.

[5]  Artidoro Pagnoni,et al.  Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics , 2021, NAACL.

[6]  Ramesh Nallapati,et al.  Entity-level Factual Consistency of Abstractive Text Summarization , 2021, EACL.

[7]  Mona T. Diab,et al.  FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization , 2020, ACL.

[8]  Ryan McDonald,et al.  On Faithfulness and Factuality in Abstractive Summarization , 2020, ACL.

[9]  J. C. Cheung,et al.  Factual Error Correction for Abstractive Summarization Models , 2020, EMNLP.

[10]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[11]  Jackie Chi Kit Cheung,et al.  Multi-Fact Correction in Abstractive Text Summarization , 2020, EMNLP.

[12]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[13]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[14]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[15]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[16]  Katja Filippova Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data , 2020, FINDINGS.

[17]  Shashi Narayan,et al.  Leveraging Pre-trained Checkpoints for Sequence Generation Tasks , 2019, Transactions of the Association for Computational Linguistics.

[18]  Richard Socher,et al.  Evaluating the Factual Consistency of Abstractive Text Summarization , 2019, EMNLP.

[19]  Alex Wang,et al.  Asking and Answering Questions to Evaluate the Factual Consistency of Summaries , 2020, ACL.

[20]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Rico Sennrich,et al.  On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation , 2020, ACL.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Yao Zhao,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[24]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[25]  Tatsunori B. Hashimoto,et al.  Improved Natural Language Generation via Loss Truncation , 2020, ACL.

[26]  Ryan McDonald,et al.  Planning with Entity Chains for Abstractive Summarization , 2021, ArXiv.