The Summary Loop: Learning to Write Abstractive Summaries Without Examples

This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. It introduces a novel method that encourages the inclusion of key terms from the original document into the summary: key terms are masked out of the original document and must be filled in by a coverage model using the current generated summary. A novel unsupervised training procedure leverages this coverage model along with a fluency model to generate and score summaries. When tested on popular news summarization datasets, the method outperforms previous unsupervised methods by more than 2 R-1 points, and approaches results of competitive supervised methods. Our model attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision.

[1]  Samuel R. Bowman,et al.  Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.

[2]  Ramakanth Pasunuru,et al.  Multi-Reward Reinforced Summarization with Saliency and Entailment , 2018, NAACL.

[3]  Vaibhava Goel,et al.  Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yao Zhao,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[5]  Mor Naaman,et al.  Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[6]  Huang Heyan,et al.  Concept Pointer Network for Abstractive Summarization , 2019, EMNLP.

[7]  Nikola I. Nikolov,et al.  Abstractive Document Summarization without Parallel Data , 2019, LREC.

[8]  Michael Elhadad,et al.  Question Answering as an Automatic Evaluation Metric for News Article Summarization , 2019, NAACL.

[9]  Sergey Edunov,et al.  Pre-trained language model representations for language generation , 2019, NAACL.

[10]  Marti A. Hearst,et al.  newsLens: building and visualizing long-ranging news stories , 2017, NEWS@ACL.

[11]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[12]  Sylvain Lamprier,et al.  Answers Unite! Unsupervised Metrics for Reinforced Summarization Models , 2019, EMNLP.

[13]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[14]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[15]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[16]  Li Dong,et al.  Cross-Lingual Natural Language Generation via Pre-Training , 2020, AAAI.

[17]  Yejin Choi,et al.  BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle , 2019, EMNLP.

[18]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[19]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[20]  Yen-Chun Chen,et al.  Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[21]  Ramakanth Pasunuru,et al.  Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation , 2018, ACL.

[22]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[23]  Christopher D. Manning,et al.  Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports , 2020, ACL.

[24]  Zhenglu Yang,et al.  Attention Optimization for Abstractive Document Summarization , 2019, EMNLP.

[25]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[26]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[27]  Alexander M. Rush,et al.  Bottom-Up Abstractive Summarization , 2018, EMNLP.

[28]  Fei Liu,et al.  Reinforced Extractive Summarization with Question-Focused Rewards , 2018, ACL.

[29]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[30]  Luis Argerich,et al.  Variations of the Similarity Function of TextRank for Automated Summarization , 2016, ArXiv.

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.