论文信息 - Summary Level Training of Sentence Rewriting for Abstractive Summarization - 字舞流文

Summary Level Training of Sentence Rewriting for Abstractive Summarization

As an attempt to combine extractive and abstractive summarization, Sentence Rewriting models adopt the strategy of extracting salient sentences from a document first and then paraphrasing the selected ones to generate a summary. However, the existing models in this framework mostly rely on sentence-level rewards or suboptimal labels, causing a mismatch between a training objective and evaluation metric. In this paper, we present a novel training signal that directly maximizes summary-level ROUGE scores through reinforcement learning. In addition, we incorporate BERT into our model, making good use of its ability on natural language understanding. In extensive experiments, we show that a combination of our proposed model and training procedure obtains new state-of-the-art performance on both CNN/Daily Mail and New York Times datasets. We also demonstrate that it generalizes better on DUC-2002 test set.

Jihoon Kim | Sanghwan Bae | Sang-goo Lee | Taeuk Kim | Taeuk Kim | Jihoon Kim | Sanghwan Bae | Sang-goo Lee

[1] Si Li,et al. Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network , 2018, NAACL.

[2] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[3] 悠太菊池,et al. 大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[4] Saif Mohammad,et al. Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation , 2017, ACL.

[5] Bowen Zhou,et al. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[6] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[7] Bowen Zhou,et al. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[8] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10] Sanja Fidler,et al. Efficient Summarization with Read-Again and Copy Mechanism , 2016, ArXiv.

[11] Dan Klein,et al. Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints , 2016, ACL.

[12] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[13] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14] Mirella Lapata,et al. Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.

[15] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[16] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[17] Alexander M. Rush,et al. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[18] Yen-Chun Chen,et al. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[19] Yejin Choi,et al. Efficient Adaptation of Pretrained Transformers for Abstractive Summarization , 2019, ArXiv.

[20] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[21] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[22] Alexander M. Rush,et al. Bottom-Up Abstractive Summarization , 2018, EMNLP.

[23] Ani Nenkova,et al. Automatic Summarization , 2011, ACL.

[24] Ji Wang,et al. Pretraining-Based Natural Language Generation for Text Summarization , 2019, CoNLL.

[25] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[26] Yang Liu,et al. Fine-tune BERT for Extractive Summarization , 2019, ArXiv.

[27] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[28] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[29] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[30] Ming Zhou,et al. HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization , 2019, ACL.

[31] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[32] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[33] Jiacheng Xu,et al. Neural Extractive Text Summarization with Syntactic Compression , 2019, EMNLP.

[34] Mark T. Maybury,et al. Automatic Summarization , 2002, Computational Linguistics.

[35] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[36] Stefan Riezler,et al. Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss , 2019, TACL.

[37] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[38] Alex Wang,et al. BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.

[39] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[40] Yejin Choi,et al. Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.

[41] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[42] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[43] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[44] Min Sun,et al. A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss , 2018, ACL.

[45] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[46] Zhiyuan Liu,et al. DeepChannel: Salience Estimation by Contrastive Learning for Extractive Document Summarization , 2018, AAAI.

[47] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .