论文信息 - On Extractive and Abstractive Neural Document Summarization with Transformer Language Models - 字舞流文

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper.

Sandeep Subramanian | Raymond Li | Jonathan Pilault | Christopher Pal

[1] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[2] Ashish Agarwal,et al. Hallucinations in Neural Machine Translation , 2018 .

[3] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.

[4] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[5] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[6] Bowen Zhou,et al. Classify or Select: Neural Architectures for Extractive Document Summarization , 2016, ArXiv.

[7] Bowen Zhou,et al. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[8] Min Sun,et al. A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss , 2018, ACL.

[9] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[10] Yao Zhao,et al. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[11] Alexander M. Rush,et al. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[12] Lu Wang,et al. BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization , 2019, ACL.

[13] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[14] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[15] Niranjan Balasubramanian,et al. Controlling Decoding for More Abstractive Summaries with Copy-Based Networks , 2018, ArXiv.

[16] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[17] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[18] Yen-Chun Chen,et al. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[19] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[20] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[22] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[23] Alexander M. Rush,et al. Bottom-Up Abstractive Summarization , 2018, EMNLP.

[24] Mor Naaman,et al. Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[25] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[26] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[27] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[28] Zita Marinho,et al. Jointly Extracting and Compressing Documents with Summary State Representations , 2019, NAACL.

[29] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[30] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[31] Alex Graves,et al. Neural Machine Translation in Linear Time , 2016, ArXiv.

[32] Dragomir R. Radev,et al. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[33] Chin-Yew Lin,et al. Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough? , 2004, NTCIR.

[34] Mirella Lapata,et al. Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[35] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[36] Bowen Zhou,et al. Pointing the Unknown Words , 2016, ACL.

[37] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[38] Jun-Ping Ng,et al. Better Summarization Evaluation with Word Embeddings for ROUGE , 2015, EMNLP.

[39] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[40] Yvette Graham,et al. Re-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE , 2015, EMNLP.

[41] Bowen Zhou,et al. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[42] Franck Dernoncourt,et al. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents , 2018, NAACL.

[43] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[44] Ani Nenkova,et al. Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[45] J. Steinberger,et al. Using Latent Semantic Analysis in Text Summarization and Summary Evaluation , 2004 .

[46] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[47] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.

[48] Yann Dauphin,et al. Hierarchical Neural Story Generation , 2018, ACL.