A Divide-and-Conquer Approach to the Summarization of Long Documents

We present a novel divide-and-conquer method for the neural summarization of long documents. Our method exploits the discourse structure of the document and uses sentence similarity to split the problem into an ensemble of smaller summarization problems. In particular, we break a long document and its summary into multiple source-target pairs, which are used for training a model that learns to summarize each part of the document separately. These partial summaries are then combined in order to produce a final complete summary. With this approach we can decompose the problem of long document summarization into smaller and simpler problems, reducing computational complexity and creating more training examples, which at the same time contain less noise in the target summaries compared to the standard approach. We demonstrate that this approach paired with different summarization models, including sequence-to-sequence RNNs and Transformers, can lead to improved summarization performance. Our best models achieve results that are on par with the state-of-the-art in two two publicly available datasets of academic articles.

[1]  Nazli Goharian,et al.  Scientific document summarization via citation contextualization and scientific discourse , 2017, International Journal on Digital Libraries.

[2]  Xu Tan,et al.  MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.

[3]  Alexander M. Rush,et al.  Bottom-Up Abstractive Summarization , 2018, EMNLP.

[4]  Dragomir R. Radev,et al.  Generating Extractive Summaries of Scientific Paradigms , 2013, J. Artif. Intell. Res..

[5]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[6]  Grigorios Tsoumakas,et al.  Structured Summarization of Academic Publications , 2019, PKDD/ECML Workshops.

[7]  Mor Naaman,et al.  Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[8]  Naren Ramakrishnan,et al.  Deep Transfer Reinforcement Learning for Text Summarization , 2018, SDM.

[9]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[10]  J. Steinberger,et al.  Using Latent Semantic Analysis in Text Summarization and Summary Evaluation , 2004 .

[11]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[12]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[13]  Tommy W. S. Chow,et al.  Query-oriented text summarization based on hypergraph transversals , 2019, Inf. Process. Manag..

[14]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[15]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[16]  Bowen Zhou,et al.  Classify or Select: Neural Architectures for Extractive Document Summarization , 2016, ArXiv.

[17]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[18]  Preslav Nakov,et al.  Rotational Unit of Memory: A Novel Representation Unit for RNNs with Scalable Applications , 2019, TACL.

[19]  Mirella Lapata,et al.  Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.

[20]  Sandeep Subramanian,et al.  On Extractive and Abstractive Neural Document Summarization with Transformer Language Models , 2020, EMNLP.

[21]  Nazli Goharian,et al.  Scientific Article Summarization Using Citation-Context and Article’s Discourse Structure , 2015, EMNLP.

[22]  Taku Kudo,et al.  Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.

[23]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[24]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[25]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[26]  Yoshua Bengio,et al.  Audio Chord Recognition with Recurrent Neural Networks , 2013, ISMIR.

[27]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[28]  Yao Zhao,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[29]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[30]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[31]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[32]  Yejin Choi,et al.  Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.

[33]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[34]  Isabelle Augenstein,et al.  A Supervised Approach to Extractive Summarisation of Scientific Papers , 2017, CoNLL.

[35]  Min-Yen Kan,et al.  Overview of the CL-SciSumm 2016 Shared Task , 2016, BIRNDL@JCDL.

[36]  Mirella Lapata,et al.  Sentence Centrality Revisited for Unsupervised Summarization , 2019, ACL.

[37]  Li Yang,et al.  Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.

[38]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[39]  Mirella Lapata,et al.  Text Summarization with Pretrained Encoders , 2019, EMNLP.

[40]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[41]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[42]  Alex Graves,et al.  Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.

[43]  Franck Dernoncourt,et al.  A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents , 2018, NAACL.

[44]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[45]  Ani Nenkova,et al.  Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[46]  Jungo Kasai,et al.  ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks , 2019, AAAI.

[47]  Mahmoud El-Haj,et al.  MultiLing 2019: Financial Narrative Summarisation , 2019, Proceedings of the Workshop MultiLing 2019: Summarization Across Languages, Genres and Sources associated with RANLP 2019.

[48]  Pratik Rane,et al.  Self-Critical Sequence Training for Image Captioning , 2018 .

[49]  Mourad Oussalah,et al.  SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis , 2019, Inf. Process. Manag..

[50]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[51]  Jiusheng Chen,et al.  ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training , 2020, EMNLP.

[52]  Benjamin Van Durme,et al.  Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.

[53]  Yen-Chun Chen,et al.  Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[54]  Noam Shazeer,et al.  Adafactor: Adaptive Learning Rates with Sublinear Memory Cost , 2018, ICML.