Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization

We introduce “extreme summarization”, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach. The idea is to create a short, one-sentence news summary answering the question “What is the article about?”. We collect a real-world, large-scale dataset for this task by harvesting online articles from the British Broadcasting Corporation (BBC). We propose a novel abstractive model which is conditioned on the article’s topics and based entirely on convolutional neural networks. We demonstrate experimentally that this architecture captures long-range dependencies in a document and recognizes pertinent content, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans.

[1]  M. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[2]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  Ani Nenkova,et al.  Automatic Text Summarization of Newswire: Lessons Learned from the Document Understanding Conference , 2005, AAAI.

[5]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[6]  Mirella Lapata,et al.  Discourse Constraints for Document Compression , 2010, CL.

[7]  Geoffrey Zweig,et al.  Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[8]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[9]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[10]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[11]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[12]  Jordan J. Louviere,et al.  Best-Worst Scaling: Theory, Methods and Applications , 2015 .

[13]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[14]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Deniz Yuret,et al.  Why Neural Translations are the Right Length , 2016, EMNLP.

[17]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Dan Klein,et al.  Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints , 2016, ACL.

[20]  Zhen-Hua Ling,et al.  Distraction-based neural networks for modeling documents , 2016, IJCAI 2016.

[21]  Larry P. Heck,et al.  Contextual LSTM (CLSTM) models for Large scale NLP tasks , 2016, ArXiv.

[22]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[23]  Natalie Schluter,et al.  The limits of automatic summarisation according to ROUGE , 2017, EACL.

[24]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[25]  Mirella Lapata,et al.  Neural Extractive Summarization with Side Information , 2017, ArXiv.

[26]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[27]  Saif Mohammad,et al.  Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation , 2017, ACL.

[28]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[29]  Chong Wang,et al.  TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency , 2016, ICLR.

[30]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[31]  Xiaojun Wan,et al.  Abstractive Document Summarization with a Graph-Based Attentional Neural Model , 2017, ACL.

[32]  Yann Dauphin,et al.  A Convolutional Encoder Model for Neural Machine Translation , 2016, ACL.

[33]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[34]  Yejin Choi,et al.  Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.

[35]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[36]  Ramakanth Pasunuru,et al.  Multi-Reward Reinforced Summarization with Saliency and Entailment , 2018, NAACL.

[37]  Mor Naaman,et al.  Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[38]  Mirella Lapata,et al.  Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.

[39]  Angela Fan,et al.  Controllable Abstractive Summarization , 2017, NMT@ACL.

[40]  Mirella Lapata,et al.  Document Modeling with External Attention for Sentence Extraction , 2018, ACL.