Learning to Extract Coherent Summary via Deep Reinforcement Learning

Coherence plays a critical role in producing a high-quality summary from a document. In recent years, neural extractive summarization is becoming increasingly attractive. However, most of them ignore the coherence of summaries when extracting sentences. As an effort towards extracting coherent summaries, we propose a neural coherence model to capture the cross-sentence semantic and syntactic coherence patterns. The proposed neural coherence model obviates the need for feature engineering and can be trained in an end-to-end fashion using unlabeled data. Empirical results show that the proposed neural coherence model can efficiently capture the cross-sentence coherence patterns. Using the combined output of the neural coherence model and ROUGE package as the reward, we design a reinforcement learning method to train a proposed neural extractive summarizer which is named Reinforced Neural Extractive Summarization (RNES) model. The RNES model learns to optimize coherence and informative importance of the summary simultaneously. Experimental results show that the proposed RNES outperforms existing baselines and achieves state-of-the-art performance in term of ROUGE on CNN/Daily Mail dataset. The qualitative evaluation indicates that summaries produced by RNES are more coherent and readable.

[1]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[2]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[3]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[4]  Xiaojun Wan,et al.  Recent advances in document summarization , 2017, Knowledge and Information Systems.

[5]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Sadid A. Hasan,et al.  Fear the REAPER: A System for Automatic Multi-Document Summarization with Reinforcement Learning , 2014, EMNLP.

[8]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[9]  Shafiq R. Joty,et al.  A Neural Local Coherence Model , 2017, ACL.

[10]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[11]  Yang Xiang,et al.  ICRC-HIT: A Deep Learning based Comment Sequence Labeling System for Answer Selection Challenge , 2015, *SEMEVAL.

[12]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[13]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[14]  Dan Klein,et al.  Jointly Learning to Extract and Compress , 2011, ACL.

[15]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[16]  Eduard H. Hovy,et al.  A Model of Coherence Based on Distributed Sentence Representation , 2014, EMNLP.

[17]  Qingcai Chen,et al.  LCSTS: A Large Scale Chinese Short Text Summarization Dataset , 2015, EMNLP.

[18]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[19]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[20]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[21]  Iryna Gurevych,et al.  A Reinforcement Learning Approach for Adaptive Single- and Multi-Document Summarization , 2015, GSCL.

[22]  Takeshi Abekawa,et al.  Framework of Automatic Text Summarization Using Reinforcement Learning , 2012, EMNLP-CoNLL.

[23]  Qingcai Chen,et al.  A novel word embedding learning model using the dissociation between nouns and verbs , 2016, Neurocomputing.

[24]  Hal Daumé,et al.  Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback , 2017, EMNLP.

[25]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[26]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[27]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[28]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[29]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[30]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[31]  Zhiyuan Liu,et al.  Neural Headline Generation with Minimum Risk Training , 2016, ArXiv.

[32]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[33]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[34]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.