Deep reinforcement learning for extractive document summarization

Abstract We present a novel extractive document summarization approach based on a Deep Q-Network (DQN), which can model salience and redundancy of sentences in the Q-value approximation and learn a policy that maximize the Rouge score with respect to gold summaries. We design two hierarchical network architectures to not only generate informative features from the document to represent the states of DQN, but also create a list of potential actions from sentences in the document for the DQN. At training time, our model is directly trained on reference summaries generated by human, eliminating the need for sentence-level extractive labels. For testing, we evaluate this model on the CNN/Daily corpus, the DUC 2002 dataset and the DUC 2004 dataset using Rouge metric. Our experiments show that our approach achieves performance which is better than or comparable to state-of-the-art models on these corpora without any access to linguistic annotation. This is the first time DQN has been applied to extractive summarization tasks.

[1]  Hua Li,et al.  Document Summarization Using Conditional Random Fields , 2007, IJCAI.

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  Bowen Zhou,et al.  Sequence-to-Sequence RNNs for Text Summarization , 2016, ArXiv.

[4]  Bowen Zhou,et al.  Classify or Select: Neural Architectures for Extractive Document Summarization , 2016, ArXiv.

[5]  Devdatt P. Dubhashi,et al.  Extractive Summarization using Continuous Vector Space Models , 2014, CVSC@EACL.

[6]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[7]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[8]  Wenpeng Yin,et al.  Optimizing Sentence Modeling and Selection for Document Summarization , 2015, IJCAI.

[9]  Andrew W. Moore,et al.  An Introduction to Reinforcement Learning , 1995 .

[10]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[11]  Siqi Liu,et al.  Optimization of image description metrics using policy gradient methods , 2016, ArXiv.

[12]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[13]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[14]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[15]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[16]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[17]  Sadid A. Hasan,et al.  Fear the REAPER: A System for Automatic Multi-Document Summarization with Reinforcement Learning , 2014, EMNLP.

[18]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[19]  Ming Zhou,et al.  TGSum: Build Tweet Guided Multi-Document Summarization Dataset , 2015, AAAI.

[20]  Regina Barzilay,et al.  Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.

[21]  Zhong-Ping Jiang,et al.  Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming , 2016, Autom..

[22]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[23]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[24]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[25]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[26]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[27]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[28]  Richard S. Sutton,et al.  Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.

[29]  Zhong-Ping Jiang,et al.  Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems , 2016, IEEE Transactions on Automatic Control.

[30]  Jing Peng,et al.  Incremental multi-step Q-learning , 1994, Machine Learning.

[31]  Regina Barzilay,et al.  Learning to Win by Reading Manuals in a Monte-Carlo Framework , 2011, ACL.

[32]  Xiaojun Wan,et al.  Towards a Unified Approach to Simultaneous Single-Document and Multi-Document Summarizations , 2010, COLING.

[33]  Dan Roth,et al.  Reading to Learn: Constructing Features from Semantic Abstracts , 2009, EMNLP.

[34]  Guy Shani,et al.  High-level reinforcement learning in strategy games , 2010, AAMAS.

[35]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[36]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[37]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[38]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[39]  Mirella Lapata,et al.  Chinese Poetry Generation with Recurrent Neural Networks , 2014, EMNLP.

[40]  Dianne P. O'Leary,et al.  Text summarization via hidden Markov models , 2001, SIGIR '01.

[41]  Daraksha Parveen,et al.  Topical Coherence for Graph-based Extractive Summarization , 2015, EMNLP.

[42]  Lucy Vanderwende,et al.  Enhancing Single-Document Summarization by Combining RankNet and Third-Party Sources , 2007, EMNLP.

[43]  Takeshi Abekawa,et al.  Framework of Automatic Text Summarization Using Reinforcement Learning , 2012, EMNLP-CoNLL.

[44]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[45]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[46]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[47]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[48]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[49]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[50]  Mirella Lapata,et al.  Automatic Generation of Story Highlights , 2010, ACL.