Sequence Generative Adversarial Network for Long Text Summarization

In this paper, we propose a new adversarial training framework for text summarization task. Although sequence-to-sequence models have achieved state-of-the-art performance in abstractive summarization, the training strategy (MLE) suffers from exposure bias in the inference stage. This discrepancy between training and inference makes generated summaries less coherent and accuracy, which is more prominent in summarizing long articles. To address this issue, we model abstractive summarization using Generative Adversarial Network (GAN), aiming to minimize the gap between generated summaries and the ground-truth ones. This framework consists of two models: a generator that generates summaries, a discriminator that evaluates generated summaries. Reinforcement learning (RL) strategy is used to guarantee the co-training of generator and discriminator. Besides, motivated by the nature of summarization task, we design a novel Triple-RNNs discriminator, and extend the off-the-shelf generator by appending encoder and decoder with attention mechanism. Experimental results showed that our model significantly outperforms the state-of-the-art models, especially on long text corpus.

[1]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[2]  Wenpeng Yin,et al.  Comparative Study of CNN and RNN for Natural Language Processing , 2017, ArXiv.

[3]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[4]  Jun Zhao,et al.  Inner Attention based Recurrent Neural Networks for Answer Selection , 2016, ACL.

[5]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[6]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[7]  Min Yang,et al.  Generative Adversarial Network for Abstractive Text Summarization , 2017, AAAI.

[8]  Dianne P. O'Leary,et al.  Text summarization via hidden Markov models , 2001, SIGIR '01.

[9]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[10]  David Bamman,et al.  Adversarial Training for Relation Extraction , 2017, EMNLP.

[11]  Xin Pan,et al.  Text summarization with TensorFlow , 2016 .

[12]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[13]  Ferenc Huszar,et al.  How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , 2015, ArXiv.

[14]  Tie-Yan Liu,et al.  Adversarial Neural Machine Translation , 2017, ACML.

[15]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[16]  Bo Li,et al.  Integrating Extractive and Abstractive Models for Long Text Summarization , 2017, 2017 IEEE International Congress on Big Data (BigData Congress).

[17]  Mirella Lapata,et al.  Sentence Compression Beyond Word Deletion , 2008, COLING.

[18]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[19]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[20]  Louisa Ferrier A Maximum Entropy Approach to Text Summarization , 2001 .

[21]  Richard M. Schwartz,et al.  BBN/UMD at DUC-2004: Topiary , 2004 .

[22]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[23]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[24]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[25]  Bowen Zhou,et al.  Sequence-to-Sequence RNNs for Text Summarization , 2016, ArXiv.

[26]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[29]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.