A two-step abstractive summarization model with asynchronous and enriched-information decoding

Most sequence-to-sequence abstractive summarization models generate the summaries based on the source article and the generated words, but they often neglect the future information implied in the un-generated words, which means that they lack the ability of “lookahead.” In this paper, we present a novel summarization model with “lookahead” ability to fully employ the implied future information. Our model takes two steps: (1) in the first step, an asynchronous decoder model with a no ground truth guiding backward decoder that explicitly produces and exploits the future information is trained. (2) in the inference process, in addition to the joint probability of the generated sequence, an enriched-information decoding method is proposed to further take future ROUGE reward of the un-generated words into account. Furthermore, the future ROUGE reward is predicted by a novel reward-predict model, and it takes the hidden states of the pre-trained asynchronous decoder model as input. Experimental results show that our two-step summarization model achieves new state-of-the-art results on CNN/Daily Mail dataset and the generalization of our model on test-only DUC-2002 datasets achieves higher scores than the state-of-the-art model.

[1]  Si Li,et al.  Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network , 2018, NAACL.

[2]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[3]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[4]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[5]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[6]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[7]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[8]  Iryna Gurevych,et al.  A Reinforcement Learning Approach for Adaptive Single- and Multi-Document Summarization , 2015, GSCL.

[9]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[10]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[11]  Xiaojun Wan,et al.  Abstractive Document Summarization with a Graph-Based Attentional Neural Model , 2017, ACL.

[12]  Naoaki Okazaki,et al.  Neural Headline Generation on Abstract Meaning Representation , 2016, EMNLP.

[13]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[14]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[15]  Alexander M. Rush,et al.  Bottom-Up Abstractive Summarization , 2018, EMNLP.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Di He,et al.  Decoding with Value Networks for Neural Machine Translation , 2017, NIPS.

[18]  Ramakanth Pasunuru,et al.  Multi-Reward Reinforced Summarization with Saliency and Entailment , 2018, NAACL.

[19]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[20]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[21]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[22]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[23]  Rongrong Ji,et al.  Asynchronous Bidirectional Decoding for Neural Machine Translation , 2018, AAAI.

[24]  Yen-Chun Chen,et al.  Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[25]  Ramakanth Pasunuru,et al.  Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation , 2018, ACL.

[26]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[27]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[28]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[29]  Xinyan Xiao,et al.  Improving Neural Abstractive Document Summarization with Explicit Information Selection Modeling , 2018, EMNLP.

[30]  Richard M. Schwartz,et al.  BBN/UMD at DUC-2004: Topiary , 2004 .

[31]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[32]  Yejin Choi,et al.  Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.

[33]  Christoph Meinel,et al.  Image Captioning with Deep Bidirectional LSTMs and Multi-Task Learning , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[34]  Gholamreza Haffari,et al.  Towards Decoding as Continuous Optimisation in Neural Machine Translation , 2017, EMNLP.