Abstractive Summarization with Keyword and Generated Word Attention

Abstractive summarization is a important task in natural language processing field. In previous work, the sequence-to-sequence based models are widely used for abstractive summarization task. However, most of the current abstractive summarization models still suffer from two problems. One is that it is difficult for these models to learn an accurate source contextual representation from the redundancy and noisy source text at each decoding step. Another is the information loss problem, which is ignored in previous work. The inability of these models to effectively exploit previously generated words led to this problem. In order to address these two problems, in this paper, we propose a novel keyword and generated word attention model. Specifically, the proposed model first employs the hidden state of decoder to capture relevant keywords and previously generated words contextual at each time step. The model then utilizes obtained keywords and generated words contextual to create keywords-aware and generated words-aware source contextual, respectively. The keywords contextual contributes to learn an accurate source contextual representation, and the generated words contextual can alleviate the information loss problem. Experimental results on a popular Chinese social media dataset demonstrate that the proposed model outperforms baselines and achieves the state-of-the-art performance.

[1]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[2]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[3]  Zhen-Hua Ling,et al.  Distraction-based neural networks for modeling documents , 2016, IJCAI 2016.

[4]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[5]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[8]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[9]  Si Li,et al.  Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network , 2018, NAACL.

[10]  Qingcai Chen,et al.  LCSTS: A Large Scale Chinese Short Text Summarization Dataset , 2015, EMNLP.

[11]  Xu Sun,et al.  A Hierarchical End-to-End Model for Jointly Improving Text Summarization and Sentiment Classification , 2018, IJCAI.

[12]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13]  Xu Sun,et al.  Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation , 2018, NAACL.

[14]  Yoshua Bengio,et al.  Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.

[15]  Xu Sun,et al.  Global Encoding for Abstractive Summarization , 2018, ACL.

[16]  M. de Rijke,et al.  Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model , 2017, SIGIR.

[17]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[18]  Ani Nenkova,et al.  A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization , 2006, SIGIR.

[19]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[20]  Mirella Lapata,et al.  Automatic Generation of Story Highlights , 2010, ACL.

[21]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[22]  Houfeng Wang,et al.  Learning Summary Prior Representation for Extractive Summarization , 2015, ACL.

[23]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[24]  Rongrong Ji,et al.  Asynchronous Bidirectional Decoding for Neural Machine Translation , 2018, AAAI.

[25]  Xu Sun,et al.  Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization , 2018, ACL.

[26]  Piji Li,et al.  Deep Recurrent Generative Decoder for Abstractive Text Summarization , 2017, EMNLP.

[27]  Xu Sun,et al.  Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization , 2017, ACL.