Source-side Prediction for Neural Headline Generation

The encoder-decoder model is widely used in natural language generation tasks. However, the model sometimes suffers from repeated redundant generation, misses important phrases, and includes irrelevant entities. Toward solving these problems we propose a novel source-side token prediction module. Our method jointly estimates the probability distributions over source and target vocabularies to capture a correspondence between source and target tokens. The experiments show that the proposed model outperforms the current state-of-the-art method in the headline generation task. Additionally, we show that our method has an ability to learn a reasonable token-wise correspondence without knowing any true alignments.

[1]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.

[3]  Zhiguo Wang,et al.  Coverage Embedding Models for Neural Machine Translation , 2016, EMNLP.

[4]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[5]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[6]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[7]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[8]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[9]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Mingbo Ma,et al.  When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) , 2017, EMNLP.

[12]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[13]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14]  Zaixiang Zheng,et al.  Neural Machine Translation with Word Predictions , 2017, EMNLP.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Naoaki Okazaki,et al.  Neural Headline Generation on Abstract Meaning Representation , 2016, EMNLP.

[17]  Colin Raffel,et al.  Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.

[18]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[19]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[20]  Benjamin Van Durme,et al.  Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.

[21]  Masaaki Nagata,et al.  Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization , 2016, EACL.

[22]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[23]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[24]  Piji Li,et al.  Deep Recurrent Generative Decoder for Abstractive Text Summarization , 2017, EMNLP.