Reducing Odd Generation from Neural Headline Generation

The Encoder-Decoder model is widely used in natural language generation tasks. However, the model sometimes suffers from repeated redundant generation, misses important phrases, and includes irrelevant entities. Toward solving these problems we propose a novel sourceside token prediction module. Our method jointly estimates the probability distributions over source and target vocabularies to capture the correspondence between source and target tokens. Experiments show that the proposed model outperforms the current state-of-the-art method in the headline generation task. We also show that our method can learn a reasonable token-wise correspondence without knowing any true alignment1.

[1]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.

[3]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Masaaki Nagata,et al.  Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization , 2016, EACL.

[6]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[7]  Naoaki Okazaki,et al.  Neural Headline Generation on Abstract Meaning Representation , 2016, EMNLP.

[8]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[9]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[10]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[11]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[12]  Benjamin Van Durme,et al.  Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.

[13]  Piji Li,et al.  Deep Recurrent Generative Decoder for Abstractive Text Summarization , 2017, EMNLP.

[14]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[15]  Zaixiang Zheng,et al.  Neural Machine Translation with Word Predictions , 2017, EMNLP.

[16]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[17]  Zhiguo Wang,et al.  Coverage Embedding Models for Neural Machine Translation , 2016, EMNLP.

[18]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[19]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[20]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.