May the Force Be with Your Copy Mechanism: Enhanced Supervised-Copy Method for Natural Language Generation

Recent neural sequence-to-sequence models with a copy mechanism have achieved remarkable progress in various text generation tasks. These models addressed out-ofvocabulary problems and facilitated the generation of rare words. However, the identification of the word which needs to be copied is difficult, as observed by prior copy models, which suffer from incorrect generation and lacking abstractness. In this paper, we propose a novel supervised approach of a copy network that helps the model decide which words need to be copied and which need to be generated. Specifically, we re-define the objective function, which leverages source sequences and target vocabularies as guidance for copying. The experimental results on datato-text generation and abstractive summarization tasks verify that our approach enhances the copying quality and improves the degree of abstractness.

[1]  Ehud Reiter,et al.  An Architecture for Data-to-Text Systems , 2007, ENLG.

[2]  Xu Tan,et al.  MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.

[3]  Mirella Lapata,et al.  Data-to-Text Generation with Content Selection and Planning , 2018, AAAI.

[4]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[5]  Richard Socher,et al.  The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.

[6]  Xiuyu Wu,et al.  A Question Type Driven and Copy Loss Enhanced Frameworkfor Answer-Agnostic Neural Question Generation , 2020, NGT.

[7]  Wenhu Chen,et al.  Guided Alignment Training for Topic-Aware Neural Machine Translation , 2016, AMTA.

[8]  Mirella Lapata,et al.  Data-to-text Generation with Entity Modeling , 2019, ACL.

[9]  Emiel Krahmer,et al.  Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation , 2017, J. Artif. Intell. Res..

[10]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[11]  Yusuke Miyao,et al.  Learning to Select, Track, and Generate for Data-to-Text , 2019, ACL.

[12]  Richard Socher,et al.  Evaluating the Factual Consistency of Abstractive Text Summarization , 2019, EMNLP.

[13]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[14]  Phil Blunsom,et al.  Language as a Latent Variable: Discrete Generative Models for Sentence Compression , 2016, EMNLP.

[15]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[16]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[17]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[18]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[19]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[20]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21]  Nan Yang,et al.  Sequential Copying Networks , 2018, AAAI.

[22]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[23]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[24]  Song Xu,et al.  Self-Attention Guided Copy Mechanism for Abstractive Summarization , 2020, ACL.

[25]  Yao Zhao,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[26]  Alexander M. Rush,et al.  End-to-End Content and Plan Selection for Data-to-Text Generation , 2018, INLG.

[27]  Zhiyu Chen,et al.  Few-shot NLG with Pre-trained Language Model , 2020, ACL.

[28]  Laure Soulier,et al.  Controlling hallucinations at word level in data-to-text generation , 2021, Data Mining and Knowledge Discovery.

[29]  Patrick Gallinari,et al.  A Hierarchical Model for Data-to-Text Generation , 2019, ECIR.

[30]  Alexander M. Rush,et al.  Bottom-Up Abstractive Summarization , 2018, EMNLP.

[31]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[32]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[33]  Fei Liu,et al.  Controlling the Amount of Verbatim Copying in Abstractive Summarization , 2019, AAAI.

[34]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[35]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[36]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[37]  Vincent Nguyen,et al.  The OpenNMT Neural Machine Translation Toolkit: 2020 Edition , 2020, AMTA.

[38]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[39]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[40]  Richard Socher,et al.  Improving Abstraction in Text Summarization , 2018, EMNLP.