Cue-word Driven Neural Response Generation with a Shrinking Vocabulary

Open-domain response generation is the task of generating sensible and informative re-sponses to the source sentence. However, neural models tend to generate safe and mean-ingless responses. While cue-word introducing approaches encourage responses with concrete semantics and have shown tremendous potential, they still fail to explore di-verse responses during decoding. In this paper, we propose a novel but natural approach that can produce multiple cue-words during decoding, and then uses the produced cue-words to drive decoding and shrinks the decoding vocabulary. Thus the neural genera-tion model can explore the full space of responses and discover informative ones with efficiency. Experimental results show that our approach significantly outperforms several strong baseline models with much lower decoding complexity. Especially, our approach can converge to concrete semantics more efficiently during decoding.

[1]  G. Wahba,et al.  Multivariate Bernoulli distribution , 2012, 1206.1874.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Dongyan Zhao,et al.  Towards Implicit Content-Introducing for Generative Short-Text Conversation Systems , 2017, EMNLP.

[4]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[5]  Wei Chu,et al.  AliMe Assist: An Intelligent Assistant for Creating an Innovative E-commerce Experience , 2017, CIKM.

[6]  Ashwin K. Vijayakumar,et al.  Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models , 2016, ArXiv.

[7]  P. Serdyukov,et al.  Sequence Modeling with Unconstrained Generation Order , 2019, NeurIPS.

[8]  Rafael Bidarra,et al.  A Survey on Story Generation Techniques for Authoring Computational Narratives , 2017, IEEE Transactions on Computational Intelligence and AI in Games.

[9]  Mohammad Norouzi,et al.  The Importance of Generation Order in Language Modeling , 2018, EMNLP.

[10]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[11]  Daniel Jurafsky,et al.  A Simple, Fast Diverse Decoding Algorithm for Neural Generation , 2016, ArXiv.

[12]  Rui Yan,et al.  Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation , 2016, COLING.

[13]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[14]  Jianfeng Gao,et al.  DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.

[15]  Shuming Shi,et al.  Towards Less Generic Responses in Neural Conversation Models: A Statistical Re-weighting Method , 2018, EMNLP.

[16]  Shen Li,et al.  Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings , 2018, CCL.

[17]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[18]  Kyunghyun Cho,et al.  Non-Monotonic Sequential Text Generation , 2019, ICML.

[19]  Harry Shum,et al.  The Design and Implementation of XiaoIce, an Empathetic Social Chatbot , 2018, CL.

[20]  David Grangier,et al.  Vocabulary Selection Strategies for Neural Machine Translation , 2016, ArXiv.

[21]  Zhiguo Wang,et al.  Vocabulary Manipulation for Neural Machine Translation , 2016, ACL.

[22]  Chris Callison-Burch,et al.  Comparison of Diverse Decoding Methods from Conditional Language Models , 2019, ACL.

[23]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[24]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26]  Jakob Uszkoreit,et al.  Insertion Transformer: Flexible Sequence Generation via Insertion Operations , 2019, ICML.

[27]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[28]  Philip Gage,et al.  A new algorithm for data compression , 1994 .

[29]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[30]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[31]  Jacob Cohen,et al.  The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[32]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[33]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[34]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[35]  Daniel Jurafsky,et al.  Learning to Decode for Future Success , 2017, ArXiv.

[36]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[37]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[38]  Nebojsa Jojic,et al.  Steering Output Style and Topic in Neural Response Generation , 2017, EMNLP.

[39]  Guodong Zhou,et al.  A Discrete CVAE for Response Generation on Short-Text Conversation , 2019, EMNLP.

[40]  Lior Wolf,et al.  Using the Output Embedding to Improve Language Models , 2016, EACL.

[41]  Wei-Ying Ma,et al.  Topic Aware Neural Response Generation , 2016, AAAI.

[42]  Jakob Uszkoreit,et al.  KERMIT: Generative Insertion-Based Modeling for Sequences , 2019, ArXiv.

[43]  Xing Shi,et al.  Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary , 2017, ACL.

[44]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[45]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[46]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[47]  Zhoujun Li,et al.  Neural Response Generation with Dynamic Vocabularies , 2017, AAAI.

[48]  Jingjing Xu,et al.  PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation , 2019, ArXiv.

[49]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[50]  Mary Williamson,et al.  Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[51]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .