GLSE: Global-Local Selective Encoding for Response Generation in Neural Conversation Model

How to generate relevant and informative response is one of the core topics in response generation area. Following the task formulation of neural machine translation, previous works mainly consider response generation task as a mapping from a source sentence to a target sentence. However, the dialogue model tends to generate safe, commonplace responses (e.g., I don't know) regardless of the input, when learning to maximize the likelihood of response for the given message in an almost loss-less manner just like MT. Different from existing works, we propose a Global-Local Selective Encoding model (GLSE) to extend the seq2seq framework to generate more relevant and informative responses. Specifically, two types of selective gate network are introduced in this work: (i) A local selective word-sentence gate is added after encoding phase of Seq2Seq learning framework, which learns to tailor the original message information and generates a selected input representation. (ii) A global selective bidirectional-context gate is set to control the bidirectional information flow from a BiGRU based encoder to decoder. Empirical studies indicate the advantage of our model over several classical and strong baselines.

[1]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[2]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[3]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[4]  Denny Britz,et al.  Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models , 2017, EMNLP.

[5]  Jianfeng Gao,et al.  A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[6]  Rui Yan,et al.  Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation , 2016, COLING.

[7]  Ondrej Dusek,et al.  A Context-aware Natural Language Generator for Dialogue Systems , 2016, SIGDIAL Conference.

[8]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[9]  Zhoujun Li,et al.  Neural Response Generation with Dynamic Vocabularies , 2017, AAAI.

[10]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[11]  Matthew R. Walter,et al.  Coherent Dialogue with Attention-Based Language Models , 2016, AAAI.

[12]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13]  Noah A. Smith,et al.  Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016, ACL 2016.

[14]  Dongyan Zhao,et al.  How to Make Context More Useful? An Empirical Study on Context-Aware Neural Conversational Models , 2017, ACL.

[15]  Yang Zhao,et al.  A Conditional Variational Framework for Dialog Generation , 2017, ACL.

[16]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[17]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18]  Xinlei Chen,et al.  Visualizing and Understanding Neural Models in NLP , 2015, NAACL.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[21]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.