Hierarchy Response Learning for Neural Conversation Generation

The neural encoder-decoder models have shown great promise in neural conversation generation. However, they cannot perceive and express the intention effectively, and hence often generate dull and generic responses. Unlike past work that has focused on diversifying the output at word-level or discourse-level with a flat model to alleviate this problem, we propose a hierarchical generation model to capture the different levels of diversity using the conditional variational autoencoders. Specifically, a hierarchical response generation (HRG) framework is proposed to capture the conversation intention in a natural and coherent way. It has two modules, namely, an expression reconstruction model to capture the hierarchical correlation between expression and intention, and an expression attention model to effectively combine the expressions with contents. Finally, the training procedure of HRG is improved by introducing reconstruction loss. Experiment results show that our model can generate the responses with more appropriate content and expression.

[1]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[2]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[3]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[6]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[7]  Qi Wu,et al.  Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[9]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[10]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[11]  Zhen Xu,et al.  Neural Response Generation via GAN with an Approximate Embedding Layer , 2017, EMNLP.

[12]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[13]  Jun Xu,et al.  Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation , 2018, IJCAI.

[14]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[15]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[16]  Xiaoyu Shen,et al.  DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[17]  Daniel McDuff,et al.  Emotional Dialogue Generation using Image-Grounded Language Models , 2018, CHI.

[18]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[19]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[20]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[21]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[22]  Adi Ben-Israel,et al.  Generalized inverses: theory and applications , 1974 .

[23]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[24]  Yu Zhang,et al.  Flexible End-to-End Dialogue System for Knowledge Grounded Conversation , 2017, ArXiv.

[25]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[26]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[27]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[28]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[29]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[30]  Massimo Poesio,et al.  Towards an Axiomatization of Dialogue Acts , 1998 .

[31]  Yang Feng,et al.  Knowledge Diffusion for Neural Dialogue Generation , 2018, ACL.