Condition-transforming Variational Autoencoder for Conversation Response Generation

This paper proposes a new model, called condition-transforming variational autoencoder (CTVAE), to improve the performance of conversation response generation using conditional variational autoencoders (CVAEs). In conventional CVAEs , the prior distribution of latent variable z follows a multivariate Gaussian distribution with mean and variance modulated by the input conditions. Previous work found that this distribution tends to become condition-independent in practical application. In our proposed CTVAE model, the latent variable z is sampled by performing a non-linear transformation on the combination of the input conditions and the samples from a condition-independent prior distribution $\mathcal{N}\left( {0,{\mathbf{I}}} \right)$. In our objective evaluations, the CTVAE model outperforms the CVAE model on fluency metrics and surpasses a sequence-to-sequence (Seq2Seq) model on diversity metrics. In subjective preference tests, our proposed CTVAE model performs significantly better than CVAE and Seq2Seq models on generating fluency, informative and topic relevant responses.

[1]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[2]  Tetsuya Sakai,et al.  Overview of the NTCIR-12 Short Text Conversation Task , 2016, NTCIR.

[3]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[4]  Ming Li,et al.  Generating Thematic Chinese Poetry with Conditional Variational Autoencoder , 2017, ArXiv.

[5]  Wei-Ying Ma,et al.  Topic Aware Neural Response Generation , 2016, AAAI.

[6]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[7]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[8]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[9]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[10]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[11]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[12]  Zhoujun Li,et al.  Neural Response Generation with Dynamic Vocabularies , 2017, AAAI.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[15]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[16]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[17]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[18]  Bo Chen,et al.  Mechanism-Aware Neural Machine for Dialogue Response Generation , 2017, AAAI.

[19]  Ming Li,et al.  Generating Thematic Chinese Poetry using Conditional Variational Autoencoders with Hybrid Decoders , 2017, IJCAI.

[20]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[21]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[22]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[23]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[24]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[25]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[26]  Ming-Wei Chang,et al.  A Knowledge-Grounded Neural Conversation Model , 2017, AAAI.