论文信息 - DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder - 字舞流文

DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder

Variational autoencoders~(VAEs) have shown a promise in data-driven conversation modeling. However, most VAE conversation models match the approximate posterior distribution over the latent variables to a simple prior such as standard normal distribution, thereby restricting the generated responses to a relatively simple (e.g., unimodal) scope. In this paper, we propose DialogWAE, a conditional Wasserstein autoencoder~(WAE) specially designed for dialogue modeling. Unlike VAEs that impose a simple distribution over the latent variables, DialogWAE models the distribution of data by training a GAN within the latent variable space. Specifically, our model samples from the prior and posterior distributions over the latent variables by transforming context-dependent random noise using neural networks and minimizes the Wasserstein distance between the two distributions. We further develop a Gaussian mixture prior network to enrich the latent space. Experiments on two popular datasets show that DialogWAE outperforms the state-of-the-art approaches in generating more coherent, informative and diverse responses.

Xiaodong Gu | Kyunghyun Cho | Sunghun Kim | JungWoo Ha | Kyunghyun Cho | Jung-Woo Ha | Sunghun Kim | Xiaodong Gu

[1] Matt J. Kusner,et al. GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution , 2016, ArXiv.

[2] Joelle Pineau,et al. Bootstrapping Dialog Systems with Word Embeddings , 2014 .

[3] Colin Cherry,et al. A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU , 2014, WMT@ACL.

[4] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5] Jung-Woo Ha,et al. NSML: Meet the MLaaS platform with a real-world case study , 2018, ArXiv.

[6] Mirella Lapata,et al. Vector-based Models of Semantic Composition , 2008, ACL.

[7] Regina Barzilay,et al. Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[8] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[9] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[10] Vasile Rus,et al. A Comparison of Greedy and Optimal Assessment of Natural Language Student Input Using Word-to-Word Similarity Metrics , 2012, BEA@NAACL-HLT.

[11] Alan Ritter,et al. Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[12] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[13] Xiaoyu Shen,et al. Improving Variational Encoder-Decoders in Dialogue Generation , 2018, AAAI.

[14] Jung-Woo Ha,et al. NSML: A Machine Learning Platform That Enables You to Focus on Your Models , 2017, ArXiv.

[15] Maxine Eskénazi,et al. Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[16] Eric P. Xing,et al. Nonparametric Variational Auto-Encoders for Hierarchical Representation Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[18] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[19] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[20] Bernhard Schölkopf,et al. Wasserstein Auto-Encoders , 2017, ICLR.

[21] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[22] Masaru Kitsuregawa,et al. Modeling Situations in Neural Chat Bots , 2017, ACL.

[23] Gunhee Kim,et al. A Hierarchical Latent Structure for Variational Conversation Modeling , 2018, NAACL.

[24] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[25] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[26] Xiaoyu Shen,et al. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[27] Wei-Ying Ma,et al. Topic Aware Neural Response Generation , 2016, AAAI.

[28] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[29] Alexander M. Rush,et al. Adversarially Regularized Autoencoders , 2017, ICML.

[30] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[31] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[32] Zhen Xu,et al. Neural Response Generation via GAN with an Approximate Embedding Layer , 2017, EMNLP.

[33] Joelle Pineau,et al. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.