Variational Memory Encoder-Decoder

Introducing variability while maintaining coherence is a core task in learning to generate utterances in conversation. Standard neural encoder-decoder models and their extensions using conditional variational autoencoder often result in either trivial or digressive responses. To overcome this, we explore a novel approach that injects variability into neural encoder-decoder via the use of external memory as a mixture model, namely Variational Memory Encoder-Decoder (VMED). By associating each memory read with a mode in the latent mixture distribution at each timestep, our model can capture the variability observed in sequential data such as natural conversations. We empirically compare the proposed model against other recent approaches on various conversational datasets. The results show that VMED consistently achieves significant improvement over others in both metric-based and qualitative evaluations.

[1]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[2]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[3]  Denny Britz,et al.  Efficient Attention using a Fixed-Size Memory Representation , 2017, EMNLP.

[4]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[5]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[6]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[7]  Min Zhang,et al.  Variational Neural Machine Translation , 2016, EMNLP.

[8]  Rui Shu Stochastic Video Prediction with Conditional Density Estimation , 2016 .

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Svetlana Lazebnik,et al.  Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space , 2017, NIPS.

[11]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[12]  Zhaochun Ren,et al.  Hierarchical Variational Memory Network for Dialogue Generation , 2018, WWW.

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Oladimeji Farri,et al.  Condensed Memory Networks for Clinical Diagnostic Inferencing , 2016, AAAI.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[17]  V. Maz'ya,et al.  On approximate approximations using Gaussian kernels , 1996 .

[18]  Athanassia G. Bacharoglou Approximation of probability distributions by convex mixtures of Gaussian measures , 2010 .

[19]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[20]  Joelle Pineau,et al.  Bootstrapping Dialog Systems with Word Embeddings , 2014 .

[21]  Lars Hertel,et al.  Approximate Inference for Deep Latent Gaussian Mixtures , 2016 .

[22]  Colin Cherry,et al.  A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU , 2014, WMT@ACL.

[23]  Svetha Venkatesh,et al.  Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning , 2018, KDD.

[24]  Pierre Lison,et al.  Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models , 2017, SIGDIAL Conference.

[25]  Huachun Tan,et al.  Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering , 2016, IJCAI.

[26]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[27]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[28]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[29]  David Amos,et al.  Generative Temporal Models with Memory , 2017, ArXiv.

[30]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[31]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[32]  Qun Liu,et al.  Memory-enhanced Decoder for Neural Machine Translation , 2016, EMNLP.

[33]  Tsung-Hsien Wen,et al.  Latent Intention Dialogue Models , 2017, ICML.

[34]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[35]  Murray Shanahan,et al.  Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders , 2016, ArXiv.

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[38]  Svetha Venkatesh,et al.  Dual Control Memory Augmented Neural Networks for Treatment Recommendations , 2018, PAKDD.

[39]  Yang Zhao,et al.  A Conditional Variational Framework for Dialog Generation , 2017, ACL.

[40]  Jörg Bornschein,et al.  Variational Memory Addressing in Generative Models , 2017, NIPS.

[41]  Jean-Philippe Thiran,et al.  Lower and upper bounds for approximation of the Kullback-Leibler divergence between Gaussian Mixture Models , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.