Generating diverse conversation responses by creating and ranking multiple candidates

Abstract This paper introduces our systems built for Track 2 of Dialog System Technology Challenge 7 (DSTC7). This challenge track aimed to evaluate the response generation methods using fully data-driven conversation models in a knowledge-grounded setting, where textual facts were provided as the knowledge for each context-response pair. The sequence-to-sequence models have achieved impressive results in machine translation and have also been widely used for end-to-end generative conversation modelling. However, they tended to output dull and repeated responses in previous studies. Our work aims to promote the diversity of end-to-end conversation response generation by adopting a two-stage pipeline. 1) Create multiple responses for an input context together with its textual facts. At this stage, two different models are designed, i.e., a variational generative (VariGen) model and a retrieval-based (Retrieval) model. 2) Rank and return the most relevant response by training a topic coherence discrimination (TCD) model for calculating ranking scores. In our experiments, we demonstrated the effectiveness of the response ranking strategy and the external textual knowledge for generating better responses. According to the official evaluation results, our Retrieval and VariGen systems ranked first and second respectively among all participant systems on Entropy metrics which measured the objective diversity of generated responses. Besides, the VariGen system ranked second on NIST and METEOR metrics which measured the objective quality of generated responses.

[1]  Steve J. Young,et al.  A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.

[2]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[3]  Zhe Gan,et al.  Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization , 2018, NeurIPS.

[4]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[5]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[6]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[7]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[8]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[9]  Zhen-Hua Ling,et al.  Promoting Diversity for End-to-End Conversation Response Generation , 2019, ArXiv.

[10]  Yang Zhao,et al.  A Conditional Variational Framework for Dialog Generation , 2017, ACL.

[11]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[12]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[13]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[14]  Zhigang Chen,et al.  Condition-transforming Variational Autoencoder for Conversation Response Generation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[16]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[17]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[18]  Stephen Clark,et al.  Latent Variable Dialogue Models and their Diversity , 2017, EACL.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[21]  Ming-Wei Chang,et al.  A Knowledge-Grounded Neural Conversation Model , 2017, AAAI.

[22]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[23]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[24]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[25]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[26]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.