论文信息 - Multi-Referenced Training for Dialogue Response Generation

Multi-Referenced Training for Dialogue Response Generation

In open-domain dialogue response generation, a dialogue context can be continued with diverse responses, and the dialogue models should capture such one-to-many relations. In this work, we first analyze the training objective of dialogue models from the view of Kullback-Leibler divergence (KLD) and show that the gap between the real world probability distribution and the single-referenced data’s probability distribution prevents the model from learning the one-to-many relations efficiently. Then we explore approaches to multi-referenced training in two aspects. Data-wise, we generate diverse pseudo references from a powerful pretrained model to build multi-referenced data that provides a better approximation of the real-world distribution. Model-wise, we propose to equip variational models with an expressive prior, named linear Gaussian model (LGM). Experimental results of automated evaluation and human evaluation show that the methods yield significant improvements over baselines.

Tatsuya Kawahara | Tianyu Zhao

[1] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[2] Tom M. Mitchell,et al. Learning Data Manipulation for Augmentation and Weighting , 2019, NeurIPS.

[3] Tatsuya Kawahara,et al. Designing Precise and Robust Dialogue Response Evaluators , 2020, ACL.

[4] Feng Ji,et al. Teacher-Student Framework Enhanced Multi-domain Dialogue Generation , 2019, ArXiv.

[5] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[6] Hua Wu,et al. Generating Multiple Diverse Responses with Multi-Mapping and Posterior Mapping Selection , 2019, IJCAI.

[7] Dongyan Zhao,et al. Are Training Samples Correlated? Learning to Generate Dialogue Responses with Multiple References , 2019, ACL.

[8] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.

[9] Shuming Shi,et al. Generating Multiple Diverse Responses for Short-Text Conversation , 2018, AAAI.

[10] Mohit Bansal,et al. Automatically Learning Data Augmentation Policies for Dialogue Tasks , 2019, EMNLP.

[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.