论文信息 - Few-Shot Dialogue Generation Without Annotated Data: A Transfer Learning Approach - 字舞流文

Few-Shot Dialogue Generation Without Annotated Data: A Transfer Learning Approach

Learning with minimal data is one of the key challenges in the development of practical, production-ready goal-oriented dialogue systems. In a real-world enterprise setting where dialogue systems are developed rapidly and are expected to work robustly for an ever-growing variety of domains, products, and scenarios, efficient learning from a limited number of examples becomes indispensable. In this paper, we introduce a technique to achieve state-of-the-art dialogue generation performance in a few-shot setup, without using any annotated data. We do this by leveraging background knowledge from a larger, more highly represented dialogue source --- namely, the MetaLWOz dataset. We evaluate our model on the Stanford Multi-Domain Dialogue Dataset, consisting of human-human goal-oriented dialogues in in-car navigation, appointment scheduling, and weather information domains. We show that our few-shot approach achieves state-of-the art results on that dataset by consistently outperforming the previous best model in terms of BLEU and Entity F1 scores, while being more data-efficient by not requiring any data annotation.

Sungjin Lee | Oliver Lemon | Arash Eshghi | Igor Shalyminov | Oliver Lemon | Arash Eshghi | Sungjin Lee | Igor Shalyminov

[1] Oliver Lemon,et al. Challenging Neural Dialogue Models with Natural Data: Memory Networks Fail on Incremental Phenomena , 2017, ArXiv.

[2] Oliver Lemon,et al. Bootstrapping incremental dialogue systems from minimal data: the generalisation power of dialogue grammars , 2017, EMNLP.

[3] Verena Rieser,et al. Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.

[4] Oliver Lemon,et al. Bootstrapping incremental dialogue systems: using linguistic knowledge to learn from minimal data , 2016, NIPS 2016.

[5] Stefan Ultes,et al. Addressing Objects and Their Relations: The Conversational Entity Dialogue Model , 2018, SIGDIAL Conference.

[6] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[7] Maxine Eskénazi,et al. Zero-Shot Dialog Generation with Cross-Domain Latent Actions , 2018, SIGDIAL Conference.

[8] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9] Matthew Henderson,et al. The third Dialog State Tracking Challenge , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[10] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[11] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[12] Jianfeng Gao,et al. Multi-Domain Task-Completion Dialog Challenge , 2019 .

[13] Christopher D. Manning,et al. Key-Value Retrieval Networks for Task-Oriented Dialogue , 2017, SIGDIAL Conference.

[14] Arash Eshghi,et al. DyLan : Parser for Dynamic Syntax , 2013 .

[15] Angel X. Chang,et al. SUTime: A library for recognizing and normalizing time expressions , 2012, LREC.

[16] Maxine Eskénazi,et al. Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation , 2018, ACL.

[17] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19] Roi Blanco,et al. Lightweight Multilingual Entity Extraction and Linking , 2017, WSDM.

[20] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[21] Tsung-Hsien Wen,et al. Latent Intention Dialogue Models , 2017, ICML.

[22] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[23] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.