Continual Dialogue State Tracking via Example-Guided Question Answering

Dialogue systems are frequently updated to accommodate new services, but naively updating them by continually training with data for new services in diminishing performance on previously learnt services. Motivated by the insight that dialogue state tracking (DST), a crucial component of dialogue systems that estimates the user's goal as a conversation proceeds, is a simple natural language understanding task, we propose reformulating it as a bundle of granular example-guided question answering tasks to minimize the task shift between services and thus benefit continual learning. Our approach alleviates service-specific memorization and teaches a model to contextualize the given question and example to extract the necessary information from the conversation. We find that a model with just 60M parameters can achieve a significant boost by learning to learn from in-context examples retrieved by a retriever trained to identify turns with similar dialogue state changes. Combining our method with dialogue-level memory replay, our approach attains state of the art performance on DST continual learning metrics without relying on any complex regularization or parameter expansion methods.

[1]  Noah A. Smith,et al.  Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks , 2022, EMNLP.

[2]  Yonghui Wu,et al.  Show, Don’t Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue , 2022, NAACL.

[3]  Fei Mi,et al.  Continual Prompt Tuning for Dialog State Tracking , 2022, ACL.

[4]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[5]  Izhak Shafran,et al.  Description-Driven Task-Oriented Dialog Modeling , 2022, ArXiv.

[6]  Kaushik Ram Sadagopan,et al.  Know Thy Strengths: Comprehensive Dialogue State Tracking Diagnostics , 2021, EMNLP.

[7]  M. Lewis,et al.  MetaICL: Learning to Learn In Context , 2021, NAACL.

[8]  Alexander M. Rush,et al.  Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.

[9]  Elman Mansimov,et al.  Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System , 2021, ACL.

[10]  Zhou Yu,et al.  Zero-Shot Dialogue State Tracking via Cross-Task Transfer , 2021, EMNLP.

[11]  Quoc V. Le,et al.  Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.

[12]  Baolin Peng,et al.  Soloist: Building Task Bots at Scale with Transfer Learning and Machine Teaching , 2021, Transactions of the Association for Computational Linguistics.

[13]  Zhou Yu,et al.  Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue StateTracking , 2021, NAACL.

[14]  Sonal Gupta,et al.  Muppet: Massive Multi-task Representations with Pre-Finetuning , 2021, EMNLP.

[15]  Shang-Wen Li,et al.  Zero-shot Generalization in Dialog State Tracking through Generative Question Answering , 2021, EACL.

[16]  Bing Liu,et al.  Continual Learning in Task-Oriented Dialogue Systems , 2020, EMNLP.

[17]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[18]  Carel van Niekerk,et al.  TripPy: A Triple Copy Strategy for Value Independent Neural Dialog State Tracking , 2020, SIGDIAL.

[19]  R. Socher,et al.  A Simple Language Model for Task-Oriented Dialogue , 2020, Neural Information Processing Systems.

[20]  Richard Socher,et al.  TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue , 2020, EMNLP.

[21]  Sang-Woo Lee,et al.  Efficient Dialogue State Tracking by Selectively Overwriting Memory , 2019, ACL.

[22]  Hongxia Jin,et al.  A Progressive Model to Enable Continual Learning for Semantic Slot Filling , 2019, EMNLP.

[23]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[24]  Raghav Gupta,et al.  Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset , 2019, AAAI.

[25]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[26]  Dilek Z. Hakkani-Tür,et al.  Dialog State Tracking: A Neural Reading Comprehension Approach , 2019, SIGdial.

[27]  Anuj Kumar Goyal,et al.  MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines , 2019, LREC.

[28]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[30]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[31]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[34]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[35]  Kang Liu,et al.  Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks , 2021, EMNLP.

[36]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .