论文信息 - Unsupervised Learning of KB Queries in Task Oriented Dialogs

Unsupervised Learning of KB Queries in Task Oriented Dialogs

Task-oriented dialog (TOD) systems converse with users to accomplish a specific task. This task requires the system to query a knowledge base (KB) and use the retrieved results to fulfil user needs. Predicting the KB queries is crucial and can lead to severe under-performance if made incorrectly. KB queries are usually annotated in real-world datasets and are learnt using supervised approaches to achieve acceptable task completion. This need for query annotations prevents TOD systems from easily adapting to new domains. In this paper, we propose a novel problem of learning end-to-end TOD systems using dialogs that do not contain KB query annotations. Our approach first learns to predict the KB queries using reinforcement learning (RL) and then learns the end-to-end system using the predicted queries. However, predicting the correct query in TOD systems is uniquely plagued by correlated attributes, in which, due to data bias, certain attributes always occur together in the KB. This prevents the RL system to generalise and accuracy suffers as a result. We propose Correlated Attributes Resilient RL (CARRL), a modification to the RL gradient estimation, which mitigates the problem of correlated attributes and predicts KB queries better than existing weakly supervised approaches. Finally, we compare the performance of our end-to-end system trained using predicted queries to a system trained using annotated gold queries.

Mausam | Nikhil Gupta | Dinesh Raghu

[1] David Vandyke,et al. Conditional Generation and Snapshot Learning in Neural Dialogue Systems , 2016, EMNLP.

[2] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[3] Richard Socher,et al. Global-to-local Memory Pointer Networks for Task-Oriented Dialogue , 2019, ICLR.

[4] Ming Zhou,et al. Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base , 2018, NeurIPS.

[5] Geoffrey Zweig,et al. Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.

[6] Nikhil Gupta,et al. Disentangling Language and Knowledge in Task-Oriented Dialogs , 2018, NAACL.

[7] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[8] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[9] Jianfeng Gao,et al. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.

[10] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[11] Matthew Henderson,et al. Word-Based Dialog State Tracking with Recurrent Neural Networks , 2014, SIGDIAL Conference.