论文信息 - Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use

Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use

Neural end-to-end goal-oriented dialog systems showed promise to reduce the workload of human agents for customer service, as well as reduce wait time for users. However, their inability to handle new user behavior at deployment has limited their usage in real world. In this work, we propose an end-to-end trainable method for neural goal-oriented dialog systems that handles new user behaviors at deployment by transferring the dialog to a human agent intelligently. The proposed method has three goals: 1) maximize user’s task success by transferring to human agents, 2) minimize the load on the human agents by transferring to them only when it is essential, and 3) learn online from the human agent’s responses to reduce human agents’ load further. We evaluate our proposed method on a modified-bAbI dialog task,1 which simulates the scenario of new user behaviors occurring at test time. Experimental results show that our proposed method is effective in achieving the desired goals.

Jatin Ganhotra | Janarthanan Rajendran | Lazaros C. Polymenakos

[1] Marilyn A. Walker,et al. Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System , 2000, AAAI/IAAI.

[2] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[3] Christopher D. Manning,et al. Key-Value Retrieval Networks for Task-Oriented Dialogue , 2017, SIGDIAL Conference.

[4] Kallirroi Georgila,et al. Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets , 2008, CL.

[5] Jason Weston,et al. Dialogue Learning With Human-In-The-Loop , 2016, ICLR.

[6] Geoffrey Zweig,et al. End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning , 2016, ArXiv.

[7] Jakob Grue Simonsen,et al. A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[8] Bing Liu,et al. Iterative policy learning in end-to-end trainable task-oriented neural dialog models , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[9] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[10] Milica Gasic,et al. POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[11] Dongho Kim,et al. POMDP-based dialogue manager adaptation to extended domains , 2013, SIGDIAL Conference.