论文信息 - An Assessment Framework for DialPort

An Assessment Framework for DialPort

Collecting a large amount of real human-computer interaction data in various domains is a cornerstone in the development of better data-driven spoken dialog systems. The DialPort project is creating a portal to collect a constant stream of real user conversational data on a variety of topics. In order to keep real users attracted to DialPort, it is crucial to develop a robust evaluation framework to monitor and maintain high performance. Different from earlier spoken dialog systems, DialPort has a heterogeneous set of spoken dialog systems gathered under one outward-looking agent. In order to access this new structure, we have identified some unique challenges that DialPort will encounter so that it can appeal to real users and have created a novel evaluation scheme that quantitatively assesses their performance in these situations. We look at assessment from the point of view of the system developer as well as that of the end user.

[1] Tiancheng Zhao,et al. DialPort: A General Framework for Aggregating Dialog Systems , 2016 .

[2] David Vandyke,et al. Multi-domain Dialog State Tracking using Recurrent Neural Networks , 2015, ACL.

[3] Maxine Eskénazi,et al. DialPort: Connecting the spoken dialog research community to real user data , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[4] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[5] David R. Traum,et al. Towards Automatic Identification of Effective Clues for Team Word-Guessing Games , 2016, LREC.

[6] Maxine Eskénazi,et al. A Finite-State Turn-Taking Model for Spoken Dialog Systems , 2009, NAACL.

[7] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[8] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[9] M. Eskénazi,et al. The DialPort Portal : Grouping Diverse Types of Spoken Dialog Systems , 2016 .

[10] Maxine Eskénazi,et al. DialPort, Gone Live: An Update After A Year of Development , 2017, SIGDIAL Conference.

[11] Marilyn A. Walker,et al. PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.