A Reinforcement Learning approach to evaluating state representations in spoken dialogue systems

Although dialogue systems have been an area of research for decades, finding accurate ways of evaluating different systems is still a very active subfield since many leading methods, such as task completion rate or user satisfaction, capture different aspects of the end-to-end human-computer dialogue interaction. In this work, we step back the focus from the complete evaluation of a dialogue system to presenting metrics for evaluating one internal component of a dialogue system: its dialogue manager. Specifically, we investigate how to create and evaluate the best state space representations for a Reinforcement Learning model to learn an optimal dialogue control strategy. We present three metrics for evaluating the impact of different state models and demonstrate their use on the domain of a spoken dialogue tutoring system by comparing the relative utility of adding three features to a model of user, or student, state. The motivation for this work is that if one knows which features are best to use, one can construct a better dialogue manager, and thus better performing dialogue systems.

[1]  Hua Ai,et al.  Knowledge consistent user simulations for dialog systems , 2007, INTERSPEECH.

[2]  Pascal Poupart,et al.  Partially Observable Markov Decision Processes with Continuous Observations for Dialogue Management , 2008, SIGDIAL.

[3]  Pontus Wärnestål,et al.  User Evaluation of a Conversational Recommender System , 2005, IJCAI 2005.

[4]  Julia Hirschberg,et al.  Detecting certainness in spoken tutorial dialogues , 2005, INTERSPEECH.

[5]  David D. Denison,et al.  Nonlinear estimation and classification , 2003 .

[6]  Pascal Poupart,et al.  Factored partially observable Markov decision processes for dialogue management , 2005 .

[7]  Kallirroi Georgila,et al.  Hybrid reinforcement/supervised learning for dialogue policies from COMMUNICATOR data , 2005 .

[8]  Carolyn Penstein Rosé,et al.  The Architecture of Why2-Atlas: A Coach for Qualitative Physics Essay Writing , 2002, Intelligent Tutoring Systems.

[9]  Diane J. Litman,et al.  Dialogue-Learning Correlations in Spoken Dialogue Tutoring , 2005, AIED.

[10]  Marilyn A. Walker,et al.  Automatic Optimization of Dialogue Management , 2000, COLING.

[11]  Léon Personnaz,et al.  Response to the comments by J. Larsen and L.K. Hansen for: : Construction of confidence intervals for neural networks based on least squares estimation (Neural Networks 13) , 2002 .

[12]  Shi Bing,et al.  Inductive learning algorithms and representations for text categorization , 2006 .

[13]  Steve Young,et al.  Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning , 2002 .

[14]  Richard S. Sutton,et al.  Dimensions of Reinforcement Learning , 1998 .

[15]  Marilyn A. Walker,et al.  An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email , 2000, J. Artif. Intell. Res..

[16]  Lenhart K. Schubert,et al.  Dialog Parsing in the TRAINS System , 1996 .

[17]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[18]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[19]  Diane J. Litman,et al.  Analyzing Dependencies Between Student Certainness States and Tutor Responses in a Spoken Dialogue Corpus , 2008 .

[20]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[21]  Joel R. Tetreault,et al.  Comparing the Utility of State Features in Spoken Dialogue Using Reinforcement Learning , 2006, NAACL.

[22]  Marilyn A. Walker,et al.  Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email , 1998, COLING-ACL.

[23]  Victor Zue,et al.  Experiments in Evaluating Interactive Spoken Language Systems , 1992, HLT.

[24]  Joelle Pineau,et al.  Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[25]  Joelle Pineau,et al.  Active Learning in Partially Observable Markov Decision Processes , 2005, ECML.

[26]  Lynette Hirschman,et al.  The cost of errors in a spoken language system , 1993, EUROSPEECH.

[27]  Andreas Stolcke,et al.  Prosody-based automatic detection of annoyance and frustration in human-computer dialog , 2002, INTERSPEECH.

[28]  Léon Personnaz,et al.  Construction of confidence intervals for neural networks based on least squares estimation , 2000, Neural Networks.

[29]  Tim Paek,et al.  The Markov Assumption in Spoken Dialogue Management , 2005, SIGDIAL Workshop.

[30]  Morena Danieli,et al.  Metrics for Evaluating Dialogue Strategies in a Spoken Language System , 1996, ArXiv.

[31]  Oliver Lemon,et al.  Learning More Effective Dialogue Strategies Using Limited Dialogue Move Features , 2006, ACL.

[32]  Donna K. Byron,et al.  Resolving Pronominal Reference to Abstract Entities , 2002, ACL.

[33]  Joel R. Tetreault,et al.  Using Reinforcement Learning to Build a Better Model of Dialogue State , 2006, EACL.

[34]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[35]  Diane J. Litman,et al.  Exploiting Word-level Features for Emotion Prediction , 2006, 2006 IEEE Spoken Language Technology Workshop.

[36]  J. Barnes,et al.  Crop responses to sulfonylurea residues in soils of the subtropical grain region of Australia , 2000 .

[37]  Eric Fosler-Lussier,et al.  Ambiguity representation and resolution in spoken dialogue systems , 2001, INTERSPEECH.

[38]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[39]  S. Argamon,et al.  Hedged Responses and Expressions of Affect in Human/Human and Human/Computer Tutorial Interactions , 2004 .

[40]  Joel R. Tetreault,et al.  Using system and user performance features to improve emotion detection in spoken tutoring dialogs , 2006, INTERSPEECH.

[41]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[42]  Oliver Lemon,et al.  Reinforcement learning of dialogue strategies using the user's last dialogue act , 2005 .

[43]  Joel R. Tetreault,et al.  Estimating the Reliability of MDP Policies: a Confidence Interval Approach , 2007, HLT-NAACL.

[44]  Kallirroi Georgila,et al.  Learning user simulations for information state update dialogue systems , 2005, INTERSPEECH.

[45]  Oliver Lemon,et al.  USING LOGISTIC REGRESSION TO INITIALISE REINFORCEMENT-LEARNING-BASED DIALOGUE SYSTEMS , 2006, 2006 IEEE Spoken Language Technology Workshop.

[46]  Diane J. Litman,et al.  ITSPOKE: An Intelligent Tutoring Spoken Dialogue System , 2004, NAACL.

[47]  Marilyn A. Walker,et al.  Towards developing general models of usability with PARADISE , 2000, Natural Language Engineering.

[48]  Marilyn A. Walker,et al.  Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.

[49]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[50]  Kate Forbes-Riley,et al.  Using Bigrams to Identify Relationships Between Student Certainness States and Tutor Responses in a Spoken Dialogue Corpus , 2005, SIGDIAL.

[51]  Lewis M. Norton,et al.  Beyond Class A: A Proposal for Automatic Evaluation of Discourse , 1990, HLT.

[52]  Roberto Pieraccini,et al.  A stochastic model of computer-human interaction for learning dialogue strategies , 1997, EUROSPEECH.