Spoken language interaction with model uncertainty: an adaptive human–robot interaction system

Spoken language is one of the most intuitive forms of interaction between humans and agents. Unfortunately, agents that interact with people using natural language often experience communication errors and do not correctly understand the user's intentions. Recent systems have successfully used probabilistic models of speech, language and user behaviour to generate robust dialogue performance in the presence of noisy speech recognition and ambiguous language choices, but decisions made using these probabilistic models are still prone to errors owing to the complexity of acquiring and maintaining a complete model of human language and behaviour. In this paper, a decision-theoretic model for human–robot interaction using natural language is described. The algorithm is based on the Partially Observable Markov Decision Process (POMDP), which allows agents to choose actions that are robust not only to uncertainty from noisy or ambiguous speech recognition but also unknown user models. Like most dialogue systems, a POMDP is defined by a large number of parameters that may be difficult to specify a priori from domain knowledge, and learning these parameters from the user may require an unacceptably long training period. An extension to the POMDP model is described that allows the agent to acquire a linguistic model of the user online, including new vocabulary and word choice preferences. The approach not only avoids a training period of constant questioning as the agent learns, but also allows the agent actively to query for additional information when its uncertainty suggests a high risk of mistakes. The approach is demonstrated both in simulation and on a natural language interaction system for a robotic wheelchair application.

[1]  R. Bellman Dynamic programming. , 1957, Science.

[2]  Lawrence R. Rabiner,et al.  A tutorial on Hidden Markov Models , 1986 .

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  Geoffrey J. Gordon Stable Function Approximation in Dynamic Programming , 1995, ICML.

[5]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[6]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[7]  C.D. Martin,et al.  The Media Equation: How People Treat Computers, Television and New Media Like Real People and Places [Book Review] , 1997, IEEE Spectrum.

[8]  David Andre,et al.  Model based Bayesian Exploration , 1999, UAI.

[9]  Joelle Pineau,et al.  Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[10]  António J. S. Teixeira,et al.  Human-robot interaction through spoken language dialogue , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[11]  Stephanie Seneff,et al.  Dialogue Management in the Mercury Flight Reservation System , 2000 .

[12]  Marilyn A. Walker,et al.  NJFun- A Reinforcement Learning Spoken Dialogue System , 2000 .

[13]  Thomas G. Dietterich,et al.  Editors. Advances in Neural Information Processing Systems , 2002 .

[14]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[15]  Laurent El Ghaoui,et al.  Robustness in Markov Decision Problems with Uncertain Transition Matrices , 2003, NIPS.

[16]  Eric Horvitz,et al.  Optimizing Automated Call Routing by Integrating Spoken Dialog Models with Queuing Models , 2004, NAACL.

[17]  Shimei Pan,et al.  Designing and Evaluating an Adaptive Spoken Dialogue System , 2002, User Modeling and User-Adapted Interaction.

[18]  Michael L. Littman,et al.  Planning with predictive state representations , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[19]  Joelle Pineau,et al.  A Hierarchical Approach to POMDP Planning and Execution , 2004 .

[20]  Joelle Pineau,et al.  Active Learning in Partially Observable Markov Decision Processes , 2005, ECML.

[21]  Yishay Mansour,et al.  Reinforcement Learning in POMDPs Without Resets , 2005, IJCAI.

[22]  J.D. Williams,et al.  Scaling up POMDPs for Dialog Management: The ``Summary POMDP'' Method , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[23]  Doina Precup,et al.  Learning in non-stationary Partially Observable Markov Decision Processes , 2005 .

[24]  Pascal Poupart,et al.  Partially Observable Markov Decision Processes with Continuous Observations for Dialogue Management , 2008, SIGDIAL.

[25]  Jesse Hoey,et al.  POMDP Models for Assistive Technology , 2005, AAAI Fall Symposium: Caring Machines.

[26]  Pascal Poupart,et al.  Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..

[27]  Jesse Hoey,et al.  An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.

[28]  Shie Mannor,et al.  The Robustness-Performance Tradeoff in Markov Decision Processes , 2006, NIPS.

[29]  Sriraam Natarajan,et al.  A Decision-Theoretic Model of Assistance , 2007, IJCAI.

[30]  L. Seabra Lopes,et al.  How many words can my robot learn?: An approach and experiments with one-class learning , 2007 .