Optimizing Dialogue Strategy Learning Using Learning Automata

Modeling the behavior of the dialogue management in the design of a spoken dialogue system using statistical methodologies is currently a growing research area. This paper presents a work on developing an adaptive learning approach to optimize dialogue strategy. At the core of our system is a method formalizing dialogue management as a sequential decision making under uncertainty whose underlying probabilistic structure has a Markov Chain. Researchers have mostly focused on model-free algorithms for automating the design of dialogue management using machine learning techniques such as reinforcement learning. But in model-free algorithms there exist a dilemma in engaging the type of exploration versus exploitation. Hence we present a model-based online policy learning algorithm using interconnected learning automata for optimizing dialogue strategy. The proposed algorithm is capable of deriving an optimal policy that prescribes what action should be taken in various states of conversation so as to maximize the expected total reward to attain the goal and incorporates good exploration and exploitation in its updates to improve the naturalness of humancomputer interaction. We test the proposed approach using the most sophisticated evaluation framework PARADISE for accessing to the railway information system. Keywords—Dialogue management, Learning automata, Reinforcement learning, Spoken dialogue system

[1]  Olivier Pietquin,et al.  A Framework for Unsupervised Learning of Dialogue Strategies , 2004 .

[2]  Bart De Schutter,et al.  Decentralized Reinforcement Learning Control of a Robotic Manipulator , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.

[3]  B. John Oommen,et al.  Continuous and discretized pursuit learning schemes: various algorithms and their comparison , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Steve J. Young,et al.  A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.

[5]  Roberto Pieraccini,et al.  Automating spoken dialogue management design using machine learning: An industry perspective , 2008, Speech Commun..

[6]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[7]  Ann Nowé,et al.  Evolutionary game theory and multi-agent reinforcement learning , 2005, The Knowledge Engineering Review.

[8]  Thierry Dutoit,et al.  A probabilistic framework for dialog simulation and optimal strategy learning , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Jürgen Schmidhuber,et al.  Reinforcement Learning Soccer Teams with Incomplete World Models , 1999, Auton. Robots.

[10]  Joseph Polifroni,et al.  A form-based dialogue manager for spoken language applications , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[11]  Peter Vrancx,et al.  Networks of Learning Automata and Limiting Games , 2007, Adaptive Agents and Multi-Agents Systems.

[12]  Yukinori Kakazu,et al.  An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning , 2003, Robotics Auton. Syst..

[13]  Oliver Lemon,et al.  REINFORCEMENT LEARNING OF DIALOGUE STRATEGIES WITH HIERARCHICAL ABSTRACT MACHINES , 2006, 2006 IEEE Spoken Language Technology Workshop.

[14]  Tim Paek,et al.  The Markov Assumption in Spoken Dialogue Management , 2005, SIGDIAL Workshop.

[15]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[16]  Kaddour Najim,et al.  Learning automata and stochastic optimization , 1997 .

[17]  James F. Allen,et al.  Toward Conversational Human-Computer Interaction , 2001, AI Mag..

[18]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[19]  Kurt Driessens,et al.  Relational Reinforcement Learning , 1998, Machine-mediated learning.

[20]  K. Narendra,et al.  Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.

[21]  Michael F. McTear,et al.  Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit , 1998, ICSLP.

[22]  Reda Alhajj,et al.  Multiagent reinforcement learning using function approximation , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[23]  Shin Ishii,et al.  Multiagent reinforcement learning applied to a chase problem in a continuous world , 2001, Artificial Life and Robotics.

[24]  M. Thathachar,et al.  Networks of Learning Automata: Techniques for Online Stochastic Optimization , 2003 .

[25]  Kallirroi Georgila,et al.  Hybrid reinforcement/supervised learning for dialogue policies from COMMUNICATOR data , 2005 .

[26]  Kumpati S. Narendra,et al.  Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..

[27]  Marilyn A. Walker,et al.  PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.

[28]  Maarten Peeters,et al.  Learning Automata as a Basis for Multi Agent Reinforcement Learning , 2005, EUMAS.

[29]  A. Robinson I. Introduction , 1991 .

[30]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.