Optimizing Dialogue Strategy in Large-Scale Spoken Dialogue System: A Learning Automaton Based Approach

Application of statistical methodology to model dialogue strategy in spoken dialogue system is a growing research area. Reinforcement learning is a promising technique for creating a dialogue management component that accepts semantic of the current dialogue state and seeks to find the best action given those features. In practice, increase in the number of dialogue states, much use of memory and processing is needed and the use of exhaustive search techniques like dynamic programming leads to sub-optimal solution. Hence, this paper investigates an adaptive policy iterative method using learning automata that cover large state-action space by hierarchical organization of automaton to learn optimal dialogue strategy. The proposed approach has clear advantages over baseline reinforcement learning algorithms in terms of faster learning with good exploitation in its update and scalability to larger problems.

[1]  Thierry Dutoit,et al.  A probabilistic framework for dialog simulation and optimal strategy learning , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[3]  M. A. L. Thathachar,et al.  Networks of Learning Automata , 2004 .

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Michael F. McTear,et al.  Book Review , 2005, Computational Linguistics.

[6]  Johanna D. Moore,et al.  Evolving optimal inspectable strategies for spoken dialogue systems , 2006, HLT-NAACL.

[7]  Roberto Pieraccini,et al.  Automating spoken dialogue management design using machine learning: An industry perspective , 2008, Speech Commun..

[8]  Steve J. Young,et al.  A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.

[9]  Oliver Lemon,et al.  Evaluation of a hierarchical reinforcement learning spoken dialogue system , 2010, Comput. Speech Lang..

[10]  M. Thathachar,et al.  Networks of Learning Automata: Techniques for Online Stochastic Optimization , 2003 .

[11]  N. Baba,et al.  A relative reward-strength algorithm for the hierarchical structure learning automata operating in the general nonstationary multiteacher environment , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[13]  Kallirroi Georgila,et al.  Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets , 2008, CL.