Reinforcement learning automata approach to optimize dialogue strategy in large state spaces

Application of machine learning techniques in designing dialogue strategies is a growing research area. Most of the reinforcement learning methods use tabular representation to learn the value of taking an action from each possible state in order to maximize the total reward. For large state spaces, several difficulties are to be faced like large tables, an account of prior knowledge, and data sparsity. This paper investigates the performance of online policy iterative reinforcement learning automata approach that handles large state space by hierarchical organization of automaton to learn optimal dialogue strategy. The results were compared with flat reinforcement learning methods and the results shows that the proposed method has faster learning and scalability to larger problems.

[1]  Thierry Dutoit,et al.  A probabilistic framework for dialog simulation and optimal strategy learning , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Ann Nowé,et al.  Colonies of learning automata , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[3]  Roberto Pieraccini,et al.  Automating spoken dialogue management design using machine learning: An industry perspective , 2008, Speech Commun..

[4]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[5]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[6]  Johanna D. Moore,et al.  Evolving optimal inspectable strategies for spoken dialogue systems , 2006, HLT-NAACL.

[7]  M. A. L. Thathachar,et al.  Networks of Learning Automata , 2004 .

[8]  Oliver Lemon,et al.  Learning dialogue strategies for interactive database search , 2007, INTERSPEECH.

[9]  Steve Young,et al.  Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning , 2002 .

[10]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[11]  Joelle Pineau,et al.  Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[12]  M. Thathachar,et al.  Networks of Learning Automata: Techniques for Online Stochastic Optimization , 2003 .

[13]  Marilyn A. Walker,et al.  DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems , 2001, HLT.

[14]  Oliver Lemon,et al.  Using dialogue acts to learn better repair strategies for spoken dialogue systems , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  P. S. Sastry,et al.  Varieties of learning automata: an overview , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.