Particle swarm optimisation of spoken dialogue system strategies

Dialogue management optimisation has been cast into a plan- ning under uncertainty problem for long. Some methods such as Reinforcement Learning (RL) are now part of the state of the art. Whatever the solving method, strong assumptions are made about the dialogue system properties. For instance, RL assumes that the dialogue state space is Markovian. Such con- straints may involve important engineering work. This paper introduces a more general approach, based on fewer modelling assumptions. A Black Box Optimisation (BBO) method and more precisely a Particle Swarm Optimisation (PSO) is used to solve the control problem. In addition, PSO allows taking ad- vantage of the parallel aspect of the problem of optimising a system online with many users calling at the same time. Some preliminary results are presented.

[1]  Marilyn A. Walker,et al.  PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.

[2]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[3]  Milica Gasic,et al.  The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[4]  Milica Gasic,et al.  Gaussian Processes for Fast Policy Optimisation of POMDP-based Dialogue Managers , 2010, SIGDIAL Conference.

[5]  Staffan Larsson,et al.  Information state and dialogue management in the TRINDI dialogue move engine toolkit , 2000, Natural Language Engineering.

[6]  Joelle Pineau,et al.  Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[7]  Marilyn A. Walker,et al.  Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.

[8]  Matthieu Geist,et al.  A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization , 2012, IEEE Journal of Selected Topics in Signal Processing.

[9]  Shie Mannor,et al.  The Cross Entropy Method for Fast Policy Search , 2003, ICML.

[10]  Christian Igel,et al.  Evolution Strategies for Direct Policy Search , 2008, PPSN.

[11]  R. Bellman A Markovian Decision Process , 1957 .

[12]  Maurice Clerc,et al.  Standard Particle Swarm Optimisation , 2012 .

[13]  Jérémy Fix,et al.  Monte-Carlo Swarm Policy Search , 2012, ICAISC.

[14]  Thierry Dutoit,et al.  Aided design of finite-state dialogue management systems , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[15]  Andries Petrus Engelbrecht,et al.  Fundamentals of Computational Swarm Intelligence , 2005 .

[16]  M. Clerc Standard Particle Swarm Optimisation From 2006 to 2011 , 2012 .

[17]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[18]  Roberto Pieraccini,et al.  Learning dialogue strategies within the Markov decision process framework , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[19]  Bart De Schutter,et al.  Cross-Entropy Optimization of Control Policies With Adaptive Basis Functions , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Matthieu Geist,et al.  Uncertainty Management for On-Line Optimisation of a POMDP-Based Large-Scale Spoken Dialogue System , 2011, INTERSPEECH.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Thierry Dutoit,et al.  A probabilistic framework for dialog simulation and optimal strategy learning , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[23]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[24]  Hui Ye,et al.  Agenda-Based User Simulation for Bootstrapping a POMDP Dialogue System , 2007, NAACL.

[25]  Mauricio Zambrano-Bigiarini,et al.  Standard Particle Swarm Optimisation 2011 at CEC-2013: A baseline for future PSO improvements , 2013, 2013 IEEE Congress on Evolutionary Computation.

[26]  Christian Igel,et al.  Similarities and differences between policy gradient methods and evolution strategies , 2008, ESANN.

[27]  Reva Freedman Atlas: A Plan Manager for Mixed-Initiative, Multimodal Dialogue , 1999 .