Adaptive natural language generation in dialogue using reinforcement learning

This paper presents a new model for adaptive Natural Language Generation (NLG) in dialogue, showing how NLG problems can be approached as statistical planning problems using Reinforcement Learning. This approach brings a number of theoretical and practical benefits such as finegrained adaptation, generalization, and automatic (global) optimization. We present the model and related work in statistical/trainable NLG, discuss its applications, and provide a demonstration of the approach, showing policy learning for adaptive information presentation decisions (Contrast, Cluster, or List items). An adaptive NLG policy learned in our framework shows a statistically significant 27% relative increase in reward over an “RL-majority” baseline policy for the same task. We thereby also show that that such NLG problems should be approached in combination with dialogue management decisions, and we show how to jointly optimize NLG and dialogue management plans.

[1]  Marilyn A. Walker,et al.  Trainable Sentence Planning for Complex Information Presentations in Spoken Dialog Systems , 2004, ACL.

[2]  Marilyn A. Walker,et al.  User tailored generation in the match multimodal dialogue system , 2004 .

[3]  Marilyn A. Walker,et al.  Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email , 1998, COLING-ACL.

[4]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[5]  Stephen Young Probabilistic methods in spoken–dialogue systems , 2000, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[6]  Johanna D. Moore,et al.  Generating Tailored, Comparative Descriptions in Spoken Dialogue , 2004, FLAIRS Conference.

[7]  Johanna D. Moore,et al.  Information Presentation in Spoken Dialogue Systems , 2006, EACL.

[8]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[9]  M. Pickering,et al.  Toward a mechanistic psychology of dialogue , 2004, Behavioral and Brain Sciences.

[10]  Marilyn A. Walker,et al.  SPoT: A Trainable Sentence Planner , 2001, NAACL.

[11]  Roberto Pieraccini,et al.  A stochastic model of computer-human interaction for learning dialogue strategies , 1997, EUROSPEECH.

[12]  Alexander I. Rudnicky,et al.  Stochastic natural language generation for spoken dialog systems , 2002, Comput. Speech Lang..

[13]  Marilyn A. Walker,et al.  Individual and Domain Adaptation in Sentence Planning for Dialogue , 2007, J. Artif. Intell. Res..

[14]  Johanna D. Moore,et al.  Generating and evaluating evaluative arguments , 2006, Artif. Intell..

[15]  Matthew Stone,et al.  Microplanning with Communicative Intentions: The SPUD System , 2001, Comput. Intell..

[16]  Kathleen McKeown,et al.  Statistical Acquisition of Content Selection Rules for Natural Language Generation , 2003, EMNLP.

[17]  Marilyn A. Walker,et al.  Towards developing general models of usability with PARADISE , 2000, Natural Language Engineering.

[18]  Kallirroi Georgila,et al.  User simulation for spoken dialogue systems: learning and evaluation , 2006, INTERSPEECH.

[19]  Amy Isard,et al.  Speaking the Users' Languages , 2003, IEEE Intell. Syst..

[20]  Oliver Lemon,et al.  Dialogue Policy Learning for Combinations of Noise and User Simulation: Transfer Results , 2007, SIGDIAL.

[21]  Oliver Lemon,et al.  User simulations for online adaptation and knowledge-alignment in troubleshooting dialogue systems , 2008 .

[22]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[23]  Staffan Larsson,et al.  Coordinating on ad-hoc semantic systems in dialogue , 2007 .

[24]  Mirella Lapata,et al.  Collective Content Selection for Concept-to-Text Generation , 2005, HLT.

[25]  Oliver Lemon,et al.  Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation , 2008, ACL.