Reinforcement Learning of Multi-Issue Negotiation Dialogue Policies

We use reinforcement learning (RL) to learn a multi-issue negotiation dialogue policy. For training and evaluation, we build a hand-crafted agenda-based policy, which serves as the negotiation partner of the RL policy. Both the agendabased and the RL policies are designed to work for a large variety of negotiation settings, and perform well against negotiation partners whose behavior has not been observed before. We evaluate the two models by having them negotiate against each other under various settings. The learned model consistently outperforms the agenda-based model. We also ask human raters to rate negotiation transcripts between the RL policy and the agenda-based policy, regarding the rationality of the two negotiators. The RL policy is perceived as more rational than the agenda-based policy.

[1]  Csaba Szepesv Algorithms for Reinforcement Learning , 2010 .

[2]  Peter A. Heeman,et al.  Representing the Reinforcement Learning state in a negotiation dialogue , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[3]  Oliver Lemon,et al.  Learning non-cooperative dialogue behaviours , 2014, SIGDIAL Conference.

[4]  Kallirroi Georgila,et al.  Reinforcement Learning of Argumentation Dialogue Policies in Negotiation , 2011, INTERSPEECH.

[5]  Kallirroi Georgila,et al.  Learning Dialogue Strategies from Older and Younger Simulated Users , 2010, SIGDIAL Conference.

[6]  Steve J. Young,et al.  The Hidden Agenda User Simulation Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Kallirroi Georgila Reinforcement Learning of Two-Issue Negotiation Dialogue Policies , 2013, SIGDIAL Conference.

[8]  S. Young,et al.  Scaling POMDPs for Spoken Dialog Management , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Kallirroi Georgila,et al.  Reinforcement Learning of Question-Answering Dialogue Policies for Virtual Museum Guides , 2012, SIGDIAL Conference.

[10]  Alexander I. Rudnicky AN AGENDA-BASED DIALOG MANAGEMENT ARCHITECTURE FOR SPOKEN LANGUAGE SYSTEMS , 1999 .

[11]  Steve J. Young,et al.  Reinforcement learning for parameter estimation in statistical spoken dialogue systems , 2012, Comput. Speech Lang..

[12]  Hideki Kashioka,et al.  Simultaneous feature selection and parameter optimization for training of dialog policy by reinforcement learning , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[13]  Matthieu Geist,et al.  Sparse Approximate Dynamic Programming for Dialog Management , 2010, SIGDIAL Conference.

[14]  Lihong Li,et al.  Reinforcement learning for dialog management using least-squares Policy iteration and fast feature selection , 2009, INTERSPEECH.

[15]  Sarit Kraus,et al.  Can automated agents proficiently negotiate with humans? , 2010, CACM.

[16]  Kallirroi Georgila,et al.  Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies , 2014, ACL.

[17]  Kallirroi Georgila,et al.  A Cultural Decision-Making Model for Negotiation based on Inverse Reinforcement Learning , 2012, CogSci.

[18]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[19]  Oliver Lemon,et al.  Adaptive Referring Expression Generation in Spoken Dialogue Systems: Evaluation with Real Users , 2010, SIGDIAL Conference.

[20]  Satoshi Nakamura,et al.  Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy , 2010, SIGDIAL Conference.

[21]  Matthew Henderson,et al.  Policy optimisation of POMDP-based dialogue systems without state space compression , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[22]  Tomoki Toda,et al.  Reinforcement Learning of Cooperative Persuasive Dialogue Policies using Framing , 2014, COLING.

[23]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[24]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[25]  Kallirroi Georgila,et al.  Learning Culture-Specific Dialogue Models from Non Culture-Specific Data , 2011, HCI.

[26]  Kurt VanLehn,et al.  Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies , 2011, User Modeling and User-Adapted Interaction.

[27]  Joel R. Tetreault,et al.  A Reinforcement Learning approach to evaluating state representations in spoken dialogue systems , 2008, Speech Commun..

[28]  Kallirroi Georgila,et al.  Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets , 2008, CL.

[29]  Michael English,et al.  Learning Mixed Initiative Dialog Strategies By Using Reinforcement Learning On Both Conversants , 2005, HLT.