Reinforcement Learning of Two-Issue Negotiation Dialogue Policies

We use hand-crafted simulated negotiators (SNs) to train and evaluate dialogue policies for two-issue negotiation between two agents. These SNs differ in their goals and in the use of strong and weak arguments to persuade their counterparts. They may also make irrational moves, i.e., moves not consistent with their goals, to generate a variety of negotiation patterns. Different versions of these SNs interact with each other to generate corpora for Reinforcement Learning (RL) of argumentation dialogue policies for each of the two agents. We evaluate the learned policies against hand-crafted SNs similar to the ones used for training but with the modification that these SNs no longer make irrational moves and thus are harder to beat. The learned policies generally do as well as, or better than the hand-crafted SNs showing that RL can be successfully used for learning argumentation dialogue policies in twoissue negotiation scenarios.

[1]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[2]  Peter A. Heeman,et al.  Representing the Reinforcement Learning state in a negotiation dialogue , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[3]  Kallirroi Georgila,et al.  Reinforcement Learning of Argumentation Dialogue Policies in Negotiation , 2011, INTERSPEECH.

[4]  Kallirroi Georgila,et al.  Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets , 2008, CL.

[5]  Matthieu Geist,et al.  Sparse Approximate Dynamic Programming for Dialog Management , 2010, SIGDIAL Conference.

[6]  Kallirroi Georgila,et al.  Learning Dialogue Strategies from Older and Younger Simulated Users , 2010, SIGDIAL Conference.

[7]  Joel R. Tetreault,et al.  A Reinforcement Learning approach to evaluating state representations in spoken dialogue systems , 2008, Speech Commun..

[8]  S. Young,et al.  Scaling POMDPs for Spoken Dialog Management , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Kallirroi Georgila,et al.  Reinforcement Learning of Question-Answering Dialogue Policies for Virtual Museum Guides , 2012, SIGDIAL Conference.

[10]  Satoshi Nakamura,et al.  Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy , 2010, SIGDIAL Conference.

[11]  Steve J. Young,et al.  Reinforcement learning for parameter estimation in statistical spoken dialogue systems , 2012, Comput. Speech Lang..

[12]  Kallirroi Georgila,et al.  Learning Culture-Specific Dialogue Models from Non Culture-Specific Data , 2011, HCI.

[13]  Kurt VanLehn,et al.  Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies , 2011, User Modeling and User-Adapted Interaction.

[14]  Kallirroi Georgila,et al.  A Cultural Decision-Making Model for Negotiation based on Inverse Reinforcement Learning , 2012, CogSci.

[15]  Roie Zivan,et al.  POMDP based Negotiation Modeling , 2009 .

[16]  Oliver Lemon,et al.  Adaptive Referring Expression Generation in Spoken Dialogue Systems: Evaluation with Real Users , 2010, SIGDIAL Conference.

[17]  Hideki Kashioka,et al.  Simultaneous feature selection and parameter optimization for training of dialog policy by reinforcement learning , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[18]  Lihong Li,et al.  Reinforcement learning for dialog management using least-squares Policy iteration and fast feature selection , 2009, INTERSPEECH.

[19]  Matthew Henderson,et al.  Policy optimisation of POMDP-based dialogue systems without state space compression , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).