论文信息 - Deal or No Deal? End-to-End Learning of Negotiation Dialogues - 字舞流文

Deal or No Deal? End-to-End Learning of Negotiation Dialogues

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions. Negotiations require complex communication and reasoning skills, but success is easy to measure, making this an interesting task for AI. We gather a large dataset of human-human negotiations on a multi-issue bargaining task, where agents who cannot observe each other’s reward functions must reach an agreement (or a deal) via natural language dialogue. For the first time, we show it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states. We also introduce dialogue rollouts, in which the model plans ahead by simulating possible complete continuations of the conversation, and find that this technique dramatically improves performance. Our code and dataset are publicly available.

Yann Dauphin | Dhruv Batra | Mike Lewis | Denis Yarats | Devi Parikh | M. Lewis | Yann Dauphin | Denis Yarats | Dhruv Batra | Devi Parikh | Y. Dauphin

[1] J. Nash. THE BARGAINING PROBLEM , 1950, Classics in Game Theory.

[2] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[3] C. Fershtman,et al. The Importance of the Agenda in Bargaining , 1990 .

[4] V. Talwar,et al. Development of lying to conceal a transgression: Children’s control of expressive behaviour during verbal deception , 2002 .

[5] S. Singh,et al. Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[6] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[7] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[8] International Foundation for Autonomous Agents and MultiAgent Systems ( IFAAMAS ) , 2007 .

[9] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[10] David R. Traum,et al. Multi-party, Multi-issue, Multi-strategy Negotiation for Multi-modal Virtual Agents , 2008, IVA.

[11] Oliver Lemon,et al. Reinforcement Learning for Adaptive Dialogue Systems - A Data-driven Methodology for Dialogue Management and Natural Language Generation , 2011, Theory and Applications of Natural Language Processing.

[12] Alan Ritter,et al. Data-Driven Response Generation in Social Media , 2011, EMNLP.

[13] Hervé Frezza-Buet,et al. Sample-efficient batch reinforcement learning for dialogue management optimization , 2011, TSLP.

[14] Oliver Lemon,et al. Modelling Strategic Conversation: the STAC project , 2012 .

[15] Dongho Kim,et al. POMDP-based dialogue manager adaptation to extended domains , 2013, SIGDIAL Conference.

[16] Sarit Kraus,et al. Evaluating practical negotiating agents: Results and analysis of the 2011 international competition , 2013, Artif. Intell..

[17] Matthew Henderson,et al. The Second Dialog State Tracking Challenge , 2014, SIGDIAL Conference.

[18] Avi Rosenfeld,et al. NegoChat: a chat-based negotiation agent , 2014, AAMAS.

[19] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[20] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[21] Oliver Lemon,et al. Strategic Dialogue Management via Deep Reinforcement Learning , 2015, NIPS 2015.

[22] Xu Wei,et al. Learning Like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23] David DeVault,et al. Toward Natural Turn-Taking in a Virtual Human Negotiation Agent , 2015, AAAI Spring Symposia.

[24] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[25] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[26] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[27] Jing He,et al. Policy Networks with Two-Stage Training for Dialogue Systems , 2016, SIGDIAL Conference.

[28] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[29] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[30] Xiang Zhang,et al. Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems , 2015, ICLR.

[31] Jonathan Gratch,et al. The Misrepresentation Game: How to win at negotiation while seeming like a nice guy , 2016, AAMAS.

[32] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[33] José M. F. Moura,et al. Visual Dialog , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Alexander Peysakhovich,et al. Maintaining cooperation in complex social dilemmas using deep reinforcement learning , 2017, ArXiv.

[35] Oliver Lemon,et al. Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents , 2017, EACL.

[36] Stefan Lee,et al. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[38] Percy Liang,et al. Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings , 2017, ACL.

[39] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.