论文信息 - Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning

Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning

We present a method for combining multi-agent communication and traditional data-driven approaches to natural language learning, with an end goal of teaching agents to communicate with humans in natural language. Our starting point is a language model that has been trained on generic, not task-specific language data. We then place this model in a multi-agent self-play environment that generates task-specific rewards used to adapt or modulate the model, turning it into a task-conditional language model. We introduce a new way for combining the two types of learning based on the idea of reranking language model samples, and show that this method outperforms others in communicating with humans in a visual referential communication task. Finally, we present a taxonomy of different types of language drift that can occur alongside a set of measures to detect them.

[1] 付伶俐. 打磨Using Language,倡导新理念 , 2014 .

[2] Kyunghyun Cho,et al. Emergent Communication in a Multi-Modal, Multi-Step Referential Game , 2017, ICLR.

[3] Christopher Potts,et al. Learning in the Rational Speech Acts Model , 2015, ArXiv.

[4] Robert Dale,et al. Building applied natural language generation systems , 1997, Natural Language Engineering.

[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[7] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[8] Alec Radford,et al. Fine-Tuning Language Models from Human Preferences , 2019, ArXiv.

[9] Laura Graesser,et al. Emergent Linguistic Phenomena in Multi-Agent Communication Games , 2019, EMNLP.

[10] Ross B. Girshick,et al. Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Csr Young,et al. How to Do Things With Words , 2009 .

[12] H. H. Clark,et al. Conceptual pacts and lexical choice in conversation. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[13] Eugene Kharitonov,et al. Anti-efficient encoding in emergent communication , 2019, NeurIPS.

[14] Kyunghyun Cho,et al. Countering Language Drift via Visual Grounding , 2019, EMNLP.

[15] Samy Bengio,et al. Context-Aware Captions from Context-Agnostic Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Dan Klein,et al. Speaker-Follower Models for Vision-and-Language Navigation , 2018, NeurIPS.

[17] Stephen Clark,et al. Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input , 2018, ICLR.

[18] Dan Klein,et al. Reasoning about Pragmatics with Neural Listeners and Speakers , 2016, EMNLP.

[19] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[20] Ivan Titov,et al. Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols , 2017, NIPS.

[21] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[22] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.

[23] Marco Baroni,et al. How agents see things: On visual representations in an emergent language game , 2018, EMNLP.

[24] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[25] Bing Liu,et al. Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning , 2018, NAACL.

[26] Steve J. Young,et al. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.

[27] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[28] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[29] Yuchen Lu,et al. Countering Language Drift with Seeded Iterated Learning , 2020, ICML.

[30] Anca D. Dragan,et al. On the Utility of Learning about Humans for Human-AI Coordination , 2019, NeurIPS.

[31] Christopher Potts,et al. Pragmatically Informative Image Captioning with Character-Level Inference , 2018, NAACL.