Emergence of Communication in an Interactive World with Consistent Speakers

Training agents to communicate with one another given task-based supervision only has attracted considerable attention recently, due to the growing interest in developing models for human-agent interaction. Prior work on the topic focused on simple environments, where training using policy gradient was feasible despite the non-stationarity of the agents during training. In this paper, we present a more challenging environment for testing the emergence of communication from raw pixels, where training using policy gradient fails. We propose a new model and training algorithm, that utilizes the structure of a learned representation space to produce more consistent speakers at the initial phases of training, which stabilizes learning. We empirically show that our algorithm substantially improves performance compared to policy gradient. We also propose a new alignment-based metric for measuring context-independence in emerged communication and find our method increases context-independence compared to policy gradient and other competitive baselines.

[1]  Brian Skyrms,et al.  Signals: Evolution, Learning, and Information , 2010 .

[2]  Angeliki Lazaridou,et al.  Towards Multi-Agent Communication-Based Language Learning , 2016, ArXiv.

[3]  G. Frege Über Sinn und Bedeutung , 1892 .

[4]  Stephen Clark,et al.  Emergent Communication through Negotiation , 2018, ICLR.

[5]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[6]  Nando de Freitas,et al.  Compositional Obverter Communication Learning From Raw Visual Input , 2018, ICLR.

[7]  José M. F. Moura,et al.  Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog , 2017, EMNLP.

[8]  Christopher D. Manning,et al.  Learning Language Games through Interaction , 2016, ACL.

[9]  J. Bruner Child's Talk: Learning to Use Language , 1985 .

[10]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[11]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[12]  Stephen Clark,et al.  Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input , 2018, ICLR.

[13]  Ivan Titov,et al.  Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols , 2017, NIPS.

[14]  M. Tomasello,et al.  Does the chimpanzee have a theory of mind? 30 years later , 2008, Trends in Cognitive Sciences.

[15]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[16]  David Lewis Convention: A Philosophical Study , 1986 .

[17]  Kyunghyun Cho,et al.  Emergent Communication in a Multi-Modal, Multi-Step Referential Game , 2017, ICLR.

[18]  Demis Hassabis,et al.  Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.

[19]  John Langford,et al.  Mapping Instructions and Visual Observations to Actions with Reinforcement Learning , 2017, EMNLP.

[20]  Alexander Peysakhovich,et al.  Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.

[21]  R. Kirk CONVENTION: A PHILOSOPHICAL STUDY , 1970 .

[22]  Luc Steels,et al.  The Grounded Naming Game , 2012 .

[23]  Kyunghyun Cho,et al.  Emergent Language in a Multi-Modal, Multi-Step Referential Game , 2017, ArXiv.

[24]  M A Nowak,et al.  The evolution of language. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Terry Winograd,et al.  Understanding natural language , 1974 .

[26]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[27]  Jeffrey Mark Siskind Grounding language in perception , 2004, Artificial Intelligence Review.

[28]  Igor Mordatch,et al.  A Paradigm for Situated and Goal-Driven Language Learning , 2016, ArXiv.

[29]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[30]  James A. Reggia,et al.  Progress in the Simulation of Emergent Communication and Language , 2003, Adapt. Behav..

[31]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[32]  Yann Dauphin,et al.  Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.

[33]  Pieter Abbeel,et al.  Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.