Dynamic population-based meta-learning for multi-agent communication with natural language

In this work, our goal is to train agents that can coordinate with seen, unseen as well as human partners in a multi-agent communication environment involving natural language. Previous work using a single set of agents has shown great progress in generalizing to known partners, however it struggles when coordinating with unfamiliar agents. To mitigate that, recent work explored the use of population-based approaches, where multiple agents interact with each other with the goal of learning more generic protocols. These methods, while able to result in good coordination between unseen partners, still only achieve so in cases of simple languages, thus failing to adapt to human partners using natural language. We attribute this to the use of static populations and instead propose a dynamic population-based metalearning approach that builds such a population in an iterative manner. We perform a holistic evaluation of our method on two different referential games, and show that our agents outperform all prior work when communicating with seen partners and humans. Furthermore, we analyze the natural language generation skills of our agents, where we find that our agents also outperform strong baselines. Finally, we test the robustness of our agents when communicating with out-of-population agents and carefully test the importance of each component of our method through ablation studies.

[1]  M. Tomasello Origins of human communication , 2008 .

[2]  Pieter Abbeel,et al.  Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[5]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[6]  Nicole Fitzgerald,et al.  To Populate is To Regulate , 2019, ArXiv.

[7]  Stephen Clark,et al.  Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input , 2018, ICLR.

[8]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[9]  Kyunghyun Cho,et al.  Emergent Communication in a Multi-Modal, Multi-Step Referential Game , 2017, ICLR.

[10]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[11]  Yuchen Lu,et al.  Countering Language Drift with Seeded Iterated Learning , 2020, ICML.

[12]  Angeliki Lazaridou,et al.  Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning , 2020, ACL.

[13]  Abhinav Gupta,et al.  Exploring Structural Inductive Biases in Emergent Communication , 2020, ArXiv.

[14]  Jascha Sohl-Dickstein,et al.  Learning Unsupervised Learning Rules , 2018, ArXiv.

[15]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[16]  Serge J. Belongie,et al.  Learning to Evaluate Image Captioning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Doina Precup,et al.  Shaping representations through communication: community size effect in artificial learning systems , 2019, ArXiv.

[18]  A. Wray,et al.  The consequences of talking to strangers: Evolutionary corollaries of socio-cultural influences on linguistic form , 2007 .

[19]  Kyunghyun Cho,et al.  Countering Language Drift via Visual Grounding , 2019, EMNLP.

[20]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[21]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[22]  Eugene Kharitonov,et al.  Compositionality and Generalization In Emergent Languages , 2020, ACL.

[23]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[24]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[25]  Joelle Pineau,et al.  Seeded self-play for language learning , 2019, EMNLP.

[26]  Joelle Pineau,et al.  On the Pitfalls of Measuring Emergent Communication , 2019, AAMAS.

[27]  Alexander Peysakhovich,et al.  Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.

[28]  G. Lupyan,et al.  Language Structure Is Partly Determined by Social Structure , 2010, PloS one.

[29]  P. Trudgill Sociolinguistic Typology: Social Determinants of Linguistic Complexity , 2011 .

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Michael Cogswell,et al.  Emergence of Compositional Language with Deep Generational Transmission , 2019, ArXiv.

[32]  José M. F. Moura,et al.  Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog , 2017, EMNLP.

[33]  Tamim Asfour,et al.  ProMP: Proximal Meta-Policy Search , 2018, ICLR.

[34]  David Silver,et al.  A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.

[35]  Nick Chater,et al.  Simpler grammar, larger vocabulary: How population size affects language , 2018, Proceedings of the Royal Society B: Biological Sciences.

[36]  Peter Young,et al.  Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..

[37]  R. Kirk CONVENTION: A PHILOSOPHICAL STUDY , 1970 .

[38]  Jakob N. Foerster,et al.  "Other-Play" for Zero-Shot Coordination , 2020, ICML.

[39]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[40]  Eugene Kharitonov,et al.  Information Minimization In Emergent Languages , 2019, ArXiv.

[41]  Jakob N. Foerster,et al.  Learning to learn to communicate , 2019 .

[42]  Michael Bowling,et al.  Ease-of-Teaching and Language Structure from Emergent Communication , 2019, NeurIPS.

[43]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[44]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[45]  Joelle Pineau,et al.  On the interaction between supervision and self-play in emergent communication , 2020, ICLR.

[46]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[47]  Laura Graesser,et al.  Emergent Linguistic Phenomena in Multi-Agent Communication Games , 2019, EMNLP.

[48]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[49]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[50]  Tamas David-Barrett,et al.  Language as a coordination tool evolves slowly , 2016, Royal Society Open Science.

[51]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[52]  Marco Baroni,et al.  Miss Tools and Mr Fruit: Emergent Communication in Agents Learning about Object Affordances , 2019, ACL.

[53]  Ivan Titov,et al.  Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols , 2017, NIPS.