Capacity, Bandwidth, and Compositionality in Emergent Language Learning

Many recent works have discussed the propensity, or lack thereof, for emergent languages to exhibit properties of natural languages. A favorite in the literature is learning compositionality. We note that most of those works have focused on communicative bandwidth as being of primary importance. While important, it is not the only contributing factor. In this paper, we investigate the learning biases that affect the efficacy and compositionality of emergent languages. Our foremost contribution is to explore how capacity of a neural network impacts its ability to learn a compositional language. We additionally introduce a set of evaluation metrics with which we analyze the learned languages. Our hypothesis is that there should be a specific range of model capacity and channel bandwidth that induces compositional structure in the resulting language and consequently encourages systematic generalization. While we empirically see evidence for the bottom of this range, we curiously do not find evidence for the top part of the range and believe that this is an open question for the community.

[1]  Kyunghyun Cho,et al.  Countering Language Drift via Visual Grounding , 2019, EMNLP.

[2]  Andreas Maletti,et al.  Recurrent Neural Networks as Weighted Language Recognizers , 2017, NAACL.

[3]  Mathijs Mul,et al.  The compositionality of neural networks: integrating symbolism and connectionism , 2019, ArXiv.

[4]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[5]  Yuanzhi Li,et al.  Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.

[6]  Marco Baroni,et al.  Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[7]  Marco Baroni,et al.  Memorize or generalize? Searching for a compositional RNN in a haystack , 2018, ArXiv.

[8]  Luc Steels,et al.  Aibo''s first words. the social learning of language and meaning. Evolution of Communication , 2002 .

[9]  Stephen Clark,et al.  Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input , 2018, ICLR.

[10]  Brenden M. Lake,et al.  Compositional generalization through meta sequence-to-sequence learning , 2019, NeurIPS.

[11]  Ivan Titov,et al.  Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols , 2017, NIPS.

[12]  Aaron C. Courville,et al.  Systematic Generalization: What Is Required and Can It Be Learned? , 2018, ICLR.

[13]  Luc Steels,et al.  The synthetic modeling of language origins , 1997 .

[14]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[15]  Alexander Peysakhovich,et al.  Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.

[16]  R. Kirk CONVENTION: A PHILOSOPHICAL STUDY , 1970 .

[17]  David Lewis Convention: A Philosophical Study , 1986 .

[18]  Boaz Barak,et al.  Deep double descent: where bigger models and more data hurt , 2019, ICLR.

[19]  Simon Kirby,et al.  Minimal Requirements for the Emergence of Learned Signaling , 2014, Cogn. Sci..

[20]  Brian Skyrms,et al.  Hierarchical Models for the Evolution of Compositional Language , 2018 .

[21]  Kyunghyun Cho,et al.  Emergent Communication in a Multi-Modal, Multi-Step Referential Game , 2017, ICLR.

[22]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[23]  Eugene Kharitonov,et al.  Anti-efficient encoding in emergent communication , 2019, NeurIPS.

[24]  Jascha Sohl-Dickstein,et al.  Capacity and Trainability in Recurrent Neural Networks , 2016, ICLR.

[25]  Mikhail Belkin,et al.  Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Barnabás Póczos,et al.  Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.

[28]  Charles Kemp,et al.  Efficient compression in color naming and its evolution , 2018, Proceedings of the National Academy of Sciences.

[29]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Kyunghyun Cho,et al.  Emergent Language in a Multi-Modal, Multi-Step Referential Game , 2017, ArXiv.

[32]  Pieter Abbeel,et al.  Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.

[33]  S. Kirby,et al.  Compression and communication in the cultural evolution of linguistic structure , 2015, Cognition.

[34]  José M. F. Moura,et al.  Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog , 2017, EMNLP.

[35]  Marco Baroni,et al.  Linguistic generalization and compositionality in modern artificial neural networks , 2019, Philosophical Transactions of the Royal Society B.

[36]  Simon Kirby,et al.  Iconicity and the Emergence of Combinatorial Structure in Language , 2016, Cogn. Sci..

[37]  Eugene Kharitonov,et al.  Word-order Biases in Deep-agent Emergent Communication , 2019, ACL.

[38]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[39]  Niranjan Balasubramanian,et al.  The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models , 2018, ArXiv.

[40]  Jacob Andreas,et al.  Measuring Compositionality in Representation Learning , 2019, ICLR.

[41]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[42]  Jason Lee,et al.  Emergent Translation in Multi-Agent Communication , 2017, ICLR.

[43]  Joelle Pineau,et al.  On the interaction between supervision and self-play in emergent communication , 2020, ICLR.