Word-order Biases in Deep-agent Emergent Communication

Sequence-processing neural networks led to remarkable progress on many NLP tasks. As a consequence, there has been increasing interest in understanding to what extent they process language as humans do. We aim here to uncover which biases such models display with respect to "natural" word-order constraints. We train models to communicate about paths in a simple gridworld, using miniature languages that reflect or violate various natural language trends, such as the tendency to avoid redundancy or to minimize long-distance dependencies. We study how the controlled characteristics of our miniature languages affect individual learning and their stability across multiple network generations. The results draw a mixed picture. On the one hand, neural networks show a strong tendency to avoid long-distance dependencies. On the other hand, there is no clear preference for the efficient, non-redundant encoding of information that is widely attested in natural language. We thus suggest inoculating a notion of "effort" into neural networks, as a possible way to make their linguistic behavior more human-like.

[1]  Pieter Abbeel,et al.  Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.

[2]  Julie M. Hupp,et al.  Evidence for a domain-general mechanism underlying the suffixation preference in language , 2009 .

[3]  Kyunghyun Cho,et al.  Emergent Communication in a Multi-Modal, Multi-Step Referential Game , 2017, ICLR.

[4]  Ken Hale Basic word order in two “free word order” languages , 1992 .

[5]  Joseph H. Greenberg,et al.  Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements , 1990, On Language.

[6]  John A. Hawkins,et al.  A Performance Theory of Order and Constituency , 1995 .

[7]  Anna L. Theakston,et al.  Iconicity affects children’s comprehension of complex sentences: The role of semantics, clause order, input and individual differences , 2018, Cognition.

[8]  Lior Wolf,et al.  Using the Output Embedding to Improve Language Models , 2016, EACL.

[9]  Marco Baroni,et al.  Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[10]  Yu-Jung Heo,et al.  Answerer in Questioner's Mind for Goal-Oriented Visual Dialogue , 2018, ArXiv.

[11]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[12]  Sanjiv Kumar,et al.  On the Convergence of Adam and Beyond , 2018 .

[13]  Maryia Fedzechkinaa,et al.  Miniature artificial language learning as a complement to typological data , 2015 .

[14]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[15]  Holger Diessel,et al.  Iconicity of sequence: A corpus-based analysis of the positioning of temporal adverbial clauses in English , 2008 .

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[19]  Nando de Freitas,et al.  Compositional Obverter Communication Learning From Raw Visual Input , 2018, ICLR.

[20]  Marianne Mithun,et al.  Is basic word order universal , 1987 .

[21]  Jason Lee,et al.  Emergent Translation in Multi-Agent Communication , 2017, ICLR.

[22]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[23]  Meng Zhang,et al.  Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.

[24]  S. Kirby,et al.  Iterated learning and the evolution of language , 2014, Current Opinion in Neurobiology.

[25]  Solomon Marcus,et al.  SYNTACTIC ICONICITY, WITHIN AND BEYOND ITS ACCEPTED PRINCIPLES , 2010 .

[26]  John Haiman,et al.  THE ICONICITY OF GRAMMAR: ISOMORPHISM AND MOTIVATION , 1980 .

[27]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[28]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[29]  Marco Baroni,et al.  How agents see things: On visual representations in an emergent language game , 2018, EMNLP.

[30]  Emil Gustavsson,et al.  Learning to Play Guess Who? and Inventing a Grounded Language as a Consequence , 2016, ArXiv.

[31]  E. Gibson Linguistic complexity: locality of syntactic dependencies , 1998, Cognition.

[32]  Richard Futrell,et al.  Large-scale evidence of dependency length minimization in 37 languages , 2015, Proceedings of the National Academy of Sciences.

[33]  Comrie Bernard Language Universals and Linguistic Typology , 1982 .

[34]  Elizabeth Closs Traugott,et al.  A History of English Syntax , 1971 .

[35]  Alexander Peysakhovich,et al.  Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.

[36]  René Dirven,et al.  Cognitive English Grammar , 2007 .

[37]  Michael C. Frank,et al.  The learnability of constructed languages reflects typological patterns , 2011, CogSci.

[38]  Stephen Clark,et al.  Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input , 2018, ICLR.

[39]  F. Newmeyer Iconicity and generative grammar , 1992 .

[40]  Ivan Titov,et al.  Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols , 2017, NIPS.

[41]  José M. F. Moura,et al.  Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog , 2017, EMNLP.

[42]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[43]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[44]  Elissa L. Newport,et al.  Balancing Effort and Information Transmission During Language Acquisition: Evidence From Word Order and Case Marking , 2017, Cogn. Sci..

[45]  Yann Dauphin,et al.  Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.

[46]  Ting Qian,et al.  Cue Effectiveness in Communicatively Efficient Discourse Production , 2012, Cogn. Sci..