Emergence of Pragmatics from Referential Game between Theory of Mind Agents

Pragmatics studies how context can contribute to language meanings [1]. In human communication, language is never interpreted out of context, and sentences can usually convey more information than their literal meanings [2]. However, this mechanism is missing in most multi-agent systems [3, 4, 5, 6], restricting the communication efficiency and the capability of human-agent interaction. In this paper, we propose an algorithm, using which agents can spontaneously learn the ability to "read between lines" without any explicit hand-designed rules. We integrate the theory of mind (ToM) [7, 8] in a cooperative multi-agent pedagogical situation and propose an adaptive reinforcement learning (RL) algorithm to develop a communication protocol. ToM is a profound cognitive science concept, claiming that people regularly reason about other's mental states, including beliefs, goals, and intentions, to obtain performance advantage in competition, cooperation or coalition. With this ability, agents consider language as not only messages but also rational acts reflecting others' hidden states. Our experiments demonstrate the advantage of pragmatic protocols over non-pragmatic protocols. We also show the teaching complexity following the pragmatic protocol empirically approximates to recursive teaching dimension (RTD).

[1]  De Weerd Estimating the use of higher-order theory of mind using computational agents , 2017 .

[2]  Michael C. Frank,et al.  Review Pragmatic Language Interpretation as Probabilistic Inference , 2022 .

[3]  Noah D. Goodman,et al.  Planning, Inference and Pragmatics in Sequential Language Games , 2018, TACL.

[4]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[5]  Hans Ulrich Simon,et al.  Recursive teaching dimension, VC-dimension and sample compression , 2014, J. Mach. Learn. Res..

[6]  Morten H. Christiansen,et al.  On The Evolutionary Origin of Symbolic Communication , 2016, Scientific Reports.

[7]  Stephen Clark,et al.  Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input , 2018, ICLR.

[8]  Nando de Freitas,et al.  Compositional Obverter Communication Learning From Raw Visual Input , 2018, ICLR.

[9]  M. Tomasello,et al.  Does the chimpanzee have a theory of mind? 30 years later , 2008, Trends in Cognitive Sciences.

[10]  Reinhard Blutner,et al.  Some Aspects of Optimality in Natural Language Interpretation , 2000, J. Semant..

[11]  Krishnendu Chatterjee,et al.  Language acquisition with communication between learners , 2018, Journal of The Royal Society Interface.

[12]  Piotr J. Gmytrasiewicz,et al.  Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs , 2017, MAICS.

[13]  Xi Chen,et al.  On the Recursive Teaching Dimension of VC Classes , 2016, NIPS.

[14]  Siobhan Chapman Logic and Conversation , 2005 .

[15]  Jonathan Berant,et al.  Emergence of Communication in an Interactive World with Consistent Speakers , 2018, ArXiv.

[16]  Noah D. Goodman,et al.  A rational account of pedagogical reasoning: Teaching by, and learning from, examples , 2014, Cognitive Psychology.

[17]  Dan Klein,et al.  Reasoning about Pragmatics with Neural Listeners and Speakers , 2016, EMNLP.

[18]  Christopher Potts,et al.  Emergence of Gricean Maxims from Multi-Agent Decision Theory , 2013, NAACL.

[19]  Prashant Doshi,et al.  Monte Carlo Sampling Methods for Approximating Interactive POMDPs , 2014, J. Artif. Intell. Res..

[20]  Costas Tsatsoulis,et al.  Learning Communication Strategies in Multiagent Systems , 1998, Applied Intelligence.

[21]  Michael C. Frank,et al.  Informative communication in word production and word learning , 2009 .

[22]  Raymond J. Dolan,et al.  Game Theory of Mind , 2008, PLoS Comput. Biol..

[23]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[24]  Morten H. Christiansen,et al.  Language evolution: consensus and controversies , 2003, Trends in Cognitive Sciences.

[25]  Stephen Clark,et al.  Emergent Communication through Negotiation , 2018, ICLR.

[26]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[27]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[28]  Gerhard Jäger,et al.  Game theory in semantics and pragmatics , 2012 .

[29]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[30]  Chris L. Baker,et al.  Rational quantitative attribution of beliefs, desires and percepts in human mentalizing , 2017, Nature Human Behaviour.

[31]  Dan Klein,et al.  A Game-Theoretic Approach to Generating Spatial Descriptions , 2010, EMNLP.

[32]  Claudia V. Goldman,et al.  Optimizing information exchange in cooperative multi-agent systems , 2003, AAMAS '03.

[33]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[34]  H. Francis Song,et al.  Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[35]  Bart Verheij,et al.  Higher-order theory of mind in the Tacit Communication Game , 2015, BICA 2015.

[36]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[37]  Roger Levy,et al.  That's what she (could have) said: How alternative utterances affect language use , 2012, CogSci.

[38]  Michael C. Frank,et al.  Predicting Pragmatic Reasoning in Language Games , 2012, Science.

[39]  Shimon Whiteson,et al.  Learning with Opponent-Learning Awareness , 2017, AAMAS.

[40]  Claudia V. Goldman,et al.  Learning to communicate in a decentralized environment , 2007, Autonomous Agents and Multi-Agent Systems.

[41]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[42]  Bart Verheij,et al.  Theory of Mind in the Mod Game: An Agent-Based Model of Strategic Reasoning , 2014, ECSI.

[43]  Kyunghyun Cho,et al.  Emergent Communication in a Multi-Modal, Multi-Step Referential Game , 2017, ICLR.

[44]  Michael C. Frank,et al.  Learning to Reason Pragmatically with Cognitive Limitations , 2014, CogSci.

[45]  Alexander Peysakhovich,et al.  Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.

[46]  Stefan Lee,et al.  Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Anca D. Dragan,et al.  Pragmatic-Pedagogic Value Alignment , 2017, ISRR.

[48]  Jacob L. Mey,et al.  Pragmatics: An Introduction , 2001 .