论文信息 - Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning

Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning

Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity. In contrast, studies of reinforcement learning in mixed-motive games have primarily leveraged homogeneous approaches. Given the defining characteristic of mixed-motive games--the imperfect correlation of incentives between group members--we study the effect of population heterogeneity on mixed-motive reinforcement learning. We draw on interdependence theory from social psychology and imbue reinforcement learning agents with Social Value Orientation (SVO), a flexible formalization of preferences over group outcome distributions. We subsequently explore the effects of diversity in SVO on populations of reinforcement learning agents in two mixed-motive Markov games. We demonstrate that heterogeneity in SVO generates meaningful and complex behavioral variation among agents similar to that suggested by interdependence theory. Empirical results in these mixed-motive dilemmas suggest agents trained in heterogeneous populations develop particularly generalized, high-performing policies relative to those trained in homogeneous populations.

[1] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[2] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.

[3] R. Nisbett,et al. Perception of social distributions. , 1985, Journal of personality and social psychology.

[4] John Fender,et al. Altruism and Economics , 2012 .

[5] Alexander Peysakhovich,et al. Prosocial Learning Agents Solve Generalized Stag Hunts Better than Selfish Ones Extended Abstract , 2018 .

[6] Iyad Rahwan,et al. Behavioural evidence for a transparency–efficiency tradeoff in human–machine cooperation , 2019, Nature Machine Intelligence.

[7] Catherine C. Eckel,et al. Altruism in Anonymous Dictator Games , 1996 .

[8] Joseph Henrich,et al. Cooperation, Punishment, and the Evolution of Human Institutions , 2006, Science.

[9] G. Hardin,et al. Tragedy of the Commons , 1968 .

[10] Margaret Morrison,et al. Models as Mediating Instruments , 1999 .

[11] Nando de Freitas,et al. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.

[12] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[13] Paul N. Bennett,et al. Guidelines for Human-AI Interaction , 2019, CHI.

[14] Patrick Riordan. Blueprint: The Evolutionary Origins of a Good Society. By Nicholas A.Christakis. Pp. xxi, 520, NY, Boston, London, Little, Brown Spark, 2019, $30.00. , 2019 .

[15] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[16] Ryan O. Murphy,et al. Social Value Orientation , 2014, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.

[17] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.

[18] A. Turrell,et al. Drawing on different disciplines: macroeconomic agent-based models , 2019 .

[19] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.

[20] M. Brewer. In-group bias in the minimal intergroup situation: A cognitive-motivational analysis. , 1979 .

[21] Joel Z. Leibo,et al. Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.

[22] Alborz Geramifard,et al. Decentralized control of Partially Observable Markov Decision Processes using belief space macro-actions , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[23] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.

[24] Elinor Ostrom,et al. The Nature of Common-Pool Resource Problems , 1990 .

[25] J. Hartley,et al. Retrospectives: The Origins of the Representative Agent , 1996 .

[26] S. Bowles. Group Competition, Reproductive Leveling, and the Evolution of Human Altruism , 2006, Science.

[27] A. Kirman. Whom Or What Does the Representative Individual Represent , 1992 .

[28] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.