Emotional Multiagent Reinforcement Learning in Social Dilemmas

Social dilemmas have attracted extensive interest in multiagent system research in order to study the emergence of cooperative behaviors among selfish agents. Without extra mechanisms or assumptions, directly applying multiagent reinforcement learning in social dilemmas will end up with convergence to the Nash equilibrium of mutual defection among the agents. This paper investigates the importance of emotions in modifying agent learning behaviors in order to achieve cooperation in social dilemmas. Two fundamental variables, individual wellbeing and social fairness, are considered in the appraisal of emotions that are used as intrinsic rewards for learning. Experimental results reveal that different structural relationships between the two appraisal variables can lead to distinct agent behaviors, and under certain circumstances, cooperation can be obtained among the agents.

[1]  M. Nowak Five Rules for the Evolution of Cooperation , 2006, Science.

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Ana L. C. Bazzan,et al.  Learning to cooperate in the Iterated Prisoner’s Dilemma by means of social attachments , 2011, Journal of the Brazilian Computer Society.

[4]  J. Conlisk Conlisk : Why Bounded Rationality ? 671 , 2000 .

[5]  Richard L. Lewis,et al.  Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.

[6]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[7]  Rosalind W. Picard,et al.  Affective Cognitive Learning and Decision Making : The Role of Emotions , 2006 .

[8]  Attila Szolnoki,et al.  Coevolutionary Games - A Mini Review , 2009, Biosyst..

[9]  María Malfaz,et al.  A New Approach to Modeling Emotions and Their Use on a Decision-Making System for Artificial Agents , 2012, IEEE Transactions on Affective Computing.

[10]  K. Scherer,et al.  Appraisal processes in emotion. , 2003 .

[11]  Juan C. Burguillo,et al.  Emerging cooperation on complex networks , 2011, AAMAS.

[12]  Rafael H. Bordini,et al.  A framework for the simulation of agents with emotions , 2001, AGENTS '01.

[13]  Robert Trappl Cybernetics and systems research '94 : proceedings of the Twelfth European Meeting on Cybernetics and Systems Research , 1992 .

[14]  Craig A. Smith,et al.  Appraisal components, core relational themes, and the emotions , 1993 .

[15]  P. Petta,et al.  Computational models of emotion , 2010 .

[16]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[17]  Ana Paiva,et al.  Emotion-Based Intrinsic Motivation for Reinforcement Learning Agents , 2011, ACII.

[18]  Michael A. Goodrich,et al.  Satisficing and Learning Cooperation in the Prisoner s Dilemma , 2001, IJCAI.

[19]  Sue L. Denham,et al.  Emotions in autonomous agents: comparative analysis of mechanisms and functions , 2012, Autonomous Agents and Multi-Agent Systems.

[20]  Richard L. Lewis,et al.  Where Do Rewards Come From , 2009 .

[21]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[22]  Naoki Masuda,et al.  Evolution of cooperation facilitated by reinforcement learning with adaptive aspiration levels. , 2011, Journal of theoretical biology.

[23]  Peter Vrancx,et al.  Switching dynamics of multi-agent learning , 2008, AAMAS.

[24]  Karl Tuyls,et al.  Human-inspired computational fairness , 2010, Autonomous Agents and Multi-Agent Systems.

[25]  Katia P. Sycara,et al.  The evolution of cooperation in self-interested agent societies: a critical study , 2011, AAMAS.

[26]  Victor R. Lesser,et al.  Learning the task allocation game , 2006, AAMAS '06.

[27]  Chao Wang,et al.  Imitating emotions instead of strategies in spatial games elevates social welfare , 2011, 1109.1712.

[28]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.