Learning to Reach Agreement in a Continuous Ultimatum Game

It is well-known that acting in an individually rational manner, according to the principles of classical game theory, may lead to sub-optimal solutions in a class of problems named social dilemmas. In contrast, humans generally do not have much difficulty with social dilemmas, as they are able to balance personal benefit and group benefit. As agents in multi-agent systems are regularly confronted with social dilemmas, for instance in tasks such as resource allocation, these agents may benefit from the inclusion of mechanisms thought to facilitate human fairness. Although many of such mechanisms have already been implemented in a multi-agent systems context, their application is usually limited to rather abstract social dilemmas with a discrete set of available strategies (usually two). Given that many real-world examples of social dilemmas are actually continuous in nature, we extend this previous work to more general dilemmas, in which agents operate in a continuous strategy space. The social dilemma under study here is the well-known Ultimatum Game, in which an optimal solution is achieved if agents agree on a common strategy. We investigate whether a scale-free interaction network facilitates agents to reach agreement, especially in the presence of fixed-strategy agents that represent a desired (e.g. human) outcome. Moreover, we study the influence of rewiring in the interaction network. The agents are equipped with continuous-action learning automata and play a large number of random pairwise games in order to establish a common strategy. From our experiments, we may conclude that results obtained in discrete-strategy games can be generalized to continuous-strategy games to a certain extent: a scale-free interaction network structure allows agents to achieve agreement on a common strategy, and rewiring in the interaction network greatly enhances the agents' ability to reach agreement. However, it also becomes clear that some alternative mechanisms, such as reputation and volunteering, have many subtleties involved and do not have convincing beneficial effects in the continuous case.

[1]  Francisco C. Santos,et al.  Cooperation Prevails When Individuals Adjust Their Social Ties , 2006, PLoS Comput. Biol..

[2]  F. C. Santos,et al.  Evolutionary dynamics of social dilemmas in structured heterogeneous populations. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[4]  Sarah Mathew,et al.  A Narrow Road to Cooperation , 2007, Science.

[5]  Ralf H. Peters Evolutionary Stability in the Ultimatum Game , 2000 .

[6]  Alvin E. Roth,et al.  Learning in High Stakes Ultimatum Games: An Experiment in the Slovak Republic , 1998 .

[7]  Steven J. Brams Fair Division* , 2009, Encyclopedia of Complexity and Systems Science.

[8]  H. Oosterbeek,et al.  Cultural Differences in Ultimatum Game Experiments: Evidence from a Meta-Analysis , 2001 .

[9]  W. Güth,et al.  An experimental analysis of ultimatum bargaining , 1982 .

[10]  Daniel Kudenko,et al.  Adaptive Agents and Multi-Agent Systems II: Adaptation and Multi-Agent Learning , 2003, Adaptive Agents and Multi-Agent Systems.

[11]  R. Axelrod The Dissemination of Culture , 1997 .

[12]  S. Zamir,et al.  Bargaining and Market Behavior in Jerusalem, Ljubljana, Pittsburgh, and Tokyo: An Experimental Study , 1991 .

[13]  Nico Roos,et al.  Priority Awareness: Towards a Computational Model of Human Fairness for Multi-agent Systems , 2007, Adaptive Agents and Multi-Agents Systems.

[14]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[15]  Joakim Sonnegård Determination of first movers in sequential bargaining games: An experimental study , 1996 .

[16]  Yen-Sheng Chiang A Path Toward Fairness , 2008 .

[17]  Steven de Jong,et al.  Fairness in multi-agent systems , 2008, The Knowledge Engineering Review.

[18]  L. Cameron,et al.  Raising the Stakes in the Ultimatum Game: Experimental Evidence From Indonesia , 1999 .

[19]  E. Fehr A Theory of Fairness, Competition and Cooperation , 1998 .

[20]  Vittorio Loreto,et al.  Agreement dynamics on small-world networks , 2006, cond-mat/0603205.

[21]  J. M. Smith,et al.  The Logic of Animal Conflict , 1973, Nature.

[22]  J. N. Bearden,et al.  Ultimatum Bargaining Experiments: The State of the Art , 2001 .

[23]  Colin Camerer,et al.  Foundations of Human Sociality - Economic Experiments and Ethnographic: Evidence From Fifteen Small-Scale Societies , 2004 .

[24]  Herbert Gintis,et al.  Game Theory Evolving: A Problem-Centered Introduction to Modeling Strategic Interaction - Second Edition , 2009 .

[25]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26]  Yann Chevaleyre,et al.  Issues in Multiagent Resource Allocation , 2006, Informatica.

[27]  Karl Tuyls,et al.  Artificial agents learning human fairness , 2008, AAMAS.

[28]  C. Hauert,et al.  Reward and punishment , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[29]  R. Selten,et al.  End behavior in sequences of finite prisoner's dilemma supergames , 1986 .

[30]  M. Nowak,et al.  Fairness versus reason in the ultimatum game. , 2000, Science.

[31]  Ernst Fehr,et al.  Homo reciprocans: A Research Initiative on the Origins, Dimensions, and Policy Implications of Recip , 1997 .

[32]  M. Thathachar,et al.  Networks of Learning Automata: Techniques for Online Stochastic Optimization , 2003 .

[33]  Thomas Riechmann,et al.  Inequity Aversion and Individual Behavior in Public Good Games: An Experimental Investigation , 2007 .

[34]  Brit Grosskopf Reinforcement and Directional Learning in the Ultimatum Game with Responder Competition , 2003 .

[35]  E. Fehr Human behaviour: Don't lose your reputation , 2004, Nature.

[36]  D. E. Matthews Evolution and the Theory of Games , 1977 .

[37]  C. Hauert,et al.  Via Freedom to Coercion: The Emergence of Costly Punishment , 2007, Science.