Evolutionary Optimization of Cooperative Strategies for the Iterated Prisoner's Dilemma

The Iterated Prisoner’s Dilemma (IPD) has been studied in fields as diverse as economics, computer science, psychology, politics, and environmental studies. This is due, in part, to the intriguing property that its Nash Equilibrium is not globally optimal. Typically treated as a single-objective problem, a player’s goal is to maximize their own score. In some work, minimizing the opponent’s score is an additional objective. Here, we explore the role of explicitly optimizing for mutual cooperation in IPD player performance. We implement a genetic algorithm in which each member of the population evolves using one of four multi-objective fitness functions: selfish, communal, cooperative, and selfless, the last three of which use a cooperative metric as an objective. As a control, we also consider two singleobjective fitness functions. We explore the role of representation in evolving cooperation by implementing four representations for evolving players. Finally, we evaluate the effect of noise on the evolution of cooperative behaviors. Testing our evolved players in tournaments in which a player’s own score is the sole metic, we find that players evolved with mutual cooperation as an objective are very competitive. Thus, learning to play nicely with others is a successful strategy for maximizing personal reward.

[1]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[2]  Xin Yao,et al.  Behavioral diversity, choices and noise in the iterated prisoner's dilemma , 2005, IEEE Transactions on Evolutionary Computation.

[3]  Halina Kwasnicka,et al.  Discovering effective strategies for the iterated prisoner's dilemma using genetic algorithms , 2005, 5th International Conference on Intelligent Systems Design and Applications (ISDA'05).

[4]  Garrison W. Greenwood,et al.  Evolutionary games and the study of cooperation: Why has so little progress been made? , 2012, 2012 IEEE Congress on Evolutionary Computation.

[5]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[6]  M. Nowak Five Rules for the Evolution of Cooperation , 2006, Science.

[7]  Daniel A. Ashlock,et al.  Multiple Opponent Optimization of Prisoner’s Dilemma Playing Agents , 2015, IEEE Transactions on Computational Intelligence and AI in Games.

[8]  Daniel A. Ashlock,et al.  Fingerprinting: Visualization and Automatic Analysis of Prisoner's Dilemma Strategies , 2008, IEEE Transactions on Evolutionary Computation.

[9]  Daniel A. Ashlock,et al.  A fingerprint comparison of different Prisoner's Dilemma payoff matrices , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[10]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[11]  David B. Fogel,et al.  Evolving Behaviors in the Iterated Prisoner's Dilemma , 1993, Evolutionary Computation.

[12]  Daniel A. Ashlock,et al.  Fingerprint analysis of the noisy prisoner’s dilemma , 2009, 2007 IEEE Congress on Evolutionary Computation.

[13]  Wolfgang J. Luhan,et al.  Cedex Discussion Paper Series , 2022 .

[14]  Graham Kendall,et al.  Engineering Design of Strategies for Winning Iterated Prisoner's Dilemma Competitions , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[15]  Hisao Ishibuchi,et al.  Evolution of cooperative behavior in a spatial iterated prisoner's dilemma game with different representation schemes of game strategies , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[16]  Martin Jones,et al.  Reinforcement learning produces dominant strategies for the Iterated Prisoner’s Dilemma , 2017, PloS one.

[17]  Yevgeniy Vorobeychik,et al.  Predicting Human Cooperation , 2016, PloS one.

[18]  F. Guala,et al.  Group membership, team preferences, and expectations , 2013 .

[19]  T. Lofaro Crossing the Threshold: the role of density dependence and demographic stochasticity in the evolution of cooperation , 2015 .

[20]  Philippe Mathieu,et al.  Our Meeting with Gradual, A Good Strategy for the Iterated Prisoner's Dilemma , 1996 .

[21]  R. Axelrod,et al.  How to Cope with Noise in the Iterated Prisoner's Dilemma , 1995 .

[22]  Hisao Ishibuchi,et al.  Evolution of cooperative strategies for iterated prisoner's dilemma on networks , 2013, 2013 Fifth International Conference on Computational Aspects of Social Networks.

[23]  Dario Madeo,et al.  Self-regulation promotes cooperation in social networks , 2018, ArXiv.

[24]  Eun-Youn Kim,et al.  Understanding representational sensitivity in the iterated prisoner's dilemma with fingerprints , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  Hisao Ishibuchi,et al.  Effects of the Number of Opponents on the Evolution of Cooperation in the Iterated Prisoner's Dilemma , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[26]  W. E. Fann Prisoner's Dilemma: John von Neumann, Game Theory, and the Puzzle of the Bomb , 1993 .

[27]  Kalyanmoy Deb,et al.  Optimal Strategies of the Iterated Prisoner's Dilemma Problem for Multiple Conflicting Objectives , 2006, IEEE Transactions on Evolutionary Computation.

[28]  Robert Axelrod,et al.  The Evolution of Strategies in the Iterated Prisoner's Dilemma , 2001 .

[29]  Jean-Paul Delahaye,et al.  New Winning Strategies for the Iterated Prisoner's Dilemma , 2015, J. Artif. Soc. Soc. Simul..

[30]  Wendy Ashlock,et al.  Why Some Representations Are More Cooperative Than Others For Prisoner's Dilemma , 2007, 2007 IEEE Symposium on Foundations of Computational Intelligence.

[31]  Marc Parizeau,et al.  DEAP: evolutionary algorithms made easy , 2012, J. Mach. Learn. Res..

[32]  Nancy Wilkins-Diehr,et al.  XSEDE: Accelerating Scientific Discovery , 2014, Computing in Science & Engineering.