Theoretical Advantages of Lenient Learners: An Evolutionary Game Theoretic Perspective

This paper presents the dynamics of multiple learning agents from an evolutionary game theoretic perspective. We provide replicator dynamics models for cooperative coevolutionary algorithms and for traditional multiagent Q-learning, and we extend these differential equations to account for lenient learners: agents that forgive possible mismatched teammate actions that resulted in low rewards. We use these extended formal models to study the convergence guarantees for these algorithms, and also to visualize the basins of attraction to optimal and suboptimal solutions in two benchmark coordination problems. The paper demonstrates that lenience provides learners with more accurate information about the benefits of performing their actions, resulting in higher likelihood of convergence to the globally optimal solution. In addition, the analysis indicates that the choice of learning algorithm has an insignificant impact on the overall performance of multiagent learning algorithms; rather, the performance of these algorithms depends primarily on the level of lenience that the agents exhibit to one another. Finally, the research herein supports the strength and generality of evolutionary game theory as a backbone for multiagent learning.

[1]  D. E. Matthews Evolution and the Theory of Games , 1977 .

[2]  Kenneth A. De Jong,et al.  Modeling Variation in Cooperative Coevolution Using Evolutionary Game Theory , 2002, FOGA.

[3]  Kagan Tumer,et al.  Handling Communication Restrictions and Team Formation in Congestion Games , 2006, Autonomous Agents and Multi-Agent Systems.

[4]  R. Paul Wiegand,et al.  Improving Coevolutionary Search for Optimal Multiagent Behaviors , 2003, IJCAI.

[5]  Luc Devroye,et al.  Non-Uniform Random Variate Generation , 1986 .

[6]  Risto Miikkulainen,et al.  Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[7]  Dan Ventura,et al.  Predicting and Preventing Coordination Problems in Cooperative Q-learning Systems , 2007, IJCAI.

[8]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[9]  M. Vose The Simple Genetic Algorithm , 1999 .

[10]  Sean Luke,et al.  Lenience towards Teammates Helps in Cooperative Multiagent Learning , 2005 .

[11]  Kagan Tumer,et al.  Coordinating multi-rover systems: evaluation functions for dynamic and noisy environments , 2005, GECCO '05.

[12]  Daniel Kudenko,et al.  Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.

[13]  Kagan Tumer,et al.  Learning agents for distributed and robust spacecraft power management , 2006 .

[14]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[15]  Kenneth A. De Jong,et al.  Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents , 2000, Evolutionary Computation.

[16]  Jeffrey K. Bassett,et al.  An Analysis of Cooperative Coevolutionary Algorithms A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University , 2003 .

[17]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[18]  Mitchell A. Potter,et al.  The design and analysis of a computational model of cooperative coevolution , 1997 .

[19]  R. Eriksson,et al.  Cooperative Coevolution in Inventory Control Optimisation , 1997, ICANNGA.

[20]  Kagan Tumer,et al.  Evolving distributed agents for managing air traffic , 2007, GECCO '07.

[21]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[22]  Sean Luke,et al.  Lenient learners in cooperative multiagent systems , 2006, AAMAS '06.

[23]  Phil Husbands,et al.  Simulated Co-Evolution as the Mechanism for Emergent Planning and Scheduling , 1991, International Conference on Genetic Algorithms.

[24]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[25]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[26]  Josef Hofbauer,et al.  Evolutionary Games and Population Dynamics , 1998 .

[27]  Kenneth A. De Jong,et al.  A Cooperative Coevolutionary Approach to Function Optimization , 1994, PPSN.

[28]  Karl Tuyls,et al.  An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.

[29]  Herbert Gintis,et al.  Game Theory Evolving: A Problem-Centered Introduction to Modeling Strategic Interaction - Second Edition , 2009 .

[30]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[31]  Sean Luke,et al.  Selecting informative actions improves cooperative multiagent learning , 2006, AAMAS '06.

[32]  R. Paul Wiegand,et al.  An empirical analysis of collaboration methods in cooperative coevolutionary algorithms , 2001 .

[33]  R. Paul Wiegand,et al.  A Visual Demonstration of Convergence Properties of Cooperative Coevolution , 2004, PPSN.

[34]  Sean Luke,et al.  Time-dependent Collaboration Schemes for Cooperative Coevolutionary Algorithms , 2005, AAAI Fall Symposium: Coevolutionary and Coadaptive Systems.

[35]  J. M. Smith,et al.  The Logic of Animal Conflict , 1973, Nature.

[36]  Thomas Jansen,et al.  Exploring the Explorative Advantage of the Cooperative Coevolutionary (1+1) EA , 2003, GECCO.

[37]  R. Paul Wiegand,et al.  Robustness in cooperative coevolution , 2006, GECCO '06.

[38]  Yishay Mansour,et al.  Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[39]  Tom Lenaerts,et al.  A selection-mutation model for q-learning in multi-agent systems , 2003, AAMAS '03.