Selecting informative actions improves cooperative multiagent learning

In concurrent cooperative multiagent learning, each agent simultaneously learns to improve the overall performance of the team, with no direct control over the actions chosen by its teammates. An agent's action selection directly influences the rewards received by all the agents, resulting in a co-adaptation among the concurrent learning processes. Co-adaptation can drive the team towards suboptimal solutions because agents tend to select those actions that are rewarded better, without any consideration for how such actions may affect the search of their teammates. We argue that to counter this tendency, agents should also prefer actions that inform their teammates about the structure of the joint search space in order to help them choose from among various action options. We analyze this approach in a cooperative coevolutionary framework, and we propose a new algorithm, iCCEA, that highlights the advantages of selecting informative actions. We show that iCCEA generally outperforms other cooperative coevolution algorithms on our test problems.

[1]  Ronen I. Brafman,et al.  Efficient learning equilibrium , 2004, Artificial Intelligence.

[2]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[3]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[4]  Phil Husbands,et al.  Simulated Co-Evolution as the Mechanism for Emergent Planning and Scheduling , 1991, ICGA.

[5]  R. Paul Wiegand,et al.  Improving Coevolutionary Search for Optimal Multiagent Behaviors , 2003, IJCAI.

[6]  Sean Luke,et al.  Time-dependent Collaboration Schemes for Cooperative Coevolutionary Algorithms , 2005, AAAI Fall Symposium: Coevolutionary and Coadaptive Systems.

[7]  Rudolf Paul Wiegand,et al.  An analysis of cooperative coevolutionary algorithms , 2004 .

[8]  R. Paul Wiegand,et al.  A Sensitivity Analysis of a Cooperative Coevolutionary Algorithm Biased for Optimization , 2004, GECCO.

[9]  Daniel Kudenko,et al.  Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.

[10]  Larry Bull,et al.  Evolutionary computing in multi-agent environments: Partners , 1997 .

[11]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[12]  R. Paul Wiegand,et al.  An empirical analysis of collaboration methods in cooperative coevolutionary algorithms , 2001 .

[13]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[14]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[15]  M. Lichbach The cooperator's dilemma , 1996 .

[16]  Mitchell A. Potter,et al.  The design and analysis of a computational model of cooperative coevolution , 1997 .

[17]  Jordan B. Pollack,et al.  On identifying global optima in cooperative coevolution , 2005, GECCO '05.

[18]  R. Paul Wiegand,et al.  A Visual Demonstration of Convergence Properties of Cooperative Coevolution , 2004, PPSN.

[19]  Kenneth A. De Jong,et al.  Understanding cooperative co-evolutionary dynamics via simple fitness landscapes , 2005, GECCO '05.

[20]  François Charpillet,et al.  MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.

[21]  Jeffrey K. Bassett,et al.  An Analysis of Cooperative Coevolutionary Algorithms A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University , 2003 .

[22]  Larry Bull,et al.  Evolutionary Computing in Multi-agent Environments: Operators , 1998, Evolutionary Programming.

[23]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[24]  Yoav Shoham,et al.  On the Agenda(s) of Research on Multi-Agent Learning , 2004, AAAI Technical Report.

[25]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.