An Application of Multiagent Learning in Highly Dynamic Environments

We explore the emergent behavior of game theoretic algorithms in a highly dynamic applied setting in which the optimal goal for the agents is constantly changing. Our focus is on a variant of the traditional predator-prey problem entitled Defender. Consisting of multiple predators and multiple prey, Defender shares similarities with rugby, soccer, and football, in addition to current problems in the field of Multiagent Systems (MAS). Observations, communications, and knowledge about the world-state are designed to be information-sparse, modeling real-world uncertainty. We propose a solution to Defender by means of the well-known multiagent learning algorithm fictitious play, and compare it with rational learning, regret matching, minimax regret, and a simple greedy strategy. We provide the modifications required to build these agents and state the implications of their application of them to our problem. We show fictitious play's performance to be superior at evenly assigning predators to prey in spite of it being an incomplete and imperfect information game that is continually changing its dimension and payoff. Interestingly, its performance is attributed to a synthesis of fictitious play, partial observability, and an anti-coordination game which reinforces the payoff of actions that were previously taken.

[1]  J. Lygeros,et al.  A game theoretic approach to controller design for hybrid systems , 2000, Proceedings of the IEEE.

[2]  Gunes Ercal,et al.  On No-Regret Learning, Fictitious Play, and Nash Equilibrium , 2001, ICML.

[3]  Faruk Polat,et al.  Multi-agent real-time pursuit , 2009, Autonomous Agents and Multi-Agent Systems.

[4]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[5]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[6]  Claudia V. Goldman,et al.  Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..

[7]  Peter Stone,et al.  Layered learning in multiagent systems - a winning approach to robotic soccer , 2000, Intelligent robotics and autonomous agents.

[8]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[9]  Shahram Payandeh,et al.  On Confinement of the Initial Location of an Intruder in a Multi-robot Pursuit Game , 2013, J. Intell. Robotic Syst..

[10]  J.P. Hespanha,et al.  Multiple-agent probabilistic pursuit-evasion games , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[11]  S. Shankar Sastry,et al.  Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation , 2002, IEEE Trans. Robotics Autom..

[12]  Reid G. Smith,et al.  The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver , 1980, IEEE Transactions on Computers.

[13]  M. Benda,et al.  On Optimal Cooperation of Knowledge Sources , 1985 .

[14]  Sarit Kraus,et al.  Empirical evaluation of ad hoc teamwork in the pursuit domain , 2011, AAMAS.

[15]  J. Robinson AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[16]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[17]  Sarit Kraus,et al.  Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.

[18]  Hiroaki Kitano,et al.  RoboCup: The Robot World Cup Initiative , 1997, AGENTS '97.

[19]  Jason R. Marden,et al.  Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[20]  CoordinationExperimentsEdmund H. Durfee,et al.  MICE : A Flexible Testbed for Intelligent , 1989 .

[21]  Greg Welch,et al.  Welch & Bishop , An Introduction to the Kalman Filter 2 1 The Discrete Kalman Filter In 1960 , 1994 .

[22]  Geoffrey A. Hollinger,et al.  Search and pursuit-evasion in mobile robotics , 2011, Auton. Robots.

[23]  Milind Tambe,et al.  Hybrid BDI-POMDP Framework for Multiagent Teaming , 2011, J. Artif. Intell. Res..

[24]  Nicholas R. Jennings,et al.  Commitments and conventions: The foundation of coordination in multi-agent systems , 1993, The Knowledge Engineering Review.

[25]  Sebastian Thrun,et al.  Visibility-based Pursuit-evasion with Limited Field of View , 2004, Int. J. Robotics Res..

[26]  Craig Boutilier,et al.  Learning Conventions in Multiagent Stochastic Domains using Likelihood Estimates , 1996, UAI.

[27]  Lynne E. Parker,et al.  Cooperative multi-robot observation of multiple moving targets , 1997, Proceedings of International Conference on Robotics and Automation.

[28]  Lynne E. Parker,et al.  ALLIANCE: an architecture for fault tolerant multirobot cooperation , 1998, IEEE Trans. Robotics Autom..