Coach planning with opponent models for distributed execution

In multi-agent domains, the generation and coordinated execution of plans in the presence of adversaries is a significant challenge. In our research, a special “coach” agent works with a team of distributed agents. The coach has a global view of the world, but has no actions other than occasionally communicating with the team over a limited bandwidth channel. Our coach is given a set of predefined opponent models which predict future states of the world caused by the opponents’ actions. The coach observes the world state changes resulting from the execution of its team and opponents and selects the best matched opponent model based on its observations. The coach uses the recognized opponent model to predict the behavior of the opponent. Upon opportunities to communicate, the coach generates a plan for the team, using the predictions of the opponent model. The centralized coach generates a plan for distributed execution. We introduce (i) the probabilistic representation and recognition algorithm for the opponent models; (ii) a multi-agent plan representation, Multi-Agent Simple Temporal Networks; and (iii) a plan execution algorithm that allows the robust distributed execution in the presence of noisy perception and actions. The complete approach is implemented in a complex simulated robot soccer environment. We present the contributions as developed in this domain, carefully highlighting their generality along with a series of experiments validating the effectiveness of our coach approach.

[1]  Aaron F. Bobick,et al.  A Framework for Recognizing Multi-Agent Action from Visual Evidence , 1999, AAAI/IAAI.

[2]  Gerhard Weiss,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 1999 .

[3]  Manuela M. Veloso,et al.  The CMUnited-97 Small Robot Team , 1997, RoboCup.

[4]  Austin Tate,et al.  Synthesizing Protection Monitors from Causal Structure , 1994, AIPS.

[5]  Peter Stone,et al.  The UT Austin Villa 2003 Champion Simulator Coach: A Machine Learning Approach , 2004, RoboCup.

[6]  Milind Tambe,et al.  Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[7]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[8]  Andreas Birk,et al.  RoboCup 2001: Robot Soccer World Cup V , 2002, Lecture Notes in Computer Science.

[9]  Victor R. Lesser,et al.  Communication decisions in multi-agent cooperation: model and experiments , 2001, AGENTS '01.

[10]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[11]  Ian Frank,et al.  Soccer Server: A Tool for Research on Multiagent Systems , 1998, Appl. Artif. Intell..

[12]  Tamio Arai,et al.  Distributed Autonomous Robotic Systems 3 , 1998 .

[13]  Manuela Veloso,et al.  An Empirical Study of Coaching , 2002, DARS.

[14]  Hiroyuki Iida,et al.  Opponent-model search , 1993 .

[15]  Rina Dechter,et al.  Temporal Constraint Networks , 1989, Artif. Intell..

[16]  Edmund H. Durfee,et al.  A Rigorous, Operational Formalization of Recursive Modeling , 1995, ICMAS.

[17]  Manuela Veloso,et al.  Coaching: learning and using environment and agent models for advice , 2005 .

[19]  Jafar Habibi,et al.  Coaching a Soccer Simulation Team in RoboCup Environment , 2002, EurAsia-ICT.

[20]  Ubbo Visser,et al.  Recognition and Prediction of Motion Situations Based on a Qualitative Motion Description , 2003, RoboCup.

[21]  Ubbo Visser,et al.  Virtual Werder , 2000, RoboCup.

[22]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[23]  Jörg Denzinger,et al.  Improving modeling of other agents using stereotypes and compactification of observations , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[24]  Manuela Veloso,et al.  Automated Robot Behavior Recognition Applied to Robotic Soccer , 1999 .

[25]  David Carmel,et al.  Model-based learning of interaction strategies in multi-agent systems , 1998, J. Exp. Theor. Artif. Intell..

[26]  Y. Freund,et al.  The non-stochastic multi-armed bandit problem , 2001 .

[27]  Hiroaki Kitano,et al.  RoboCup-99: Robot Soccer World Cup III , 2003, Lecture Notes in Computer Science.

[28]  Milind Tambe,et al.  RESC: An Approach for Real-time, Dynamic Agent Tracking , 1995, IJCAI.

[29]  Peter Stone,et al.  Anticipation as a key for collaboration in a team of agents: a case study in robotic soccer , 1999, Optics East.

[30]  Peter Stone,et al.  Keeping the Ball from CMUnited-99 , 2000, RoboCup.

[31]  H.H.L.M. Donkers,et al.  NOSCE HOSTEM: Searching with Opponent Models , 1997 .

[32]  David Carmel,et al.  Incorporating Opponent Models into Adversary Search , 1996, AAAI/IAAI, Vol. 1.

[33]  Jafar Habibi,et al.  Using a Two-Layered Case-Based Reasoning for Prediction in Soccer Coach , 2003, MLMTA.

[34]  Nicholas R. Jennings,et al.  Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions , 1995, Artif. Intell..

[35]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.

[36]  Manuela M. Veloso,et al.  The CMUnited-99 Champion Simulator Team , 1999, RoboCup.

[37]  Jean-Claude Latombe,et al.  Robot motion planning , 1991, The Kluwer international series in engineering and computer science.

[38]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[39]  Manuela M. Veloso,et al.  Rationale-Based Monitoring for Planning in Dynamic Environments , 1998, AIPS.

[40]  Hiroaki Kitano,et al.  The RoboCup Synthetic Agent Challenge 97 , 1997, IJCAI.

[41]  Manuela M. Veloso,et al.  Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[42]  Nicola Muscettola,et al.  Execution of Temporal Plans with Uncertainty , 2000, AAAI/IAAI.

[43]  Peter Stone,et al.  CMUnited: a team of robotics soccer agents collaborating in an adversarial environment , 1998, CROS.

[44]  Manuela M. Veloso,et al.  ChaMeleons-01 Team Description , 2001, RoboCup.

[45]  Sarit Kraus,et al.  Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[46]  Manuela M. Veloso,et al.  ATT-CMUnited-2000: Third Place Finisher in the RoboCup-2000 Simulator League , 2000, RoboCup.

[47]  John E. Laird,et al.  It knows what you're going to do: adding anticipation to a Quakebot , 2001, AGENTS '01.

[48]  David Atkinson,et al.  Generating Perception Requests and Expectations to Verify the Execution of Plans , 1986, AAAI.

[49]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[50]  Milind Tambe,et al.  Automated assistants to aid humans in understanding team behaviors , 2000, AGENTS '00.

[51]  Peter Stone,et al.  Layered learning in multiagent systems - a winning approach to robotic soccer , 2000, Intelligent robotics and autonomous agents.

[52]  Brett Browning,et al.  Plays as Effective Multiagent Plans Enabling Opponent-Adaptive Play Selection , 2004, ICAPS.

[53]  Anthony Stentz,et al.  Optimal and efficient path planning for partially-known environments , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[54]  Manuela M. Veloso,et al.  The CMUnited-98 Champion Simulator Team , 1998, RoboCup.

[55]  Jürgen Perl,et al.  Behavior Classification with Self-Organizing Maps , 2000, RoboCup.

[56]  Nicola Muscettola,et al.  Reformulating Temporal Plans for Efficient Execution , 1998, KR.

[57]  Milind Tambe,et al.  The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[58]  Gregory Kuhlmann and Peter Stone and Justin Lallinger The Champion UT Austin Villa 2003 Simulator Online Coach Team , 2004 .