论文信息 - Empirical evaluation of ad hoc teamwork in the pursuit domain

Empirical evaluation of ad hoc teamwork in the pursuit domain

The concept of creating autonomous agents capable of exhibiting ad hoc teamwork was recently introduced as a challenge to the AI, and specifically to the multiagent systems community. An agent capable of ad hoc teamwork is one that can effectively cooperate with multiple potential teammates on a set of collaborative tasks. Previous research has investigated theoretically optimal ad hoc teamwork strategies in restrictive settings. This paper presents the first empirical study of ad hoc teamwork in a more open, complex teamwork domain. Specifically, we evaluate a range of effective algorithms for on-line behavior generation on the part of a single ad hoc team agent that must collaborate with a range of possible teammates in the pursuit domain.

[1] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[2] Sarit Kraus,et al. Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[3] Ronen I. Brafman,et al. On Partially Controlled Multi-Agent Systems , 1996, J. Artif. Intell. Res..

[4] Faruk Polat,et al. Multi-agent real-time pursuit , 2009, Autonomous Agents and Multi-Agent Systems.

[5] Victor R. Lesser,et al. Designing a Family of Coordination Algorithms , 1997, ICMAS.

[6] M. Benda,et al. On Optimal Cooperation of Knowledge Sources , 1985 .

[7] Sarit Kraus,et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.

[8] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[9] Sarit Kraus,et al. To teach or not to teach?: decision making under uncertainty in ad hoc teams , 2010, AAMAS.

[10] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.

[11] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[12] Sylvain Gelly,et al. Exploration exploitation in Go: UCT for Monte-Carlo Go , 2006, NIPS 2006.

[13] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[14] Yukinori Kakazu,et al. An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning , 2003, Robotics Auton. Syst..

[15] Peter Stone,et al. Leading a Best-Response Teammate in an Ad Hoc Team , 2009, AMEC/TADA.

[16] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[17] Peter Stone,et al. Convergence, Targeted Optimality, and Safety in Multiagent Learning , 2010, ICML.

[18] Valentin Robu,et al. Agent-Mediated Electronic Commerce. Designing Trading Strategies and Mechanisms for Electronic Markets , 2013, Lecture Notes in Business Information Processing.

[19] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[20] Kagan Tumer,et al. Robot coordination with ad-hoc team formation , 2010, AAMAS.

[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[23] Sandip Sen,et al. Teaching new teammates , 2006, AAMAS '06.