Teamwork with Limited Knowledge of Teammates

While great strides have been made in multiagent teamwork, existing approaches typically assume extensive information exists about teammates and how to coordinate actions. This paper addresses how robust teamwork can still be created even if limited or no information exists about a specific group of teammates, as in the ad hoc teamwork scenario. The main contribution of this paper is the first empirical evaluation of an agent cooperating with teammates not created by the authors, where the agent is not provided expert knowledge of its teammates. For this purpose, we develop a general-purpose teammate modeling method and test the resulting ad hoc team agent's ability to collaborate with more than 40 unknown teams of agents to accomplish a benchmark task. These agents were designed by people other than the authors without these designers planning for the ad hoc teamwork setting. A secondary contribution of the paper is a new transfer learning algorithm, TwoStageTransfer, that can improve results when the ad hoc team agent does have some limited observations of its current teammates.

[1]  Vincent Conitzer,et al.  AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[2]  Ya'akov Gal,et al.  Networks of Influence Diagrams: A Formalism for Representing Agents' Beliefs and Decision-Making Processes , 2008, J. Artif. Intell. Res..

[3]  Peter Stone,et al.  An analysis framework for ad hoc teamwork tasks , 2012, AAMAS.

[4]  Sarit Kraus,et al.  Empirical evaluation of ad hoc teamwork in the pursuit domain , 2011, AAMAS.

[5]  Sarit Kraus,et al.  Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.

[6]  Sarit Kraus,et al.  The Evolution of Sharedplans , 1999 .

[7]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[8]  Brett Browning,et al.  Dynamically formed heterogeneous robot teams performing tightly-coordinated tasks , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[9]  Joel Veness,et al.  Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[10]  Ronen I. Brafman,et al.  On Partially Controlled Multi-Agent Systems , 1996, J. Artif. Intell. Res..

[11]  Yifeng Zeng,et al.  Improved approximation of interactive dynamic influence diagrams using discriminative model updates , 2009, AAMAS.

[12]  Sylvain Gelly,et al.  Exploration exploitation in Go: UCT for Monte-Carlo Go , 2006, NIPS 2006.

[13]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[14]  Peter Stone,et al.  Boosting for Regression Transfer , 2010, ICML.

[15]  Milind Tambe,et al.  The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[16]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[17]  Y. Mansour,et al.  Algorithmic Game Theory: Learning, Regret Minimization, and Equilibria , 2007 .

[18]  Feng Wu,et al.  Online Planning for Ad Hoc Autonomous Agent Teams , 2011, IJCAI.

[19]  Shotaro Akaho,et al.  TrBagg: A Simple Transfer Learning Method and its Application to Personalization in Collaborative Tagging , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[20]  Manuela M. Veloso,et al.  Modeling mutual capabilities in heterogeneous teams for role assignment , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Gang Wang,et al.  Boosting for transfer learning from multiple data sources , 2012, Pattern Recognit. Lett..

[22]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[23]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Y. Mansour,et al.  4 Learning , Regret minimization , and Equilibria , 2006 .

[25]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[26]  Michael H. Bowling,et al.  Coordination and Adaptation in Impromptu Teams , 2005, AAAI.

[27]  Milind Tambe,et al.  Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[28]  Sarit Kraus,et al.  Collaborative Plans for Complex Group Action , 1996, Artif. Intell..