Learning Teammate Models for Ad Hoc Teamwork

Robust autonomous agents should be able to cooperate with new teammates effectively by employing ad hoc teamwork. Reasoning about ad hoc teamwork allows agents to perform joint tasks while cooperating with a variety of teammates. As the teammates may not share a communication or coordination algorithm, the ad hoc team agent adapts to its teammates just by observing them. Whereas most past work on ad hoc teamwork considers the case where the ad hoc team agent has a prior model of its teammate, this paper is the first to introduce an agent that learns models of its teammates autonomously. In addition, this paper presents a new transfer learning algorithm that can be used when the ad hoc agent only has limited observations about potential teammates.

[1]  Peter Stone,et al.  Boosting for Regression Transfer , 2010, ICML.

[2]  Y. Mansour,et al.  Algorithmic Game Theory: Learning, Regret Minimization, and Equilibria , 2007 .

[3]  Brett Browning,et al.  Dynamically formed heterogeneous robot teams performing tightly-coordinated tasks , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[4]  Sylvain Gelly,et al.  Exploration exploitation in Go: UCT for Monte-Carlo Go , 2006, NIPS 2006.

[5]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[6]  Michael H. Bowling,et al.  Coordination and Adaptation in Impromptu Teams , 2005, AAAI.

[7]  Feng Wu,et al.  Online Planning for Ad Hoc Autonomous Agent Teams , 2011, IJCAI.

[8]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Gang Wang,et al.  Boosting for transfer learning from multiple data sources , 2012, Pattern Recognit. Lett..

[10]  Vincent Conitzer,et al.  AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[11]  Sarit Kraus,et al.  Empirical evaluation of ad hoc teamwork in the pursuit domain , 2011, AAMAS.

[12]  Y. Mansour,et al.  4 Learning , Regret minimization , and Equilibria , 2006 .

[13]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[14]  Joel Veness,et al.  Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[15]  Sarit Kraus,et al.  Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.

[16]  Manuela M. Veloso,et al.  Modeling mutual capabilities in heterogeneous teams for role assignment , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Yifeng Zeng,et al.  Improved approximation of interactive dynamic influence diagrams using discriminative model updates , 2009, AAMAS.

[18]  Peter Stone,et al.  An analysis framework for ad hoc teamwork tasks , 2012, AAMAS.

[19]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[20]  Ming Li,et al.  Soft Control on Collective Behavior of a Group of Autonomous Agents By a Shill Agent , 2006, J. Syst. Sci. Complex..

[21]  Ronen I. Brafman,et al.  On Partially Controlled Multi-Agent Systems , 1996, J. Artif. Intell. Res..