论文信息 - Teaching and leading an ad hoc teammate: Collaboration without pre-coordination

Teaching and leading an ad hoc teammate: Collaboration without pre-coordination

As autonomous agents proliferate in the real world, both in software and robotic settings, they will increasingly need to band together for cooperative activities with previously unfamiliar teammates. In such ad hoc team settings, team strategies cannot be developed a priori. Rather, an agent must be prepared to cooperate with many types of teammates: it must collaborate without pre-coordination. This article defines two aspects of collaboration in two-player teams, involving either simultaneous or sequential decision making. In both cases, the ad hoc agent is more knowledgeable of the environment, and attempts to influence the behavior of its teammate such that they will attain the optimal possible joint utility.

[1] Sandra Carberry,et al. Techniques for Plan Recognition , 2001, User Modeling and User-Adapted Interaction.

[2] A. Schotter,et al. An Experimental Study of Belief Learning Using Elicited Beliefs , 2002 .

[3] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .

[4] Sarit Kraus,et al. Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[5] Masaki Aoyagi,et al. Mutual Observability and the Convergence of Actions in a Multi-Person Two-Armed Bandit Model , 1998 .

[6] W. Hamilton,et al. The Evolution of Cooperation , 1984 .

[7] Andrew W. Moore,et al. Distributed Value Functions , 1999, ICML.

[8] J. S. Albus. Task decomposition , 1993, Proceedings of 8th IEEE International Symposium on Intelligent Control.

[9] Ra Kildare,et al. Ad-hoc online teams as complex systems: agents that cater for team interaction rules , 2004 .

[10] Yoav Shoham,et al. Essentials of Game Theory: A Concise Multidisciplinary Introduction , 2008, Essentials of Game Theory: A Concise Multidisciplinary Introduction.

[11] Michael H. Bowling,et al. Coordination and Adaptation in Impromptu Teams , 2005, AAAI.

[12] Edmund H. Durfee,et al. Rational Coordination in Multi-Agent Environments , 2000, Autonomous Agents and Multi-Agent Systems.

[13] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[14] Brett Browning,et al. Dynamically formed heterogeneous robot teams performing tightly-coordinated tasks , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[15] Samuel Barrett and Peter Stone. Ad Hoc Teamwork Modeled with Multi-armed Bandits: An Extension to Discounted Infinite Rewards , 2011 .

[16] Michael N. Huhns,et al. Agents for establishing ad hoc cross-organizational teams , 2004 .

[17] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[18] Jürgen Eichberger. Bayesian Learning in Repeated Normal Form Games , 1995 .

[19] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .

[20] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[21] Michael Wooldridge,et al. Computational Aspects of Cooperative Game Theory (Synthesis Lectures on Artificial Inetlligence and Machine Learning) , 2011 .

[22] Godfrey Keller,et al. Strategic Experimentation with Poisson Bandits , 2009 .

[23] David C. Parkes,et al. A General Approach to Environment Design with One Agent , 2009, IJCAI.

[24] Philip R. Cohen,et al. Plans for Discourse , 2003 .

[25] Karen E. Lochbaum,et al. A Collaborative Planning Model of Intentional Structure , 1998, CL.

[26] Katia P. Sycara,et al. Distributed Intelligent Agents , 1996, IEEE Expert.

[27] Ronen I. Brafman,et al. On Partially Controlled Multi-Agent Systems , 1996, J. Artif. Intell. Res..

[28] Manuela M. Veloso,et al. Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[29] Sandra Zilles,et al. Models of Cooperative Teaching and Learning , 2011, J. Mach. Learn. Res..

[30] H. Young,et al. The Evolution of Conventions , 1993 .

[31] Yoav Shoham,et al. Learning against opponents with bounded memory , 2005, IJCAI.

[32] Noa Agmon,et al. Leading ad hoc agents in joint action settings with multiple teammates , 2012, AAMAS.

[33] Sarit Kraus,et al. Learning Teammate Models for Ad Hoc Teamwork , 2012, AAMAS 2012.

[34] H. Peyton Young,et al. The Possible and the Impossible in Multi-Agent Learning , 2007, Artif. Intell..

[35] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[36] Noa Agmon,et al. Ad hoc teamwork for leading a flock , 2013, AAMAS.

[37] Gal A. Kaminka,et al. Integration of Coordination Mechanisms in the BITE Multi-Robot Architecture , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[38] Candace L. Sidner,et al. Plan parsing for intended response recognition in discourse 1 , 1985, Comput. Intell..

[39] Daijiro Okada,et al. Two-person repeated games with finite automata , 2000, Int. J. Game Theory.

[40] Michael N. Huhns,et al. Agents for establishing ad hoc cross-organizational teams , 2004, Proceedings. IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2004. (IAT 2004)..

[41] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[42] Karen E. Lochbaum,et al. An Algorithm for Plan Recognition in Collaborative Discourse , 1991, ACL.

[43] Ayça Kaya,et al. When Does it Pay to Get Informed? , 2010 .

[44] Yoav Shoham,et al. Multi-Agent Reinforcement Learning:a critical survey , 2003 .

[45] Lehel Csató,et al. Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[46] Robert J. Aumann,et al. 16. Acceptable Points in General Cooperative n-Person Games , 1959 .

[47] Peter Stone,et al. Online Multiagent Learning against Memory Bounded Adversaries , 2008, ECML/PKDD.

[48] L. Shapley. A Value for n-person Games , 1988 .

[49] Sarit Kraus,et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.

[50] Edmund H. Durfee,et al. Recursive Agent Modeling Using Limited Rationality , 1995, ICMAS.

[51] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[52] M. Cripps,et al. Strategic Experimentation with Exponential Bandits , 2003 .