论文信息 - Toward human-multiagent teams

Toward human-multiagent teams

One of the most fundamental challenges of building a human-multiagent team is adjustable autonomy, a process in which the control over team decisions is dynamically transferred between humans and agents. This thesis studies adjustable autonomy in the context of a human interacting with a team of agents and focuses on four issues that arise when addressing this team-level adjustable autonomy problem in real-time uncertain domains. Firstly, the humans and agents may differ significantly in their worldviews and their capabilities. This difference leads to inconsistencies in how humans and agents solve problems. Despite such inconsistencies, previous work has rigidly assumed the infallibility of human decisions. However, in some cases following the human's decisions lead to worse human-multiagent team performance. Secondly, it is desirable for the team to manage the uncertainty of action durations and plan for the optimal action at any point in time. This is a crucial challenge to address given that these human-multiagent teams are working in real-time with strict deadlines combined with the particularly uncertain duration of actions that involve a human. Thirdly, the team needs to be able to plan for the optimal time to interrupt certain actions. This is due to the fact that actions may take an uncertain amount of time and the deadline is approaching. The human-multiagent team may benefit from attempting an action for a given amount of time and interrupting the action if it does not finish in order to try another action that has a higher expected reward. Fourthly, team-level adjustable autonomy is an inherently distributed and complex problem that cannot be solved optimally and completely online. My thesis makes four contributions to the field in order to address these challenges. First, I have included, in the adjustable autonomy framework, the modeling of the resolution of inconsistencies between human and agent view. This diverges from previous work on adjustable autonomy that traditionally assumes the human is infallible and decisions as rigid, but instead puts the humans and agents on an equal footing, allowing each to identify possible team performance problems. I have developed new "resolution adjustable autonomy strategies" that recognize inconsistencies and provide a framework to decide if a resolution is beneficial. Second, in order to address the challenges brought about by dealing with time, I have modeled these new adjustable autonomy strategies using TMDPs (Time dependent Markov Decision Problems). This allows for an improvement over previous approaches, which used a discretized time model and less efficient solutions. Third, I have introduced a new model for Interruptible TMDPs (ITMDPs) that allows for an action to be interrupted at any point in continuous time. This results in a more accurate modeling of actions and produces additional time-dependent policies that guide interruption during the execution of an action. Fourth, I have created a hybrid approach that decomposes the team level adjustable autonomy problem in a separate ITMDP for each team decision. In addition, team-based logics are used to coordinate and execute the team actions that are present in the ITMDP. In addition to developing these techniques, I have conducted experimental evaluations that demonstrate the contributions of this approach. This has been realized in a system that I have constructed, DEFACTO (Demonstrating Effective Flexible Agent Coordination of Teams via Omnipresence), that incorporates this approach to team-level adjustable autonomy, along with agent coordination reasoning and a multi-perspective view of the team for the human. DEFACTO has been applied to an urban disaster response domain and used for incident commander training. The Los Angeles Fire Department has been supportive and has given valuable feedback that has shaped the system.

Milind Tambe | Nathan Schurr

[1] Milind Tambe,et al. An Automated Teamwork Infrastructure for Heterogeneous Software Agents and Humans , 2003, Autonomous Agents and Multi-Agent Systems.

[2] Terrence Fong,et al. Multi-robot remote driving with collaborative control , 2003, IEEE Trans. Ind. Electron..

[3] William R. Swartout,et al. Making a game of system design , 2003, CACM.

[4] Milos Hauskrecht,et al. Solving Factored MDPs with Continuous and Discrete Variables , 2004, UAI.

[5] Michael J. Zyda,et al. From Viz‐Sim to VR to Games: How We Built a Hit Game‐Based Simulation , 2005 .

[6] Hiroaki Kitano,et al. RoboCup Rescue: search and rescue in large-scale disasters as a domain for autonomous agents research , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[7] Thomas Wagner,et al. COORDINATORS coordination managers for first responders , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[8] Michael L. Littman,et al. Exact Solutions to Time-Dependent MDPs , 2000, NIPS.

[9] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[10] Milind Tambe,et al. Towards Adjustable Autonomy for the Real World , 2002, J. Artif. Intell. Res..

[11] Eric Horvitz,et al. Learning and reasoning about interruption , 2003, ICMI '03.

[12] Paul Scerri,et al. Impact of Human Advice on Agent Teams : A Preliminary Report , 2003 .

[13] Michael Wooldridge,et al. Reasoning about Intentions in Uncertain Domains , 2001, ECSQARU.

[14] Craig Boutilier,et al. Decision-Theoretic, High-Level Agent Programming in the Situation Calculus , 2000, AAAI/IAAI.

[15] Randall W. Hill,et al. Intelligent Agents for the Synthetic Battlefield: A Company of Rotary Wing Aircraft , 1997, AAAI/IAAI.

[16] Diana F. Spears. Asimovian Adaptive Agents , 2000, J. Artif. Intell. Res..

[17] Milind Tambe,et al. A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources , 2007, IJCAI.

[18] Milind Tambe,et al. Conflicts in teamwork: hybrids to the rescue , 2005, AAMAS '05.

[19] Randall W. Hill,et al. Toward the holodeck: integrating graphics, sound, character and story , 2001, AGENTS '01.

[20] Marek Petrik,et al. An Analysis of Laplacian Methods for Value Function Approximation in MDPs , 2007, IJCAI.

[21] Zhengzhu Feng,et al. Dynamic Programming for Structured Continuous Markov Decision Problems , 2004, UAI.

[22] Jeremy W. Baxter,et al. Controlling teams of uninhabited air vehicles , 2005, AAMAS '05.

[23] Nathanael Chambers,et al. PLOW: A Collaborative Task Learning Agent , 2007, AAAI.

[24] David Kortenkamp,et al. Adjustable Autonomy for Human-Centered Autonomous Systems on Mars , 1998 .

[25] Michael A. Goodrich,et al. Towards predicting robot team performance , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[26] Dylan M. Jones,et al. Navigating Buildings in "Desk-Top" Virtual Environments: Experimental Investigations Using Extended Navigational Experience , 1997 .

[27] Milind Tambe,et al. Allocating tasks in extreme teams , 2005, AAMAS '05.

[28] James F. Allen,et al. TRAINS-95: Towards a Mixed-Initiative Planning Assistant , 1996, AIPS.

[29] Maria L. Gini,et al. Mixed-initiative decision support in agent-based automated contracting , 2000, AGENTS '00.

[30] Hector J. Levesque,et al. Intention is Choice with Commitment , 1990, Artif. Intell..

[31] Anthony E. Richardson,et al. Spatial knowledge acquisition from maps and from navigation in real and virtual environments , 1999, Memory & cognition.

[32] Karen L. Myers,et al. Policy-based Agent Directability , 2003 .

[33] Egon L. van den Broek,et al. TACOP: a cognitive agent for a naval training simulation environment , 2005, AAMAS '05.

[34] Maarten Sierhuis,et al. Advantages of Brahms for Specifying and Implementing a Multiagent Human-Robotic Exploration System , 2003, FLAIRS.

[35] Milind Tambe,et al. Coordinators Autonomy Module Technical Report , 2006 .

[36] Lihong Li,et al. Lazy Approximation for Solving Continuous Finite-Horizon MDPs , 2005, AAAI.

[37] Håkan L. S. Younes,et al. Solving Generalized Semi-Markov Decision Processes Using Continuous Phase-Type Distributions , 2004, AAAI.

[38] Jeffrey M. Bradshaw,et al. Human-Agent Teamwork and Adjustable Autonomy in Practice , 2003 .

[39] Milind Tambe,et al. A prototype infrastructure for distributed robot-agent-person teams , 2003, AAMAS '03.

[40] Ronen I. Brafman,et al. Planning with Continuous Resources in Stochastic Domains , 2005, IJCAI.

[41] Milind Tambe,et al. Hybrid BDI-POMDP Framework for Multiagent Teaming , 2011, J. Artif. Intell. Res..

[42] Paul Scerri,et al. Synergistic Integration of Agent Technologies for Military Simulation , 2007 .

[43] Michael Zyda,et al. From visual simulation to virtual reality to games , 2005, Computer.

[44] Milind Tambe,et al. Exploiting belief bounds: practical POMDPs for personal assistant agents , 2005, AAMAS '05.

[45] Candace L. Sidner,et al. COLLAGEN: A Collaboration Manager for Software Interface Agents , 1998, User Modeling and User-Adapted Interaction.

[46] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[47] Oren Etzioni,et al. The First Law of Robotics (A Call to Arms) , 1994, AAAI.

[48] John P. Lewis,et al. The DEFACTO System: Training Tool for Incident Commanders , 2005, AAAI.

[49] Milind Tambe,et al. Revisiting Asimov's First Law: A Response to the Call to Arms , 2001, ATAL.

[50] A. Paivio,et al. Pictures and words in visual search , 1974, Memory & cognition.

[51] Wendell H. Chun,et al. Team-Centered Virtual Interactive Presence for Adjustable Autonomy , 2005 .

[52] Jeffrey D. Anderson,et al. Managing autonomy in robot teams: Observations from four experiments , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[53] Samin Karim,et al. Experiences with the design and implementation of an agent-based autonomous UAV controller , 2005, AAMAS '05.

[54] Milind Tambe,et al. Toward Team-Oriented Programming , 1999, ATAL.

[55] Michael A. Goodrich,et al. Experiments in adjustable autonomy , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).