论文信息 - Improving adjustable autonomy strategies for time-critical domains

Improving adjustable autonomy strategies for time-critical domains

As agents begin to perform complex tasks alongside humans as collaborative teammates, it becomes crucial that the resulting human-multiagent teams adapt to time-critical domains. In such domains, adjustable autonomy has proven useful by allowing for a dynamic transfer of control of decision making between human and agents. However, existing adjustable autonomy algorithms commonly discretize time, which not only results in high algorithm runtimes but also translates into inaccurate transfer of control policies. In addition, existing techniques fail to address decision making inconsistencies often encountered in human multiagent decision making. To address these limitations, we present novel approach for Resolving Inconsistencies in Adjustable Autonomy in Continuous Time (RIAACT) that makes three contributions: First, we apply continuous time planning paradigm to adjustable autonomy, resulting in high-accuracy transfer of control policies. Second, our new adjustable autonomy framework both models and plans for the resolving of inconsistencies between human and agent decisions. Third, we introduce a new model, Interruptible Action Time-dependent Markov Decision Problem (IA-TMDP), which allows for actions to be interrupted at any point in continuous time. We show how to solve IA-TMDPs efficiently and leverage them to plan for the resolving of inconsistencies in RIAACT. Furthermore, these contributions have been realized and evaluated in a complex disaster response simulation system.

[1] Milind Tambe,et al. Using multiagent teams to improve the training of incident commanders , 2006, AAMAS '06.

[2] Reid G. Simmons,et al. Coordinated Multiagent Teams and Sliding Autonomy for Large-Scale Assembly , 2006, Proceedings of the IEEE.

[3] Milind Tambe,et al. Towards Adjustable Autonomy for the Real World , 2002, J. Artif. Intell. Res..

[4] Jeffrey D. Anderson,et al. Managing autonomy in robot teams: Observations from four experiments , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[5] Hector J. Levesque,et al. Intention is Choice with Commitment , 1990, Artif. Intell..

[6] Håkan L. S. Younes,et al. Solving Generalized Semi-Markov Decision Processes Using Continuous Phase-Type Distributions , 2004, AAAI.

[7] Michael L. Littman,et al. Exact Solutions to Time-Dependent MDPs , 2000, NIPS.

[8] Milind Tambe,et al. A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources , 2007, IJCAI.

[9] Milind Tambe,et al. Hybrid BDI-POMDP Framework for Multiagent Teaming , 2011, J. Artif. Intell. Res..

[10] U. Rieder,et al. Markov Decision Processes , 2010 .

[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[12] References , 1971 .

[13] Milind Tambe,et al. Exploiting belief bounds: practical POMDPs for personal assistant agents , 2005, AAMAS '05.

[14] Lihong Li,et al. Lazy Approximation for Solving Continuous Finite-Horizon MDPs , 2005, AAAI.