论文信息 - Probabilistic Plan Recognition in Multiagent Systems

Probabilistic Plan Recognition in Multiagent Systems

We present a theoretical framework for online probabilistic plan recognition in cooperative multiagent systems. Our model extends the Abstract Hidden Markov Model (AHMM) (Bui, Venkatesh, & West 2002), and consists of a hierarchical dynamic Bayes network that allows reasoning about the interaction among multiple cooperating agents. We provide an in-depth analysis of two different policy termination schemes, Tall and Tany for concurrent action introduced in (Rohanimanesh & Mahadevan 2003). In the Tall scheme, a joint policy terminates only when all agents have terminated executing their individual policies. In the Tany scheme, a joint policy terminates as soon as any of the agents terminates executing its individual policy. Since exact inference is intractable, we describe an approximate algorithm using Rao-Blackwellized particle filtering. Our approximate inference procedure reduces the complexity from exponential time in N, the number of agents and K, the number of levels, to time linear in both N and K^ ≤ K (the lowest-level of plan coordination) for the Tall termination scheme and O(N log N) and linear in K^ for the Tany termination scheme.

Suchi Saria | Sridhar Mahadevan | S. Mahadevan | S. Saria

[1] Svetha Venkatesh,et al. Policy Recognition in the Abstract Hidden Markov Model , 2002, J. Artif. Intell. Res..

[2] Nando de Freitas,et al. Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[3] Stuart J. Russell,et al. Stochastic simulation algorithms for dynamic probabilistic networks , 1995, UAI.

[4] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.

[5] Simon J. Godsill,et al. On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[6] Aaron F. Bobick,et al. A Framework for Recognizing Multi-Agent Action from Visual Evidence , 1999, AAAI/IAAI.

[7] Sridhar Mahadevan,et al. Learning to Take Concurrent Actions , 2002, NIPS.

[8] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[9] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[10] Yaser Al-Onaizan,et al. On being a teammate: experiences acquired in the design of RoboCup teams , 1999, AGENTS '99.

[11] David J. Spiegelhalter,et al. Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[12] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[13] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[14] G. Casella,et al. Rao-Blackwellisation of sampling schemes , 1996 .

[15] Manuela M. Veloso,et al. Coaching a simulated soccer team by opponent model recognition , 2001, AGENTS '01.

[16] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.

[17] Ross D. Shachter. Evaluating Influence Diagrams , 1986, Oper. Res..