Probabilistic Plan Recognition in Multiagent Systems

We present a theoretical framework for online probabilistic plan recognition in cooperative multiagent systems. Our model extends the Abstract Hidden Markov Model (AHMM) (Bui, Venkatesh, & West 2002), and consists of a hierarchical dynamic Bayes network that allows reasoning about the interaction among multiple cooperating agents. We provide an in-depth analysis of two different policy termination schemes, Tall and Tany for concurrent action introduced in (Rohanimanesh & Mahadevan 2003). In the Tall scheme, a joint policy terminates only when all agents have terminated executing their individual policies. In the Tany scheme, a joint policy terminates as soon as any of the agents terminates executing its individual policy. Since exact inference is intractable, we describe an approximate algorithm using Rao-Blackwellized particle filtering. Our approximate inference procedure reduces the complexity from exponential time in N, the number of agents and K, the number of levels, to time linear in both N and K^ ≤ K (the lowest-level of plan coordination) for the Tall termination scheme and O(N log N) and linear in K^ for the Tany termination scheme.

[1]  Svetha Venkatesh,et al.  Policy Recognition in the Abstract Hidden Markov Model , 2002, J. Artif. Intell. Res..

[2]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[3]  Stuart J. Russell,et al.  Stochastic simulation algorithms for dynamic probabilistic networks , 1995, UAI.

[4]  Sridhar Mahadevan,et al.  Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.

[5]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[6]  Aaron F. Bobick,et al.  A Framework for Recognizing Multi-Agent Action from Visual Evidence , 1999, AAAI/IAAI.

[7]  Sridhar Mahadevan,et al.  Learning to Take Concurrent Actions , 2002, NIPS.

[8]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[9]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[10]  Yaser Al-Onaizan,et al.  On being a teammate: experiences acquired in the design of RoboCup teams , 1999, AGENTS '99.

[11]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[12]  Xavier Boyen,et al.  Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[13]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[14]  G. Casella,et al.  Rao-Blackwellisation of sampling schemes , 1996 .

[15]  Manuela M. Veloso,et al.  Coaching a simulated soccer team by opponent model recognition , 2001, AGENTS '01.

[16]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[17]  Ross D. Shachter Evaluating Influence Diagrams , 1986, Oper. Res..