论文信息 - Assistant Agents for Sequential Planning Problems

Assistant Agents for Sequential Planning Problems

The problem of optimal planning under uncertainty in collaborative multi-agent domains is known to be deeply intractable but still demands a solution. This thesis will explore principled approximation methods that yield tractable approaches to planning for AI assistants, which allow them to understand the intentions of humans and help them achieve their goals. AI assistants are ubiquitous in video games, making them attractive domains for applying these planning techniques. However, games are also challenging domains, typically having very large state spaces and long planning horizons. The approaches in this thesis will leverage recent advances in Monte-Carlo search, approximation of stochastic dynamics by deterministic dynamics, and hierarchical action representation, to handle domains that are too complex for existing state of the art planners. These planning techniques will be demonstrated across a range of video game domains.

[1] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[2] Jonathan P. Rowe,et al. Goal Recognition with Markov Logic Networks for Player-Adaptive Games , 2011, AIIDE.

[3] Judy Goldsmith,et al. Competition Adds Complexity , 2007, NIPS.

[4] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[5] Leslie Pack Kaelbling,et al. CAPIR: Collaborative Action Planning with Intention Recognition , 2011, AIIDE.

[6] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.

[7] Sarit Kraus,et al. Empirical evaluation of ad hoc teamwork in the pursuit domain , 2011, AAMAS.

[8] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[9] Gita Reese Sukthankar,et al. Learning Policies for First Person Shooter Games Using Inverse Reinforcement Learning , 2011, AIIDE.

[10] David Hsu,et al. Monte Carlo Value Iteration with Macro-Actions , 2011, NIPS.

[11] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[12] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[13] Leslie Pack Kaelbling,et al. Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[14] Jeff Orkin,et al. Symbolic Representation of Game World State: Toward Real-Time Planning in Games , 2004 .

[15] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[16] Chris L. Baker,et al. Action understanding as inverse planning , 2009, Cognition.