UCT for Tactical Assault Planning in Real-Time Strategy Games

We consider the problem of tactical assault planning in real-time strategy games where a team of friendly agents must launch an assault on an enemy. This problem offers many challenges including a highly dynamic and uncertain environment, multiple agents, durative actions, numeric attributes, and different optimization objectives. While the dynamics of this problem are quite complex, it is often possible to provide or learn a coarse simulation-based model of a tactical domain, which makes Monte-Carlo planning an attractive approach. In this paper, we investigate the use of UCT, a recent Monte-Carlo planning algorithm for this problem. UCT has recently shown impressive successes in the area of games, particularly Go, but has not yet been considered in the context of multiagent tactical planning. We discuss the challenges of adapting UCT to our domain and an implementation which allows for the optimization of user specified objective functions. We present an evaluation of our approach on a range of tactical assault problems with different objectives in the RTS game Wargus. The results indicate that our planner is able to generate superior plans compared to several baselines and a human player.

[1]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[2]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[3]  Robin Burke,et al.  Knowledge-based recommender systems , 2000 .

[4]  Hideo Shimazu,et al.  ExpertClerk: Navigating Shoppers Buying Process with the Combination of Asking and Proposing , 2001, IJCAI.

[5]  Derek Bridge,et al.  Product Recommendation Systems: A New Direction , 2001 .

[6]  Barry Smyth,et al.  Similarity vs. Diversity , 2001, ICCBR.

[7]  Barry Smyth,et al.  Comparison-Based Recommendation , 2002, ECCBR.

[8]  Barry Smyth,et al.  On the Role of Diversity in Conversational Recommender Systems , 2003, ICCBR.

[9]  Michael Buro,et al.  RTS Games and Real-Time AI Research , 2003 .

[10]  S. Griffis EDITOR , 1997, Journal of Navigation.

[11]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[12]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[13]  Michael Buro,et al.  Heuristic Search Applied to Abstract Combat Games , 2005, Canadian Conference on AI.

[14]  Jonathan Schaeffer,et al.  Monte Carlo Planning in RTS Games , 2005, CIG.

[15]  Sylvain Gelly,et al.  Exploration exploitation in Go: UCT for Monte-Carlo Go , 2006, NIPS 2006.

[16]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[17]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.

[18]  Alan Fern,et al.  Online Planning for Resource Production in Real-Time Strategy Games , 2007, ICAPS.

[19]  Michael Buro,et al.  Adversarial Planning Through Strategy Simulation , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[20]  Alan Fern,et al.  Learning and transferring roles in multi-agent MDPs , 2008, AAAI 2008.

[21]  Alan Fern,et al.  Learning and Transferring Roles in Multi-Agent Reinforcement , 2008 .

[22]  Radha-Krishna Balla UCT for tactical assault battles in real-time strategy games , 2009 .