论文信息 - Knowledge Generation for Improving Simulations in UCT for General Game Playing

Knowledge Generation for Improving Simulations in UCT for General Game Playing

General Game Playing (GGP) aims at developing game playing agents that are able to play a variety of games and, in the absence of pre-programmed game specific knowledge, become proficient players. Most GGP players have used standard tree-search techniques enhanced by automatic heuristic learning. The UCT algorithm, a simulation-based tree search, is a new approach and has been used successfully in GGP. However, it relies heavily on random simulations to assign values to unvisited nodes and selecting nodes for descending down a tree. This can lead to slower convergence times in UCT. In this paper, we discuss the generation and evolution of domain-independent knowledge using both state and move patterns. This is then used to guide the simulations in UCT. In order to test the improvements, we create matches between a player using standard the UCT algorithm and one using UCT enhanced with knowledge.

Scott D. Goodwin | Ziad Kobti | Shiven Sharma

[1] Neil D. Lawrence,et al. Missing Data in Kernel PCA , 2006, ECML.

[2] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[3] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[4] Bikramjit Banerjee and Gregory Kuhlmann and Peter Stone. Value Function Transfer for General Game Playing , 2006 .

[5] Scott D. Goodwin,et al. General Game Playing with Ants , 2008, SEAL.

[6] Elliot B. Koffman. Learning Games through Pattern Recognition , 1968, IEEE Trans. Syst. Sci. Cybern..

[7] Ziad Kobti,et al. A Multi-Agent Architecture for Game Playing , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[8] Michael R. Genesereth,et al. General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[9] Jonathan Schaeffer,et al. The History Heuristic and Alpha-Beta Search Enhancements in Practice , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[10] Sylvain Gelly,et al. Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[11] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[12] James E. Clune,et al. Heuristic Evaluation Functions for General Game Playing , 2007, KI - Künstliche Intelligenz.

[13] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.

[14] Richard S. Sutton,et al. Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.

[15] Albert L. Zobrist,et al. A New Hashing Method with Application for Game Playing , 1990 .

[16] Stephan Schiffel,et al. Fluxplayer: A Successful General Game Player , 2007, AAAI.