Pruning in UCT Algorithm

UCT is a Monte-Carlo planning algorithm that, with in a given amount of time, computes near-optimal solutions for Markovian decision processes of large state spaces. It has gained much attention from there search community and been used in many applications since its publication in 2006, because of its significant improvement of the effectiveness of Monte-Carlo planning computation. This paper proposes a modification of the UCT algorithm, which can prune certain Markovian decision process actions and their associated states during the Monte-Carlo planning computation. The pruning of actions and states is performed based on properties of underlying UCB algorithms of UCT. This paper proves that it is highly unlikely for the pruned actions and states to be in the solution path returned by the UCT algorithm, making the pruning modification almost just as good as the original algorithm. Additionally, the pruning modification may reduce the size of the Markovian decision process state space, and thus improves the effectiveness of the original algorithm. Experimental results in computer GO demonstrate the effectiveness of pruning in the UCT algorithm.