Performance Analysis of Two Parallel Game-Tree Search Applications

Game-tree search plays an important role in the field of artificial intelligence. In this paper we analyze scalability performance of two parallel game-tree search applications in chess on two shared-memory multiprocessor systems. One is a recently-proposed Parallel Randomized Best-First Minimax search algorithm (PRBFM) in a chess-playing program, and the other is Crafty, a state-of-the-art alpha-beta-based chess-playing program. The analysis shows that the hash-table and dynamic tree splitting operations used in Crafty result in large scalability penalties while PRBFM prevents those penalties by using a fundamentally different search strategy. Our micro-architectural analysis also shows that PRBFM is memory-friendly while Crafty is latency-sensitive and both of them are not bandwidth bound. Although PRBFM is slower than Crafty in sequential performance, it will be much faster than Crafty on middle-scale multiprocessor systems due to its much better scalability. This makes the PRBFM a promising parallel game-tree search algorithm on future large-scale chip multiprocessor systems.