论文信息 - Monte-Carlo tree search in production management problems

Monte-Carlo tree search in production management problems

Classical search algorithms rely on the existence of a sufficiently powerful evaluation function for non-terminal states. In many task domains, the development of such an evaluation function requires substantial effort and domain knowledge, or is not even possible. As an alternative in recent years, Monte-Carlo evaluation has been succesfully applied in such task domains. In this paper, we apply a search algorithm based on Monte-Carlo evaluation, MonteCarlo Tree Search, in the task domain of production management problems. These can be defined as single-agent problems which consist of selecting a sequence of actions with side effects, leading to high quantities of one or more goal products. They are challenging and can be constructed with highly variable difficulty. Earlier research yielded an offline learning algorithm that leads to good solutions, but requires a long time to run. We show that Monte-Carlo Tree Search leads to a solution in a shorter period of time than this algorithm, with improved solutions for large problems. Our findings can be generalized to other task domains.

[1] Judea Pearl,et al. The solution for the branching factor of the alpha-beta pruning algorithm and its optimality , 1982, CACM.

[2] S.J.J. Smith,et al. Empirical Methods for Artificial Intelligence , 1995 .

[3] Jonathan Schaeffer,et al. CHINOOK: The World Man-Machine Checkers Champion , 1996, AI Mag..

[4] Dana S. Nau,et al. Computer Bridge - A Big Win for AI Planning , 1998, AI Mag..

[5] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .

[6] Martin Müller. Not Like Other Games -- Why Tree Search in Go is Different , 2000 .

[7] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..

[8] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..

[9] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.

[10] Alexander Nareyek,et al. Choosing search heuristics by non-stationary reinforcement learning , 2004 .

[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12] Nico Roos,et al. Evolutionary Planning Heuristics in Production Management , 2005, BNAIC.

[13] Bruno Bouzy,et al. HISTORY AND TERRITORY HEURISTICS FOR MONTE CARLO GO , 2006 .