Learning of Evaluation Functions to Realize Playing Styles in Shogi

This paper presents a method to give a computer player an intended playing style by the machine learning of an evaluation function. Recent improvements in machine learning techniques have realized the automated tuning of the feature weight vector of an evaluation function. To make a strong player, as many moves as possible of strong players' game records are needed, though the number of available game records decreases when we focus on a specific playing style. To pursue both goals of playing style and playing strength, we present three steps of learning: classifying moves with respect to playing styles, training the weight vector of an evaluation function by using the whole set of game records to maximize its playing strength, and modifying the weight vector carefully so as to improve agreement with the moves of the intended playing style. We applied our method to realize players of defense or attack-oriented style in shogi and tested the players by self-play against the original version. The results confirmed that the presented method successfully adjusted evaluation functions in that the frequency of defensive moves is significantly increased or decreased in accordance with the game records used while keeping the winning ratio at almost 50 %.

[1]  Joel Veness,et al.  Bootstrapping from Game Tree Search , 2009, NIPS.

[2]  Andrew Tridgell,et al.  Learning to Play Chess Using Temporal Differences , 2000, Machine Learning.

[3]  Murray Campbell,et al.  Deep Blue , 2002, Artif. Intell..

[4]  Masakazu Muramatsu,et al.  Efficiency of three forward-pruning techniques in shogi: Futility pruning, null-move pruning, and Late Move Reduction (LMR) , 2012, Entertain. Comput..

[5]  Michael Buro,et al.  Improving heuristic mini-max search by supervised learning , 2002, Artif. Intell..

[6]  H. Jaap van den Herik,et al.  Selecting evaluation functions in Opponent-Model search , 2005, Theor. Comput. Sci..

[7]  Hiroyuki Iida,et al.  Computer shogi , 2002, Artif. Intell..

[8]  Takeshi Ito,et al.  A Trial AI System with Its Suggestion of Kifuu (playing style) in Shogi , 2010, 2010 International Conference on Technologies and Applications of Artificial Intelligence.

[9]  Tomoyuki Kaneko,et al.  Large-Scale Optimization for Evaluation Functions with Minimax Search , 2014, J. Artif. Intell. Res..

[10]  Gerald Tesauro,et al.  Programming backgammon using self-teaching neural nets , 2002, Artif. Intell..