Imitation Learning for Playing Shogi Based on Generative Adversarial Networks

For imitation learning in games, AI programs commonly learn thinking and evaluating methods from professional players' game records. However, compared to the total number of all possible game states, top players' records are extremely insufficient. The limited amount of high-quality learning materials may become the bottleneck of training artificial intelligence. We proposed to introduce the idea of Generative Adversarial Networks into game programming, and validated its effectiveness in playing Shogi, a Japanese Chess game. The proposed method is experimentally proved to be capable to alleviate the data insufficiency problem and build more competitive AI programs than conventional supervised training methods.

[1]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  William E. Winkler,et al.  Methods for evaluating and creating data quality , 2004, Inf. Syst..

[3]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[6]  Yanjun Qi,et al.  Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently , 2017, ArXiv.