Autonomous Agents in Snake Game via Deep Reinforcement Learning

Since DeepMind pioneered a deep reinforcement learning (DRL) model to play the Atari games, DRL has become a commonly adopted method to enable the agents to learn complex control policies in various video games. However, similar approaches may still need to be improved when applied to more challenging scenarios, where reward signals are sparse and delayed. In this paper, we develop a refined DRL model to enable our autonomous agent to play the classical Snake Game, whose constraint gets stricter as the game progresses. Specifically, we employ a convolutional neural network (CNN) trained with a variant of Q-learning. Moreover, we propose a carefully designed reward mechanism to properly train the network, adopt a training gap strategy to temporarily bypass training after the location of the target changes, and introduce a dual experience replay method to categorize different experiences for better training efficacy. The experimental results show that our agent outperforms the baseline model and surpasses human-level performance in terms of playing the Snake Game.

[1]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[2]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[3]  Ah-Hwee Tan,et al.  Creating Autonomous Adaptive Agents in a Real-Time First-Person Shooter Computer Game , 2015, IEEE Transactions on Computational Intelligence and AI in Games.

[4]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[5]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[6]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Toshiharu Sugawara,et al.  Learning to Coordinate with Deep Reinforcement Learning in Doubles Pong Game , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[9]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[10]  Hiroyuki Iida,et al.  Finding Comfortable Settings of Snake Game Using Game Refinement Measurement , 2016, CSA/CUTE.

[11]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[12]  Stefanie Tellex,et al.  Implementing the Deep Q-Network , 2017, ArXiv.

[13]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[14]  Kyung-Joong Kim,et al.  Deep Q networks for visual fighting game AI , 2017, 2017 IEEE Conference on Computational Intelligence and Games (CIG).

[15]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[16]  Hao Yi Ong,et al.  Distributed Deep Q-Learning , 2015, ArXiv.

[17]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[18]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.