Reinforcement Learning for Heroes of Newerth Verstärkendes Lernen in Heroes of Newerth

Recently computer games became more and more complicated and complex, leaving plenty of room for humans to come up with tactics and strategies. Artificial intelligence (AI) approaches for computer controlled opponents, however, are lagging behind. Most artificial intelligence opponents in current games are rule-based and therefore follow the same pattern repeatedly. The AI agent is not able to adapt to the current game situation or learn from the player. Higher difficulty levels in the game are usually achieved by cheating (e.g., cheaper units), instead of smarter behavior. This often leads to the player feeling treated unfairly. A solution to this problem could be the use of reinforcement learning. Having a game AI that learns from its flaws instead of using cheats to overpower the player would result in a more appealing game experience for the player. In this thesis, fitted Q-Iteration (FQI) with extra trees (ExT) will be applied to learn Heroes of Newerth (HoN) to see if it is a viable alternative for current game AI. We will show that the algorithms chosen are robust towards irrelevant features, but also point out weaknesses in their performance. It will also show that robustness can be a curse when it comes to optimizing parameters, since changes often do not have a significant impact in the performance. For example, most of the settings for the algorithm resulted in similar results concerning the agent’s performance. Another weakness of ExT are the random cuts which struggle with multiple features conveying similar information. For example, adding multiple features that only change when the hero gains a level leads to chaotic results in the performance. We will see that a naive approach in configuring the algorithm is only sufficient to achieve a small degree of self improvement. Better performance might be achieved by evaluating the relevance of features and filtering them accordingly. Improving the representation for actions would be another way to improve overall performance.

[1]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[2]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[3]  Michael Pfeiffer,et al.  Reinforcement Learning of Strategies for Settlers of Catan , 2004 .

[4]  Michael Buro,et al.  Call for AI Research in RTS Games , 2004 .

[5]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[6]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[7]  Alan Fern,et al.  Online Planning for Resource Production in Real-Time Strategy Games , 2007, ICAPS.

[8]  Richard S. Sutton,et al.  Reinforcement Learning of Local Shape in the Game of Go , 2007, IJCAI.

[9]  Michael Buro,et al.  Adversarial Planning Through Strategy Simulation , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[10]  Levente Kocsis,et al.  Transpositions and move groups in Monte Carlo tree search , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[11]  Martin Allen,et al.  Real-time ai in xpilot using reinforcement learning , 2010, 2010 World Automation Congress.

[12]  Guy Shani,et al.  High-level reinforcement learning in strategy games , 2010, AAMAS.

[13]  Marcus Gallagher,et al.  Reinforcement Learning in First Person Shooter Games , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[14]  A. Gusmão,et al.  Reinforcement Learning In Real-Time Strategy Games , 2012 .

[15]  Damien Ernst,et al.  Comparison of different selection strategies in Monte-Carlo Tree Search for the game of Tron , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).