论文信息 - Methods for approximating value functions for the Dominion card game

Methods for approximating value functions for the Dominion card game

AbstractArtificial neural networks have been successfully used to approximate value functions for tasks involving decision making. In domains where decisions require a shift in judgment as the overall state changes, it is hypothesized here that methods utilizing multiple artificial neural networks are likely to provide a benefit as an approximation of a value function over those that employ a single network. The card game Dominion was chosen as the domain to examine this. This paper compares artificial neural networks generated by multiple machine learning methods successfully applied to other games (such as in TD-Gammon) to a genetic algorithm method for generating two neural networks for different phases of the game along with evolving the transition point. The results demonstrate a greater success ratio with the genetic algorithm applied to two neural networks. This suggests that future work examining more complex neural network configurations and richer evolutionary exploration could apply to Dominion as well as other domains necessitating shifts in strategy.

Ransom K. Winder

[1] Ransom K. Winder. Generating Artificial Neural Networks for Value Function Approximation in a Domain Requiring a Shifting Strategy , 2013, EvoApplications.

[2] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[3] David B. Fogel,et al. Evolving neural networks to play checkers without relying on expert knowledge , 1999, IEEE Trans. Neural Networks.

[4] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[5] Moshe Sipper,et al. GP-EndChess: Using Genetic Programming to Evolve Chess Endgame Players , 2005, EuroGP.

[6] Wentong Cai,et al. Simulation-based optimization of StarCraft tactical AI through evolutionary computation , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[7] Moshe Sipper,et al. Using GP-Gammon: Using Genetic Programming to Evolve Backgammon Players , 2005, EuroGP.

[8] Julian Togelius,et al. Evolving card sets towards balancing dominion , 2012, 2012 IEEE Congress on Evolutionary Computation.

[9] David E. Goldberg,et al. Genetic algorithms and Machine Learning , 1988, Machine Learning.

[10] Michael Pfeiffer,et al. Reinforcement Learning of Strategies for Settlers of Catan , 2004 .

[11] Sai-Keung Wong,et al. A Study on Genetic Algorithm and Neural Network for Implementing Mini-Games , 2010, 2010 International Conference on Technologies and Applications of Artificial Intelligence.

[12] Wee-Chong Oon,et al. M 2 ICAL analyses HC-gammon , 2007, AAAI 2007.

[13] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[14] D.B. Fogel,et al. A self-learning evolutionary chess program , 2004, Proceedings of the IEEE.

[15] Thomas Bartz-Beielstein,et al. Reinforcement learning for games: failures and successes , 2009, GECCO '09.

[16] Gary Montague,et al. Genetic programming: an introduction and survey of applications , 1997 .

[17] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[18] Chih-Sheng Lin,et al. Emergent Tactical Formation Using Genetic Algorithm in Real-Time Strategy Games , 2011, 2011 International Conference on Technologies and Applications of Artificial Intelligence.

[19] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[20] Donald C. Wunsch,et al. Computer Go: A Grand Challenge to AI , 2007, Challenges for Computational Intelligence.

[21] Moshe Sipper,et al. Evolving board-game players with genetic programming , 2011, GECCO.

[22] Jordan B. Pollack,et al. Co-Evolution in the Successful Learning of Backgammon Strategy , 1998, Machine Learning.

[23] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[24] John R. Koza,et al. Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[25] Mukta Paliwal,et al. Neural networks and statistical techniques: A review of applications , 2009, Expert Syst. Appl..

[26] Sai-Keung Wong,et al. A Study on Genetic Algorithm and Neural Network for Mini-Games , 2012, J. Inf. Sci. Eng..

[27] Rémi Coulom,et al. High-accuracy value-function approximation with neural networks applied to the acrobot , 2004, ESANN.

[28] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[29] Risto Miikkulainen,et al. Evolving Neural Networks to Play Go , 2004, Applied Intelligence.

[30] Wee-Chong Oon,et al. M2ICAL Analyses HC-Gammon , 2007, AAAI.

[31] Darryl Charles,et al. Improving Temporal Difference game agent control using a dynamic exploration during control learning , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[32] Chang Kee Tong,et al. Evolving Neural Controllers Using GA for Warcraft 3-Real Time Strategy Game , 2011, 2011 Sixth International Conference on Bio-Inspired Computing: Theories and Applications.

[33] Kaijun Leng,et al. A Genetic Algorithm Approach for TOC-based Supply Chain Coordination , 2012 .