论文信息 - Countering Evolutionary Forgetting in No-Limit Texas Hold'em Poker Agents

Countering Evolutionary Forgetting in No-Limit Texas Hold'em Poker Agents

No-Limit Texas Hold’em Poker is a stochastic game of imperfect information. Each player receives cards dealt randomly and does not know which cards his opponents have been dealt. These simple features result in No-Limit Texas Hold’em Poker having a large decision space in comparison to other classic games such as Backgammon and Chess. Evolutionary algorithms and neural networks have been shown to find solutions in large and non-linear decision spaces and have proven to aid decision making in No-Limit Texas Hold’em Poker. In this paper, a hybrid method known as evolving neural networks is used by No-Limit Texas Hold’em Poker playing agents to make betting decisions. When selecting a new generation of agents, evolutionary forgetting can result in selecting an agent with betting behaviour that has previously been shown to be inferior. To prevent this from occurring, we utilize two heuristics: halls of fame and co-evolution. In addition, we evaluate agent fitness using three fitness functions based upon, respectively, the length of time an agent survives in a tournament, the number of hands won in a tournament, and the average amount of money won across all hands in a tournament. Results show that the length of time an agent survives is indeed an appropriate measure of fitness. Results also show that utilizing halls of fame and co-evolution serve to further improve the fitness of agents. Finally, through monitoring the evolutionary progress of agents, we find that the skill level of agents improves when using our evolutionary heuristics.

Robert J. Hilderman | Garrett Nicolai | Garrett Nicolai

[1] Alex Lubberts and Risto Miikkulainen. Co-Evolving a Go-Playing Neural network , 2001 .

[2] Jonathan Schaeffer,et al. A World Championship Caliber Checkers Program , 1992, Artif. Intell..

[3] Moshe Sipper,et al. GP-EndChess: Using Genetic Programming to Evolve Chess Endgame Players , 2005, EuroGP.

[4] Richard K. Belew,et al. Coevolutionary search among adversaries , 1997 .

[5] Luigi Barone,et al. An adaptive learning model for simplified poker using evolutionary algorithms , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[6] Terence Conrad Schauenberg,et al. Opponent Modelling and Search in Poker , 2006 .

[7] Sebastian Thrun,et al. Learning to Play the Game of Chess , 1994, NIPS.

[8] Graham Kendall,et al. An evolutionary approach for the tuning of a chess evaluation function using population dynamics , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[9] Darse Billings. Algorithms and assessment in computer poker , 2006 .

[10] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..

[11] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..

[12] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..

[13] David Gerhard,et al. Pattern Classification in No-Limit Poker: A Head-Start Evolutionary Approach , 2007, Canadian Conference on AI.

[14] Gerald Tesauro,et al. Programming backgammon using self-teaching neural nets , 2002, Artif. Intell..

[15] Jonathan Schaeffer,et al. Using Selective-Sampling Simulations in Poker , 1999 .

[16] Jordan B. Pollack,et al. Co-Evolution in the Successful Learning of Backgammon Strategy , 1998, Machine Learning.

[17] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.