论文信息 - Generating Novice Heuristics for Post-Flop Poker

Generating Novice Heuristics for Post-Flop Poker

Agents now exist that can play Texas Hold’em Poker at a very high level, and simplified versions of the game have been solved. However, this does not directly translate to learning heuristics humans can use to play the game. We address the problem of learning chains of human-learnable heuristics for playing heads-up limit Texas Hold’em, focusing on the post-flop stages of the game. By restricting the policy space to fast and frugal trees, which are sequences of if-then-else rules, we can learn such heuristics using several methods including genetic programming. This work builds on our previous work on learning such heuristic rule set for Blackjack and pre-flop Texas Hold’em, but introduces a richer language for heuristics.

Julian Togelius | Andy Nealen | Frank Lantz | Fernando de Mesentier Silva

[1] Jonathan Schaeffer,et al. A World Championship Caliber Checkers Program , 1992, Artif. Intell..

[2] David Sklansky,et al. The Theory of Poker , 1999 .

[3] Julian Togelius,et al. Super mario evolution , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[4] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[5] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[6] John R. Koza,et al. Genetic programming as a means for programming computers by natural selection , 1994 .

[7] Moshe Sipper,et al. GP-Gammon: Genetically Programming Backgammon Players , 2005, Genetic Programming and Evolvable Machines.

[8] Julian Togelius,et al. Depth in Strategic Games , 2017, AAAI Workshops.

[9] Luigi Barone,et al. An adaptive learning model for simplified poker using evolutionary algorithms , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[10] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[11] D. Kahneman. Maps of Bounded Rationality: Psychology for Behavioral Economics , 2003 .

[12] Gerd Gigerenzer,et al. Fast and frugal heuristics: The tools of bounded rationality , 2004 .

[13] Moshe Sipper,et al. Evolving Lose-Checkers players using genetic programming , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[14] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.

[15] Riccardo Poli,et al. Evolutionary Solo Pong players , 2005, 2005 IEEE Congress on Evolutionary Computation.

[16] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.

[17] Riccardo Poli,et al. A Field Guide to Genetic Programming , 2008 .

[18] Michael Buro,et al. From Simple Features to Sophisticated Evaluation Functions , 1998, Computers and Games.

[19] Michael H. Bowling,et al. Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.

[20] Julian Togelius,et al. Generating beginner heuristics for simple texas hold'em , 2018, GECCO.

[21] Julian Togelius,et al. Evolving robust and specialized car racing skills , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[22] Oliver Kramer,et al. Genetic Algorithm Essentials , 2017, Studies in Computational Intelligence.

[23] Tuomas Sandholm,et al. A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[24] R. Rivest. Learning Decision Lists , 1987, Machine Learning.

[25] G Gigerenzer,et al. Reasoning the fast and frugal way: models of bounded rationality. , 1996, Psychological review.

[26] Julian Togelius,et al. Generating heuristics for novice players , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[27] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..