Computer poker: A review

The game of poker has been identified as a beneficial domain for current AI research because of the properties it possesses such as the need to deal with hidden information and stochasticity. The identification of poker as a useful research domain has inevitably resulted in increased attention from academic researchers who have pursued many separate avenues of research in the area of computer poker. The poker domain has often featured in previous review papers that focus on games in general, however a comprehensive review paper with a specific focus on computer poker has so far been lacking in the literature. In this paper, we present a review of recent algorithms and approaches in the area of computer poker, along with a survey of the autonomous poker agents that have resulted from this research. We begin with the first serious attempts to create strong computerised poker players by constructing knowledge-based and simulation-based systems. This is followed by the use of computational game theory to construct robust poker agents and the advances that have been made in this area. Approaches to constructing exploitive agents are reviewed and the challenging problems of creating accurate and dynamic opponent models are addressed. Finally, we conclude with a selection of alternative approaches that have received attention in previously published material and the interesting problems that they pose.

[1]  Kevin Waugh,et al.  Strategy Grafting in Extensive Games , 2009, NIPS.

[2]  Troels Bjerre Lund,et al.  A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs , 2008, AAMAS.

[3]  Brian Sheppard,et al.  World-championship-caliber Scrabble , 2002, Artif. Intell..

[4]  Jonathan Schaeffer,et al.  A Gamut of Games , 2001, AI Mag..

[5]  Guy Van den Broeck,et al.  Monte-Carlo Tree Search in Poker Using Expected Reward Distributions , 2009, ACML.

[6]  Jason Noble,et al.  Finding Robust Texas Hold'em Poker Strategies Using Pareto Coevolution and Deterministic Crowding , 2002, ICMLA.

[7]  Tuomas Sandholm,et al.  Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[8]  O. H. Brownlee,et al.  ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .

[9]  Kathryn B. Laskey,et al.  UAI '99: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, July 30 - August 1, 1999 , 1999, UAI.

[10]  Jonathan Schaeffer,et al.  Poker as Testbed for AI Research , 1998, Canadian Conference on AI.

[11]  Michael H. Bowling,et al.  Data Biased Robust Counter Strategies , 2009, AISTATS.

[12]  Ian D. Watson,et al.  Similarity-Based Retrieval and Solution Re-use Policies in the Game of Texas Hold'em , 2010, ICCBR.

[13]  Jonathan Schaeffer,et al.  Opponent Modeling in Poker , 1998, AAAI/IAAI.

[14]  David Sklansky,et al.  Hold'Em Poker for Advanced Players , 1999 .

[15]  Kevin Waugh,et al.  Abstraction pathologies in extensive games , 2009, AAMAS.

[16]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[17]  Terence Conrad Schauenberg,et al.  Opponent Modelling and Search in Poker , 2006 .

[18]  A. Rollett,et al.  The Monte Carlo Method , 2004 .

[19]  Bjørnar Tessem,et al.  A Case-Based Learner for Poker , 2006 .

[20]  Johannes Fürnkranz,et al.  An Exploitative Monte-Carlo Poker Agent , 2009, LWA.

[22]  Kevin B. Korb,et al.  Bayesian Poker , 1999, UAI.

[23]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[24]  Bernhard von Stengel,et al.  Fast algorithms for finding randomized strategies in game trees , 1994, STOC '94.

[25]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[26]  Robert J. Hilderman,et al.  No-Limit Texas Hold'em Poker agents created with evolutionary neural networks , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[27]  Matthew L. Ginsberg,et al.  GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.

[28]  Tuomas Sandholm,et al.  Expectation-Based Versus Potential-Aware Automated Abstraction in Imperfect Information Games: An Experimental Comparison Using Poker , 2008, AAAI.

[29]  Jonathan Schaeffer,et al.  The games computers (and people) play , 2000, Adv. Comput..

[30]  Christopher K. Riesbeck,et al.  Inside Case-Based Reasoning , 1989 .

[31]  Tuomas Sandholm,et al.  Computing an approximate jam/fold equilibrium for 3-player no-limit Texas Hold'em tournaments , 2008, AAMAS.

[32]  Jonathan Schaeffer,et al.  Using Selective-Sampling Simulations in Poker , 1999 .

[33]  Peña Castillo,et al.  Probabilities and simulations in poker , 1999 .

[34]  Marc Lanctot,et al.  MCRNR: Fast Computing of Restricted Nash Responses by Means of Sampling , 2010, Interactive Decision Theory and Game Theory.

[35]  Tuomas Sandholm,et al.  Speeding up gradient-based algorithms for sequential games , 2010, AAMAS.

[36]  Michael L. Littman,et al.  Abstraction Methods for Game Theoretic Poker , 2000, Computers and Games.

[37]  John Aaron. Davidson,et al.  Opponent modeling in poker: learning and acting in a hostile and uncertain environment , 2002 .

[38]  Kevin Waugh,et al.  A demonstration of the Polaris poker system , 2009, AAMAS.

[39]  Avi Pfeffer,et al.  Generating and Solving Imperfect Information Games , 1995, IJCAI.

[40]  Allen Newell,et al.  SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..

[41]  Rickard Andersson Pseudo-Optimal Strategies in No-Limit Poker , 2006, J. Int. Comput. Games Assoc..

[42]  Ian D. Watson,et al.  Investigating the Effectiveness of Applying Case-Based Reasoning to the Game of Texas Hold'em , 2007, FLAIRS.

[43]  Rémi Munos,et al.  Adaptive play in Texas Hold'em Poker , 2008, ECAI.

[44]  David Gerhard,et al.  Pattern Classification in No-Limit Poker: A Head-Start Evolutionary Approach , 2007, Canadian Conference on AI.

[45]  G. Towl Editorial , 2012, Evidence Based Mental Health.

[46]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[47]  Michael H. Bowling,et al.  Computing Robust Counter-Strategies , 2007, NIPS.

[48]  William Dudziak Using Fictitious Play to Find Pseudo-optimal Solutions for Full-scale Poker , 2006, IC-AI.

[49]  Kevin B. Korb,et al.  USING BAYESIAN DECISION NETWORKS TO PLAY TEXAS HOLD ’ EM POKER , 2006 .

[50]  Bruce W. Ballard,et al.  The *-Minimax Search Procedure for Trees Containing Chance Nodes , 1983, Artif. Intell..

[51]  Johannes Fürnkranz,et al.  Machines that learn to play games , 2001 .

[52]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[53]  Dap Hartmann,et al.  MACHINES THAT LEARN TO PLAY GAMES , 2002 .

[54]  Ian D. Watson,et al.  A Memory-Based Approach to Two-Player Texas Hold'em , 2009, Australasian Conference on Artificial Intelligence.

[55]  Darse Billings Algorithms and assessment in computer poker , 2006 .

[56]  Darse Billings,et al.  A Tool for the Direct Assessment of Poker Decisions , 2006, J. Int. Comput. Games Assoc..

[57]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[58]  Jonathan Schaeffer,et al.  Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games , 2004, Computers and Games.

[59]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[60]  Peter Bro Miltersen,et al.  A near-optimal strategy for a heads-up no-limit Texas Hold'em poker tournament , 2007, AAMAS '07.

[61]  Robert J. Vanderbei,et al.  Linear Programming: Foundations and Extensions , 1998, Kluwer international series in operations research and management service.

[62]  Donald E. Knuth,et al.  The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.

[63]  David Schnizlein,et al.  State translation in no-limit poker , 2009 .

[64]  D. Papp Dealing with imperfect information in poker , 1998 .

[65]  Javier Peña,et al.  Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games , 2007, WINE.

[66]  Michael R. Genesereth,et al.  General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[67]  R. Watson,et al.  Pareto coevolution: using performance against coevolved opponents in a game as dimensions for Pareto selection , 2001 .

[68]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[69]  David Sklansky,et al.  The Theory of Poker , 1999 .

[70]  Duane Szafron,et al.  Using counterfactual regret minimization to create competitive multiplayer poker agents , 2010, AAMAS.

[71]  Troels Bjerre Lund,et al.  Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker , 2007, AAAI.

[72]  Kevin Waugh,et al.  A Practical Use of Imperfect Recall , 2009, SARA.

[73]  Tuomas Sandholm,et al.  A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[74]  Jonathan Schaeffer,et al.  Improved Opponent Modeling in Poker , 2000 .

[75]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[76]  Arthur Tay,et al.  Evolving Nash-optimal poker strategies using evolutionary computation , 2009, Frontiers of Computer Science in China.

[77]  Tuomas Sandholm,et al.  Lossless abstraction of imperfect information games , 2007, JACM.

[78]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[79]  Jonathan Schaeffer,et al.  The challenge of poker , 2002, Artif. Intell..

[80]  Jonathan Rubin,et al.  CASPER: DESIGN AND DEVELOPMENT OF A CASE-BASED POKER PLAYER , 2007 .

[81]  Kevin Waugh,et al.  Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.

[82]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[83]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[84]  Michael L. Littman,et al.  The 2006 AAAI Computer Poker Competition , 2006 .

[85]  Michael H. Bowling,et al.  A New Algorithm for Generating Equilibria in Massive Zero-Sum Games , 2007, AAAI.

[86]  Ian D. Watson,et al.  CASPER: A Case-Based Poker-Bot , 2008, Australasian Conference on Artificial Intelligence.

[87]  Martin Zinkevich,et al.  The Annual Computer Poker Competition , 2013, AI Mag..

[88]  Jonathan Schaeffer,et al.  Using Probabilistic Knowledge and Simulation to Play Poker , 1999, AAAI/IAAI.