The evolution of blackjack strategies

We investigate the evolution of a blackjack player. We utilise three neural networks (one for splitting, one for doubling down and one for standing/hitting) to evolve blackjack strategies. Initially a pool of randomly generated players play 1000 hands of blackjack. An evolutionary strategy is used to mutate the best networks (with the worst networks being killed). We compare the best evolved strategies to other well-known strategies and show that we can beat the play of an average casino player. We also show that we are able to learn parts of Thorpe's basic strategy.

[1]  Edward O. Thorp,et al.  Beat the Dealer: A Winning Strategy for the Game of Twenty-One , 1965 .

[2]  Jonathan Schaeffer,et al.  One jump ahead - challenging human supremacy in checkers , 1997, J. Int. Comput. Games Assoc..

[3]  David B. Fogel,et al.  Anaconda defeats Hoyle 6-0: a case study competing an evolved checkers program against commercially available software , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[4]  Claude E. Shannon,et al.  XXII. Programming a Computer for Playing Chess 1 , 1950 .

[5]  David B. Fogel,et al.  Evolving an expert checkers playing program without using human expertise , 2001, IEEE Trans. Evol. Comput..

[6]  A. V. Uskov,et al.  Programming a computer to play chess , 1970 .

[7]  E. Sanchez,et al.  Blackjack as a test bed for learning strategies in neural networks , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[8]  R. R. Baldwin,et al.  The Optimum Strategy in Blackjack , 1956 .

[9]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[10]  Jonathan Schaeffer,et al.  The challenge of poker , 2002, Artif. Intell..

[11]  David B. Fogel,et al.  Blondie24: Playing at the Edge of AI , 2001 .

[12]  E Thorp,et al.  A FAVORABLE STRATEGY FOR TWENTY-ONE. , 1961, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Daniel Olson,et al.  Learning to Play Games From Experience: An Application of Artificial Neural Networks and Temporal Di , 1993 .

[14]  D. Knuth,et al.  Computer poker , 1995 .

[15]  David B. Fogel,et al.  Evolving neural networks to play checkers without relying on expert knowledge , 1999, IEEE Trans. Neural Networks.

[16]  Bernard Widrow,et al.  Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..