Play selection in American football: a case study in neuro-dynamic programming

We present a computational case study of neuro-dynamic programming, a recent class of reinforcement learning methods. We cast the problem of play selection in American football as a stochastic shortest path Markov Decision Problem (MDP). In particular, we consider the problem faced by a quarterback in attempting to maximize the net score of an offensive drive. The resulting optimization problem serves as a medium-scale testbed for numerical algorithms based on policy iteration.

[1]  Paul J. Werbos,et al.  Approximate dynamic programming for real-time control and neural modeling , 1992 .

[2]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[3]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[4]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[5]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[6]  Dimitri P. Bertsekas,et al.  A Counterexample to Temporal Differences Learning , 1995, Neural Computation.

[7]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[8]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[9]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[10]  Franklin A. Graybill,et al.  Introduction to The theory , 1974 .

[11]  Ward Whitt,et al.  Approximations of Dynamic Programs, II , 1979, Math. Oper. Res..

[12]  P. Schweitzer,et al.  Generalized polynomial approximations in Markovian decision processes , 1985 .

[13]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[14]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[15]  Ward Whitt,et al.  Approximations of Dynamic Programs, I , 1978, Math. Oper. Res..