Simulation, learning, and optimization techniques in Watson's game strategies
暂无分享,去创建一个
Gerald Tesauro | James Fan | David Gondek | John M. Prager | Jonathan Lenchner | G. Tesauro | David Gondek | James Fan | J. Prager | J. Lenchner
[1] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[2] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Kurt Hornik,et al. On the generation of correlated artificial binary data , 1998 .
[7] Matthew L. Ginsberg,et al. GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.
[8] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..
[9] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..
[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[11] M. Dufwenberg. Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.