Opponent Modelling in the Game of Tron using Reinforcement Learning
暂无分享,去创建一个
[1] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.
[2] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[3] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.
[4] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[5] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[6] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[7] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[8] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.
[9] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[10] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..
[11] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[12] Marco Wiering,et al. Connectionist reinforcement learning for intelligent unit micro management in StarCraft , 2011, The 2011 International Joint Conference on Neural Networks.
[13] Tuomas Sandholm,et al. Game theory-based opponent modeling in large imperfect-information games , 2011, AAMAS.
[14] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[15] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[16] R. Bellman. A Markovian Decision Process , 1957 .
[17] Marco Wiering,et al. Reinforcement learning to train Ms. Pac-Man using higher-order action-relative inputs , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[18] Richard Andrew Mealing,et al. Dynamic opponent modelling in two-player games , 2015 .