Mastering 2048 With Delayed Temporal Coherence Learning, Multistage Weight Promotion, Redundant Encoding, and Carousel Shaping
暂无分享,去创建一个
[1] Andrew G. Barto,et al. Adaptive Step-Size for Online Temporal Difference Learning , 2012, AAAI.
[2] D. Michie. GAME-PLAYING AND GAME-LEARNING AUTOMATA , 1966 .
[3] J. Albus. A Theory of Cerebellar Function , 1971 .
[4] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[5] Wolfgang Konen,et al. Temporal difference learning with eligibility traces for the game connect four , 2014, 2014 IEEE Conference on Computational Intelligence and Games.
[6] Wojciech Jaskowski,et al. High-Dimensional Function Approximation for Knowledge-Free Reinforcement Learning: a Case Study in SZ-Tetris , 2015, GECCO.
[7] Michael Buro,et al. From Simple Features to Sophisticated Evaluation Functions , 1998, Computers and Games.
[8] Rahul Mehta. 2048 is (PSPACE) Hard, but Sometimes Easy , 2014, Electron. Colloquium Comput. Complex..
[9] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[10] Patrick M. Pilarski,et al. Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[12] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[13] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[14] Donald F. Beal,et al. Temporal Coherence and Prediction Decay in TD Learning , 1999, IJCAI.
[15] John Levine,et al. An investigation into 2048 AI strategies , 2014, 2014 IEEE Conference on Computational Intelligence and Games.
[16] Wojciech Jaskowski,et al. Coevolutionary CMA-ES for Knowledge-Free Learning of Game Position Evaluation , 2016, IEEE Transactions on Computational Intelligence and AI in Games.
[17] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[18] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[19] Edward P. Manning. Using Resource-Limited Nash Memory to Improve an Othello Evaluation Function , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[20] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[21] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[22] Simon M. Lucas,et al. Temporal Difference Learning Versus Co-Evolution for Acquiring Othello Position Evaluation , 2006, 2006 IEEE Symposium on Computational Intelligence and Games.
[23] Wojciech Jaskowski,et al. On Scalability, Generalization, and Hybridization of Coevolutionary Learning: A Case Study for Othello , 2013, IEEE Transactions on Computational Intelligence and AI in Games.
[24] Todd W. Neller. Pedagogical possibilities for the 2048 puzzle game , 2015 .
[25] Wojciech Jaskowski. Systematic n-Tuple Networks for Othello Position Evaluation , 2014, J. Int. Comput. Games Assoc..
[26] Wolfgang Konen,et al. Reinforcement Learning with N-tuples on the Game Connect-4 , 2012, PPSN.
[27] Simon M. Lucas. Learning to Play Othello with N-Tuple Systems , 2008 .
[28] Jos W. H. M. Uiterwijk,et al. CHANCEPROBCUT: Forward pruning in chance nodes , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.
[29] I-Chen Wu,et al. Multistage Temporal Difference Learning for 2048-Like Games , 2017, IEEE Transactions on Computational Intelligence and AI in Games.
[30] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[31] I-Chen Wu,et al. Multi-Stage Temporal Difference Learning for 2048 , 2014, TAAI.
[32] R. M. Burstall,et al. Advances in programming and non-numerical computation , 1967, The Mathematical Gazette.
[33] Wojciech Jaskowski,et al. Temporal difference learning of N-tuple networks for the game 2048 , 2014, 2014 IEEE Conference on Computational Intelligence and Games.
[34] Wolfgang Konen,et al. Online Adaptable Learning Rates for the Game Connect-4 , 2016, IEEE Transactions on Computational Intelligence and AI in Games.
[35] Simon M. Lucas,et al. Preference Learning for Move Prediction and Evaluation Function Approximation in Othello , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[36] W. W. Bledsoe,et al. Pattern recognition and reading by machine , 1959, IRE-AIEE-ACM '59 (Eastern).
[37] Matthieu Geist,et al. Approximate modified policy iteration and its application to the game of Tetris , 2015, J. Mach. Learn. Res..
[38] Kiminori Matsuzaki,et al. Systematic Selection of N-Tuple Networks for 2048 , 2016, Computers and Games.