暂无分享,去创建一个
Pierre Baldi | Forest Agostinelli | Stephen McAleer | Alexander Shmakov | P. Baldi | Forest Agostinelli | Alexander Shmakov | S. McAleer
[1] Gene Cooperman,et al. Twenty-six moves suffice for Rubik's cube , 2007, ISSAC '07.
[2] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[3] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.
[4] Otakar Trunda,et al. Deep Heuristic-learning in the Rubik's Cube Domain: An Experimental Evaluation , 2017, ITAT.
[5] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[6] Tomas Rokicki,et al. Twenty-Two Moves Suffice for Rubik’s Cube® , 2010 .
[7] Malcolm I. Heywood,et al. The Rubik cube and GP Temporal Sequence learning: An initial study , 2011 .
[8] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[9] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[10] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[11] John J. Grefenstette,et al. Evolutionary Algorithms for Reinforcement Learning , 1999, J. Artif. Intell. Res..
[12] Tomas Rokicki,et al. The Diameter of the Rubik's Cube Group Is Twenty , 2013, SIAM J. Discret. Math..
[13] Malcolm I. Heywood,et al. Discovering Rubik's Cube Subgroups using Coevolutionary GP: A Five Twist Experiment , 2016, GECCO.
[14] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[15] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[16] Richard B. Segal,et al. On the Scalability of Parallel UCT , 2010, Computers and Games.
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Léon Bottou,et al. From machine learning to machine reasoning , 2011, Machine Learning.
[19] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[20] Richard E. Korf,et al. Finding Optimal Solutions to Rubik's Cube Using Pattern Databases , 1997, AAAI/IAAI.
[21] Calvin Lee,et al. Rubik’s cube solver , 2018 .
[22] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[23] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[24] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[25] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[26] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[27] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.