暂无分享,去创建一个
Konrad Czechowski | Yuhuai Wu | Piotr Milo's | Tomasz Odrzyg'o'zd'z | Marek Zbysi'nski | Michal Zawalski | Krzysztof Olejnik | Lukasz Kuci'nski | Yuhuai Wu | K. Czechowski | Piotr Milo's | Tomasz Odrzyg'o'zd'z | Michal Zawalski | Lukasz Kuci'nski | Marek Zbysi'nski | Krzysztof Olejnik | Michał Zawalski
[1] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[2] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[3] Pierre Baldi,et al. Solving the Rubik’s cube with deep reinforcement learning and search , 2019, Nature Machine Intelligence.
[4] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[5] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[6] W. T. Gowers. THE IMPORTANCE OF MATHEMATICS , 2002 .
[7] Alexei A. Efros,et al. Time-Agnostic Prediction: Predicting Predictable Video Frames , 2018, ICLR.
[8] Bradly C. Stadie,et al. World Model as a Graph: Learning Latent Landmarks for Planning , 2021, ICML.
[9] Pieter Abbeel,et al. Hallucinative Topological Memory for Zero-Shot Visual Planning , 2020, ICML.
[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[11] Amit K. Roy-Chowdhury,et al. Learning from Trajectories via Subgoal Discovery , 2019, NeurIPS.
[12] Jessica B. Hamrick,et al. Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning , 2020, ArXiv.
[13] Johann Schumann,et al. Automated Theorem Proving in Software Engineering , 2001, Springer Berlin Heidelberg.
[14] Alex S. Fukunaga,et al. Classical Planning in Deep Latent Space: Bridging the Subsymbolic-Symbolic Boundary , 2017, AAAI.
[15] Chelsea Finn,et al. Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors , 2020, NeurIPS.
[16] Yoshua Bengio,et al. Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..
[17] Yoshua Bengio,et al. Variational Temporal Abstraction , 2019, NeurIPS.
[18] Silvio Savarese,et al. Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation , 2019, CoRL.
[19] Ilya Sutskever,et al. Generative Language Modeling for Automated Theorem Proving , 2020, ArXiv.
[20] Kostas Daniilidis,et al. Keyframing the Future: Keyframe Discovery for Visual Prediction and Planning , 2020, L4DC.
[21] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[22] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[23] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[24] Tom Eccles,et al. An investigation of model-free planning , 2019, ICML.
[25] Chelsea Finn,et al. Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation , 2019, ICLR.
[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[27] J. Hollerman,et al. Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior. , 2000, Progress in brain research.
[28] Thomas Wolf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[29] Nicholas Roy,et al. Learning over Subgoals for Efficient Navigation of Structured, Unknown Environments , 2018, CoRL.
[30] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[31] Wei Gao,et al. Intention-Net: Integrating Planning and Deep Learning for Goal-Directed Autonomous Navigation , 2017, CoRL.
[32] Alan Fern,et al. The first learning track of the international planning competition , 2011, Machine Learning.
[33] Uri Zwick,et al. SOKOBAN and other motion planning problems , 1999, Comput. Geom..
[34] Vladlen Koltun,et al. Semi-parametric Topological Memory for Navigation , 2018, ICLR.
[35] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[36] David Wilkins,et al. Using Patterns and Plans in Chess , 1980, Artif. Intell..
[37] Konrad Czechowski,et al. Uncertainty-sensitive Learning and Planning with Ensembles , 2019, ArXiv.
[38] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[39] Hector Geffner,et al. Purely Declarative Action Descriptions are Overrated: Classical Planning with Simulators , 2017, IJCAI.
[40] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[41] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[42] Jessica B. Hamrick,et al. On the role of planning in model-based deep reinforcement learning , 2020, ArXiv.
[43] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[44] Jimmy Ba,et al. INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving , 2020, ICLR.
[45] Hector Geffner,et al. Width and Serialization of Classical Planning Problems , 2012, ECAI.
[46] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[47] Jimmy Ba,et al. Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning , 2020, ICML.
[48] Cordelia Schmid,et al. Goal-Conditioned Reinforcement Learning with Imagined Subgoals , 2021, ICML.
[49] Marjan Ghazvininejad,et al. Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.
[50] Pieter Abbeel,et al. Learning Plannable Representations with Causal InfoGAN , 2018, NeurIPS.
[51] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.