暂无分享,去创建一个
[1] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .
[2] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[3] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[5] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[6] Sergey Levine,et al. Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.
[7] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[8] Kutluhan Erol,et al. Hierarchical task network planning: formalization, analysis, and implementation , 1996 .
[9] Joshua Achiam,et al. On First-Order Meta-Learning Algorithms , 2018, ArXiv.
[10] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[11] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[12] Matthew E. Taylor,et al. Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning , 2017, ArXiv.
[13] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[14] Brian Scassellati,et al. Autonomously constructing hierarchical task networks for planning and human-robot collaboration , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[15] Chong Wang,et al. Neural Logic Machines , 2019, ICLR.
[16] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[17] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[18] Vítor Santos Costa,et al. Inductive Logic Programming , 2013, Lecture Notes in Computer Science.
[19] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[20] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[21] Eric P. Xing,et al. Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.
[22] Honglak Lee,et al. Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies , 2018, NeurIPS.
[23] Wei Xu,et al. A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment , 2017, ArXiv.
[24] Song-Chun Zhu,et al. Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration , 2016, EMNLP.
[25] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[26] Silvio Savarese,et al. Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[28] Silvio Savarese,et al. Neural Task Programming: Learning to Generalize Across Hierarchical Tasks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[29] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[30] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[31] Ali Farhadi,et al. AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.
[32] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.
[33] Austin Tate,et al. Generating Project Networks , 1977, IJCAI.
[34] Honglak Lee,et al. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , 2017, ICML.
[35] Romain Laroche,et al. Hybrid Reward Architecture for Reinforcement Learning , 2017, NIPS.
[36] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[37] Dale Schuurmans,et al. Direct value-approximation for factored MDPs , 2001, NIPS.
[38] Andrew G. Barto,et al. Causal Graph Based Decomposition of Factored MDPs , 2006, J. Mach. Learn. Res..
[39] Ruslan Salakhutdinov,et al. Gated-Attention Architectures for Task-Oriented Language Grounding , 2017, AAAI.
[40] Richard Evans,et al. Learning Explanatory Rules from Noisy Data , 2017, J. Artif. Intell. Res..
[41] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.