Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
暂无分享,去创建一个
Sergey Levine | Ruslan Salakhutdinov | Benjamin Eysenbach | S. Levine | R. Salakhutdinov | Benjamin Eysenbach
[1] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[2] Rob Fergus,et al. Composable Planning with Attributes , 2018, ICML.
[3] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[4] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[5] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[6] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.
[7] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.
[8] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[9] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[10] Josef Hadar,et al. Rules for Ordering Uncertain Prospects , 1969 .
[11] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[12] Steven M. LaValle,et al. Planning algorithms , 2006 .
[13] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[14] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[15] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[16] Yifan Wu,et al. The Laplacian in RL: Learning Representations with Efficient Approximations , 2018, ICLR.
[17] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[18] Vitaly Levdik,et al. Time Limits in Reinforcement Learning , 2017, ICML.
[19] Kate Saenko,et al. Hierarchical Reinforcement Learning with Hindsight , 2018, ArXiv.
[20] Wolfram Burgard,et al. Principles of Robot Motion: Theory, Algorithms, and Implementation ERRATA!!!! 1 , 2007 .
[21] Aleksandra Faust,et al. Learning Navigation Behaviors End-to-End With AutoRL , 2018, IEEE Robotics and Automation Letters.
[22] Eric P. Xing,et al. Gated Path Planning Networks , 2018, ICML.
[23] Manfred Lau,et al. Behavior planning for character animation , 2005, SCA '05.
[24] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[25] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..
[26] B. Faverjon,et al. Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .
[27] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[28] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[29] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[30] D. Freedman,et al. Some Asymptotic Theory for the Bootstrap , 1981 .
[31] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[32] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[33] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[34] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[35] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[36] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[37] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[38] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[39] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[40] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[41] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[42] Lydia Tapia,et al. PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-Based Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[43] Dilek Z. Hakkani-Tür,et al. FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning , 2018, ArXiv.
[44] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[45] Qi Wu,et al. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[46] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[47] Thomas A. Funkhouser,et al. Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Sergey Levine,et al. Learning Latent Plans from Play , 2019, CoRL.
[49] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.
[50] Byron Boots,et al. Differentiable MPC for End-to-end Planning and Control , 2018, NeurIPS.
[51] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[52] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[53] Sergey Levine,et al. Space-time planning with parameterized locomotion controllers , 2011, TOGS.
[54] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[55] Martin A. Riedmiller,et al. Self-supervised Learning of Image Embedding for Continuous Control , 2019, ArXiv.
[56] Vladlen Koltun,et al. Semi-parametric Topological Memory for Navigation , 2018, ICLR.
[57] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[58] Sergey Levine,et al. Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.
[59] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[60] Howie Choset,et al. Principles of Robot Motion: Theory, Algorithms, and Implementation ERRATA!!!! 1 , 2007 .
[61] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[62] Marc Pollefeys,et al. Episodic Curiosity through Reachability , 2018, ICLR.
[63] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[64] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[65] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[66] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[67] Lydia E. Kavraki,et al. Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..