When to Trust Your Model: Model-Based Policy Optimization
暂无分享,去创建一个
Sergey Levine | Justin Fu | Marvin Zhang | Michael Janner | S. Levine | Marvin Zhang | Justin Fu | Michael Janner
[1] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[2] Sebastian Engell,et al. Model Predictive Control Using Neural Networks [25 Years Ago] , 1995, IEEE Control Systems.
[3] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[4] Stefan Schaal,et al. Learning tasks from a single demonstration , 1997, Proceedings of International Conference on Robotics and Automation.
[5] Csaba Szepesvári,et al. Model-based reinforcement learning with nearly tight exploration complexity bounds , 2010, ICML.
[6] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[7] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[8] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[9] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.
[10] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[11] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[12] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[13] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[14] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[15] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[16] C. Rasmussen,et al. Improving PILCO with Bayesian Neural Network Dynamics Models , 2016 .
[17] Sergey Levine,et al. Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[18] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[19] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[20] Daniel Nikovski,et al. Value-Aware Loss Function for Model-based Reinforcement Learning , 2017, AISTATS.
[21] Finale Doshi-Velez,et al. Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks , 2016, ICLR.
[22] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[23] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[24] Gabriel Kalweit,et al. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning , 2017, CoRL.
[25] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.
[26] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[27] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[28] Erik Talvitie,et al. Self-Correcting Models for Model-Based Reinforcement Learning , 2016, AAAI.
[29] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[30] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[31] Byron Boots,et al. Dual Policy Iteration , 2018, NeurIPS.
[32] Sergey Levine,et al. Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.
[33] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[34] Erik Talvitie,et al. The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces , 2018, ArXiv.
[35] Rob Fergus,et al. Understanding the Asymptotic Performance of Model-Based RL Methods , 2018 .
[36] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[37] Tamim Asfour,et al. Model-Based Reinforcement Learning via Meta-Policy Optimization , 2018, CoRL.
[38] Sergey Levine,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.
[39] Kavosh Asadi,et al. Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.
[40] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[41] Yilun Du,et al. Task-Agnostic Dynamics Priors for Deep Reinforcement Learning , 2019, ICML.
[42] Yoshua Bengio,et al. Probabilistic Planning with Sequential Monte Carlo methods , 2018, ICLR.
[43] Yuandong Tian,et al. Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees , 2018, ICLR.
[44] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[45] Nikolai Matni,et al. On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.