Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning
暂无分享,去创建一个
Jinwoo Shin | Honglak Lee | Kimin Lee | Seunghyun Lee | Younggyo Seo | Jinwoo Shin | Honglak Lee | Kimin Lee | Younggyo Seo | Seunghyun Lee
[1] I. Jolliffe. Principal Components in Regression Analysis , 1986 .
[2] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..
[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[4] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[5] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[8] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[9] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Dirk P. Kroese,et al. Chapter 3 – The Cross-Entropy Method for Optimization , 2013 .
[11] Dirk P. Kroese,et al. The cross-entropy method for estimation , 2013 .
[12] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[15] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.
[16] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[17] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[18] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[19] Quoc V. Le,et al. Swish: a Self-Gated Activation Function , 2017, 1710.05941.
[20] Greg Turk,et al. Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.
[21] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[22] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.
[23] Abhinav Gupta,et al. Robust Adversarial Reinforcement Learning , 2017, ICML.
[24] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[25] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[26] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[27] Sanja Fidler,et al. NerveNet: Learning Structured Policy with Graph Neural Networks , 2018, ICLR.
[28] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[29] Raia Hadsell,et al. Graph networks as learnable physics engines for inference and control , 2018, ICML.
[30] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[31] Nan Jiang,et al. Markov Decision Processes with Continuous Side Information , 2017, ALT.
[32] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[33] Dawn Xiaodong Song,et al. Assessing Generalization in Deep Reinforcement Learning , 2018, ArXiv.
[34] Sergey Levine,et al. Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL , 2018, ICLR.
[35] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[36] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[37] Abhinav Gupta,et al. Environment Probing Interaction Policies , 2019, ICLR.
[38] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[39] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.
[40] Sergey Levine,et al. SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning , 2018, ICML.
[41] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[42] Jimmy Ba,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[43] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[44] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.