暂无分享,去创建一个
Sergey Levine | Michael I. Jordan | Ion Stoica | Joseph Gonzalez | Alvin Wan | Vladimir Feinberg | S. Levine | I. Stoica | Joseph E. Gonzalez | Vladimir Feinberg | Alvin Wan
[1] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[2] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[3] Gabriel Kalweit,et al. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning , 2017, CoRL.
[4] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[5] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[6] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[7] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[8] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[9] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[10] Nikolai Matni,et al. On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.
[11] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[12] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[13] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[14] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[15] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.