暂无分享,去创建一个
Sergey Levine | Shixiang Gu | Vitchyr Pong | Murtaza Dalal | Vitchyr H. Pong | S. Levine | S. Gu | Murtaza Dalal
[1] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[2] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[4] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[5] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.
[6] Sergey Levine,et al. Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[7] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[8] Martial Hebert,et al. Improved Learning of Dynamics Models for Control , 2016, ISER.
[9] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[10] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[11] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[12] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.
[13] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[14] B. Widrow,et al. Neural networks for self-learning control systems , 1990, IEEE Control Systems Magazine.
[15] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[16] Sergey Levine,et al. Goal-driven dynamics learning via Bayesian optimization , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[19] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[20] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[21] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[22] Navdeep Jaitly,et al. Discrete Sequential Prediction of Continuous Actions for Deep RL , 2017, ArXiv.
[23] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[24] Pieter Abbeel,et al. Prediction and Control with Temporal Segment Models , 2017, ICML.
[25] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[26] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[27] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[28] Jan Peters,et al. Nonamemanuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to , 2011 .
[29] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[30] Gaurav S. Sukhatme,et al. Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning , 2017, ICML.
[31] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[32] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[33] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[34] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[35] Peter Dayan,et al. Q-learning , 1992, Machine Learning.