Reset-free guided policy search: Efficient deep reinforcement learning with stochastic initial states
暂无分享,去创建一个
Sergey Levine | Anurag Ajay | Pieter Abbeel | Chelsea Finn | William Montgomery | S. Levine | P. Abbeel | Chelsea Finn | Anurag Ajay | William H. Montgomery
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[3] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[4] Sergey Levine,et al. Guided Policy Search via Approximate Mirror Descent , 2016, NIPS.
[5] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[6] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[7] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[8] Stefan Schaal,et al. Learning force control policies for compliant manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[9] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[10] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[11] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..
[12] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[13] Hany Abdulsamad,et al. Model-Free Trajectory Optimization for Reinforcement Learning , 2016, ICML.
[14] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..
[15] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[16] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[17] Stefan Schaal,et al. Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.
[18] Stefan Schaal,et al. Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning , 2007, ESANN.
[19] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[20] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[21] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[22] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[23] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .