Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics
暂无分享,去创建一个
[1] R. Fletcher. Practical Methods of Optimization , 1988 .
[2] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[3] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[4] Emanuel Todorov,et al. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.
[5] Dirk P. Kroese,et al. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning , 2004 .
[6] Dirk P. Kroese,et al. The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics) , 2004 .
[7] Peter Szabó,et al. Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods , 2005, NIPS.
[8] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[9] Stefan Schaal,et al. Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning , 2007, ESANN.
[10] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[11] Pieter Abbeel,et al. Learning for control from multiple demonstrations , 2008, ICML '08.
[12] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.
[13] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[14] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[15] Aude Billard,et al. BM: An iterative algorithm to learn stable non-linear dynamical systems with Gaussian mixture models , 2010, 2010 IEEE International Conference on Robotics and Automation.
[16] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[17] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[18] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[20] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[21] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[22] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[23] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[24] N. Roy,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2013 .
[25] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.