Model-Free Trajectory Optimization for Reinforcement Learning
暂无分享,去创建一个
Hany Abdulsamad | Gerhard Neumann | Abbas Abdolmaleki | Riad Akrour | A. Abdolmaleki | G. Neumann | Hany Abdulsamad | R. Akrour
[1] W. Müller. JACOBSON, D. H. and D. Q. MAYNE: Differential dynamic programming. Modern analytic and computational methods in Science and Mathematics, No. 24. American Elsevier Publ. Co., Inc., New York 1970. XVI, 208 S., 17 Abb., Dfl. 51.50. , 1973 .
[2] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[3] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[4] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[5] Emanuel Todorov,et al. Optimal Control Theory , 2006 .
[6] Stefan Schaal,et al. Path integral-based stochastic optimal control for rigid body dynamics , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[7] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[8] Yuval Tassa,et al. Iterative local dynamic programming , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[9] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[10] Yuval Tassa,et al. Stochastic Differential Dynamic Programming , 2010, Proceedings of the 2010 American Control Conference.
[11] Jan Peters,et al. A biomimetic approach to robot table tennis , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[12] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[13] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[14] Paul Wagner,et al. A reinterpretation of the policy oscillation phenomenon in approximate policy iteration , 2011, NIPS.
[15] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[16] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[17] Luca Bascetta,et al. Adaptive Step-Size for Policy Gradient Methods , 2013, NIPS.
[18] Daniele Calandriello,et al. Safe Policy Iteration , 2013, ICML.
[19] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.
[20] Yunpeng Pan,et al. Probabilistic Differential Dynamic Programming , 2014, NIPS.
[21] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[22] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[23] Luís Paulo Reis,et al. Model-Based Relative Entropy Stochastic Search , 2016, NIPS.
[24] Anastasios Kyrillidis,et al. Dropping Convexity for Faster Semi-definite Optimization , 2015, COLT.