Reinforcement Learning for Humanoid Robotics
暂无分享,去创建一个
[1] R. Bellman. Dynamic programming. , 1957, Science.
[2] Vijaykumar Gullapalli,et al. Learning Control Under Extreme Uncertainty , 1992, NIPS.
[3] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.
[4] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[5] Judy A. Franklin,et al. Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..
[6] Ruth F. Curtain,et al. Linear-quadratic control: An introduction , 1997, Autom..
[7] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[8] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[9] Shigenobu Kobayashi,et al. Reinforcement learning for continuous action using stochastic gradient ascent , 1998 .
[10] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[11] J. Tsitsiklis,et al. Simulation-based optimization of Markov reward processes: implementation issues , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).
[12] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[13] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[14] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[15] Karsten Berns,et al. Adaptive biologically inspired control for the four-legged walking machine BISAM , 1999 .
[16] T. Moon,et al. Mathematical Methods and Algorithms for Signal Processing , 1999 .
[17] Chaouki T. Abdallah,et al. Linear Quadratic Control: An Introduction , 2000 .
[18] Michail G. Lagoudakis,et al. Model-Free Least-Squares Policy Iteration , 2001, NIPS.
[19] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[20] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[21] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[22] Jun Nakanishi,et al. Learning rhythmic movements by demonstration using nonlinear oscillators , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[23] Ralf Schoknecht,et al. Optimality of Reinforcement Learning Algorithms with Linear Function Approximation , 2002, NIPS.
[24] Stefan Schaal,et al. Forward models in visuomotor control. , 2002, Journal of neurophysiology.
[25] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[26] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[27] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.