Natural Actor-Critic
暂无分享,去创建一个
[1] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.
[2] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[3] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[4] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[5] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[6] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[7] Richard S. Sutton,et al. Dimensions of Reinforcement Learning , 1998 .
[8] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[9] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[11] T. Moon,et al. Mathematical Methods and Algorithms for Signal Processing , 1999 .
[12] Kenji Fukumizu,et al. Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.
[13] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[14] Peter L. Bartlett,et al. An Introduction to Reinforcement Learning Theory: Value Function Methods , 2002, Machine Learning Summer School.
[15] Jun Nakanishi,et al. Learning rhythmic movements by demonstration using nonlinear oscillators , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[17] Jeff G. Schneider,et al. Covariant Policy Search , 2003, IJCAI.
[18] Sethu Vijayakumar,et al. Scaling Reinforcement Learning Paradigms for Motor Learning , 2003 .
[19] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[20] Jongho Kim,et al. An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm , 2005, CIS.
[21] Douglas Aberdeen,et al. POMDPs and Policy Gradients , 2006 .
[22] Olivier Buffet,et al. Shaping multi-agent systems with gradient reinforcement learning , 2007, Autonomous Agents and Multi-Agent Systems.
[23] Jin Yu,et al. Natural Actor-Critic for Road Traffic Optimisation , 2006, NIPS.
[24] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[25] Shin Ishii,et al. Fast and Stable Learning of Quasi-Passive Dynamic Walking by an Unstable Biped Robot based on Off-Policy Natural Actor-Critic , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[26] Aude Billard,et al. Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.
[27] Xinhua Zhang,et al. Conditional Random Fields for Reinforcement Learning , 2007 .
[28] Stefan Schaal,et al. Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning , 2007, ESANN.
[29] Xinhua Zhang,et al. Conditional random fields for multi-agent reinforcement learning , 2007, ICML '07.