Natural Actor-Critic
暂无分享,去创建一个
[1] Dimitri P. Bertsekas,et al. Neuro-Dynamic Programming , 2009, Encyclopedia of Optimization.
[2] Xinhua Zhang,et al. Conditional random fields for multi-agent reinforcement learning , 2007, ICML '07.
[3] Stefan Schaal,et al. Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning , 2007, ESANN.
[4] Aude Billard,et al. Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.
[5] Xinhua Zhang,et al. Conditional Random Fields for Reinforcement Learning , 2007 .
[6] Csaba Szepesv. Natural Actor-Critic , 2007 .
[7] Olivier Buffet,et al. Shaping multi-agent systems with gradient reinforcement learning , 2007, Autonomous Agents and Multi-Agent Systems.
[8] Jin Yu,et al. Natural Actor-Critic for Road Traffic Optimisation , 2006, NIPS.
[9] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Shin Ishii,et al. Fast and Stable Learning of Quasi-Passive Dynamic Walking by an Unstable Biped Robot based on Off-Policy Natural Actor-Critic , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[11] Douglas Aberdeen,et al. POMDPs and Policy Gradients , 2006 .
[12] Jongho Kim,et al. An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm , 2005, CIS.
[13] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[14] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[15] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[16] Sethu Vijayakumar,et al. Scaling Reinforcement Learning Paradigms for Motor Learning , 2003 .
[17] Jun Nakanishi,et al. Learning rhythmic movements by demonstration using nonlinear oscillators , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[18] Peter L. Bartlett,et al. An Introduction to Reinforcement Learning Theory: Value Function Methods , 2002, Machine Learning Summer School.
[19] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[20] Kenji Fukumizu,et al. Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.
[21] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[22] T. Moon,et al. Mathematical Methods and Algorithms for Signal Processing , 1999 .
[23] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[24] Vijay R. Konda,et al. Actor-Critic Algorithms , 1999, NIPS.
[25] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[26] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[27] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[28] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[29] Richard S. Sutton,et al. Dimensions of Reinforcement Learning , 1998 .
[30] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.