论文信息 - Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

, Stefan SchaalUniversity of Southern California, Los Angeles CA 90089, USAAbstract. Inthispaper, weinvestigatemotorprimitivelearning withtheNatural Actor-Critic approach. The Natural Actor-Critic consists out ofactor updates which are achieved using natural stochastic policy gradientswhile the critic obtains the natural policy gradient by linear regression.We show that this architecture can be used to learn the “building blocksof movement generation”, called motor primitives. Motor primitives areparameterizedcontrolpoliciessuchassplinesornonlineardiﬀerentialequa-tions with desired attractor properties. We show that our most modernalgorithm, the Episodic Natural Actor-Critic outperforms previous algo-rithms by at least an order of magnitude. We demonstrate the eﬃciencyof this reinforcement learning method in the application of learning to hita baseball with an anthropomorphic robot arm.

Stefan Schaal | Jan Peters | Jan Peters | S. Schaal

[1] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[2] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[3] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.

[4] Jun Nakanishi,et al. Learning rhythmic movements by demonstration using nonlinear oscillators , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .

[6] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.

[7] Jan Wessnitzer,et al. ESANN'2007 proceedings - European Symposium on Artificial Neural Networks , 2007 .