An extended policy gradient algorithm for robot task learning

In real-world robotic applications, many factors, both at low-level (e.g., vision and motion control parameters) and at high-level (e.g., the behaviors) determine the quality of the robot performance. Thus, for many tasks, robots require fine tuning of the parameters, in the implementation of behaviors and basic control actions, as well as in strategic decisional processes. In recent years, machine learning techniques have been used to find optimal parameter sets for different behaviors. However, a drawback of learning techniques is time consumption: in practical applications, methods designed for physical robots must be effective with small amounts of data. In this paper, we present a method for concurrent learning of best strategy and optimal parameters, by extending the policy gradient reinforcement learning algorithm. The results of our experimental work in a simulated environment and on a real robot show a very high convergence rate.

[1]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[2]  Stephan K. Chalup,et al.  Machine Learning With AIBO Robots in the Four-Legged League of RoboCup , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Marco Fratarcangeli,et al.  A 3D Simulator of Multiple Legged Robots Based on USARSim , 2006, RoboCup.

[4]  Manuela M. Veloso,et al.  Layered Learning , 2000, ECML.

[5]  N. Bredeche,et al.  Perceptual learning and abstraction in machine learning: an application to autonomous robotics , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  Stefano Nolfi,et al.  Evolutionary Robotics: Exploiting the Full Power of Self-organization , 1998, Connect. Sci..

[7]  Peter Stone,et al.  The Chin Pinch: A Case Study in Skill Learning on a Legged Robot , 2006, RoboCup.

[8]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Andrea Cherubini,et al.  Layered Learning for a Soccer Legged Robot Helped with a 3D Simulator , 2008, RoboCup.

[11]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.