Learning humanoid soccer actions interleaving simulated and real data

This paper presents an approach for learning complex tasks on real robots, like walking or kicking in a humanoid soccer robot, profiting at most from the possibility to run simulations of a virtual model of the robot. This approach avoids to damage the real robot in the time consuming trials needed to learn a correct behavior and avoids to overfit the virtual robot model. The basic idea is to run most of the learning steps in simulation and to use a few learning steps on the real robot to assess discrepancies between the simulation and the reality. The calculated discrepancies are used to correct the fitness function used in simulation. Experiments on interleaving the learning between a real robot (Robovie-M by VStone) and its virtual model in USARSim are presented. They show that the proposed method is effective and significantly reduces learning time.

[1]  Chia-Ju Wu,et al.  Design of fuzzy logic controllers using genetic algorithms , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[2]  Emanuele Menegatti,et al.  A 3D Model of a Humanoid for USARSim Simulator , 2006 .

[3]  Giovanni Indiveri,et al.  A CONTROL ARCHITECTURE FOR DYNAMICALLY STABLE GAITS OF SMALL SIZE HUMANOID ROBOTS , 2006 .

[4]  Matthias Hebbel,et al.  Modeling and Learning Walking Gaits of Biped Robots , 2006 .

[5]  Manfred Hild,et al.  Evolution of Biped Walking Using Neural Oscillators and Physical Simulation , 2008, RoboCup.

[6]  Hod Lipson,et al.  Nonlinear system identification using coevolution of models and tests , 2005, IEEE Transactions on Evolutionary Computation.

[7]  Pieter Abbeel,et al.  Using inaccurate models in reinforcement learning , 2006, ICML.

[8]  Jun-Ho Oh,et al.  Online Biped Walking Pattern Generation for Humanoid Robot KHR-3(KAIST Humanoid Robot - 3: HUBO) , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[9]  J. Velasco Genetic Algorithms in Fuzzy Control Systems 8.1 Introduction , 1995 .

[10]  Marco Fratarcangeli,et al.  A 3D Simulator of Multiple Legged Robots Based on USARSim , 2006, RoboCup.

[11]  Martijn Wisse,et al.  Using a controller based on reinforcement learning for a passive dynamic walking robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[12]  H. Sebastian Seung,et al.  Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[13]  Oskar von Stryk,et al.  Hardware-in-the-Loop Optimization of the Walking Speed of a Humanoid Robot , 2006 .

[14]  Hod Lipson,et al.  AN EXPLORATION-ESTIMATION ALGORITHM FOR SYNTHESIS AND ANALYSIS OF ENGINEERING SYSTEMS USING MINIMAL PHYSICAL TESTING , 2004, DAC 2004.

[15]  Stefano Carpin,et al.  USARSim: a robot simulator for research and education , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.